Tail calls are matched on the graph, with a dedicated tail call
optimization that is actually testable. The instruction selection can
still fall back to a regular if the platform constraints don't allow to
emit a tail call (i.e. the return locations of caller and callee differ
or the callee takes non-register parameters, which is a restriction that
will be removed in the future).
Also explicitly limit tail call optimization to stubs for now and drop
the global flag.
BUG=v8:4076
LOG=n
Review URL: https://codereview.chromium.org/1114163005
Cr-Commit-Position: refs/heads/master@{#28219}