version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
mostly to make sure that it is going in the right direction. The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is there a simpler/more correct way to store the already-materialized
objects? (At the moment there is a custom root reference to JSArray
containing frames' FixedArrays with their captured objects.)
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=mstarzinger@chromium.org, danno@chromium.org
LOG=N
BUG=
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18918 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
by escape analysis). Added several tests that expose the bug.
Summary:
LCodegen::AddToTranslation assumes that Lithium environments are
generated by depth-first traversal, but LChunkBuilder::CreateEnvironment
was generating them in breadth-first fashion. This fixes the
CreateEnvironment to traverse the captured objects depth-first.
Note:
It might be worth considering representing LEnvironment by a list
with the same order as the serialized translation representation
rather than having two lists with a subtle relationship between
them (and then serialize in a slightly different order again).
R=titzer@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Review URL: https://codereview.chromium.org/93803003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18470 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This change implements a simple data-flow analysis pass over captured
objects to the existing escape analysis. It tracks the state of values
in the Hydrogen graph through CapturedObject marker instructions that
are used to construct an appropriate translation for the deoptimizer to
be able to materialize these objects again.
This can be considered a combination of scalar replacement of loads and
stores on captured objects and sinking of unused allocations.
R=titzer@chromium.org
TEST=mjsunit/compiler/escape-analysis
Review URL: https://codereview.chromium.org/21055011
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16098 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The current usage of this runtime function is broken as it does not
prevent inlining of the affected function but rather bails out from the
whole unit of compilation after trying to inline affected functions.
This simplifies said runtime function to avoid accidental misuse.
R=titzer@chromium.org
TEST=mjsunit/never-optimize
Review URL: https://codereview.chromium.org/19776006
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15762 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This allows the deoptimizer to materialize objects (e.g. the arguments
object) while deopting without having a consective stack area holding
the object values. The LEnvironment explicitly tracks locations for
these values and preserves them in the translation.
R=svenpanne@chromium.org
TEST=mjsunit/compiler/inline-arguments
Review URL: https://codereview.chromium.org/16779004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15087 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The predicate CanBeSpilled had a bug, prohibiting the necessary spilling and
correct splitting of live ranges. Removed a redundant assertion immediately done
by the callee anyway.
Thanks to Slava for help with that issue and the entertaining historical
background of the whole story... ;-)
BUG=177883
Review URL: https://codereview.chromium.org/12631012
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13891 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This allows Crankshaft to completely inline a f.apply() dispatch if the
exact number of arguments is known and the function is constant. The
deoptimizer doesn't generate the f.apply() frame during deoptimization,
so the materialized frames look like f.apply() did a tailcall.
R=jkummerow@chromium.org
TEST=mjsunit/compiler/inline-function-apply
Review URL: https://codereview.chromium.org/12263004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13665 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
HCheckPrototypeMaps currently records the prototype and the holder of the
prototype chain (both ends of the chain) and assumes that the chain elements
and their maps did not change in during the entirety of Crankshaft. The actual
traversal of the prototype chain happens in Lithium at code generation.
With parallel compilation, this assumption is not longer correct.
R=mstarzinger@chromium.org
BUG=
Review URL: https://chromiumcodereview.appspot.com/11864013
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13454 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Previously Crankshaft emitted a generic load for these, now we emit a load of a
named field, guarded by a proto chain check.
LCheckPrototypeMaps now returns the holder, which is for free, because it
already had to check its map as the last step, anyway. This is in sync with what
StubCompiler::CheckPrototype does.
Review URL: https://codereview.chromium.org/11338030
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@12847 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This fixes materialization of arguments objects for strict mode functions during
deoptimization. We materialize arguments from the stack area where optimized
code pushes the arguments when entering the inlined environment. For adapted
invocations we use the arguments adaptor frame for materialization.
R=svenpanne@chromium.org
BUG=v8:2261
TEST=mjsunit/regress/regress-2261,mjsunit/compiler/inline-arguments
Review URL: https://chromiumcodereview.appspot.com/10908194
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@12489 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
xmm0 is not saved across runtime call on x64 because MacroAssembler::EnterExitFrameEpilogue preserves only allocatable XMM registers unlike on ia32 where it preserves all registers.
Cleanup handling of shifts: SHR can deoptimize only when its a shift by 0, all other shift never deoptimize.
Fix type inference for i-to-t change instruction. On X64 this ensures that write-barrier is generated correctly.
R=danno@chromium.org
Review URL: https://chromiumcodereview.appspot.com/10868032
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@12373 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Safe operations are those that either do not observe unsignedness or have special support for uint32 values:
- all binary bitwise operations: they perform ToInt32 on inputs;
- >> and << shifts: they perform ToInt32 on left hand side and ToUint32 on right hand side;
- >>> shift: it performs ToUint32 on both inputs;
- stores to integer external arrays (not pixel, float or double ones): these stores are "bitwise";
- HChange: special support added for conversions of uint32 values to double and tagged values;
- HSimulate: special support added for deoptimization with uint32 values in registers and stack slots;
- HPhi: phis that have only safe uses and only uint32 operands are uint32 themselves.
BUG=v8:2097
TEST=test/mjsunit/compiler/uint32.js
Review URL: https://chromiumcodereview.appspot.com/10778029
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@12367 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Highlights of this CL:
* Introduced a new opcode in the deoptimizer for a setter stub frame.
* Added a global setter stub for returning after deoptimizing a setter.
* We do not need special deopt support for getters, although the getter stub creates an internal frame. The normal machinery works just right for this case, although we generate a stack that can never occur during normal fullcode execution. If this hurts us one day, we can parameterize and reuse the setter deopt machinery.
Review URL: https://chromiumcodereview.appspot.com/10855098
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@12328 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Currently we inline functions with different contexts only on ia32, so we have
to move the helper functions for the various contexts to the top level. Further
more, "new Object()" seems to prevent inlining, too, so we us a simple object
literal.
Although things get consistently inlined now, something strange seems to happen
in test/effect contexts: The DEOPT output seems to contain too few frames, and
we don't get any DEOPT ouput after the first time for those contexts. This has
to be investigated...
TBR=mstarzinger@chromium.org
Review URL: https://chromiumcodereview.appspot.com/10836258
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@12312 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Currently only simple setter calls are handled (i.e. no calls in count
operations or compound assignments), and deoptimization in the setter is not
handled at all. Because of the latter, we temporarily hide this feature behind
the --inline-accessors flag, just like inlining getters.
We now use an enum everywhere we depend on the handling of a return value,
passing around several boolean would be more confusing.
Made VisitReturnStatement and the final parts of TryInline more similar, so
matching them visually is a bit easier now.
Simplified the signature of AddLeaveInlined, the target of the HGoto can simply
be retrieved from the function state.
Review URL: https://chromiumcodereview.appspot.com/10836133
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@12286 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Inlined strict mode functions (that are not called as methods) will get
their receiver reset to undefined. This should not happen when inlining
constructors.
This change also simplifies the test suite to reuse the same closures
into which constructors get inlined and use gc() to force V8 to forget
collected type feedback.
R=vegorov@chromium.org
TEST=mjsunit/compiler/inline-construct
Review URL: https://chromiumcodereview.appspot.com/9597017
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10920 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The old code used a separate HToInt32 instruction which had a wrong register
constraint for the input register which caused wrong result when the stored value
is used after a typed array store. (UseRegister instead of UseTempRegister) when no
SSE3 is available.
This change fixes it by replacing HToInt32 with the corresponding HChange
instruction which has correct register contraints.
TEST=mjsunit/compiler/regress-toint32.js
Review URL: https://chromiumcodereview.appspot.com/9565007
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10891 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Modify PreProcessOsrEntry to work with OSR entries that have non-empty expression stack.
Modify graph builder to take for-in state from environment instead of directly referencing emitted instructions.
Extend %OptimizeFunctionOnNextCall with an argument to force OSR to make writing OSR tests easier: %OptimizeFunctionOnNextCall(f, "osr").
R=fschneider@chromium.org
TEST=test/mjsunit/compiler/optimized-for-in.js
Review URL: https://chromiumcodereview.appspot.com/9431030
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10796 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Only JSObject enumerables with enum cache (fast case properties, no interceptors, no enumerable properties on the prototype) are supported.
HLoadKeyedGeneric with keys produced by for-in enumeration are recognized and rewritten into direct property load by index. For this enum-cache was extended to store property indices in a separate array (see handles.cc).
New hydrogen instructions:
- HForInPrepareMap: checks for-in fast case preconditions and returns map that contains enum-cache;
- HForInCacheArray: extracts enum-cache array from the map;
- HCheckMapValue: map check with HValue map instead of immediate;
- HLoadFieldByIndex: load fast property by it's index, positive indexes denote in-object properties, negative - out of object properties;
Changed hydrogen instructions:
- HLoadKeyedFastElement: added hole check suppression for loads from internal FixedArrays that are knows to have no holes inside.
R=fschneider@chromium.org
BUG=
TEST=
Review URL: https://chromiumcodereview.appspot.com/9425045
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10794 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This extends the current support for nested object literals we already
have in Crankshaft, to also support nested array literals and mixed
nested literals containing arrays and objects. All three types are
generated by the unified HFastLiteral instruction.
All previous upper bounds on nested literal graphs remain unchanged,
keeping the size of generated code in check.
The main intention is to boost performance of two-dimensional array
literals containing constant elements (aka. matrices).
R=danno@chromium.org
TEST=mjsunit/compiler/literals-optimized
Review URL: https://chromiumcodereview.appspot.com/9403018
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10734 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Until now we only could inline as specialized HIR instructions when called
as a method (e.g. Math.abs)
It is very common practice to abbreviate calls to those functions by defining
a global or local variable like:
var a = Math.abs;
var x = a(123);
This change allows inlining them when called as a function (global or local).
Review URL: https://chromiumcodereview.appspot.com/9365013
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10640 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Changes the way we do lazy deoptimization:
1. For side-effect instructions, we insert the lazy-deopt call at
the following LLazyBailout instruction.
CALL
GAP
LAZY-BAILOUT ==> lazy-deopt-call
2. For other instructions (StackCheck) we insert it right after the
instruction since the deopt targets an earlier deoptimization environment.
STACK-CHECK
GAP ==> lazy-deopt-call
The pc of the lazy-deopt call that will be patched in is recorded in the
deoptimization input data. Each Lithium instruction can have 0..n safepoints.
All safepoints get the deoptimization index of the associated LAZY-BAILOUT
instruction. On lazy deoptimization we use the return-pc to find the safepoint.
The safepoint tells us the deoptimization index, which in turn finds us the
PC where to insert the lazy-deopt-call.
Additional changes:
* RegExpLiteral marked it as having side-effects so that it
gets an explicitlazy-bailout instruction (instead of
treating it specially like stack-checks)
* Enable target recording CallFunctionStub to achieve
more inlining on optimized code.
BUG=v8:1789
TEST=jslint and uglify run without crashing, mjsunit/compiler/regress-lazy-deopt.js
Review URL: http://codereview.chromium.org/8492004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10006 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This change fixes a off-by-one level error when dropping the
function from the environment. The function of the outermost
environment was not dropped.
BUG=v8:1785
TEST=test/mjsunit/compiler/regress-inline-callfunctionstub.js
Review URL: http://codereview.chromium.org/8341019
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@9789 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This change is based on my previous change enabling inlining calls-as-function
fixing the bugs related to deoptimization.
The function value on top of the environment was dropped too late in the old code.
As a result we could get a wrong value on top after deoptimization.
This change includes r9619. It was reverted because of test failures that are fixed
with this patch.
Review URL: http://codereview.chromium.org/8360001
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@9758 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
We have to check for uninitialized uses before phi-elimination. Otherwise we
may miss such a use and result in using the hole value instead. This
causes a NULL-dereference or assertion failure.
BUG=96989
TEST=mjsunit/compiler/regress-96989.js
Review URL: http://codereview.chromium.org/7974009
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@9337 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The approach is to handle the common case in the optimizing
compiler and to bailout for the rare corner cases.
This is done by initializing all local const-variables with
the hole value and disallowing any use of the hole value statically.
Review URL: http://codereview.chromium.org/6026006
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@8104 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Encapsulate the helper functions in mjsunit.js.
Now only exposes the exception class and the assertXXX functions.
Make assertEquals use === instead of ==.
This prevents a lot of possiblefalse positives in tests, and avoids
having to do assertTrue(expected === actual) when you need it.
Fixed some tests that were either buggy or assuming == test.
Review URL: http://codereview.chromium.org/6869007
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7628 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
In the cases where a global property cell cannot be used in the optimized code
use standard load ic to get the property instead of bailing out.
This is re-committing r7212 and r7215 which where reverted in r7239 with the addition of recoring the source position in the hydrogen code for the LoadGlobalCell instruction. To record that position an optional position field has been added to the variable proxy AST node.
Review URL: http://codereview.chromium.org/6758007
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7474 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The class did not correctly implement the RequiredInputRepresentation.
I changed this functions to be abstract so that all hydrogen classes
must implement it.
As a convention instructions with zero input operands return None as input
representation.
Instructions that can handle all input representations without converting before
also have None as required input representation (e.g. HTest)
All other instructions need a proper required input representation.
Review URL: http://codereview.chromium.org/6538088
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@6891 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
If we have a property access of the form this.x, where the access site sees
the global object, we can specialize the IC stub so that it performs a map
check without first performing a heap object check.
Ensure that we do not get in JS code with a non-JSObject this value by
deoptimizing at Function.prototype.apply if the first argument is not a
JSObject.
BUG=v8:1128
Review URL: http://codereview.chromium.org/6463025
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@6707 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
1. The placement of checks for negative zero has to be computed after
all conversion instructions have been inserted. I separated the code
into its own phase.
2. GVN need to take instruction flags into account when comparing
instructions for redundancy.
Review URL: http://codereview.chromium.org/6260035
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@6534 ce2b1a6d-e550-0410-aec6-3dcde31c8c00