Reason for revert:
This change rendered InstructionSequenceTest::SetNumRegs ineffectual, thus
loosening the tests that were using that API to ensure correct register
allocation under intentionally constrained setups.
For the problem stated in this CL, a solution needs to continue supporting the
intentionally set-up test configuration.
Original issue's description:
> MIPS: Fix bad RegisterConfiguration usage in InstructionSequence unit tests.
>
> Test InstructionSequenceTest has been initialized with a testing RegisterConfiguration
> instance defined in instruction-sequence-unittest.h, whereas class ExplicitOperand which
> is being tested used RegisterConfiguration from instruction.cc. In case these two
> instances are different, the tests would fail. The issue is fixed by using the same
> instance of RegisterConfiguration both for test code and code under test.
>
> Additionally, the tests in register-allocator-unittest.cc use hardcoded values
> for register and begin failing is the hardcoded register is not available for
> allocation. Fix by forcing the use of allocatable registers only.
>
> TEST=unittests.MoveOptimizerTest.RemovesRedundantExplicit,unittests.RegisterAllocatorTest.SpillPhi
> BUG=
>
> Committed: https://crrev.com/0cf56232209d4c9c669b8426680de18806f6c29a
> Cr-Commit-Position: refs/heads/master@{#40862}
TBR=dcarney@chromium.org,bmeurer@chromium.org,mstarzinger@chromium.org,vogelheim@chromium.org,titzer@chromium.org,ivica.bogosavljevic@imgtec.com
# Not skipping CQ checks because original CL landed more than 1 days ago.
BUG=
Review-Url: https://codereview.chromium.org/2587593002
Cr-Commit-Position: refs/heads/master@{#41777}
Adds assignment tracking to the bytecode analysis pass, and updates
bytecode graph builder to only create LoopExitValues for assigned
values.
Review-Url: https://codereview.chromium.org/2558093005
Cr-Commit-Position: refs/heads/master@{#41719}
MIPS[64]R6 supports only fusion multiply-accumulate instructions, and using
these causes failures of several tests that expect exact floating-point
results. Therefore we disable fusion multiply-accumulate in both emitted and
compiled code on R6.
TEST=cctest/test-run-machops/RunFloat64MulAndFloat64Add1,mjsunit/es6/math-expm1.js
mjsunit/es6/math-fround.js,mjsunit/compiler/multiply-add.js
BUG=
Review-Url: https://codereview.chromium.org/2569683002
Cr-Commit-Position: refs/heads/master@{#41717}
Wrap the liveness bitvectors from the bytecode liveness analysis with a
helper class, which makes the register/accumulator bits explicit.
Review-Url: https://codereview.chromium.org/2552723004
Cr-Commit-Position: refs/heads/master@{#41589}
JS operators always have an implicit context input, so just use that instead.
BUG=
Review-Url: https://codereview.chromium.org/2541813002
Cr-Commit-Position: refs/heads/master@{#41392}
Replaces the graph-based liveness analyzer in the bytecode graph builder
with an initial bytecode-based liveness analysis pass, which is added to
the existing loop extent analysis.
Now the StateValues in the graph have their inputs initialised to
optimized_out, rather than being modified after the graph is built.
Review-Url: https://codereview.chromium.org/2523893003
Cr-Commit-Position: refs/heads/master@{#41355}
Reason for revert:
Breaks the build:
https://build.chromium.org/p/client.v8/builders/V8%20Linux%20-%20shared/builds/14886
Original issue's description:
> [ignition/turbo] Perform liveness analysis on the bytecodes
>
> Replaces the graph-based liveness analyzer in the bytecode graph builder
> with an initial bytecode-based liveness analysis pass, which is added to
> the existing loop extent analysis.
>
> Now the StateValues in the graph have their inputs initialised to
> optimized_out, rather than being modified after the graph is built.
>
> Committed: https://crrev.com/1852300954c216c29cf93444430681d213e87925
> Cr-Commit-Position: refs/heads/master@{#41344}
TBR=jarin@chromium.org,rmcilroy@chromium.org,yangguo@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Review-Url: https://codereview.chromium.org/2541443002
Cr-Commit-Position: refs/heads/master@{#41346}
Replaces the graph-based liveness analyzer in the bytecode graph builder
with an initial bytecode-based liveness analysis pass, which is added to
the existing loop extent analysis.
Now the StateValues in the graph have their inputs initialised to
optimized_out, rather than being modified after the graph is built.
Review-Url: https://codereview.chromium.org/2523893003
Cr-Commit-Position: refs/heads/master@{#41344}
Also lower JSToBoolean(x) where x is either some detectable receiver or
null, or any kind of receiver, null or undefined. Also fix a couple of
minor issues with the JSToBoolean lowering and tests.
R=yangguo@chromium.org
BUG=v8:5267
Review-Url: https://codereview.chromium.org/2530773002
Cr-Commit-Position: refs/heads/master@{#41241}
Since we are specializing on the native context, we don't have to load
the vector from the closure. For one thing, this reduces the machinery for
nodes that use a vector in their generic incarnation.
BUG=
R=mstarzinger@chromium.org
Review-Url: https://codereview.chromium.org/2529463002
Cr-Commit-Position: refs/heads/master@{#41221}
This is the TurboFan counterpart of http://crrev.com/2504263004, but it
is a bit more involved, since in TurboFan we always inline the appropriate
call to the @@hasInstance handler, and by that we can optimize a lot more
patterns of instanceof than Crankshaft, and even yield fast instanceof
for custom @@hasInstance handlers (which we can now properly inline as
well).
Also we now properly optimize Function.prototype[@@hasInstance], even if
the right hand side of an instanceof doesn't have the Function.prototype
as its direct prototype.
For the baseline case, we still rely on the global protector cell, but
we can address that in a follow-up as well, and make it more robust in
general.
TEST=mjsunit/compiler/instanceof
BUG=v8:5640
R=yangguo@chromium.org
Review-Url: https://codereview.chromium.org/2511223003
Cr-Commit-Position: refs/heads/master@{#41092}
The control edges in a TurboFan graph can form a cycle. To break this cycle in the int64-lowering we add special handling for loop nodes. Similar handling already exists for phi nodes and effectphi nodes, which breaks cycles formed by value edges and effect edges, respectively.
Review-Url: https://codereview.chromium.org/2511503002
Cr-Commit-Position: refs/heads/master@{#41071}
Currently, we are using the following sequence for load/store
with large offset (offset > 16b):
lui at, 0x1234
ori at, at, 0x5678
add at, s0, at
lw a0, 0(at)
This sequence can be optimized in the following way:
lui at, 0x1234
add at, s0, at
lw a0, 0x5678(at)
BUG=
Review-Url: https://codereview.chromium.org/2503493002
Cr-Commit-Position: refs/heads/master@{#40988}
Port 0322c20d17
Original commit message:
When storing an immediate integer or floating point zero, use the zero register
as the source value. This avoids the need to sometimes allocate a new register.
BUG=
Review-Url: https://codereview.chromium.org/2470133005
Cr-Commit-Position: refs/heads/master@{#40987}
SourcePosition::InliningId() refers to a the new table DeoptimizationInputData::InliningPositions(), which provides the following data for every inlining id:
- The inlined SharedFunctionInfo as an offset into DeoptimizationInfo::LiteralArray
- The SourcePosition of the inlining. Recursively, this yields the full inlining stack.
Before the Code object is created, the same information can be found in CompilationInfo::inlined_functions().
If SourcePosition::InliningId() is SourcePosition::kNotInlined, it refers to the outer (non-inlined) function.
So every SourcePosition has full information about its inlining stack, as long as the corresponding Code object is known. The internal represenation of a source position is a positive 64bit integer.
All compilers create now appropriate source positions for inlined functions. In the case of Turbofan, this required using AstGraphBuilderWithPositions for inlined functions too. So this class is now moved to a header file.
At the moment, the additional information in source positions is only used in --trace-deopt and --code-comments. The profiler needs to be updated, at the moment it gets the correct script offsets from the deopt info, but the wrong script id from the reconstructed deopt stack, which can lead to wrong outputs. This should be resolved by making the profiler use the new inlining information for deopts.
I activated the inlined deoptimization tests in test-cpu-profiler.cc for Turbofan, changing them to a case where the deopt stack and the inlining position agree. It is currently still broken for other cases.
The following additional changes were necessary:
- The source position table (internal::SourcePositionTableBuilder etc.) supports now 64bit source positions. Encoding source positions in a single 64bit int together with the difference encoding in the source position table results in very little overhead for the inlining id, since only 12% of the source positions in Octane have a changed inlining id.
- The class HPositionInfo was effectively dead code and is now removed.
- SourcePosition has new printing and information facilities, including computing a full inlining stack.
- I had to rename compiler/source-position.{h,cc} to compiler/compiler-source-position-table.{h,cc} to avoid clashes with the new src/source-position.cc file.
- I wrote the new wrapper PodArray for ByteArray. It is a template working with any POD-type. This is used in DeoptimizationInputData::InliningPositions().
- I removed HInlinedFunctionInfo and HGraph::inlined_function_infos, because they were only used for the now obsolete Crankshaft inlining ids.
- Crankshaft managed a list of inlined functions in Lithium: LChunk::inlined_functions. This is an analog structure to CompilationInfo::inlined_functions. So I removed LChunk::inlined_functions and made Crankshaft use CompilationInfo::inlined_functions instead, because this was necessary to register the offsets into the literal array in a uniform way. This is a safe change because LChunk::inlined_functions has no other uses and the functions in CompilationInfo::inlined_functions have a strictly longer lifespan, being created earlier (in Hydrogen already).
BUG=v8:5432
Review-Url: https://codereview.chromium.org/2451853002
Cr-Commit-Position: refs/heads/master@{#40975}
Reason for revert:
Breaks CQ trybots now, i.e. https://build.chromium.org/p/tryserver.v8/builders/v8_linux_mipsel_compile_rel/builds/24703/steps/compile%20with%20ninja/logs/stdio
Original issue's description:
> MIPS: Optimize load/store with large offset
>
> Currently, we are using the following sequence for load/store with large offset (offset > 16b):
>
> lui at, 0x1234
> ori at, at, 0x5678
> add at, s0, at
> lw a0, 0(at)
>
> This sequence can be optimized in the following way:
>
> lui at, 0x1234
> add at, s0, at
> lw a0, 0x5678(at)
>
> BUG=
TBR=ivica.bogosavljevic@imgtec.com,miran.karic@imgtec.com,v8-mips-ports@googlegroups.com,dusan.simicic@imgtec.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=
Review-Url: https://codereview.chromium.org/2500863003
Cr-Commit-Position: refs/heads/master@{#40959}
Currently, we are using the following sequence for load/store with large offset (offset > 16b):
lui at, 0x1234
ori at, at, 0x5678
add at, s0, at
lw a0, 0(at)
This sequence can be optimized in the following way:
lui at, 0x1234
add at, s0, at
lw a0, 0x5678(at)
BUG=
Review-Url: https://codereview.chromium.org/2486283003
Cr-Commit-Position: refs/heads/master@{#40953}
ToName conversion, i.e., ToPropertykey() is the
identify for strings and symbols.
BUG=v8:5623
Review-Url: https://codereview.chromium.org/2494073002
Cr-Commit-Position: refs/heads/master@{#40924}
This adds a new ExternalPointer type, which is an Internal type that is
used for ExternalReferences and other pointer values, like the pointers
into the asm.js heap. It also adds a PointerConstant operator, which we
use to represents these raw constants (we can probably remove that
particular operator again once WebAssembly ships with the validator).
R=mvstanton@chromium.org
BUG=v8:5267,v8:5270
Review-Url: https://codereview.chromium.org/2494753003
Cr-Commit-Position: refs/heads/master@{#40923}
Test InstructionSequenceTest has been initialized with a testing RegisterConfiguration
instance defined in instruction-sequence-unittest.h, whereas class ExplicitOperand which
is being tested used RegisterConfiguration from instruction.cc. In case these two
instances are different, the tests would fail. The issue is fixed by using the same
instance of RegisterConfiguration both for test code and code under test.
Additionally, the tests in register-allocator-unittest.cc use hardcoded values
for register and begin failing is the hardcoded register is not available for
allocation. Fix by forcing the use of allocatable registers only.
TEST=unittests.MoveOptimizerTest.RemovesRedundantExplicit,unittests.RegisterAllocatorTest.SpillPhi
BUG=
Review-Url: https://codereview.chromium.org/2433093002
Cr-Commit-Position: refs/heads/master@{#40862}
Port f07d2cdd6a
Original commit message:
A load instruction will implicitely clear the top 32 bits when writing to a W
register. This patch avoids generating a `mov` instruction to zero-extend the
result in this case.
For example, this occurs in the generated code for dispatching to the next
bytecode in the interpreter:
kind = BYTECODE_HANDLER
name = LdaZero
compiler = turbofan
Instructions (size = 36)
0x32e64c60 0 add x19, x19, #0x1 (1)
0x32e64c64 4 ldrb w0, [x20, x19]
0x32e64c68 8 mov w0, w0
^^^^^^^^^^
0x32e64c6c 12 lsl x0, x0, #3
0x32e64c70 16 ldr x1, [x21, x0]
0x32e64c74 20 movz x0, #0x0
0x32e64c78 24 br x1
Review-Url: https://codereview.chromium.org/2469253002
Cr-Commit-Position: refs/heads/master@{#40758}
Use HeapConstant for string_iterator_map rather than loading it
manually. This avoids unnecessary map checks.
BUG= v8:3822,v8:5267
Review-Url: https://codereview.chromium.org/2479563003
Cr-Commit-Position: refs/heads/master@{#40741}
This is preparation for using TF to create builtins that handle variable number of
arguments and have to remove these arguments dynamically from the stack upon
return.
The gist of the changes:
- Added a second argument to the Return node which specifies the number of stack
slots to pop upon return in addition to those specified by the Linkage of the
compiled function.
- Removed Tail -> Non-Tail fallback in the instruction selector. Since TF now should
handles all tail-call cases except where the return value type differs, this fallback
was not really useful and in fact caused unexpected behavior with variable
sized argument popping, since it wasn't possible to materialize a Return node
with the right pop count from the TailCall without additional context.
- Modified existing Return generation to pass a constant zero as the additional
pop argument since the variable pop functionality
LOG=N
Review-Url: https://codereview.chromium.org/2446543002
Cr-Commit-Position: refs/heads/master@{#40699}
Reason for revert:
Seems to break arm64 sim debug and blocks roll:
https://build.chromium.org/p/client.v8.ports/builders/V8%20Linux%20-%20arm64%20-%20sim%20-%20debug/builds/3294
Original issue's description:
> [turbofan] Support variable size argument removal in TF-generated functions
>
> This is preparation for using TF to create builtins that handle variable number of
> arguments and have to remove these arguments dynamically from the stack upon
> return.
>
> The gist of the changes:
> - Added a second argument to the Return node which specifies the number of stack
> slots to pop upon return in addition to those specified by the Linkage of the
> compiled function.
> - Removed Tail -> Non-Tail fallback in the instruction selector. Since TF now should
> handles all tail-call cases except where the return value type differs, this fallback
> was not really useful and in fact caused unexpected behavior with variable
> sized argument popping, since it wasn't possible to materialize a Return node
> with the right pop count from the TailCall without additional context.
> - Modified existing Return generation to pass a constant zero as the additional
> pop argument since the variable pop functionality
>
> LOG=N
TBR=bmeurer@chromium.org,mstarzinger@chromium.org,epertoso@chromium.org,danno@chromium.org
# Not skipping CQ checks because original CL landed more than 1 days ago.
NOPRESUBMIT=true
Review-Url: https://codereview.chromium.org/2473643002
Cr-Commit-Position: refs/heads/master@{#40691}
This is preparation for using TF to create builtins that handle variable number of
arguments and have to remove these arguments dynamically from the stack upon
return.
The gist of the changes:
- Added a second argument to the Return node which specifies the number of stack
slots to pop upon return in addition to those specified by the Linkage of the
compiled function.
- Removed Tail -> Non-Tail fallback in the instruction selector. Since TF now should
handles all tail-call cases except where the return value type differs, this fallback
was not really useful and in fact caused unexpected behavior with variable
sized argument popping, since it wasn't possible to materialize a Return node
with the right pop count from the TailCall without additional context.
- Modified existing Return generation to pass a constant zero as the additional
pop argument since the variable pop functionality
LOG=N
Review-Url: https://codereview.chromium.org/2446543002
Cr-Commit-Position: refs/heads/master@{#40678}
- Modifies RegisterConfiguration to specify complex aliasing on ARM 32.
- Modifies RegisterAllocator to consider aliasing.
- Modifies ParallelMove::PrepareInsertAfter to handle aliasing.
- Modifies GapResolver to split wider register moves when interference
with smaller moves is detected.
- Modifies MoveOptimizer to handle aliasing.
- Adds ARM 32 macro-assembler pseudo move instructions to handle cases where
split moves don't correspond to actual s-registers.
- Modifies CodeGenerator::AssembleMove and AssembleSwap to handle moves of
different widths, and moves involving pseudo-s-registers.
- Adds unit tests for FP operand interference checking and PrepareInsertAfter.
- Adds more tests of FP for the move optimizer and register allocator.
LOG=N
BUG=v8:4124
Review-Url: https://codereview.chromium.org/2410673002
Cr-Commit-Position: refs/heads/master@{#40597}
MIPS64 doesn't support Word32 compare instructions. Instead it relies
that the values in registers are correctly sign-extended and uses
Word64 comparison instead. This behavior is correct in most cases,
but doesn't work when comparing signed with unsigned operands.
The solution proposed here tries to match a comparison of signed
with unsigned operand, and perform Word32Compare simulation only
in those cases. Unfortunately, the solution is not complete because
it might skip cases where Word32 compare simulation is needed, so
basically it is a hack.
BUG=
TEST=mjsunit/compiler/uint32
Review-Url: https://codereview.chromium.org/2391393003
Cr-Commit-Position: refs/heads/master@{#40398}
EffectPhis can cause a cycle in a TurboFan graph. We delay the
processing of EffectPhis in the Int64Lowering to break these cycles. We
do the same already for Phis.
R=titzer@chromium.org
BUG=v8:5518
TEST=unittests/Int64LoweringTest.EffectPhiLoop
Review-Url: https://codereview.chromium.org/2428583002
Cr-Commit-Position: refs/heads/master@{#40378}
This adds more useful information to the v8-heap-stats tool.
BUG=v8:5489
Review-Url: https://codereview.chromium.org/2394213003
Cr-Commit-Position: refs/heads/master@{#40361}
Adds a boolean flag to the liveness analysis which makes it also analyze
the accumulator. This can help prevent the accumulator escaping loops,
as well as decreasing the number of distinct state values nodes in the
graph.
The flag is a kind of ugly way to hack this in, however it is probably
the simplest to add, and (more importantly) to remove once the AST graph
builder is gone.
I measure a 2.6% improvement on Mandreel on my x64 machine, and a ~2%
improvement on Navier-Stokes. Other improvements are expected.
Review-Url: https://codereview.chromium.org/2428503002
Cr-Commit-Position: refs/heads/master@{#40359}
- Adds an optional representation field to VReg and TestOperand structs.
- Adds a simple FP allocation test to register-allocator-unittest.cc.
- Adds some simple FP tests to move-optimizer-unittest.cc.
LOG=N
BUG=v8:4124
Review-Url: https://codereview.chromium.org/2400513002
Cr-Commit-Position: refs/heads/master@{#40117}
There were once plans to generate cross-context code with TurboFan,
however that doesn't fit into the model anymore, and so all of this
is essentially dead untested code (and thus most likely already broken
in subtle ways). With this mode still in place it would also be a lot
harder to make inlining based on SharedFunctionInfo work.
BUG=v8:2206,v8:5499
R=jarin@chromium.org
Review-Url: https://codereview.chromium.org/2406803002
Cr-Commit-Position: refs/heads/master@{#40109}
With this CL, we devolve all Constants introduced as they are with an object handle into
* Range - for integers
* Nan
* MinusZero
* OtherNumberConstant - for doubles
* HeapConstant
We reduce the amount we have to inspect an object handle during optimization. Also, simplifications result. For example, you never have to check if a Range contains a HeapConstant.
BUG=
Review-Url: https://codereview.chromium.org/2381523002
Cr-Commit-Position: refs/heads/master@{#40041}
If possible, take the constant map from the (known) native context for
JSCreateIterResultObject, so that subsequent map checks can be
eliminated in case of iterator inlining.
R=jarin@chromium.org
BUG=v8:3822
Review-Url: https://codereview.chromium.org/2394783002
Cr-Commit-Position: refs/heads/master@{#39974}