Perform the following transformation:
| Before | After |
|------------------+---------------------|
| add w2, w0, w1 | adds w2, w0, w1 |
| cmp w2, #0x0 | b.<cond'> <addr> |
| b.<cond> <addr> | |
|------------------+---------------------|
| add w2, w0, w1 | adds w2, w0, w1 |
| cmp #0x0, w2 | b.<cond'> <addr> |
| b.<cond> <addr> | |
and the same for and instructions instead of add. When the result of the
add/and is not used, generate cmn/tst instead. We need to take care with which
conditions we can handle and what new condition we map them to.
BUG=
Review-Url: https://codereview.chromium.org/2065243005
Cr-Commit-Position: refs/heads/master@{#37400}
CMN is a flag-setting add operation, and therefore is commutative.
{Add,Sub}WithOverflow generate ADD/SUB instructions that cannot
support a ROR shift.
BUG=
Review-Url: https://codereview.chromium.org/2087233005
Cr-Commit-Position: refs/heads/master@{#37212}
This patch enables the following transformations in the instruction
selector:
| Before | After |
|------------------+------------------------|
| and x3, x1, #0x1 | tb{,n}z w1, #0, #+0x78 |
| cmp x3, #0x0 | |
| b.{eq,ne} #+0x80 | |
|------------------+------------------------|
| cmp x0, #0x0 | cb{,n}z x0, #+0x48 |
| b.{eq,ne} #+0x4c | |
I have not seen these patterns beeing generated by turbofan, however the
stubs hit these cases frequently. A particular reason is that we are
turning operations that check for a Smi into a single `tbz`.
As a concequence, the interpreter is affected thanks to inlining
turbofan stubs into it's bytecode handlers. I have noticed the size of
the interpreter was reduced by 200 instructions.
BUG=
Review-Url: https://codereview.chromium.org/2022073002
Cr-Commit-Position: refs/heads/master@{#36632}
Adding optional operators for FNeg for WebAssembly, as the current implementation was significantly suboptimal for ARM.
Review-Url: https://codereview.chromium.org/2011303002
Cr-Commit-Position: refs/heads/master@{#36544}
This patch adds support for the `Operand2_R_LSL_I` addressing mode to
loads and stores. This allows merging a shift instruction into a
MemoryOperand. Since the shift immediate is restricted to the log2 of
the operation width, the opportunities to hit this are slim. However,
Ignition's bytecode handlers hit this case all the time:
kind = BYTECODE_HANDLER
name = Star
compiler = turbofan
Instructions (size = 44)
0x23e67280 0 add x1, x19, #0x1 (1)
0x23e67284 4 ldrsb x1, [x20, x1]
0x23e67288 8 sxtw x1, w1
0x23e6728c 12 mov x2, fp
0x23e67290 16 str x0, [x2, x1, lsl #3]
^^^^^^^^^^^^^^^^^^^^^
0x23e67294 20 add x19, x19, #0x2 (2)
0x23e67298 24 ldrb w1, [x20, x19]
0x23e6729c 28 ldr x1, [x21, x1, lsl #3]
^^^^^^^^^^^^^^^^^^^^^
0x23e672a0 32 br x1
Additionally, I noticed the optimisation occurs once in both the
`StringPrototypeCharAt` and `StringPrototypeCharCodeAt` turbofan stubs.
BUG=
Review-Url: https://codereview.chromium.org/1972103002
Cr-Commit-Position: refs/heads/master@{#36227}
A load instruction will implicitely clear the top 32 bits when writing to a W
register. This patch avoids generating a `mov` instruction to zero-extend the
result in this case.
For example, this occurs in the generated code for dispatching to the next
bytecode in the interpreter:
kind = BYTECODE_HANDLER
name = LdaZero
compiler = turbofan
Instructions (size = 36)
0x32e64c60 0 add x19, x19, #0x1 (1)
0x32e64c64 4 ldrb w0, [x20, x19]
0x32e64c68 8 mov w0, w0
^^^^^^^^^^
0x32e64c6c 12 lsl x0, x0, #3
0x32e64c70 16 ldr x1, [x21, x0]
0x32e64c74 20 movz x0, #0x0
0x32e64c78 24 br x1
BUG=
Review-Url: https://codereview.chromium.org/1950013003
Cr-Commit-Position: refs/heads/master@{#36038}
When storing an immediate integer or floating point zero, use the zero register
as the source value. This avoids the need to sometimes allocate a new register.
BUG=
Review-Url: https://codereview.chromium.org/1945783002
Cr-Commit-Position: refs/heads/master@{#36013}
MachineType is now a class with two enum fields:
- MachineRepresentation
- MachineSemantic
Both enums are usable on their own, and this change switches some places from using MachineType to use just MachineRepresentation. Most notably:
- register allocator now uses just the representation.
- Phi and Select nodes only refer to representations.
Review URL: https://codereview.chromium.org/1513543003
Cr-Commit-Position: refs/heads/master@{#32738}
Use compare-negate instruction if the right-hand input to a compare is a
negate operation.
BUG=
Review URL: https://codereview.chromium.org/1410123009
Cr-Commit-Position: refs/heads/master@{#31866}
Float(32|64)Min:
// (a < b) ? a : b
fcmp da, db
fcsel dd, da, db, lo
Float(32|64)Max:
// (b < a) ? a : b
fcmp db, da
fcsel dd, da, db, lo
BUG=
Review URL: https://codereview.chromium.org/1360603003
Cr-Commit-Position: refs/heads/master@{#31621}
Adds support for loading from and storing to outer context
variables. Also adds support for declaring functions on contexts and
locals. Finally, fixes a couple of issues with StaContextSlot where
we weren't emitting the write barrier and therefore would crash in the
GC.
Also added code so that --print-bytecode will output the
function name before the bytecodes, and replaces MachineType with StoreRepresentation in RawMachineAssembler::Store and updates tests.
BUG=v8:4280
LOG=N
Review URL: https://codereview.chromium.org/1425633002
Cr-Commit-Position: refs/heads/master@{#31584}
Support negate with shifted input on ARM64 by supporting lhs zero registers for
binary operations, and removing explicit Neg instruction support.
Review URL: https://codereview.chromium.org/1404093003
Cr-Commit-Position: refs/heads/master@{#31263}
This patch explicitly names commuted conditions for floating point
comparisons, instead of relying on CommuteFlagsCondition. Otherwise, a
bug in this function would not be caught.
BUG=
Review URL: https://codereview.chromium.org/1364773002
Cr-Commit-Position: refs/heads/master@{#30905}
This patch checks the type of the lhs operand of a floating point
comparison, and commutes the operands if it is #0.0. It allows us to
optimize a comparison with zero, as the fcmp instruction accepts #0.0 as
rhs operand.
Code before for "0.0 < 0.123":
------------------------------
fmov d1, xzr
ldr d0, pc+96
fcmp d1, d0
b.lo #+0xc
Code after:
-----------
ldr d0, pc+92
fcmp d0, #0.0
b.gt #+0xc
Before this patch, we used unsigned condition codes for floating point
comparisons, but the unordered case was not correctly commuted.
Review URL: https://codereview.chromium.org/1356283003
Cr-Commit-Position: refs/heads/master@{#30881}
Support 32-bit cmp with shift/extend by reusing the existing add/sub shift and
extend code.
Review URL: https://codereview.chromium.org/1218103005
Cr-Commit-Position: refs/heads/master@{#29435}
Move the arithmetic shift from Int32MulHigh to a following Int32Add on ARM64.
This graph is commonly generated on reduction of signed integer division.
Review URL: https://codereview.chromium.org/1209413008
Cr-Commit-Position: refs/heads/master@{#29380}
With this patch, we can generate simple immediate-shift instructions for
immediates outside the range "0 <= imm < width". Several related
instruction selectors have also been updated accordingly.
Example of generated code:
---- Before --- ---- After ----
movz w0, #33 lsr w0, w1, #1
lsr w0, w1, w0
BUG=
Review URL: https://codereview.chromium.org/1179893003
Cr-Commit-Position: refs/heads/master@{#28977}
Before selecting multiply-accumulate for a multiplication with add operation,
check that the multiply can't be reduced to add-with-shift. This prevents
simple multiplications by 3, 5, etc turning into register moves and madd
instructions.
Review URL: https://codereview.chromium.org/1180863002
Cr-Commit-Position: refs/heads/master@{#28976}
Merge a following arithmetic or logical right shift into the existing shift
of ARM64's Int32MulHigh or Uint32MulHigh code.
BUG=
Review URL: https://codereview.chromium.org/1179503003
Cr-Commit-Position: refs/heads/master@{#28945}
Reason for revert:
Breaks InstructionSelectorTest.Word64ShrWithWord64AndWithImmediate on debug builds (but not optdebug builds). I'll investigate.
Original issue's description:
> [arm64][turbofan]: Handle any immediate shift.
>
> With this patch, we can generate simple immediate-shift instructions for
> immediates outside the range "0 <= imm < width". Several related
> instruction selectors have also been updated accordingly.
>
> Example of generated code:
>
> ---- Before --- ---- After ----
> movz w0, #33 lsr w0, w1, #1
> lsr w0, w1, w0
>
> BUG=
>
> Committed: https://crrev.com/36d771bbfa4af5efcc1c1dcf5b234445cb7ee722
> Cr-Commit-Position: refs/heads/master@{#28943}
TBR=bmeurer@chromium.org,ulan@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=
Review URL: https://codereview.chromium.org/1176393002
Cr-Commit-Position: refs/heads/master@{#28944}
With this patch, we can generate simple immediate-shift instructions for
immediates outside the range "0 <= imm < width". Several related
instruction selectors have also been updated accordingly.
Example of generated code:
---- Before --- ---- After ----
movz w0, #33 lsr w0, w1, #1
lsr w0, w1, w0
BUG=
Review URL: https://codereview.chromium.org/1179733004
Cr-Commit-Position: refs/heads/master@{#28943}
Select ubfiz for (x & mask) << imm where mask is contiguous and imm is non-zero.
BUG=
Review URL: https://codereview.chromium.org/1161643003
Cr-Commit-Position: refs/heads/master@{#28755}
Enable clang's shorten-64-to-32 warning flag on ARM64, and fix the warnings
that arise.
BUG=
Review URL: https://codereview.chromium.org/1131573006
Cr-Commit-Position: refs/heads/master@{#28412}
Select sbfx for ((x << k) >> k) in ARM64 instruction selector, and similarly
for ubfx. This is a more generic version of the previous sxtb/h selector.
BUG=
Review URL: https://codereview.chromium.org/1135543002
Cr-Commit-Position: refs/heads/master@{#28318}
These operators compute the absolute floating point value of some
arbitrary input, and are implemented without any branches (i.e. using
vabs on arm, and andps/andpd on x86).
R=svenpanne@chromium.org
Review URL: https://codereview.chromium.org/1066393002
Cr-Commit-Position: refs/heads/master@{#27662}
Support sxtb and sxth extend operators on add and subtract, as we've
done for ubtx/h. This is similar to ARM support for sxtab/h.
BUG=
Review URL: https://codereview.chromium.org/1064813003
Cr-Commit-Position: refs/heads/master@{#27624}
Add support for appending extend modes uxtb or uxth to add and subtract
instructions, and using them in the instruction selector.
BUG=
Review URL: https://codereview.chromium.org/1021533002
Cr-Commit-Position: refs/heads/master@{#27303}
The instruction selector now selects pseudo instructions: CompareAndBranch or
TestAndBranch which are associated with their continuations so that generic
code in the code generator will treat them as branch instruction and will be
able to apply optimization like avoiding branches when the code can falltrhough.
R=bmeurer@chromium.org
Review URL: https://codereview.chromium.org/798553002
Cr-Commit-Position: refs/heads/master@{#25773}