There's actually no need to have the transition as part of the HStoreNamedField instruction. In fact, it is cleaner and faster to generate a separate HStoreNamedField for the transition map. This will also help to eliminate map stores with store elimination, as well as reduce register pressure for transitioning stores on ia32.
R=hpayer@chromium.org
Review URL: https://codereview.chromium.org/295743002
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@21383 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Previously we tried to share some code on by a slightly confusing re-use
of LDivI for a (general) flooring division. Now we cleanly separate
concerns, just like for the rest of the division-like operations. Note
that ARM64 already did it this way.
If we really want to save some code, we can introduce some macro
assembler instructions and/or helper functions in the code generator in
a future CL, but we should really try to avoid being "clever" to save
just a few lines of trivial code. Effort != complexity. :-)
Renamed some related Lithium operands on the way for more consistency.
R=mstarzinger@chromium.org
Review URL: https://codereview.chromium.org/212703002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@20395 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Lithium currently supports 3 division-like operations on integral operands: "Normal" division (rounding towards zero), flooring division (rounding towards -Infinity) and modulus calculation (the counterpart for the "normal" division). For divisors which are a power of 2, one can efficiently use some bit fiddling to avoid the actual division for such operations. This CL cleanly splits off these operations into separate Lithium instructions, making the code much more maintainable and more consistent across platforms.
There are 2 basic variations of these bit fiddling algorithms: One involving branches and a seemingly more clever one without branches. Choosing between the two is not as easy as it seems: Benchmarks (and probably real-world) programs seem to favor positive dividends, registers and shifting units are sometimes scarce resources, and branch prediction is quite good in modern processors. Therefore only the "normal" division by a power of 2 is implemented in a branch-free manner, this seems to be the best approach in practice. If this turns out to be wrong, we can easily and locally change this.
R=bmeurer@chromium.org
Review URL: https://codereview.chromium.org/175143002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@19715 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Recent changes in IC logic meant that CallStubs no longer use the Contextual bit. IsUndeclaredGlobal() needed to adjust for that.
In fact, now the CL has morphed to remove the notion of storing contextual state in the IC at all, it just becomes some extra ic state of the load ic. This took some adjustment in harmony code to use the global receiver for certain stores.
Now it's clearer that only LoadICs actually record any information about contextual or not.
R=verwaest@chromium.org
Review URL: https://codereview.chromium.org/140943002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18660 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
call machinery. The change replaces CallNamed, CallKeyed,
CallConstantFunction and CallKnownGlobal hydrogen instructions with two
new instructions with a more lower level semantics:
1. CallJSFunction for direct calls of JSFunction objects (no
argument adaptation)
2. CallWithDescriptor for calls of a given Code object according to
the supplied calling convention.
Details:
CallJSFunction should be straightforward, the main difference from the
existing InvokeFunction instruction is the absence of argument adaptor
handling. (As a next step, we will replace InvokeFunction with an
equivalent hydrogen code.)
For CallWithDescriptor, the calling conventions are represented by a
tweaked version of CallStubInterfaceDescriptor. In addition to the
parameter-register mapping, we also define parameter-representation
mapping there. The CallWithDescriptor instruction has variable number of
parameters now - this required some simple tweaks in Lithium, which
assumed fixed number of arguments in some places.
The calling conventions used in the calls are initialized in the
CallDescriptors class (code-stubs.h, <arch>/code-stubs-<arch>.cc), and
they live in a new table in the Isolate class. I should say I am not
quite sure about Representation::Integer32() representation for some of
the params of ArgumentAdaptorCall - it is not clear to me wether the
params could not end up on the stack and thus confuse the GC.
The change also includes an earlier small change to argument adaptor
(https://codereview.chromium.org/98463007) that avoids passing a naked
pointer to the code entry as a parameter. I am sorry for packaging that
with an already biggish change.
Performance implications:
Locally, I see a small regression (.2% or so). It is hard to say where
exactly it comes from, but I do see inefficient call sequences to the
adaptor trampoline. For example:
;;; <@78,#24> constant-t
bf85aa515a mov edi,0x5a51aa85 ;; debug: position 29
;;; <@72,#53> load-named-field
8b7717 mov esi,[edi+0x17] ;; debug: position 195
;;; <@80,#51> constant-s
b902000000 mov ecx,0x2 ;; debug: position 195
;;; <@81,#51> gap
894df0 mov [ebp+0xf0],ecx
;;; <@82,#103> constant-i
bb01000000 mov ebx,0x1
;;; <@84,#102> constant-i
b902000000 mov ecx,0x2
;;; <@85,#102> gap
89d8 mov eax,ebx
89cb mov ebx,ecx
8b4df0 mov ecx,[ebp+0xf0]
;;; <@86,#58> call-with-descriptor
e8ef57fcff call ArgumentsAdaptorTrampoline (0x2d80e6e0) ;; code: BUILTIN
Note the silly handling of ecx; the hydrogen for this code is:
0 4 s27 Constant 1 range:1_1 <|@
0 3 t30 Constant 0x5bc1aa85 <JS Function xyz (SharedFunctionInfo 0x5bc1a919)> type:object <|@
0 1 t36 LoadNamedField t30.[in-object]@24 <|@
0 1 t38 Constant 0x2300e6a1 <Code> <|@
0 1 i102 Constant 2 range:2_2 <|@
0 1 i103 Constant 1 range:1_1 <|@
0 2 t41 CallWithDescriptor t38 t30 t36 s27 i103 i102 #2 changes[*] <|@
BUG=
R=verwaest@chromium.org, danno@chromium.org
Review URL: https://codereview.chromium.org/104663004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18626 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
by escape analysis). Added several tests that expose the bug.
Summary:
LCodegen::AddToTranslation assumes that Lithium environments are
generated by depth-first traversal, but LChunkBuilder::CreateEnvironment
was generating them in breadth-first fashion. This fixes the
CreateEnvironment to traverse the captured objects depth-first.
Note:
It might be worth considering representing LEnvironment by a list
with the same order as the serialized translation representation
rather than having two lists with a subtle relationship between
them (and then serialize in a slightly different order again).
R=titzer@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Review URL: https://codereview.chromium.org/93803003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18470 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
It was only used for Math.log, and even then only in full code and in %_MathLog. For crankshafted code, Intel already used the FP operations directly, while the ARM/MIPS ports were a bit lazy and simply called the stub. The latter directly call the C library now without any cache. It would be possible to directly generate machine code if somebody has the time, from what I've seen out in the wild it should be only about a dozen instructions.
LOG=y
R=yangguo@chromium.org
Review URL: https://codereview.chromium.org/113343003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18344 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This removes tons of architecture-specific code and makes it easy to
experiment with other pseudo-RNG algorithms. The crankshafted code is
extremely good, keeping all things unboxed and doing only minimal
checks, so it is basically equivalent to the handwritten code.
When benchmarks are run without parallel recompilation, we get a few
percent regression on SunSpider's string-validate-input and
string-base64, but these benchmarks run so fast that the overall
SunSpider score is hardly affected and within the usual jitter. Note
that these benchmarks actually run even faster when we don't
crankshaft at all on the main thread (the regression is not caused by
bad code, it is caused by Crankshaft needing a few hundred microsecond
for compilation of a trivial function). Luckily, when parallel
recompilation is enabled, i.e. in the browser, we see no regression at
all!
R=mstarzinger@chromium.org
Review URL: https://codereview.chromium.org/68723002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@17955 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The %_OneByteSeqStringSetChar intrinsic expects its arguments to be checked before being called for efficiency reasons, but the fuzzer provided no such checks. Now the intrinsic is robust to bad input if FLAG_debug_code is set.
R=yangguo@chromium.org
TEST=test/mjsunit/regress/regress-320948.js
BUG=chromium:320948
LOG=Y
Review URL: https://codereview.chromium.org/72813004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@17886 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Lithium uses indexes after the maximium value ID in the HGraph as indexes
of virtual registers and assumes that the maximum value ID does not change.
The IsStandardConstant and GetConstantXX functions could add constants to
HGraph, which aliased virtual registers with real values. This could confuse
the register allocator to think that a value in a virtual register is tagged
and to incorrectly set it in the pointer map.
BUG=298269
TEST=mjsunit/regress/regress-298269.js
R=verwaest@chromium.org
Review URL: https://chromiumcodereview.appspot.com/66693002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@17599 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This improves the generated code for HSeqStringSetChar across
all platforms, taking advantage of constant operands whenever
possible. It also drops the unused DefineSameAsFirst constraint
for the register allocator on x64 and ia32, where it caused
unnecessary spills when the string operand was live across the
HSeqStringSetChar instruction.
A new GVN flag StringChars is introduced to express dependencies
between HSeqStringSetChar, HStringCharCodeAt and the upcoming
HSeqStringGetChar (the GVNFlags type is now 64bit in size).
Also improves the test case.
TEST=mjsunit/string-natives
R=mstarzinger@chromium.org, yangguo@chromium.org
Review URL: https://codereview.chromium.org/57383004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@17521 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
In the process:
- Add a command-line flag --opt-code-positions to track source position information throughout optimized code.
- Add a subclass of the hydrogen graph builder to ensure that the source position is properly set on the graph builder for all generated hydrogen code.
- Overhaul handling of source positions in hydrogen to ensure they are passed through to generated code consistently and in most cases transparently.
Originally reviewed in this CL: https://codereview.chromium.org/24957003/
Review URL: https://codereview.chromium.org/29123008
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@17295 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Right now we eagerly save all allocatable double registers upon
entry to every Hydrogen code stub that uses HCallRuntime, and
restore them when we return. Since the HCallRuntime is on the
fallback path for code stubs, this is both a waste of time and
stack space in almost every case.
This patch adds a flag to the HCallRuntime, which controls whether
the instruction saves the double register itself (using the save
doubles flag for the CEntryStub), or whether its up the surrounding
code to handle the clobbering of double registers.
R=danno@chromium.org
Review URL: https://codereview.chromium.org/23530066
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@17044 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Since the per-context random number generator is now
properly seeded upon context creation, we do not need
to check for lazy-initialization anymore, and so we
can implement the HRandom instruction w/o having to
call into the C function (which means we don't need
to MarkAsCall anymore).
TEST=cctest/test-random
R=yangguo@chromium.org
Review URL: https://codereview.chromium.org/23478031
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16684 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- Use V8_FINAL and V8_OVERRIDE in Ast classes.
- Use V8_FINAL and V8_OVERRIDE in Lithium mips backend.
- Use V8_FINAL and V8_OVERRIDE in Lithium arm backend.
- Use V8_FINAL and V8_OVERRIDE in Lithium x64 backend.
- Use V8_FINAL and V8_OVERRIDE in Lithium ia32 backend.
- Use V8_FINAL and V8_OVERRIDE in Lithium classes.
- Use V8_FINAL and V8_OVERRIDE in Hydrogen classes.
TBR=dslomov@chromium.org
Review URL: https://codereview.chromium.org/22796020
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16244 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- Use V8_FINAL and V8_OVERRIDE in objects.
- Use V8_FINAL and V8_OVERRIDE in Ast classes.
- Use V8_FINAL and V8_OVERRIDE in Lithium mips backend.
- Use V8_FINAL and V8_OVERRIDE in Lithium arm backend.
- Use V8_FINAL and V8_OVERRIDE in Lithium x64 backend.
- Use V8_FINAL and V8_OVERRIDE in Lithium ia32 backend.
- Use V8_FINAL and V8_OVERRIDE in Lithium classes.
- Use V8_FINAL and V8_OVERRIDE in Hydrogen classes.
R=dslomov@chromium.org
Review URL: https://codereview.chromium.org/23064017
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16232 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This patch is to enhance the source code line information for profiler.
For the Hydrogen compilation, most of the source code line information
is not copied from the HInstruction the to corresponding LInstruction.
This patch defines one PositionBits field for LInstruction and copies the
sorce code position value from the HInstruction.
When Generating the native code, we use RecordPosition(..) function to
write LInstruction's position value to position recorder.
For the MIPS platform, I did not touch because I have no devices
to verify the modification on it.
R=danno@chromium.org
Review URL: https://codereview.chromium.org/21042003
Patch from Chunyang Dai <chunyang.dai@intel.com>.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16114 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This change implements a simple data-flow analysis pass over captured
objects to the existing escape analysis. It tracks the state of values
in the Hydrogen graph through CapturedObject marker instructions that
are used to construct an appropriate translation for the deoptimizer to
be able to materialize these objects again.
This can be considered a combination of scalar replacement of loads and
stores on captured objects and sinking of unused allocations.
R=titzer@chromium.org
TEST=mjsunit/compiler/escape-analysis
Review URL: https://codereview.chromium.org/21055011
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16098 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
I'd like to propagate bailout reason to cpu profiler.
So I need to save it into heap object SharedFunctionInfo.
But:
1) all bailout reason strings spread across all the sources.
2) they are native strings and if I convert them into String then I may have a performance issue.
3) one byte is enough for 184 bailout reasons. Otherwise we need 8 bytes for the pointer.
Also I think it would be nice to have error strings collected in one place.
In that case we will get additional benefits:
It allows us to keep this set of messages under control.
It gives us a chance to internationalize them.
It slightly reduces the binary footprint.
From the other hand the developers have to add new strings into that enum.
BUG=
R=jkummerow@chromium.org
Review URL: https://codereview.chromium.org/20843012
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16024 ce2b1a6d-e550-0410-aec6-3dcde31c8c00