fast-mode code generator.
AST expression nodes are annotated with a location when doing the
initial syntactic check of the AST. In the current implementation,
expression locations are 'temporary' (ie, allocated to the stack) or
'nowhere' (ie, the expression's value is not needed though it must be
evaluated for side effects).
For the assignment '.result = true' on IA32, we had before (with the
true value already on top of the stack):
32 mov eax,[esp]
35 mov [ebp+0xf4],eax
38 pop eax
Now:
32 pop [ebp+0xf4]
======== On x64, before:
37 movq rax,[rsp]
41 movq [rbp-0x18],rax
45 pop rax
Now:
37 pop [rbp-0x18]
======== On ARM, before (with the true value in register ip):
36 str ip, [sp, #-4]!
40 ldr ip, [sp, #+0]
44 str ip, [fp, #-12]
48 add sp, sp, #4
Now:
36 str ip, [fp, #-12]
Review URL: http://codereview.chromium.org/267118
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3076 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
fast code generator is optimized for compilation time and code size.
Currently it is only implemented on IA32. It is potentially triggered
for any code in the global scope (including code eval'd in the global
scope). It performs a syntactic check and chooses to compile in fast
mode if the AST contains only supported constructs and matches some
other constraints.
Initially supported constructs are
* ExpressionStatement,
* ReturnStatement,
* VariableProxy (variable references) to parameters and
stack-allocated locals,
* Assignment with lhs a parameter or stack-allocated local, and
* Literal
This allows compilation of literals at the top level and not much
else.
All intermediate values are allocated to temporaries and the stack is
used for all temporaries. The extra memory traffic is a known issue.
The code generated for 'true' is:
0 push ebp
1 mov ebp,esp
3 push esi
4 push edi
5 push 0xf5cca135 ;; object: 0xf5cca135 <undefined>
10 cmp esp,[0x8277efc]
16 jnc 27 (0xf5cbbb1b)
22 call 0xf5cac960 ;; code: STUB, StackCheck, minor: 0
27 push 0xf5cca161 ;; object: 0xf5cca161 <true>
32 mov eax,[esp]
35 mov [ebp+0xf4],eax
38 pop eax
39 mov eax,[ebp+0xf4]
42 mov esp,ebp ;; js return
44 pop ebp
45 ret 0x4
48 mov eax,0xf5cca135 ;; object: 0xf5cca135 <undefined>
53 mov esp,ebp ;; js return
55 pop ebp
56 ret 0x4
Review URL: http://codereview.chromium.org/273050
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3067 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
while, and for loops.
Previously they were distinguished by a type field, which required
runtime asserts to avoid invalid nodes (since not all loop types have
the same internal structure). Now they C++ type system is used to
require well-formed loop ASTs.
Because they do not share compilation code, we had very large
functions in the code generators that merely did a runtime dispatch to
a specific implementation based on the type.
Review URL: http://codereview.chromium.org/269049
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3048 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The construction of arrays when using the the Array function either as a constructor or a normal function is now handled fully in generated code in most cases. Only when Array is called with one argument which is either negative or abowe JSObject::kInitialMaxFastElementArray (which is currently 1000) or if the allocated object cannot fit in the room left in new space is the runtime system entered.
Two new native code built-in functions are added one for normal invocation and one for the construct call. The existing C++ builtin is renamed, but kept. When the normal invocation cannot be handled in generated code the C++ builtin is called. When the construct invocation cannot be handled in native code the generic construct stub is called (which will end up in the C++ builtin through a construct trampoline).
One thing that might be changed is preserving esi (constructor function) during the handling of a construct call. We know precisily what function we where calling anyway and can just reload it. This could remove the parameter construct_call to ArrayNativeCode and remove the handling of this from that function.
The X64 and ARM implementations are not part of this changelist.
Review URL: http://codereview.chromium.org/193125
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2899 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The main piece of this change was to add support for break on return for ARM. On ARM the normal js function return consist of the following code sequence.
mov sp, fp
ldmia sp!, {fp, lr}
add sp, sp, #4
bx lr
to a call to the debug break return entry code using the following code sequence
mov lr, pc
ldr pc, [pc, #-4]
<debug break return entry code entry point address>
bktp 0
The values of Assembler::kPatchReturnSequenceLength and Assembler::kPatchReturnSequenceLength are somewhat misleading, but they fit the current use in the debugger. Also Assembler::kPatchReturnSequenceLength is used in the IC code as well (for something which is not related to return sequences at all). I will change that in a separate changelist.
For the debugger to work also added recording of the return sequence in the relocation info and handling of source position recording when a function ends with a return statement.
Used the constant kInstrSize instead of sizeof(Instr).
Passes all debugger tests on both simulator and hardware (only release mode tested on hardware).
Review URL: http://codereview.chromium.org/199075
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2879 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Objects which require an additional fixed array to be allocated now have this allocated in generated code as well. Added allocation flags to the macro assembler new space allocation routines.
Changed the ia32 and x64 macro assemblers to take allocation flags to the allocation routines instead of boolean flag.
Review URL: http://codereview.chromium.org/201015
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2832 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This change moves the allocation of new objects into generated code. The allocation will bail out into the runtime system if the number of properties to allocate for the object exceeds the number of in-object properties.
Review URL: http://codereview.chromium.org/175045
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2797 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
For objects which only have simple assignments of the form this.x = ...; a
specialized constructor stub is now generated. This generated code allocates the
object and fills in the initial properties directly. If this fails for some
reason code continues in the generic constructor stub which in turn might pass
control to the runtime system.
Added counter to see how many objects are constructed using a specialized stub.
The specialized stub is only implemented for ia32 architecture in this change.
For x64 and ARM the generic construct stub is used.
This is change is identical to http://codereview.chromium.org/174392 (committed in r2753 and reverted in r2754) except that a few parts have already been committed from http://codereview.chromium.org/173469 (committed in r2762).
Review URL: http://codereview.chromium.org/173470
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2764 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
For objects which only have simple assignments of the form this.x = ...; a specialized constructor stub is now generated. This generated code allocates the object and fills in the initial properties directly. If this fails for some reason code continues in the generic constructor stub which in turn might pass control to the runtime system.
Added counter to see how many objects are constructed using a specialized stub.
The specialized stub is only implemented for ia32 architecture in this change. For x64 and ARM the generic construct stub is used.
Review URL: http://codereview.chromium.org/174392
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2753 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
they do nothing. The frame is currently always spilled, so they were
not doing anything useful.
The call sites have been left alone to mark where spills will
eventually be needed if we begin doing register allocation on ARM.
Review URL: http://codereview.chromium.org/164136
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2644 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
fast-mode compiler.
1. We avoid generating a useless temporary for assignments with
nontrivial right-hand sides. Instead of translating id = expr into:
...
tmp = <last expr instruction>
id = tmp
we generate directly
...
id = <last expr instruction>
by passing a data destination ('hint') down the AST. The semantics is
to use the destination as a result location if a temp is needed. It
may be ignored. NULL indicates I don't care and you should generate a
temp.
2. We correctly handle assignments as subexpressions. When building
the CFG for an expression we accumulate the assigned variables and we
emit a move to a fresh temporary if a value in a variable is in
jeopardy of being overwritten.
Review URL: http://codereview.chromium.org/165056
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2643 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
generated in one-pass from the source AST, code is generated from the
CFG. Enabled by the flag --multipass and disabled by default.
Rudimentary and currently only supports literal expressions and return
statements. There are some other known limitations (e.g., missing
support for tracing).
Review URL: http://codereview.chromium.org/159695
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2596 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
We did not handle the case where the left-hand-side expression was
fully compiled to control flow. There were also some assertions for
unary and binary expressions that crashed debug builds when the
expression was fully compiled to control flow.
Regression test added.
Review URL: http://codereview.chromium.org/160006
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2524 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
* Fast runtime calls for div and mod.
* Fix assembly and disassembly of multiply instructions.
* Strength reduce and inline multiplications to shift-add.
* Strength reduce and inline mod by power of 2.
* Strength reduce mod by other small integers to mul.
* Strength reduce div by 2 and 3.
Review URL: http://codereview.chromium.org/155047
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2355 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Remove the check for deleted properties in the global load inline cache if the property is known to be read only.
Propegate the in loop flag for the global call inline cache.
Changed the propagation of the code flags in the call stub compiler to compute these the same way for all types of call stubs and assert that the flags for the generated code is the same as those used for the cache lookup.
Addressed a few comments from previous review in test-api.cc.
Review URL: http://codereview.chromium.org/150101
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2308 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
* Fix incorrect signedness in disassembly of umull/mull on ARM.
* Fix incorrect register order in disassembly of umull/mull.
* Fix incorrect assembly of umull on ARM.
* Remove retroactively obsoleted restriction on choice of
registers in mul instructions on ARM.
Review URL: http://codereview.chromium.org/150002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2292 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Marsaglia's multiply-with-carry instead of mixing the
bits obtained from calling the system random() twice.
This seems to be a bit faster and gives a better
distribution than the system random() in particular on
Windows.
Review URL: http://codereview.chromium.org/126113
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2159 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
* Identify heap numbers that contain non-Smi int32s and do bit
ops on them without calling the fp hardware or emulation.
* Identify results that are non-Smi int32s and write them into
heap numbers without calling the fp hardware or emulation.
* Do unary minus on heap numbers without going into the runtime
system.
* On add, sub and mul if we have both Smi and heapnumber inputs
to the same operation then convert the Smi to a double and do
the op without going into runtime system. This also applies
if we have two Smi inputs but the result is not Smi.
Review URL: http://codereview.chromium.org/119241
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2131 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
instructions. The intention is that the snapshots generated
by the simulator should be usable on the hardware. Instead of
swi instructions we generate a branch to a swi instruction that
is not part of the snapshot. The call/jump is patched up in
the same way as other external references when the snapshot
is deserialized. This only works for EABI targets: on old ABI
targets we still emit some instructions not supported by the
simulator (fp coprocessor instructions).
Review URL: http://codereview.chromium.org/119036
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2127 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
deferred code snippets are highly stylized. They always make a call
to a stub or the runtime and then return. This change takes advantage
of that.
Creating a deferred code object now captures a snapshot of the
registers in the virtual frame. The registers are automatically saved
on entry to the deferred code and restored on exit.
The clients of deferred code must ensure that there is no change to
the registers in the virtual frame (eg, by allocating which can cause
spilling) or to the stack pointer. That is currently the case.
As a separate change, I will add either code to verify this constraint
or else code to forbid any frame effect.
The deferred code itself does not use the virtual frame or register
allocator (or even the code generator). It is raw macro assembler
code.
Review URL: http://codereview.chromium.org/118226
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2112 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
submitted in revisions 2093, 2094, 2099, and 2106.
There's no evidence that supports that these changes
should be the cause of the unexplained performance
regressions on the intl2 and DHTML page cyclers.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2109 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
called from within a loop or not. In the past we lost the
information if a call site went megamorphic before a lazily
compiled callee was called for the first time. Now we track
that correctly (this is an issue that affects richards).
We still don't manage to track the in-loop state through a
constructor call, since constructor calls use LoadICs instead
of CallICs. This issue affects delta-blue. So in this patch
we assume that lazy compilations that don't happen through a
CallIC happen from inside a loop. I have an idea to fix this
but this patch is big enough already.
With our improved tracking of in-loop state I have switched
off the inlining of in-object loads for code that is not in
a loop. This benefits compile speed. One issue is that
eagerly compiled code now doesn't get the in-object loads
inlined. We need to eagerly compile less code to fix this.
Review URL: http://codereview.chromium.org/115744
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2046 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This issue was raised by Brett Wilson while reviewing my changelist for readability. Craig Silverstein (one of C++ SG maintainers) confirmed that we should declare one namespace per line. Our way of namespaces closing seems not violating style guides (there is no clear agreement on it), so I left it intact.
Review URL: http://codereview.chromium.org/115756
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2038 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
encoding the values in one word and by using an indirection table for
handles.
This reduces compilation time by roughly 10% and we should be able to make the slow case equality checking of frame elements faster as well.
Review URL: http://codereview.chromium.org/115347
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1949 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
stack pointer to change by more than one in a corner case. If we push
a constant smi larger than 16 bits, we push it via a temporary
register. Allocating the temporary can cause a register to be spilled
from the frame somewhere above the stack pointer.
As a fix, do not use pushes to materialize ranges of elements of size
larger than one.
Review URL: http://codereview.chromium.org/92121
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1785 ce2b1a6d-e550-0410-aec6-3dcde31c8c00