Since 2.6.31 perf_events interface has been available in the
kernel. There's a nice tool called "perf" (linux-2.6/tools/perf) that
uses this interface and provides capabilities similar to oprofile. The
simplest form of its usage is just dumping the raw log (trace) of
events generated by the kernel. In this patch I'm adding a script
(tools/ll_prof.py) to build profiles based on perf trace and our code
log. All the heavy-lifting is done by perf. Compared to oprofile agent
this approach does not require recompilation and supports code moving
garbage collections.
Expected usage is documented in the ll_prof's help. Basically one
should run V8 under perf passing --ll-prof flag and then the produced
logs can be analyzed by tools/ll_prof.py.
The new --ll-prof flag enables logging of generated code object
locations and names (like --log-code), and also of their bodies, which
can be later disassembled and annotated by the script.
Review URL: http://codereview.chromium.org/3831002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5663 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
o Only substitute constant values for full-words and not for
accidental partial matches.
o Keep the order of macros as in macros.py and allow depending on the
previous macros. We used to allow depending on macros that
happened to sort after the current one in the dictionary.
Review URL: http://codereview.chromium.org/2800042
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5022 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Fix issues so v8 could be built as a DLL.
-. get rid of all the compiler warning by moving dllexport/dllimport
to the individual members for classes which have inline members.
-. update v8 gyp to build v8.dll for chromium multi-dll version (win
and component==shared_library)
Note: most of the code are contributed by sjesse.
Code review URL: http://codereview.chromium.org/2882009/show
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5006 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Chromium build.
v8.gyp no longer sets any V8_TARGET_ARCH_* macro on the Mac. Instead, the
proper V8_TARGET_ARCH_* macro will be set by src/globals.h in the same way as
the V8_HOST_ARCH_* macro when it detects that no target macro is currently
defined. The Mac build will attempt to compile all ia32 and x86_64 .cc files.
#ifdef guards in each of these target-specific source files prevent their
compilation when the associated target is not selected. For completeness,
these #ifdef guards are also provided for the arm and mips .cc files.
BUG=706
TEST=x86_64 Mac GYP/Xcode-based Chromium build (still depends on other changes)
Review URL: http://codereview.chromium.org/2133003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4666 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
of having a list of virtual frame pointers in the jump
target we have one virtual frame, which is the frame that
all have to merge to to branch to that frame. The virtual
frame in the JumpTarget is inside the JumpTarget, rather than
being an allocated object that is pointed to. Unfortunately
this means that the JumpTarget class has to be able to see
the size of a VirtualFrame object to compile, which in turn
lead to a major reorganization of related .h files. The
actual change of functionality in this change is intended
to be minimal (we now assert that the virtual frames match
when using JumpTarget instead of just assuming that they do).
Review URL: http://codereview.chromium.org/1961004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4631 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The V8 GYP file now uses its own target architecture. It default to the standard target_arch, but can be set to a separate value. E.g. using
export GYP_DEFINES="target_arch=ia32 v8_target_arch=arm"
makes it possible to have the V8 ARM simulator be used in a IA32 build.
Added some checking of supported host/target architecture combinations.
Review URL: http://codereview.chromium.org/1790001
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4490 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This is to make possible enabling usage of the new profiling subsystem
in Chromium without much hassle. The idea is pretty simple: unless the
new profiling API is used, all works as usual, as soon as Chromium
starts to use the new API, it will work too.
Review URL: http://codereview.chromium.org/1635005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4382 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Instead of testing the value of a constant frame element to determine
the type we compute its type information at construction time.
This speeds up querying the type information during code generation.
This change also adds support for Integer32 constants and sets
the type information accordingly.
Review URL: http://codereview.chromium.org/1277001
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4256 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Circular queues serve as a transport for communicating between
VM, stack sampler and analyzer threads. Logging requirements
for VM and stack sampler are completely different, that's why
I introduced two different versions of CQs.
Review URL: http://codereview.chromium.org/1047002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4159 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
A variable usage analysis pass was run on toplevel and lazily-compiled
code but never used. Remove this pass and the data structures it
builds.
The representation of variable usage for Variables has been changed
from a struct containing a (weighted) count of reads and writes to a
simple flag. VariableProxies are always used, as before. The unused
"object uses" is removed.
Review URL: http://codereview.chromium.org/669270
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4052 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Introducing a virtual-frame-inl.h file containing some platform-independent
virtual frame function which are small enough to be inlined.
Removed unnecessary #include of virtual-frame.h from register-allocator-inl.h
and added the necessary explicit includes in a number of files.
Review URL: http://codereview.chromium.org/660104
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3962 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Each frame element gets a new attribute with number type information. A frame element can be:
- smi
- heap number
- number (i.e. either of the above)
- or something else.
The type information is propagated along with all virtual frame operations.
Results popped from the frame carry the number information with them.
Two optimizations in the code generator make use of the new
information:
- GenericBinaryOpSyub omits map checks if input operands are numbers.
- Boolean conversion for numbers: Emit inline code for converting a number (smi or heap number) to boolean. Do not emit call to ToBoolean stub in this case.
Review URL: http://codereview.chromium.org/545007
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3861 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This change adds a post-order numbering to AST nodes that
are relevant for the fast code generator. It is only invoked
together with the fast compiler.
Also changed the ast printer to print the numbering for
testing purposes if it is present.
Review URL: http://codereview.chromium.org/553134
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3738 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The problem appeared due to a fact that stubs doesn't create a stack
frame, reusing the stack frame of the caller function. When building
stack traces, the current function is retrieved from PC, and its
callees are retrieved by traversing the stack backwards. Thus, for
stubs, the stub itself was discovered via PC, and then stub's caller's
caller was retrieved from stack.
To fix this problem, a pointer to JSFunction object is now captured
from the topmost stack frame, and is saved into stack trace log
record. Then a simple heuristics is applied whether a referred
function should be added to decoded stack, or not, to avoid reporting
the same function twice (from PC and from the pointer.)
BUG=553
TEST=added to mjsunit/tools/tickprocessor
Review URL: http://codereview.chromium.org/546089
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3673 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
As this is only needed for internal profiling (not for DevTools),
the following approach had been chosen:
- during snapshot creation, positions of serialized objects inside
a snapshot are logged;
- then during V8 initialization, positions of deserealized objects
are logged;
- those positions are used for retrieving code objects names from
snapshot creation log, which needs to be supplied to tick processor
script.
Positions logging is controlled with the new flag: --log_snapshot_positions.
This flag is turned off by default, and this adds no startup penalty.
To plug this fix to Golem, the following actions are needed:
- logs created using 'mksnapshot' need to be stored along with VM images;
- tick processor script needs to be run with '--snapshot-log=...' cmdline
argument.
BUG=571
Review URL: http://codereview.chromium.org/551062
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3635 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
it in regular flat strings that are part of the snapshot.
After this change we don't need libraries-empty.cc any more. In
this change libraries-empty.cc is just a the same as libraries.cc
and the scons build builds it but does not use it. We can move
in stages to a situation where it is not generated at all for all
the build systems that we have.
Review URL: http://codereview.chromium.org/360050
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3238 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
fast-mode code generator.
AST expression nodes are annotated with a location when doing the
initial syntactic check of the AST. In the current implementation,
expression locations are 'temporary' (ie, allocated to the stack) or
'nowhere' (ie, the expression's value is not needed though it must be
evaluated for side effects).
For the assignment '.result = true' on IA32, we had before (with the
true value already on top of the stack):
32 mov eax,[esp]
35 mov [ebp+0xf4],eax
38 pop eax
Now:
32 pop [ebp+0xf4]
======== On x64, before:
37 movq rax,[rsp]
41 movq [rbp-0x18],rax
45 pop rax
Now:
37 pop [rbp-0x18]
======== On ARM, before (with the true value in register ip):
36 str ip, [sp, #-4]!
40 ldr ip, [sp, #+0]
44 str ip, [fp, #-12]
48 add sp, sp, #4
Now:
36 str ip, [fp, #-12]
Review URL: http://codereview.chromium.org/267118
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3076 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Turned on with '--log-producers' flag, also needs '--noinline-new' (this is temporarily), '--log-code', '--log-gc'. Not all allocations are traced (I'm investigating.)
Stacks are stored using weak handles. Thus, when an object is collected, its allocation stack is deleted.
Review URL: http://codereview.chromium.org/267077
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3069 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
fast code generator is optimized for compilation time and code size.
Currently it is only implemented on IA32. It is potentially triggered
for any code in the global scope (including code eval'd in the global
scope). It performs a syntactic check and chooses to compile in fast
mode if the AST contains only supported constructs and matches some
other constraints.
Initially supported constructs are
* ExpressionStatement,
* ReturnStatement,
* VariableProxy (variable references) to parameters and
stack-allocated locals,
* Assignment with lhs a parameter or stack-allocated local, and
* Literal
This allows compilation of literals at the top level and not much
else.
All intermediate values are allocated to temporaries and the stack is
used for all temporaries. The extra memory traffic is a known issue.
The code generated for 'true' is:
0 push ebp
1 mov ebp,esp
3 push esi
4 push edi
5 push 0xf5cca135 ;; object: 0xf5cca135 <undefined>
10 cmp esp,[0x8277efc]
16 jnc 27 (0xf5cbbb1b)
22 call 0xf5cac960 ;; code: STUB, StackCheck, minor: 0
27 push 0xf5cca161 ;; object: 0xf5cca161 <true>
32 mov eax,[esp]
35 mov [ebp+0xf4],eax
38 pop eax
39 mov eax,[ebp+0xf4]
42 mov esp,ebp ;; js return
44 pop ebp
45 ret 0x4
48 mov eax,0xf5cca135 ;; object: 0xf5cca135 <undefined>
53 mov esp,ebp ;; js return
55 pop ebp
56 ret 0x4
Review URL: http://codereview.chromium.org/273050
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3067 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The profile is taken together with constructors profile. In theory, it
should represent a complete heap graph. However, this takes a lot of memory,
so it is reduced to a more compact, but still useful form. Namely:
- objects are aggregated by their constructors, except for Array and Object
instances, that are too hetereogeneous;
- for Arrays and Objects, initially every instance is concerned, but then
they are grouped together based on their retainer graph paths similarity (e.g.
if two objects has the same retainer, they are considered equal);
Review URL: http://codereview.chromium.org/200132
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2903 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The SCons build now has the options profilingsupport and debuggersupport for controlling the setting of the defines ENABLE_LOGGIGN_AND_PROFILING and ENABLE_DEBUGGER_SUPPORT. By default both are set to true.
The changes to the XCode project have not been tested.
Review URL: http://codereview.chromium.org/195061
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2875 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
-fstrict-aliasing is enabled by mainline gcc at -O2 and higher, but in Apple
gcc, it must be enabled explicitly. This results in a 1.5% improvement in V8
benchmark scores.
This also removes the -fno-exceptions and -fno-rtti settings from v8.gyp for
the Mac, and removes -fno-rtti from v8.gyp for Linux, because these settings
have become part of Chromium's common.gypi, included here, as of r23304 at the
latest. The settings in v8.gyp have become redundant.
Review URL: http://codereview.chromium.org/174154
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2734 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
system can't currently process stacks produced by gcc -fomit-frame-pointer
properly. The drawback outweighs the 2% performance improvement. Once
the crash reporting system is able to handle this optimization, it should be
revisited.
Review URL: http://codereview.chromium.org/173123
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2733 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
These files will make it possible to start working with the 64-bit version on Windows.
The GUID's of the x64 project files are the same as their ia32 counterparts, but that does not matter as they will never be used in the same solution.
Added a temporary #error when building 64-bit version on Windows.
Review URL: http://codereview.chromium.org/171111
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2711 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
MSVS names '.map' file using only module's name, so both 'a.exe' and 'a.dll' will have 'a.map' file. To distinguish an originating module, we're now checking for image base which is always 00400000 for .exe files, and not 00400000 for .dlls.
Verified that windows-tick-processor can now process logs from Chromium using .map file generated for 'chrome.dll', an that it still works for V8's 'shell.exe'.
Review URL: http://codereview.chromium.org/172044
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2699 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
generated in one-pass from the source AST, code is generated from the
CFG. Enabled by the flag --multipass and disabled by default.
Rudimentary and currently only supports literal expressions and return
statements. There are some other known limitations (e.g., missing
support for tracing).
Review URL: http://codereview.chromium.org/159695
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2596 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
It is activated with '--log-gc' flag.
JS object size is calculated as its size + size of 'properties' and 'elements' arrays, if they are non-empty. This doesn't take maps, strings, heap numbers, and other shared objects into account.
As Soeren suggested, I've moved ZoneSplayTree from jsregexp to zone, and removed now empty jsregexp-inl header file.
Review URL: http://codereview.chromium.org/159504
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2570 ce2b1a6d-e550-0410-aec6-3dcde31c8c00