it is the last patch of https://codereview.chromium.org/1012633002
All that we need here is to push the collected info to the profiler
and convert it into actionable information about deopt.
On the Next: get the info accessible by embedder.
BUG=chromium:452067
LOG=n
TEST=DeoptAtFirstLevelInlinedSource, DeoptAtSecondLevelInlinedSource, DeoptUntrackedFunction
Review URL: https://codereview.chromium.org/1013143003
Cr-Commit-Position: refs/heads/master@{#27403}
We use slightly different schema for JumpTable on arm64 than for x64.
We do a branch (B) to the JumpTable from the code,
then a branch (B) to the end of jump table code
and then branch to the deoptimizer code with putting
the return address into lr register (Call which is actually Blr).
As a result the 'from' address in Deoptimizer always points to
the end of JumpTable code and we can get nothing from this information.
0) I moved save_doubles and needs_frame code out of for_loop.
1) I replaced B commands with Bl so we put different return addresses
to lr register for the different jump table entries and replaced
the final Call with Br which do not touch lr register.
Also I removed the last_entry check so we will always do the Bl
even for the last entry because we need the right address in lr.
I don't think that this will affect the performance because it
just one more branch for entire deopt mechanics.
BUG=chromium:452067
LOG=n
Review URL: https://codereview.chromium.org/984893003
Cr-Commit-Position: refs/heads/master@{#27094}
The original code always returned the first entry from RelocInfo that matched with
bailout_id. But we may have a few different deopt reasons for one bailout_id.
So we need to get the one which matches with a particular call from JumpTable.
We can do this by checking not 'target_address' (it maps to bailout_id)
but 'from' address which maps to a particular JumpTable entry.
The test was reworked so it tests identical functions against different reasons.
BUG=chromium:452067
LOG=n
Review URL: https://codereview.chromium.org/984773003
Cr-Commit-Position: refs/heads/master@{#27076}
Reason for revert:
Some tests still flaky
Original issue's description:
> CpuProfiler: enable tests except four failing tests.
>
> Four tests are failing due to a problem with no frame ranges.
>
> BUG=
> LOG=n
>
> Committed: https://crrev.com/2be160e726f2be6272b77e53fbd556aded6024f1
> Cr-Commit-Position: refs/heads/master@{#27035}
TBR=yurys@chromium.org,svenpanne@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=
Review URL: https://codereview.chromium.org/987553005
Cr-Commit-Position: refs/heads/master@{#27037}
Four tests are failing due to a problem with no frame ranges.
BUG=
LOG=n
Review URL: https://codereview.chromium.org/976203003
Cr-Commit-Position: refs/heads/master@{#27035}
The root of problem is the fact that we don't track the position of 'this' statement but use them when visit compare statement.
As a result we have -1 as the position of left expression and the resulting relative position is negative and doesn't fit into BitField.
BUG=452067
TEST=test-cpu-profiler/SourceLocation
LOG=n
Review URL: https://codereview.chromium.org/940593002
Cr-Commit-Position: refs/heads/master@{#26741}
1) create beefy RelocInfo table when cpu profiler is active, so if a function
was optimized when profiler was active RelocInfo would get separate DeoptInfo
for the each deopt case.
2) push DeoptInfo from CodeEntry to ProfileNode.
When deopt happens we put the info collected on #1 into CodeEntry and record stack sample.
On the sampling thread we grab the deopt data and append it to the corresponding ProfileNode deopts list.
Sample profile dump.
[Top down]:
0 (root) 0 #1
1 29 #2
1 test 29 #3
2 opt_function 29 #4
2 opt_function 29 #5
deopted at 118 with reason 'not a heap number'
deopted at 137 with reason 'division by zero'
BUG=452067
LOG=n
Committed: https://crrev.com/ce8701b247d3c6604f24f17a90c02d17b4417f54
Cr-Commit-Position: refs/heads/master@{#26615}
Review URL: https://codereview.chromium.org/919953002
Cr-Commit-Position: refs/heads/master@{#26630}
Reason for revert:
static initializers broke the build
Original issue's description:
> CPUProfiler: Push deopt reason further to ProfileNode.
>
> 1) create beefy RelocInfo table when cpu profiler is active, so if a function
> was optimized when profiler was active RelocInfo would get separate DeoptInfo
> for the each deopt case.
>
> 2) push DeoptInfo from CodeEntry to ProfileNode.
> When deopt happens we put the info collected on #1 into CodeEntry and record stack sample.
> On the sampling thread we grab the deopt data and append it to the corresponding ProfileNode deopts list.
>
> Sample profile dump.
> [Top down]:
> 0 (root) 0 #1
> 1 29 #2
> 5 test 29 #3
> 3 opt_function 29 #4
> deopted at 52 with reason 'not a heap number'
> deopted at 71 with reason 'division by zero'
>
> BUG=452067
> LOG=n
>
> Committed: https://crrev.com/ce8701b247d3c6604f24f17a90c02d17b4417f54
> Cr-Commit-Position: refs/heads/master@{#26615}
TBR=jarin@chromium.org,svenpanne@chromium.org,yurys@chromium.org,alph@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=452067
Review URL: https://codereview.chromium.org/915173005
Cr-Commit-Position: refs/heads/master@{#26616}
1) create beefy RelocInfo table when cpu profiler is active, so if a function
was optimized when profiler was active RelocInfo would get separate DeoptInfo
for the each deopt case.
2) push DeoptInfo from CodeEntry to ProfileNode.
When deopt happens we put the info collected on #1 into CodeEntry and record stack sample.
On the sampling thread we grab the deopt data and append it to the corresponding ProfileNode deopts list.
Sample profile dump.
[Top down]:
0 (root) 0 #1
1 29 #2
5 test 29 #3
3 opt_function 29 #4
deopted at 52 with reason 'not a heap number'
deopted at 71 with reason 'division by zero'
BUG=452067
LOG=n
Review URL: https://codereview.chromium.org/919953002
Cr-Commit-Position: refs/heads/master@{#26615}
1) Deoptimizer::Reason was replaced with Deoptimizer::DeoptInfo
because it also has raw position. Also the old name clashes with DeoptReason enum.
2) c_entry_fp assignment call was added to EntryGenerator::Generate
So we can calculate sp and have a chance to record the stack for the deopting function.
btw it makes the test stable.
3) new kind of CodeEvents was added to cpu-profiler
4) GetDeoptInfo method was extracted from PrintDeoptLocation.
So it could be reused in cpu profiler.
BUG=452067
LOG=n
Review URL: https://codereview.chromium.org/910773002
Cr-Commit-Position: refs/heads/master@{#26545}
During generation code and relocation info are generated simultaneously.
When code generation is done you each code object has associated "relocation info".
Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection),
correspondences between the machine program counter and source locations for stack walking.
This patch:
1. Add more source positions info in reloc info to make it suitable for source level mapping.
The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and
(2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other).
I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark).
2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line.
If a source line is found that hit counter is increased by one for this line.
3. Add a new public V8 API to get the hit source lines by CDT CPU profiler.
Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown.
4.Add a test that checks how the samples are distributed through source lines.
It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version.
Patch from Denis Pravdin <denis.pravdin@intel.com>;
R=svenpanne@chromium.org, yurys@chromium.org
Review URL: https://codereview.chromium.org/682143003
Patch from Weiliang <weiliang.lin@intel.com>.
Cr-Commit-Position: refs/heads/master@{#25182}
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Reason for revert:
It broke layout test fast/events/window-onerror-02.html, error column reported by window.onerror is now wrong (I believe it is because of the change in full-codegen):
http://build.chromium.org/p/client.v8/builders/V8-Blink%20Linux%2064%20%28dbg%29/builds/652
Original issue's description:
> Extend CPU profiler with mapping ticks to source lines
>
> The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers.
> During generation code and relocation info are generated simultaneously.
> When code generation is done you each code object has associated "relocation info".
> Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection),
> correspondences between the machine program counter and source locations for stack walking.
>
> This patch:
> 1. Add more source positions info in reloc info to make it suitable for source level mapping.
> The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and
> (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other).
> I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark).
>
> 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line.
> If a source line is found that hit counter is increased by one for this line.
>
> 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler.
> Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown.
>
> 4.Add a test that checks how the samples are distributed through source lines.
> It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version.
>
> Patch from Denis Pravdin <denis.pravdin@intel.com>
> BUG=None
> LOG=Y
> R=svenpanne@chromium.org
>
> Committed: https://code.google.com/p/v8/source/detail?r=24389TBR=svenpanne@chromium.org,danno@chromium.org,alph@chromium.org,denis.pravdin@intel.com,weiliang.lin@intel.com
BUG=None
LOG=N
Review URL: https://codereview.chromium.org/624443005
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24394 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers.
During generation code and relocation info are generated simultaneously.
When code generation is done you each code object has associated "relocation info".
Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection),
correspondences between the machine program counter and source locations for stack walking.
This patch:
1. Add more source positions info in reloc info to make it suitable for source level mapping.
The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and
(2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other).
I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark).
2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line.
If a source line is found that hit counter is increased by one for this line.
3. Add a new public V8 API to get the hit source lines by CDT CPU profiler.
Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown.
4.Add a test that checks how the samples are distributed through source lines.
It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version.
Patch from Denis Pravdin <denis.pravdin@intel.com>
BUG=None
LOG=Y
R=svenpanne@chromium.org
Review URL: https://codereview.chromium.org/616963005
Patch from Denis Pravdin <denis.pravdin@intel.com>.
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24389 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The reuse of CodeCreateEvent for deopt events caused a CodeCreateEvent
fired twice for a code object. When the event was processed for the first
time it seized the no-fp-ranges from code object, so the second event
had no ranges info leaving code entry without them.
As a result when a cpu profile sample falls into the region it missed the
2nd stack frame.
LOG=N
BUG=
R=bmeurer@chromium.org, loislo@chromium.org
Review URL: https://codereview.chromium.org/290093005
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@21418 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- Distinguish between context bound scripts (Script) and context unbound scripts
(UnboundScript).
- Add ScriptCompiler (which will later contain functions for async compilation).
This is a breaking change, in particular, Script::New no longer exists (it is
replaced by ScriptCompiler::CompileUnbound). Script::Compile remains as a
backwards-compatible shorthand for ScriptCompiler::Compile.
Passing CompilerOptions with produce_data_to_cache = true doesn't do anything
yet; the only way to generate the data to cache is the old preparsing API. (To
be fixed in the next version.)
This is a fixed version of https://codereview.chromium.org/186723005/
BUG=
R=dcarney@chromium.org
Review URL: https://codereview.chromium.org/199063003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@19925 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- Distinguish between context bound scripts (Script) and context unbound scripts
(UnboundScript).
- Add ScriptCompiler (which will later contain functions for async compilation).
This is a breaking change, in particular, Script::New no longer exists (it is
replaced by ScriptCompiler::CompileUnbound). Script::Compile remains as a
backwards-compatible shorthand for ScriptCompiler::Compile.
Passing CompilerOptions with produce_data_to_cache = true doesn't do anything
yet; the only way to generate the data to cache is the old preparsing API. (To
be fixed in the next version.)
BUG=
R=dcarney@chromium.org
Review URL: https://codereview.chromium.org/186723005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@19881 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
All methods for accessing collected profiles by index are deprecated. The indexed storage may well be implemented by the embedder should he need it. CpuProfiler's responsibility is just to create CpuProfile object that contains all collected data and whose lifetime can be managed by the embedder.
BUG=chromium:327298
LOG=Y
R=svenpanne@chromium.org
Review URL: https://codereview.chromium.org/117353002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18337 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Made operator* return reference to the raw type, not pointer. New method 'get()' should be used when raw pointer is needed.
Also removed useless inline modifier from the SmaprtPointer methods and added const modifier to the methods that don't change smart pointer.
Made ~SmartPointerBase protected to avoid accidental calls of the non-virtual base class's destructor.
drive-by: fixed use after free in src/factory.cc
BUG=None
LOG=N
R=alph@chromium.org, svenpanne@chromium.org
Review URL: https://codereview.chromium.org/101763003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18275 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Profiler is now started from JavaScript. Since we always capture stack trace when starting profiler there should always be at least one expected sample in the profile.
Also changed ProfilerEventsProcessor::AddCurrentStack to make sure it call TickSample::Init to instead of custom initialization code.
BUG=v8:2920
R=jkummerow@chromium.org
Review URL: https://codereview.chromium.org/25686011
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@17140 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
These classes are meant to replace OS::Ticks() and OS::TimeCurrentMillis(),
which are broken in several ways. The ElapsedTimer class implements a
stopwatch using TimeTicks::HighResNow() for high resolution, monotonic
timing.
Also fix the CpuProfile::GetStartTime() and CpuProfile::GetEndTime()
methods to actually return the time relative to the unix epoch as stated
in the documentation (previously that was relative to some arbitrary
point in time, i.e. boot time).
The previous Windows issues have been resolved, and we now use GetTickCount64()
on Windows Vista and later, falling back to timeGetTime() with rollover
protection for earlier Windows versions.
BUG=v8:2853
R=machenbach@chromium.org, yurys@chromium.org
Review URL: https://codereview.chromium.org/23490015
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16413 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
These classes are meant to replace OS::Ticks() and OS::TimeCurrentMillis(),
which are broken in several ways. The ElapsedTimer class implements a
stopwatch using TimeTicks::HighResNow() for high resolution, monotonic
timing.
Also fix the CpuProfile::GetStartTime() and CpuProfile::GetEndTime()
methods to actually return the time relative to the unix epoch as stated
in the documentation (previously that was relative to some arbitrary
point in time, i.e. boot time).
BUG=v8:2853
R=machenbach@chromium.org
Review URL: https://codereview.chromium.org/23469013
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16398 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
These classes are meant to replace OS::Ticks() and OS::TimeCurrentMillis(),
which are broken in several ways. The ElapsedTimer class implements a
stopwatch using TimeTicks::HighResNow() for high resolution, monotonic
timing.
Also fix the CpuProfile::GetStartTime() and CpuProfile::GetEndTime()
methods to actually return the time relative to the unix epoch as stated
in the documentation (previously that was relative to some arbitrary
point in time, i.e. boot time).
BUG=v8:2853
R=machenbach@chromium.org, yurys@chromium.org
Review URL: https://codereview.chromium.org/23295034
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16388 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
New flag is added that allows to specify CPU profiler sampling rate in microseconds as command line argument. It was tested to work fine with 100us interval(currently it is 1ms). Default values are kept the same as in the current implementation. The new implementation is enabled only on POSIX platforms which use signals to collect samples. Other platforms that pause thread being sampled are to follow.
SIGPROF signals are now sent on the profiler event processor thread to make sure that the processing thread does fall far behind the sampling.
The patch is based on the previous one that was rolled out in r13851. The main difference is that the circular queue is not modified for now.
On Linux sampling for CPU profiler is initiated on the profiler event processor thread, other platforms to follow.
CPU profiler continues to use SamplingCircularQueue, we will probably replace it with a single sample buffer when Mac and Win ports support profiling on the event processing thread.
When --prof option is specified profiling is initiated either on the profiler event processor thread if CPU profiler is on or on the SignalSender thread as it used to be if no CPU profiles are being collected.
ProfilerEventsProcessor::ProcessEventsAndDoSample now waits in a tight loop, processing collected samples until sampling interval expires. To save CPU resources I'm planning to change that to use nanosleep as only one sample is expected in the queue at any point.
BUG=v8:2814
R=bmeurer@chromium.org
Review URL: https://codereview.chromium.org/21101002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16310 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
All the tests that started crashing create ProfilerEventsProcessor on the stack. After r16284 SamplingCircularQueue buffer is allocated as a field of the queue instead of separate heap object. This increased self size of ProfilerEventsProcessor by about 1Mb. Windows malloc fails to allocate such an object on the stack and crashes.
BUG=
R=jkummerow@chromium.org
Review URL: https://codereview.chromium.org/23093022
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16287 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
All of these values are derived from the self samples count and there is no need to evaluate them in v8 when clients can do that when needed on their side.
Also added unsigned GetHitCount() which should be used instead of double GetSelfSamplesCount(). I'm going to deprecate the latter one once Blink has switched to GetHitCount.
BUG=267595
TBR=svenpanne@chromium.org
Review URL: https://codereview.chromium.org/22710006
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16119 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The start and end time are now measured in microseconds and the type is int64_t.
This way it seems more natural as we are going to support submilisecond sampling
rate soon. Also it fixes cctest/test-cpu-profiler/ProfileStartEndTime test
failure caused by comparison between long double and double.
TEST=cctest/test-cpu-profiler/ProfileStartEndTime
BUG=v8:2824
R=bmeurer@chromium.org
Review URL: https://codereview.chromium.org/22155003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16067 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The start and end time are now measured in microseconds and the type is int64_t. This way it seems more natural as we are going to support submilisecond sampling rate soon. Also it fixes cctest/test-cpu-profiler/ProfileStartEndTime test failure caused by comparison between long double and double.
TEST=cctest/test-cpu-profiler/ProfileStartEndTime
BUG=v8:2824
R=alph@chromium.org, bmeurer@chromium.org
Review URL: https://codereview.chromium.org/22172002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16049 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
I'm going to change CPU profiler API and deprecate GetSelfTime, GetTotalTime and GetTotalSamplesCount on CpuProfileNode as all of those values are derived from self samples count and sampling rate. The sampling rate in turn is calculate based on the profiling duration so having start/end time and total sample count is enough for calculating smpling rate.
BUG=267595
R=alph@chromium.org, bmeurer@chromium.org
Review URL: https://codereview.chromium.org/21918002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16039 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Sampling rate is now calculated as total number of samples divided by profiling time in ms. Before the patch the sampling rate was updated once per 100ms which doesn't have any obvious advantage over the simpler method.
Also we are going to get rid of the profile node self and total time calculation in the v8 CPU profiler and only expose profiling start/end time for CpuProfile and number of ticks on each ProfileNode and let clients do all the math should they need it.
BUG=None
R=bmeurer@chromium.org, loislo@chromium.org
Review URL: https://codereview.chromium.org/21105003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15944 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The SafeStackFrameIterator used by CPU profiler checked if Isolate::c_entry_fp is null and if it is not it would think that the control flow currently is in some native code. This assumption is wrong because the native code could have called a JS function but JSEntryStub would not reset c_entry_fp to NULL in that case. This CL adds a check in SafeStackFrameIterator::IsValidTop for the case when there is a JAVA_SCRIPT frame on top of EXIT frame.
Also this CL changes ExternalCallbackScope behavior to provide access to the whole stack of the scope objects instead of only top one. This allowed to provide exact callback names for those EXIT frames where external callbacks are called. Without this change it was possible only for the top most native call.
BUG=None
R=loislo@chromium.org, yangguo@chromium.org
Review URL: https://codereview.chromium.org/19775017
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15832 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
1) report line number even if a script has no resource_name (evals);
a) do that for already compiled functions in log.cc;
b) do that for fresh evals in compiler.cc;
2) Implement the test for LineNumbers and make it fast and stable, otherwise we have to wait for tick samples;
a) move processor_->Join() call into new Processor::StopSynchronously method;
b) Process all the CodeEvents even if we are stopping Processor thread;
c) make getters for generator and processor;
3) Fix the test for Jit that didn't expect line numbers;
4) Minor refactoring:
a) in ProcessTicks;
b) rename enqueue_order_ to last_code_event_id_ for better readability;
c) rename dequeue_order_ to last_processed_code_event_id_ and make it a member for better readability;
BUG=
TEST=test-profile-generator/LineNumber
R=jkummerow@chromium.org, yurys@chromium.org
Review URL: https://codereview.chromium.org/18058008
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15530 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
When pc is inside FunctionApply builtin function the top frame may be either
2) Internal stack frame created by FunctionApply itself.
In this case we know its caller's pc and can correctly resolve calling function.
1) Frame of the calling JavaScript function that invoked .apply(). In this case we have no practical reliable way to find out the caller's pc so we mark the caller's frame as 'unresolved'.
All this logic is implemented in ProfileGenerator. SafeStackFrameIterator is extended to provide type of the current top stack frame (iteration actually starts from the caller's frame as we know top function from pc).
BUG=252097
R=jkummerow@chromium.org, loislo@chromium.org
Review URL: https://codereview.chromium.org/18269003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15468 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
When current function is FunctionCall builtin we have no reliable way to determine its caller function (in many cases the top of the sampled stack contains address of the caller but sometimes it does not). Instead of dropping the sample or its two top frames we simply mark the caller frame as '(unresolved function)'. It seems like a better approach that dropping whole sample as knowing the top function and the rest of the stack the user should be able to figure out what the caller was.
This change adds builtin id to CodeEntry objects. It will be used later to add similar top frame analysis for FunctionApply and probably other builtins.
BUG=None
TBR=loislo@chromium.org
Review URL: https://codereview.chromium.org/18422003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15436 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
When current function is FunctionCall builtin we have no reliable way to determine its caller function (in many cases the top of the sampled stack contains address of the caller but sometimes it does not). Instead of dropping the sample or its two top frames we simply mark the caller frame as '(unresolved function)'. It seems like a better approach that dropping whole sample as knowing the top function and the rest of the stack the user should be able to figure out what the caller was.
This change adds builtin id to CodeEntry objects. It will be used later to add similar top frame analysis for FunctionApply and probably other builtins.
BUG=None
R=jkummerow@chromium.org, loislo@chromium.org
Review URL: https://codereview.chromium.org/18316004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15426 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The only way to get v8::CpuProfiler instance in the V8 public API is to call v8::Iolate::GetCpuProfiler(). The method will return NULL if the isolate has not been initialized yet or has been torn down already. It is the client's reponsibility to make sure that CPU profiling has been stopped before disposing of the isolate.
This CL adds a test for this and several ASSRTS enforcing that assumptions. This allowed to be sure that heap is always setup when CPU profiling is being started. Based on that the number of places where already compiled functions are reported to the profiler event processor boils down to the single place (CpuProfiler::StartProcessorIfNotStarted). I'm going to rely on this assumption in further changes.
BUG=None
R=loislo@chromium.org, yangguo@chromium.org
Review URL: https://codereview.chromium.org/18336002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15415 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The bodies of methods in ProfilerEventProcessor were moved into CpuProfiler.
Multiple NewCodeEntry methods in CpuProfilesCollection were replaced with one which
simply passes arguments to the CodeEntry constructor.
And CpuProfiler just calls this method when it needs a CodeEntry object.
This NewCodeEntry method is required because CpuProfilesCollection keeps ownership of CodeEntry objects.
BUG=255392
TEST=existing tests
R=yangguo@chromium.org, yurys@chromium.org
Review URL: https://codereview.chromium.org/18053004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15405 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
CPU profiler doesn't use stack handlers so there is no need to iterate through them while traversing stack. This change SafeStackFrameIterator always iterate only frames and removes checks corresponding to the handlers iteration.
The problem described in the bug occurred because of a false assumption in SafeStackFrameIterator that if Isolate::c_entry_fp is not NULL then the top frame on the stack is always a C++ frame. It is false because we may have entered JS code again, in which case JS_ENTRY code stub generated by JSEntryStub::GenerateBody() will save current c_entry_fp value but not reset it to NULL and after that it will create ENTRY stack frame and JS_ENTRY handler on the stack and put the latter into Isolate::handler(top). This means that if we start iterating from c_entry_fp frame and try to compare the frame's sp with Isolate::handler()->address() it will turn out that frame->sp() > handler->address() and the condition in SafeStackFrameIterator::CanIterateHandles is not held.
BUG=252097
R=loislo@chromium.org, svenpanne@chromium.org
Review URL: https://codereview.chromium.org/17589022
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15348 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This change modifies code produced by BaseLoadStubCompiler::GenerateLoadCallback so that instead of calling AccessorGetter direcly it calls InvokeAccessorGetter which changes VM state and calls the actual callback. This way CPU profiler knows which external callback is being executed in this case. Indirect call happens only if CpuProfiler::is_profiling() is true.
This is exactly same change as r15116 with a build fix for test-api.cc
BUG=244580
TBR=danno@chromium.org
Review URL: https://codereview.chromium.org/16858013
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15135 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This reverts commit f323d984a73bab345c4eab5c1907552ccfa7ccaa.
Broke compilation on the bots with an error that doesn't occur locally:
CXX(target) /mnt/data/b/build/slave/v8-linux-debug/build/v8/out/Debug/obj.target/cctest/test/cctest/test-bignum-dtoa.o
../test/cctest/test-api.cc: In function ‘void FastReturnValueCallback(const v8::FunctionCallbackInfo<v8::Value>&) [with T = int]’:
../test/cctest/test-api.cc:1129: error: insufficient contextual information to determine type
../test/cctest/test-api.cc: In function ‘void FastReturnValueCallback(const v8::FunctionCallbackInfo<v8::Value>&) [with T = unsigned int]’:
../test/cctest/test-api.cc:1136: error: insufficient contextual information to determine type
../test/cctest/test-api.cc: In function ‘void FastReturnValueCallback(const v8::FunctionCallbackInfo<v8::Value>&) [with T = double]’:
../test/cctest/test-api.cc:1143: error: insufficient contextual information to determine type
../test/cctest/test-api.cc: In function ‘void FastReturnValueCallback(const v8::FunctionCallbackInfo<v8::Value>&) [with T = bool]’:
../test/cctest/test-api.cc:1150: error: insufficient contextual information to determine type
../test/cctest/test-api.cc: In function ‘void FastReturnValueCallback(const v8::FunctionCallbackInfo<v8::Value>&) [with T = void]’:
../test/cctest/test-api.cc:1157: error: insufficient contextual information to determine type
CXX(target) /mnt/data/b/build/slave/v8-linux-debug/build/v8/out/Debug/obj.target/cctest/test/cctest/test-circular-queue.o
BUG=None
TBR=svenpanne@chromium.org
Review URL: https://codereview.chromium.org/16838013
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15117 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This change fixes the case when the accessors are invoked from JSObject::{Get,Set}PropertyWithCallback.
It already works for inlined calls generated by StoreStubCompiler::CompileStoreCallback. The same still needs to be fixed for getter invocations generated by BaseLoadStubCompiler::CompileLoadCallback, corresponding case is commented out in the new test.
This is a slightly modified version of r14915 which was rolled back due to test timeout on Windows. Compared to r14915 the new tests use OS::TimeCurrentMillis instead of OS::Ticks as OS::Ticks has ms precision on Windows and trying to wait 10 ticks (us) will result in at least 1 ms pause.
BUG=244580
R=jkummerow@chromium.org, loislo@chromium.org
Review URL: https://codereview.chromium.org/15995017
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@14932 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This change fixes the case when the accessors are invoked from JSObject::{Get,Set}PropertyWithCallback.
It already works for inlined calls generated by StoreStubCompiler::CompileStoreCallback. The same still needs to be fixed for getter invocations generated by BaseLoadStubCompiler::CompileLoadCallback, corresponding case is commented out in the new test.
BUG=244580
R=jkummerow@chromium.org, loislo@chromium.org
Review URL: https://codereview.chromium.org/16004007
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@14915 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Stack iterator takes return address based on the frame pointer (ebp) and detects JS frames based on value at fp + StandardFrameConstants::kMarkerOffset. So in order the iterator to work correctly this values should be already setup for the current function. Stack frame is constructed at the very beginning of JS function code and destroyed before return. If sample is taken before before the frame construction is completed or after it was destroyed the stack iterator will wrongly think that FP points at the current functions frame base and will skip callers frame. To avoid this we mark code ranges where stack frame doesn't exist and completely ignore such samples.
This fixes cctest/test-cpu-profiler/CollectCpuProfile flakiness.
BUG=v8:2628
R=jkummerow@chromium.org
Review URL: https://codereview.chromium.org/14253015
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@14670 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The new test checks full CPU profiling cycle: using public
V8 API it starts profiling, executes a script, stops profiling
and analyzes collected profile to check that its top-down
tree has expected strutcture. The script that is being profiled
is guaranteed to run > 200ms to make sure enough samples
are collected.
To avoid possible flakiness due to non-deterministic time required
to start new thread on varios OSs when Sampler and ProfilerEventsProcessor
threads are being started the main thread is blocked until the threads
are running.
Also I removed the heuristic in profile-generator.cc where we try
to figure out if the value on top of the sampled stack is return address
of some frameless stub invocation. The code periodically gives false positive
with the new test ending up in an extra node in the collected cpu profile.
After discussion with jkummerow@ we concluded that the logic is too fragile
and that we can address frameless stub invocations in a more reliable way
later should they have a noticeable effect on cpu profiling.
BUG=None
Review URL: https://codereview.chromium.org/13627002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@14205 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
CPU profiler API is extended with methods that allow to retrieve individual samples from profile. Each sample is presented as a pointer to a node in the top-down profile tree. The samples will let us tie JS performance to time.
BUG=None
Review URL: https://codereview.chromium.org/12919002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13980 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
I tried to limit the use of v8::Isolate::GetCurrent() and v8::internal::Isolate::Current() as much as possible, but sometimes this would have involved restructuring tests quite a bit, which is better left for a separate CL.
BUG=v8:2487
Review URL: https://codereview.chromium.org/12716010
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13953 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The patch is based on the previous one that was rolled out: https://code.google.com/p/v8/source/detail?r=12985
On Linux sampling for CPU profiler is initiated on the profiler event processor thread, other platforms to follow.
CPU profiler continues to use SamplingCircularQueue, we will replave it with a single sample buffer when Mac and Win ports support profiling on the event processing thread.
When --prof option is specified profiling is initiated either on the profiler event processor thread if CPU profiler is on or on the SignalSender thread as it used to if no CPU profiles are being collected.
ProfilerEventsProcessor::ProcessEventsAndDoSample now waits in a tight loop, processing collected samples until sampling interval expires. To save CPU resources I'm planning to change that to use nanosleep as only one sample is expected in the queue at any point.
BUG=v8:2364
Review URL: https://codereview.chromium.org/12321046
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13735 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- perform CPU profiler sampling in the sampler thread as we used to;
- skip sampling in the sampling thread if processing thread is running;
- only install SIGPROF handler when CPU profiling is enabled.
BUG=v8:2364
Review URL: https://codereview.chromium.org/11231002
Patch from Sergey Rogulenko <rogulenko@google.com> and Andrey Kosyakov <caseq@chromium.org>.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@12985 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The preprocessor defines ENABLE_LOGGING_AND_PROFILING and ENABLE_VMSTATE_TRACKING has been removed as these where required to be turned on for Crankshaft to work. To re-enable reducing the binary size by leaving out heap and CPU profiler a new set of defines needs to be created.
R=ager@chromium.org
BUG=v8:1271
TEST=all
Review URL: http://codereview.chromium.org//7350014
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@8622 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Thread class was receiving an isolate parameter by default.
This approact violates the assumption that only VM threads
can have an associated isolate, and can lead to troubles,
because accessing the same isolate from different threads
leads to race conditions.
This was found by investigating mysterious failures of the
CPU profiler layout test on Linux Chromium. As almost all
threads were associated with some isolate, the sampler was
trying to sample them.
As a side effect, we have also fixed the DebuggerAgent test.
Thanks to Vitaly for help in fixing isolates handling!
R=vitalyr@chromium.org
BUG=none
TEST=none
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@8259 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The main issue was due to multiple recompilations of functions. Now
code objects are grouped by function using SFI object address.
JSFunction objects are no longer tracked, instead we track SFI object
moves. To pick a correct code version, we now sample return addresses
instead of JSFunction addresses.
tools/{linux|mac|windows}-tickprocessor scripts differentiate
between code optimization states for the same function
(using * and ~ prefixes introduced earlier).
DevTools CPU profiler treats all variants of function code as
a single function.
ll_prof treats each optimized variant as a separate entry, because
it can disassemble each one of them.
tickprocessor.py not updated -- it is deprecated and will be removed.
BUG=v8/1087,b/3178160
TEST=all existing tests pass, including Chromium layout tests
Review URL: http://codereview.chromium.org/6551011
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@6902 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
As several pages can run in a single V8 instance, it is possible to
have functions from different security contexts intermixed in a single
CPU profile. To avoid exposing function names from one page to
another, filtering is introduced.
The basic idea is that instead of capturing return addresses from
stack, we're now capturing JSFunction addresses (as we anyway work
only with JS stack frames.) Each JSFunction can reach out for
context's security token. When providing a profile to a page, the
profile is filtered using the security token of caller page. Any
functions with different security tokens are filtered out (yes, we
only do fast path check for now) and their ticks are attributed to
their parents.
Review URL: http://codereview.chromium.org/2083005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4673 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The simple formula "ms = ticks * sampler_interval" doesn't work,
because e.g. on Linux, the actual sampling rate can be 5 times
lower than the one set up in the code. To calculate actual sampling
rate, current time is periodically queried and processed along with
actual sampling ticks count.
Review URL: http://codereview.chromium.org/1539038
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4427 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This is to make possible enabling usage of the new profiling subsystem
in Chromium without much hassle. The idea is pretty simple: unless the
new profiling API is used, all works as usual, as soon as Chromium
starts to use the new API, it will work too.
Review URL: http://codereview.chromium.org/1635005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4382 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
In browser (DevTools) mode, only non-native JS code and callbacks are reported.
Also, added "(garbage collector)" entry which accumulates samples count in GC state.
Trying to display "(compiler)" and "(external)" only brings confusion,
because it ends up in displaying scripts code under "(compiler)" node, and DOM
event handlers under "(external)" node, which looks weird.
Review URL: http://codereview.chromium.org/1523015
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4357 ce2b1a6d-e550-0410-aec6-3dcde31c8c00