v8/src/profiler/profile-generator.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

517 lines
16 KiB
C
Raw Normal View History

// Copyright 2011 the V8 project authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
#ifndef V8_PROFILER_PROFILE_GENERATOR_H_
#define V8_PROFILER_PROFILE_GENERATOR_H_
#include <atomic>
#include <deque>
#include <limits>
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
#include <map>
#include <memory>
#include <unordered_map>
#include <utility>
#include <vector>
#include "include/v8-profiler.h"
#include "src/base/platform/time.h"
#include "src/builtins/builtins.h"
#include "src/codegen/source-position.h"
#include "src/logging/code-events.h"
#include "src/profiler/strings-storage.h"
#include "src/utils/allocation.h"
namespace v8 {
namespace internal {
struct TickSample;
// Provides a mapping from the offsets within generated code or a bytecode array
[cpu-profiler] Reduce the size of inlining information Previously we stored the source position table, which stored a mapping of pc offsets to line numbers, and the inline_locations, which stored a mapping of pc offsets to stacks of {CodeEntry, line_number} pairs. This was slightly wasteful because we had two different tables which were both keyed on the pc offset and contained some overlapping information. This CL combines the two tables in a way. The source position table now maps a pc offset to a pair of {line_number, inlining_id}. If the inlining_id is valid, then it can be used to look up the inlining stack which is stored in inline_locations, but is now keyed by inlining_id rather than pc offset. This also has the nice effect of de-duplicating inline stacks which we previously duplicated. The new structure is similar to how this data is stored by the compiler, except that we convert 'source positions' (char offset in a file) into line numbers as we go, because we only care about attributing ticks to a given line. Also remove the helper RecordInliningInfo() as this is only actually used to add inline stacks by one caller (where it is now inlined). The other callers would always bail out or are only called from test-cpu-profiler. Remove AddInlineStack and replace it with SetInlineStacks which adds all of the stacks at once. We need to do it this way because the source pos table is passed into the constructor of CodeEntry, so we need to create it before the CodeEntry, but the inline stacks are not (they are part of rare_data which is not always present), so we need to add them after construction. Given that we calculate both the source pos table and the inline stacks before construction, it's just easier to add them all at once. Also add a print() method to CodeEntry to make future debugging easier as I'm constantly rewriting this locally. Bug: v8:8575, v8:7719, v8:7203 Change-Id: I39324d6ea13d116d5da5d0a0d243cae76a749c79 Reviewed-on: https://chromium-review.googlesource.com/c/1392195 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#58554}
2019-01-04 11:57:50 +00:00
// to the source line and inlining id.
class V8_EXPORT_PRIVATE SourcePositionTable : public Malloced {
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
public:
SourcePositionTable() = default;
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
[cpu-profiler] Reduce the size of inlining information Previously we stored the source position table, which stored a mapping of pc offsets to line numbers, and the inline_locations, which stored a mapping of pc offsets to stacks of {CodeEntry, line_number} pairs. This was slightly wasteful because we had two different tables which were both keyed on the pc offset and contained some overlapping information. This CL combines the two tables in a way. The source position table now maps a pc offset to a pair of {line_number, inlining_id}. If the inlining_id is valid, then it can be used to look up the inlining stack which is stored in inline_locations, but is now keyed by inlining_id rather than pc offset. This also has the nice effect of de-duplicating inline stacks which we previously duplicated. The new structure is similar to how this data is stored by the compiler, except that we convert 'source positions' (char offset in a file) into line numbers as we go, because we only care about attributing ticks to a given line. Also remove the helper RecordInliningInfo() as this is only actually used to add inline stacks by one caller (where it is now inlined). The other callers would always bail out or are only called from test-cpu-profiler. Remove AddInlineStack and replace it with SetInlineStacks which adds all of the stacks at once. We need to do it this way because the source pos table is passed into the constructor of CodeEntry, so we need to create it before the CodeEntry, but the inline stacks are not (they are part of rare_data which is not always present), so we need to add them after construction. Given that we calculate both the source pos table and the inline stacks before construction, it's just easier to add them all at once. Also add a print() method to CodeEntry to make future debugging easier as I'm constantly rewriting this locally. Bug: v8:8575, v8:7719, v8:7203 Change-Id: I39324d6ea13d116d5da5d0a0d243cae76a749c79 Reviewed-on: https://chromium-review.googlesource.com/c/1392195 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#58554}
2019-01-04 11:57:50 +00:00
void SetPosition(int pc_offset, int line, int inlining_id);
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
int GetSourceLineNumber(int pc_offset) const;
[cpu-profiler] Reduce the size of inlining information Previously we stored the source position table, which stored a mapping of pc offsets to line numbers, and the inline_locations, which stored a mapping of pc offsets to stacks of {CodeEntry, line_number} pairs. This was slightly wasteful because we had two different tables which were both keyed on the pc offset and contained some overlapping information. This CL combines the two tables in a way. The source position table now maps a pc offset to a pair of {line_number, inlining_id}. If the inlining_id is valid, then it can be used to look up the inlining stack which is stored in inline_locations, but is now keyed by inlining_id rather than pc offset. This also has the nice effect of de-duplicating inline stacks which we previously duplicated. The new structure is similar to how this data is stored by the compiler, except that we convert 'source positions' (char offset in a file) into line numbers as we go, because we only care about attributing ticks to a given line. Also remove the helper RecordInliningInfo() as this is only actually used to add inline stacks by one caller (where it is now inlined). The other callers would always bail out or are only called from test-cpu-profiler. Remove AddInlineStack and replace it with SetInlineStacks which adds all of the stacks at once. We need to do it this way because the source pos table is passed into the constructor of CodeEntry, so we need to create it before the CodeEntry, but the inline stacks are not (they are part of rare_data which is not always present), so we need to add them after construction. Given that we calculate both the source pos table and the inline stacks before construction, it's just easier to add them all at once. Also add a print() method to CodeEntry to make future debugging easier as I'm constantly rewriting this locally. Bug: v8:8575, v8:7719, v8:7203 Change-Id: I39324d6ea13d116d5da5d0a0d243cae76a749c79 Reviewed-on: https://chromium-review.googlesource.com/c/1392195 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#58554}
2019-01-04 11:57:50 +00:00
int GetInliningId(int pc_offset) const;
void print() const;
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
private:
[cpu-profiler] Reduce the size of inlining information Previously we stored the source position table, which stored a mapping of pc offsets to line numbers, and the inline_locations, which stored a mapping of pc offsets to stacks of {CodeEntry, line_number} pairs. This was slightly wasteful because we had two different tables which were both keyed on the pc offset and contained some overlapping information. This CL combines the two tables in a way. The source position table now maps a pc offset to a pair of {line_number, inlining_id}. If the inlining_id is valid, then it can be used to look up the inlining stack which is stored in inline_locations, but is now keyed by inlining_id rather than pc offset. This also has the nice effect of de-duplicating inline stacks which we previously duplicated. The new structure is similar to how this data is stored by the compiler, except that we convert 'source positions' (char offset in a file) into line numbers as we go, because we only care about attributing ticks to a given line. Also remove the helper RecordInliningInfo() as this is only actually used to add inline stacks by one caller (where it is now inlined). The other callers would always bail out or are only called from test-cpu-profiler. Remove AddInlineStack and replace it with SetInlineStacks which adds all of the stacks at once. We need to do it this way because the source pos table is passed into the constructor of CodeEntry, so we need to create it before the CodeEntry, but the inline stacks are not (they are part of rare_data which is not always present), so we need to add them after construction. Given that we calculate both the source pos table and the inline stacks before construction, it's just easier to add them all at once. Also add a print() method to CodeEntry to make future debugging easier as I'm constantly rewriting this locally. Bug: v8:8575, v8:7719, v8:7203 Change-Id: I39324d6ea13d116d5da5d0a0d243cae76a749c79 Reviewed-on: https://chromium-review.googlesource.com/c/1392195 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#58554}
2019-01-04 11:57:50 +00:00
struct SourcePositionTuple {
bool operator<(const SourcePositionTuple& other) const {
return pc_offset < other.pc_offset;
}
int pc_offset;
int line_number;
[cpu-profiler] Reduce the size of inlining information Previously we stored the source position table, which stored a mapping of pc offsets to line numbers, and the inline_locations, which stored a mapping of pc offsets to stacks of {CodeEntry, line_number} pairs. This was slightly wasteful because we had two different tables which were both keyed on the pc offset and contained some overlapping information. This CL combines the two tables in a way. The source position table now maps a pc offset to a pair of {line_number, inlining_id}. If the inlining_id is valid, then it can be used to look up the inlining stack which is stored in inline_locations, but is now keyed by inlining_id rather than pc offset. This also has the nice effect of de-duplicating inline stacks which we previously duplicated. The new structure is similar to how this data is stored by the compiler, except that we convert 'source positions' (char offset in a file) into line numbers as we go, because we only care about attributing ticks to a given line. Also remove the helper RecordInliningInfo() as this is only actually used to add inline stacks by one caller (where it is now inlined). The other callers would always bail out or are only called from test-cpu-profiler. Remove AddInlineStack and replace it with SetInlineStacks which adds all of the stacks at once. We need to do it this way because the source pos table is passed into the constructor of CodeEntry, so we need to create it before the CodeEntry, but the inline stacks are not (they are part of rare_data which is not always present), so we need to add them after construction. Given that we calculate both the source pos table and the inline stacks before construction, it's just easier to add them all at once. Also add a print() method to CodeEntry to make future debugging easier as I'm constantly rewriting this locally. Bug: v8:8575, v8:7719, v8:7203 Change-Id: I39324d6ea13d116d5da5d0a0d243cae76a749c79 Reviewed-on: https://chromium-review.googlesource.com/c/1392195 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#58554}
2019-01-04 11:57:50 +00:00
int inlining_id;
};
[cpu-profiler] Reduce the size of inlining information Previously we stored the source position table, which stored a mapping of pc offsets to line numbers, and the inline_locations, which stored a mapping of pc offsets to stacks of {CodeEntry, line_number} pairs. This was slightly wasteful because we had two different tables which were both keyed on the pc offset and contained some overlapping information. This CL combines the two tables in a way. The source position table now maps a pc offset to a pair of {line_number, inlining_id}. If the inlining_id is valid, then it can be used to look up the inlining stack which is stored in inline_locations, but is now keyed by inlining_id rather than pc offset. This also has the nice effect of de-duplicating inline stacks which we previously duplicated. The new structure is similar to how this data is stored by the compiler, except that we convert 'source positions' (char offset in a file) into line numbers as we go, because we only care about attributing ticks to a given line. Also remove the helper RecordInliningInfo() as this is only actually used to add inline stacks by one caller (where it is now inlined). The other callers would always bail out or are only called from test-cpu-profiler. Remove AddInlineStack and replace it with SetInlineStacks which adds all of the stacks at once. We need to do it this way because the source pos table is passed into the constructor of CodeEntry, so we need to create it before the CodeEntry, but the inline stacks are not (they are part of rare_data which is not always present), so we need to add them after construction. Given that we calculate both the source pos table and the inline stacks before construction, it's just easier to add them all at once. Also add a print() method to CodeEntry to make future debugging easier as I'm constantly rewriting this locally. Bug: v8:8575, v8:7719, v8:7203 Change-Id: I39324d6ea13d116d5da5d0a0d243cae76a749c79 Reviewed-on: https://chromium-review.googlesource.com/c/1392195 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#58554}
2019-01-04 11:57:50 +00:00
// This is logically a map, but we store it as a vector of tuples, sorted by
// the pc offset, so that we can save space and look up items using binary
// search.
[cpu-profiler] Reduce the size of inlining information Previously we stored the source position table, which stored a mapping of pc offsets to line numbers, and the inline_locations, which stored a mapping of pc offsets to stacks of {CodeEntry, line_number} pairs. This was slightly wasteful because we had two different tables which were both keyed on the pc offset and contained some overlapping information. This CL combines the two tables in a way. The source position table now maps a pc offset to a pair of {line_number, inlining_id}. If the inlining_id is valid, then it can be used to look up the inlining stack which is stored in inline_locations, but is now keyed by inlining_id rather than pc offset. This also has the nice effect of de-duplicating inline stacks which we previously duplicated. The new structure is similar to how this data is stored by the compiler, except that we convert 'source positions' (char offset in a file) into line numbers as we go, because we only care about attributing ticks to a given line. Also remove the helper RecordInliningInfo() as this is only actually used to add inline stacks by one caller (where it is now inlined). The other callers would always bail out or are only called from test-cpu-profiler. Remove AddInlineStack and replace it with SetInlineStacks which adds all of the stacks at once. We need to do it this way because the source pos table is passed into the constructor of CodeEntry, so we need to create it before the CodeEntry, but the inline stacks are not (they are part of rare_data which is not always present), so we need to add them after construction. Given that we calculate both the source pos table and the inline stacks before construction, it's just easier to add them all at once. Also add a print() method to CodeEntry to make future debugging easier as I'm constantly rewriting this locally. Bug: v8:8575, v8:7719, v8:7203 Change-Id: I39324d6ea13d116d5da5d0a0d243cae76a749c79 Reviewed-on: https://chromium-review.googlesource.com/c/1392195 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#58554}
2019-01-04 11:57:50 +00:00
std::vector<SourcePositionTuple> pc_offsets_to_lines_;
DISALLOW_COPY_AND_ASSIGN(SourcePositionTable);
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
};
[cpu-profiler] De-duplicate CodeEntry objects for inline stacks Within an inline stack we would have multiple copies of the exact same CodeEntry object to represent an inline frame. We had one copy for every time that the frame appeared in an inline stack. One CodeEntry can have multiple inline stacks and each stack can have multiple inline frames. In the common case, the stacks overlap and repeat frames. This CL creates a single CodeEntry object to represent each inlined function as an inline frame (for a given CodeEntry with inlinings). This removes most of the duplication of inline CodeEntry objects. We still have some duplication, e.g. if we inline bar() into foo() and foo2() but they are not themselves inlined into anything, then we will have two inline CodeEntry objects for bar(). Removing all duplication is harder to achieve because the lifetime of the inlined frame CodeEntry is now no longer tied to the inliner. Get rid of the InlineEntry struct as it is now indentical to CodeEntryAndLineNumber. We store the list of canonical inline CodeEntry objects on the CodeObject of the inlining function so that it can own the lifetimes of inlined frames. Also rename inline_locations_ to inline_stacks_ to be clearer. Bug: v8:7719 Change-Id: Ied765b4cce7fd33f3290798331f1e6767cc42e8c Reviewed-on: https://chromium-review.googlesource.com/c/1396086 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Reviewed-by: Alexei Filippov <alph@chromium.org> Cr-Commit-Position: refs/heads/master@{#58639}
2019-01-08 14:16:03 +00:00
struct CodeEntryAndLineNumber;
[cpu-profiler] Add source positions for inlined function calls Currently in both kCallerLineNumbers and kLeafNodeLineNumbers modes, we correctly capture inline stacks. In leaf number mode, this is simple as we simply add the path onto the existing tree. For caller line numbers mode this is more complex, because each path through various inlined function should be represented in the tree, even when there are multiple callsites to the same function inlined. Currently we don't correctly show line numbers for inlined functions. We do actually have this information though, which is generated by turbofan and stored in the source_position_table data structure on the code object. This also changes the behavior of the SourcePositionTable class. A problem we uncovered is that the PC that the sampler provides for every frame except the leaf is the return address of the calling frame. This address is *after* the call has already happened. It can be attributed to the next line of the function, rather than the calling line, which is wrong. We fix that here by using lower_bound in GetSourceLineNumber. The same problem happens in GetInlineStack - the PC of the caller is actually the instruction after the call. The information turbofan generates assumes that the instruction after the call is not part of the call (fair enough). To fix this we do the same thing as above - use lower_bound and then iterate back by one. TBR=alph@chromium.org Bug: v8:8575, v8:8606 Change-Id: Idc4bd4bdc8fb70b70ecc1a77a1e3744a86f83483 Reviewed-on: https://chromium-review.googlesource.com/c/1374290 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Reviewed-by: Yang Guo <yangguo@chromium.org> Cr-Commit-Position: refs/heads/master@{#58545}
2019-01-02 12:19:06 +00:00
class CodeEntry {
public:
// CodeEntry doesn't own name strings, just references them.
inline CodeEntry(CodeEventListener::LogEventsAndTags tag, const char* name,
const char* resource_name = CodeEntry::kEmptyResourceName,
int line_number = v8::CpuProfileNode::kNoLineNumberInfo,
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
int column_number = v8::CpuProfileNode::kNoColumnNumberInfo,
std::unique_ptr<SourcePositionTable> line_info = nullptr,
Address instruction_start = kNullAddress,
bool is_shared_cross_origin = false);
const char* name() const { return name_; }
const char* resource_name() const { return resource_name_; }
int line_number() const { return line_number_; }
int column_number() const { return column_number_; }
const SourcePositionTable* line_info() const { return line_info_.get(); }
int script_id() const { return script_id_; }
void set_script_id(int script_id) { script_id_ = script_id; }
int position() const { return position_; }
void set_position(int position) { position_ = position; }
void set_bailout_reason(const char* bailout_reason) {
EnsureRareData()->bailout_reason_ = bailout_reason;
}
const char* bailout_reason() const {
return rare_data_ ? rare_data_->bailout_reason_ : kEmptyBailoutReason;
}
void set_deopt_info(const char* deopt_reason, int deopt_id,
std::vector<CpuProfileDeoptFrame> inlined_frames);
CpuProfileDeoptInfo GetDeoptInfo();
bool has_deopt_info() const {
return rare_data_ && rare_data_->deopt_id_ != kNoDeoptimizationId;
}
void clear_deopt_info() {
if (!rare_data_) return;
// TODO(alph): Clear rare_data_ if that was the only field in use.
rare_data_->deopt_reason_ = kNoDeoptReason;
rare_data_->deopt_id_ = kNoDeoptimizationId;
}
void mark_used() { bit_field_ = UsedField::update(bit_field_, true); }
bool used() const { return UsedField::decode(bit_field_); }
void FillFunctionInfo(SharedFunctionInfo shared);
void SetBuiltinId(Builtins::Name id);
Builtins::Name builtin_id() const {
return BuiltinIdField::decode(bit_field_);
}
bool is_shared_cross_origin() const {
return SharedCrossOriginField::decode(bit_field_);
}
uint32_t GetHash() const;
bool IsSameFunctionAs(const CodeEntry* entry) const;
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
int GetSourceLine(int pc_offset) const;
[cpu-profiler] De-duplicate CodeEntry objects for inline stacks Within an inline stack we would have multiple copies of the exact same CodeEntry object to represent an inline frame. We had one copy for every time that the frame appeared in an inline stack. One CodeEntry can have multiple inline stacks and each stack can have multiple inline frames. In the common case, the stacks overlap and repeat frames. This CL creates a single CodeEntry object to represent each inlined function as an inline frame (for a given CodeEntry with inlinings). This removes most of the duplication of inline CodeEntry objects. We still have some duplication, e.g. if we inline bar() into foo() and foo2() but they are not themselves inlined into anything, then we will have two inline CodeEntry objects for bar(). Removing all duplication is harder to achieve because the lifetime of the inlined frame CodeEntry is now no longer tied to the inliner. Get rid of the InlineEntry struct as it is now indentical to CodeEntryAndLineNumber. We store the list of canonical inline CodeEntry objects on the CodeObject of the inlining function so that it can own the lifetimes of inlined frames. Also rename inline_locations_ to inline_stacks_ to be clearer. Bug: v8:7719 Change-Id: Ied765b4cce7fd33f3290798331f1e6767cc42e8c Reviewed-on: https://chromium-review.googlesource.com/c/1396086 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Reviewed-by: Alexei Filippov <alph@chromium.org> Cr-Commit-Position: refs/heads/master@{#58639}
2019-01-08 14:16:03 +00:00
struct Equals {
bool operator()(const std::unique_ptr<CodeEntry>& lhs,
const std::unique_ptr<CodeEntry>& rhs) const {
return lhs.get()->IsSameFunctionAs(rhs.get());
}
};
struct Hasher {
std::size_t operator()(const std::unique_ptr<CodeEntry>& e) const {
return e->GetHash();
}
};
[cpu-profiler] Reduce the size of inlining information Previously we stored the source position table, which stored a mapping of pc offsets to line numbers, and the inline_locations, which stored a mapping of pc offsets to stacks of {CodeEntry, line_number} pairs. This was slightly wasteful because we had two different tables which were both keyed on the pc offset and contained some overlapping information. This CL combines the two tables in a way. The source position table now maps a pc offset to a pair of {line_number, inlining_id}. If the inlining_id is valid, then it can be used to look up the inlining stack which is stored in inline_locations, but is now keyed by inlining_id rather than pc offset. This also has the nice effect of de-duplicating inline stacks which we previously duplicated. The new structure is similar to how this data is stored by the compiler, except that we convert 'source positions' (char offset in a file) into line numbers as we go, because we only care about attributing ticks to a given line. Also remove the helper RecordInliningInfo() as this is only actually used to add inline stacks by one caller (where it is now inlined). The other callers would always bail out or are only called from test-cpu-profiler. Remove AddInlineStack and replace it with SetInlineStacks which adds all of the stacks at once. We need to do it this way because the source pos table is passed into the constructor of CodeEntry, so we need to create it before the CodeEntry, but the inline stacks are not (they are part of rare_data which is not always present), so we need to add them after construction. Given that we calculate both the source pos table and the inline stacks before construction, it's just easier to add them all at once. Also add a print() method to CodeEntry to make future debugging easier as I'm constantly rewriting this locally. Bug: v8:8575, v8:7719, v8:7203 Change-Id: I39324d6ea13d116d5da5d0a0d243cae76a749c79 Reviewed-on: https://chromium-review.googlesource.com/c/1392195 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#58554}
2019-01-04 11:57:50 +00:00
void SetInlineStacks(
[cpu-profiler] De-duplicate CodeEntry objects for inline stacks Within an inline stack we would have multiple copies of the exact same CodeEntry object to represent an inline frame. We had one copy for every time that the frame appeared in an inline stack. One CodeEntry can have multiple inline stacks and each stack can have multiple inline frames. In the common case, the stacks overlap and repeat frames. This CL creates a single CodeEntry object to represent each inlined function as an inline frame (for a given CodeEntry with inlinings). This removes most of the duplication of inline CodeEntry objects. We still have some duplication, e.g. if we inline bar() into foo() and foo2() but they are not themselves inlined into anything, then we will have two inline CodeEntry objects for bar(). Removing all duplication is harder to achieve because the lifetime of the inlined frame CodeEntry is now no longer tied to the inliner. Get rid of the InlineEntry struct as it is now indentical to CodeEntryAndLineNumber. We store the list of canonical inline CodeEntry objects on the CodeObject of the inlining function so that it can own the lifetimes of inlined frames. Also rename inline_locations_ to inline_stacks_ to be clearer. Bug: v8:7719 Change-Id: Ied765b4cce7fd33f3290798331f1e6767cc42e8c Reviewed-on: https://chromium-review.googlesource.com/c/1396086 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Reviewed-by: Alexei Filippov <alph@chromium.org> Cr-Commit-Position: refs/heads/master@{#58639}
2019-01-08 14:16:03 +00:00
std::unordered_set<std::unique_ptr<CodeEntry>, Hasher, Equals>
inline_entries,
std::unordered_map<int, std::vector<CodeEntryAndLineNumber>>
inline_stacks);
const std::vector<CodeEntryAndLineNumber>* GetInlineStack(
int pc_offset) const;
void set_instruction_start(Address start) { instruction_start_ = start; }
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
Address instruction_start() const { return instruction_start_; }
CodeEventListener::LogEventsAndTags tag() const {
return TagField::decode(bit_field_);
}
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
static const char* const kWasmResourceNamePrefix;
V8_EXPORT_PRIVATE static const char* const kEmptyResourceName;
static const char* const kEmptyBailoutReason;
static const char* const kNoDeoptReason;
V8_EXPORT_PRIVATE static const char* const kProgramEntryName;
V8_EXPORT_PRIVATE static const char* const kIdleEntryName;
static const char* const kGarbageCollectorEntryName;
// Used to represent frames for which we have no reliable way to
// detect function.
V8_EXPORT_PRIVATE static const char* const kUnresolvedFunctionName;
V8_EXPORT_PRIVATE static const char* const kRootEntryName;
V8_INLINE static CodeEntry* program_entry() {
return kProgramEntry.Pointer();
}
V8_INLINE static CodeEntry* idle_entry() { return kIdleEntry.Pointer(); }
V8_INLINE static CodeEntry* gc_entry() { return kGCEntry.Pointer(); }
V8_INLINE static CodeEntry* unresolved_entry() {
return kUnresolvedEntry.Pointer();
}
V8_INLINE static CodeEntry* root_entry() { return kRootEntry.Pointer(); }
[cpu-profiler] Reduce the size of inlining information Previously we stored the source position table, which stored a mapping of pc offsets to line numbers, and the inline_locations, which stored a mapping of pc offsets to stacks of {CodeEntry, line_number} pairs. This was slightly wasteful because we had two different tables which were both keyed on the pc offset and contained some overlapping information. This CL combines the two tables in a way. The source position table now maps a pc offset to a pair of {line_number, inlining_id}. If the inlining_id is valid, then it can be used to look up the inlining stack which is stored in inline_locations, but is now keyed by inlining_id rather than pc offset. This also has the nice effect of de-duplicating inline stacks which we previously duplicated. The new structure is similar to how this data is stored by the compiler, except that we convert 'source positions' (char offset in a file) into line numbers as we go, because we only care about attributing ticks to a given line. Also remove the helper RecordInliningInfo() as this is only actually used to add inline stacks by one caller (where it is now inlined). The other callers would always bail out or are only called from test-cpu-profiler. Remove AddInlineStack and replace it with SetInlineStacks which adds all of the stacks at once. We need to do it this way because the source pos table is passed into the constructor of CodeEntry, so we need to create it before the CodeEntry, but the inline stacks are not (they are part of rare_data which is not always present), so we need to add them after construction. Given that we calculate both the source pos table and the inline stacks before construction, it's just easier to add them all at once. Also add a print() method to CodeEntry to make future debugging easier as I'm constantly rewriting this locally. Bug: v8:8575, v8:7719, v8:7203 Change-Id: I39324d6ea13d116d5da5d0a0d243cae76a749c79 Reviewed-on: https://chromium-review.googlesource.com/c/1392195 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#58554}
2019-01-04 11:57:50 +00:00
void print() const;
private:
struct RareData {
const char* deopt_reason_ = kNoDeoptReason;
const char* bailout_reason_ = kEmptyBailoutReason;
int deopt_id_ = kNoDeoptimizationId;
[cpu-profiler] De-duplicate CodeEntry objects for inline stacks Within an inline stack we would have multiple copies of the exact same CodeEntry object to represent an inline frame. We had one copy for every time that the frame appeared in an inline stack. One CodeEntry can have multiple inline stacks and each stack can have multiple inline frames. In the common case, the stacks overlap and repeat frames. This CL creates a single CodeEntry object to represent each inlined function as an inline frame (for a given CodeEntry with inlinings). This removes most of the duplication of inline CodeEntry objects. We still have some duplication, e.g. if we inline bar() into foo() and foo2() but they are not themselves inlined into anything, then we will have two inline CodeEntry objects for bar(). Removing all duplication is harder to achieve because the lifetime of the inlined frame CodeEntry is now no longer tied to the inliner. Get rid of the InlineEntry struct as it is now indentical to CodeEntryAndLineNumber. We store the list of canonical inline CodeEntry objects on the CodeObject of the inlining function so that it can own the lifetimes of inlined frames. Also rename inline_locations_ to inline_stacks_ to be clearer. Bug: v8:7719 Change-Id: Ied765b4cce7fd33f3290798331f1e6767cc42e8c Reviewed-on: https://chromium-review.googlesource.com/c/1396086 Commit-Queue: Peter Marshall <petermarshall@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Reviewed-by: Alexei Filippov <alph@chromium.org> Cr-Commit-Position: refs/heads/master@{#58639}
2019-01-08 14:16:03 +00:00
std::unordered_map<int, std::vector<CodeEntryAndLineNumber>> inline_stacks_;
std::unordered_set<std::unique_ptr<CodeEntry>, Hasher, Equals>
inline_entries_;
std::vector<CpuProfileDeoptFrame> deopt_inlined_frames_;
};
RareData* EnsureRareData();
struct V8_EXPORT_PRIVATE ProgramEntryCreateTrait {
static CodeEntry* Create();
};
struct V8_EXPORT_PRIVATE IdleEntryCreateTrait {
static CodeEntry* Create();
};
struct V8_EXPORT_PRIVATE GCEntryCreateTrait {
static CodeEntry* Create();
};
struct V8_EXPORT_PRIVATE UnresolvedEntryCreateTrait {
static CodeEntry* Create();
};
struct V8_EXPORT_PRIVATE RootEntryCreateTrait {
static CodeEntry* Create();
};
V8_EXPORT_PRIVATE static base::LazyDynamicInstance<
CodeEntry, ProgramEntryCreateTrait>::type kProgramEntry;
V8_EXPORT_PRIVATE static base::LazyDynamicInstance<
CodeEntry, IdleEntryCreateTrait>::type kIdleEntry;
V8_EXPORT_PRIVATE static base::LazyDynamicInstance<
CodeEntry, GCEntryCreateTrait>::type kGCEntry;
V8_EXPORT_PRIVATE static base::LazyDynamicInstance<
CodeEntry, UnresolvedEntryCreateTrait>::type kUnresolvedEntry;
V8_EXPORT_PRIVATE static base::LazyDynamicInstance<
CodeEntry, RootEntryCreateTrait>::type kRootEntry;
using TagField = BitField<CodeEventListener::LogEventsAndTags, 0, 8>;
using BuiltinIdField = BitField<Builtins::Name, 8, 22>;
static_assert(Builtins::builtin_count <= BuiltinIdField::kNumValues,
"builtin_count exceeds size of bitfield");
using UsedField = BitField<bool, 30, 1>;
using SharedCrossOriginField = BitField<bool, 31, 1>;
uint32_t bit_field_;
const char* name_;
const char* resource_name_;
int line_number_;
int column_number_;
int script_id_;
int position_;
std::unique_ptr<SourcePositionTable> line_info_;
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
Address instruction_start_;
std::unique_ptr<RareData> rare_data_;
DISALLOW_COPY_AND_ASSIGN(CodeEntry);
};
struct CodeEntryAndLineNumber {
CodeEntry* code_entry;
int line_number;
};
typedef std::vector<CodeEntryAndLineNumber> ProfileStackTrace;
class ProfileTree;
class V8_EXPORT_PRIVATE ProfileNode {
public:
inline ProfileNode(ProfileTree* tree, CodeEntry* entry, ProfileNode* parent,
int line_number = 0);
ProfileNode* FindChild(
CodeEntry* entry,
int line_number = v8::CpuProfileNode::kNoLineNumberInfo);
ProfileNode* FindOrAddChild(CodeEntry* entry, int line_number = 0);
void IncrementSelfTicks() { ++self_ticks_; }
void IncreaseSelfTicks(unsigned amount) { self_ticks_ += amount; }
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
void IncrementLineTicks(int src_line);
CodeEntry* entry() const { return entry_; }
unsigned self_ticks() const { return self_ticks_; }
const std::vector<ProfileNode*>* children() const { return &children_list_; }
unsigned id() const { return id_; }
unsigned function_id() const;
ProfileNode* parent() const { return parent_; }
int line_number() const {
return line_number_ != 0 ? line_number_ : entry_->line_number();
}
CpuProfileNode::SourceType source_type() const;
unsigned int GetHitLineCount() const {
return static_cast<unsigned int>(line_ticks_.size());
}
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
bool GetLineTicks(v8::CpuProfileNode::LineTick* entries,
unsigned int length) const;
void CollectDeoptInfo(CodeEntry* entry);
const std::vector<CpuProfileDeoptInfo>& deopt_infos() const {
return deopt_infos_;
}
Isolate* isolate() const;
void Print(int indent);
private:
struct Equals {
bool operator()(CodeEntryAndLineNumber lhs,
CodeEntryAndLineNumber rhs) const {
return lhs.code_entry->IsSameFunctionAs(rhs.code_entry) &&
lhs.line_number == rhs.line_number;
}
};
struct Hasher {
std::size_t operator()(CodeEntryAndLineNumber pair) const {
return pair.code_entry->GetHash() ^ ComputeUnseededHash(pair.line_number);
}
};
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
ProfileTree* tree_;
CodeEntry* entry_;
unsigned self_ticks_;
std::unordered_map<CodeEntryAndLineNumber, ProfileNode*, Hasher, Equals>
children_;
int line_number_;
std::vector<ProfileNode*> children_list_;
ProfileNode* parent_;
unsigned id_;
// maps line number --> number of ticks
std::unordered_map<int, int> line_ticks_;
std::vector<CpuProfileDeoptInfo> deopt_infos_;
DISALLOW_COPY_AND_ASSIGN(ProfileNode);
};
class V8_EXPORT_PRIVATE ProfileTree {
public:
explicit ProfileTree(Isolate* isolate);
~ProfileTree();
typedef v8::CpuProfilingMode ProfilingMode;
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
ProfileNode* AddPathFromEnd(
const std::vector<CodeEntry*>& path,
int src_line = v8::CpuProfileNode::kNoLineNumberInfo,
bool update_stats = true);
ProfileNode* AddPathFromEnd(
const ProfileStackTrace& path,
int src_line = v8::CpuProfileNode::kNoLineNumberInfo,
bool update_stats = true,
ProfilingMode mode = ProfilingMode::kLeafNodeLineNumbers);
ProfileNode* root() const { return root_; }
unsigned next_node_id() { return next_node_id_++; }
unsigned GetFunctionId(const ProfileNode* node);
void Print() {
root_->Print(0);
}
Isolate* isolate() const { return isolate_; }
void EnqueueNode(const ProfileNode* node) { pending_nodes_.push_back(node); }
size_t pending_nodes_count() const { return pending_nodes_.size(); }
std::vector<const ProfileNode*> TakePendingNodes() {
return std::move(pending_nodes_);
}
private:
template <typename Callback>
void TraverseDepthFirst(Callback* callback);
std::vector<const ProfileNode*> pending_nodes_;
unsigned next_node_id_;
ProfileNode* root_;
Isolate* isolate_;
unsigned next_function_id_;
std::unordered_map<CodeEntry*, unsigned> function_ids_;
DISALLOW_COPY_AND_ASSIGN(ProfileTree);
};
class CpuProfiler;
class CpuProfile {
public:
struct SampleInfo {
ProfileNode* node;
base::TimeTicks timestamp;
int line;
};
V8_EXPORT_PRIVATE CpuProfile(CpuProfiler* profiler, const char* title,
CpuProfilingOptions options);
// Checks whether or not the given TickSample should be (sub)sampled, given
// the sampling interval of the profiler that recorded it (in microseconds).
V8_EXPORT_PRIVATE bool CheckSubsample(base::TimeDelta sampling_interval);
// Add pc -> ... -> main() call path to the profile.
void AddPath(base::TimeTicks timestamp, const ProfileStackTrace& path,
int src_line, bool update_stats,
base::TimeDelta sampling_interval);
void FinishProfile();
const char* title() const { return title_; }
const ProfileTree* top_down() const { return &top_down_; }
int samples_count() const { return static_cast<int>(samples_.size()); }
const SampleInfo& sample(int index) const { return samples_[index]; }
int64_t sampling_interval_us() const {
return options_.sampling_interval_us();
}
base::TimeTicks start_time() const { return start_time_; }
base::TimeTicks end_time() const { return end_time_; }
CpuProfiler* cpu_profiler() const { return profiler_; }
void UpdateTicksScale();
V8_EXPORT_PRIVATE void Print();
private:
void StreamPendingTraceEvents();
const char* title_;
const CpuProfilingOptions options_;
base::TimeTicks start_time_;
base::TimeTicks end_time_;
std::deque<SampleInfo> samples_;
ProfileTree top_down_;
CpuProfiler* const profiler_;
size_t streaming_next_sample_;
uint32_t id_;
// Number of microseconds worth of profiler ticks that should elapse before
// the next sample is recorded.
base::TimeDelta next_sample_delta_;
static std::atomic<uint32_t> last_id_;
DISALLOW_COPY_AND_ASSIGN(CpuProfile);
};
class V8_EXPORT_PRIVATE CodeMap {
public:
CodeMap();
~CodeMap();
void AddCode(Address addr, CodeEntry* entry, unsigned size);
void MoveCode(Address from, Address to);
CodeEntry* FindEntry(Address addr);
void Print();
private:
struct CodeEntryMapInfo {
unsigned index;
unsigned size;
};
union CodeEntrySlotInfo {
CodeEntry* entry;
unsigned next_free_slot;
};
static constexpr unsigned kNoFreeSlot = std::numeric_limits<unsigned>::max();
void ClearCodesInRange(Address start, Address end);
unsigned AddCodeEntry(Address start, CodeEntry*);
void DeleteCodeEntry(unsigned index);
CodeEntry* entry(unsigned index) { return code_entries_[index].entry; }
std::deque<CodeEntrySlotInfo> code_entries_;
std::map<Address, CodeEntryMapInfo> code_map_;
unsigned free_list_head_ = kNoFreeSlot;
DISALLOW_COPY_AND_ASSIGN(CodeMap);
};
class V8_EXPORT_PRIVATE CpuProfilesCollection {
public:
explicit CpuProfilesCollection(Isolate* isolate);
void set_cpu_profiler(CpuProfiler* profiler) { profiler_ = profiler; }
bool StartProfiling(const char* title, CpuProfilingOptions options = {});
CpuProfile* StopProfiling(const char* title);
std::vector<std::unique_ptr<CpuProfile>>* profiles() {
return &finished_profiles_;
}
const char* GetName(Name name) { return resource_names_.GetName(name); }
bool IsLastProfile(const char* title);
void RemoveProfile(CpuProfile* profile);
// Finds a common sampling interval dividing each CpuProfile's interval,
// rounded up to the nearest multiple of the CpuProfiler's sampling interval.
// Returns 0 if no profiles are attached.
base::TimeDelta GetCommonSamplingInterval() const;
// Called from profile generator thread.
The idea behind of this solution is to use the existing "relocation info" instead of consumption the CodeLinePosition events emitted by the V8 compilers. During generation code and relocation info are generated simultaneously. When code generation is done you each code object has associated "relocation info". Relocation information lets V8 to mark interesting places in the generated code: the pointers that might need to be relocated (after garbage collection), correspondences between the machine program counter and source locations for stack walking. This patch: 1. Add more source positions info in reloc info to make it suitable for source level mapping. The amount of data should not be increased dramatically because (1) V8 already marks interesting places in the generated code and (2) V8 does not write redundant information (it writes a pair (pc_offset, pos) only if pos is changed and skips other). I measured it on Octane benchmark - for unoptimized code the number of source positions may achieve 2x ('lin_solve' from NavierStokes benchmark). 2. When a sample happens, CPU profiler finds a code object by pc, then use its reloc info to match the sample to a source line. If a source line is found that hit counter is increased by one for this line. 3. Add a new public V8 API to get the hit source lines by CDT CPU profiler. Note that it's expected a minor patch in Blink to pack the source level info in JSON to be shown. 4.Add a test that checks how the samples are distributed through source lines. It tests two cases: (1) relocation info created during code generation and (2) relocation info associated with precompiled function's version. Patch from Denis Pravdin <denis.pravdin@intel.com>; R=svenpanne@chromium.org, yurys@chromium.org Review URL: https://codereview.chromium.org/682143003 Patch from Weiliang <weiliang.lin@intel.com>. Cr-Commit-Position: refs/heads/master@{#25182} git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@25182 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-11-06 09:16:34 +00:00
void AddPathToCurrentProfiles(base::TimeTicks timestamp,
const ProfileStackTrace& path, int src_line,
bool update_stats,
base::TimeDelta sampling_interval);
// Limits the number of profiles that can be simultaneously collected.
static const int kMaxSimultaneousProfiles = 100;
private:
StringsStorage resource_names_;
std::vector<std::unique_ptr<CpuProfile>> finished_profiles_;
CpuProfiler* profiler_;
// Accessed by VM thread and profile generator thread.
std::vector<std::unique_ptr<CpuProfile>> current_profiles_;
base::Semaphore current_profiles_semaphore_;
DISALLOW_COPY_AND_ASSIGN(CpuProfilesCollection);
};
class V8_EXPORT_PRIVATE ProfileGenerator {
public:
explicit ProfileGenerator(CpuProfilesCollection* profiles);
void RecordTickSample(const TickSample& sample);
CodeMap* code_map() { return &code_map_; }
private:
CodeEntry* FindEntry(Address address);
CodeEntry* EntryForVMState(StateTag tag);
CpuProfilesCollection* profiles_;
CodeMap code_map_;
DISALLOW_COPY_AND_ASSIGN(ProfileGenerator);
};
} // namespace internal
} // namespace v8
#endif // V8_PROFILER_PROFILE_GENERATOR_H_