The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
// Copyright 2013 the V8 project authors. All rights reserved.
|
|
|
|
// Redistribution and use in source and binary forms, with or without
|
|
|
|
// modification, are permitted provided that the following conditions are
|
|
|
|
// met:
|
|
|
|
//
|
|
|
|
// * Redistributions of source code must retain the above copyright
|
|
|
|
// notice, this list of conditions and the following disclaimer.
|
|
|
|
// * Redistributions in binary form must reproduce the above
|
|
|
|
// copyright notice, this list of conditions and the following
|
|
|
|
// disclaimer in the documentation and/or other materials provided
|
|
|
|
// with the distribution.
|
|
|
|
// * Neither the name of Google Inc. nor the names of its
|
|
|
|
// contributors may be used to endorse or promote products derived
|
|
|
|
// from this software without specific prior written permission.
|
|
|
|
//
|
|
|
|
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
|
|
|
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
|
|
|
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
|
|
|
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
|
|
|
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
|
|
|
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
|
|
|
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
|
|
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
|
|
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
|
|
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
|
|
|
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
|
2017-07-13 11:22:37 +00:00
|
|
|
// Flags: --allow-natives-syntax --expose-gc
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
|
|
|
|
|
|
|
|
// Simple test of capture
|
|
|
|
(function testCapturedArguments() {
|
|
|
|
function h() {
|
|
|
|
return g.arguments[0];
|
|
|
|
}
|
|
|
|
|
|
|
|
function g(x) {
|
|
|
|
return h();
|
|
|
|
}
|
|
|
|
|
|
|
|
function f() {
|
|
|
|
var l = { y : { z : 4 }, x : 2 }
|
|
|
|
var r = g(l);
|
|
|
|
assertEquals(2, r.x);
|
|
|
|
assertEquals(2, l.x);
|
|
|
|
l.x = 3;
|
|
|
|
l.y.z = 5;
|
|
|
|
// Test that the arguments object is properly
|
|
|
|
// aliased
|
|
|
|
assertEquals(3, r.x);
|
|
|
|
assertEquals(3, l.x);
|
|
|
|
assertEquals(5, r.y.z);
|
|
|
|
}
|
|
|
|
|
2019-03-01 10:19:54 +00:00
|
|
|
%PrepareFunctionForOptimization(f);
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
f(); f(); f();
|
|
|
|
%OptimizeFunctionOnNextCall(f);
|
|
|
|
f();
|
|
|
|
})();
|
|
|
|
|
|
|
|
|
|
|
|
// Get the arguments object twice, test aliasing
|
|
|
|
(function testTwoCapturedArguments() {
|
|
|
|
function h() {
|
|
|
|
return g.arguments[0];
|
|
|
|
}
|
|
|
|
|
|
|
|
function i() {
|
|
|
|
return g.arguments[0];
|
|
|
|
}
|
|
|
|
|
|
|
|
function g(x) {
|
|
|
|
return {h : h() , i : i()};
|
|
|
|
}
|
|
|
|
|
|
|
|
function f() {
|
|
|
|
var l = { y : { z : 4 }, x : 2 }
|
|
|
|
var r = g(l);
|
|
|
|
assertEquals(2, r.h.x)
|
|
|
|
l.y.z = 3;
|
|
|
|
assertEquals(3, r.h.y.z);
|
|
|
|
assertEquals(3, r.i.y.z);
|
|
|
|
}
|
|
|
|
|
2019-03-01 10:19:54 +00:00
|
|
|
%PrepareFunctionForOptimization(f);
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
f(); f(); f();
|
|
|
|
%OptimizeFunctionOnNextCall(f);
|
|
|
|
f();
|
|
|
|
})();
|
|
|
|
|
|
|
|
|
|
|
|
// Nested arguments object test
|
|
|
|
(function testTwoCapturedArgumentsNested() {
|
|
|
|
function i() {
|
|
|
|
return { gx : g.arguments[0], hx : h.arguments[0] };
|
|
|
|
}
|
|
|
|
|
|
|
|
function h(x) {
|
|
|
|
return i();
|
|
|
|
}
|
|
|
|
|
|
|
|
function g(x) {
|
|
|
|
return h(x.y);
|
|
|
|
}
|
|
|
|
|
|
|
|
function f() {
|
|
|
|
var l = { y : { z : 4 }, x : 2 }
|
|
|
|
var r = g(l);
|
|
|
|
assertEquals(2, r.gx.x)
|
|
|
|
assertEquals(4, r.gx.y.z)
|
|
|
|
assertEquals(4, r.hx.z)
|
|
|
|
l.y.z = 3;
|
|
|
|
assertEquals(3, r.gx.y.z)
|
|
|
|
assertEquals(3, r.hx.z)
|
|
|
|
assertEquals(3, l.y.z)
|
|
|
|
}
|
|
|
|
|
2019-03-01 10:19:54 +00:00
|
|
|
%PrepareFunctionForOptimization(f);
|
2017-07-12 11:27:10 +00:00
|
|
|
f(); f(); f();
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
%OptimizeFunctionOnNextCall(f);
|
2019-03-01 10:19:54 +00:00
|
|
|
f();
|
|
|
|
%PrepareFunctionForOptimization(f);
|
|
|
|
f();
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
%OptimizeFunctionOnNextCall(f);
|
2017-07-12 11:27:10 +00:00
|
|
|
f(); f();
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
})();
|
|
|
|
|
|
|
|
|
|
|
|
// Nested arguments object test with different inlining
|
|
|
|
(function testTwoCapturedArgumentsNested2() {
|
|
|
|
function i() {
|
|
|
|
return { gx : g.arguments[0], hx : h.arguments[0] };
|
|
|
|
}
|
|
|
|
|
|
|
|
function h(x) {
|
|
|
|
return i();
|
|
|
|
}
|
|
|
|
|
|
|
|
function g(x) {
|
|
|
|
return h(x.y);
|
|
|
|
}
|
|
|
|
|
|
|
|
function f() {
|
|
|
|
var l = { y : { z : 4 }, x : 2 }
|
|
|
|
var r = g(l);
|
|
|
|
assertEquals(2, r.gx.x)
|
|
|
|
assertEquals(4, r.gx.y.z)
|
|
|
|
assertEquals(4, r.hx.z)
|
|
|
|
l.y.z = 3;
|
|
|
|
assertEquals(3, r.gx.y.z)
|
|
|
|
assertEquals(3, r.hx.z)
|
|
|
|
assertEquals(3, l.y.z)
|
|
|
|
}
|
|
|
|
|
2019-03-01 10:19:54 +00:00
|
|
|
%PrepareFunctionForOptimization(f);
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
%NeverOptimizeFunction(i);
|
|
|
|
f(); f(); f();
|
|
|
|
%OptimizeFunctionOnNextCall(f);
|
2019-03-01 10:19:54 +00:00
|
|
|
f();
|
|
|
|
%PrepareFunctionForOptimization(f);
|
|
|
|
f();
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
%OptimizeFunctionOnNextCall(f);
|
|
|
|
f(); f();
|
|
|
|
})();
|
|
|
|
|
|
|
|
|
|
|
|
// Multiple captured argument test
|
|
|
|
(function testTwoArgumentsCapture() {
|
|
|
|
function h() {
|
|
|
|
return { a : g.arguments[1], b : g.arguments[0] };
|
|
|
|
}
|
|
|
|
|
|
|
|
function g(x, y) {
|
|
|
|
return h();
|
|
|
|
}
|
|
|
|
|
|
|
|
function f() {
|
|
|
|
var l = { y : { z : 4 }, x : 2 }
|
|
|
|
var k = { t : { u : 3 } };
|
|
|
|
var r = g(k, l);
|
|
|
|
assertEquals(2, r.a.x)
|
|
|
|
assertEquals(4, r.a.y.z)
|
|
|
|
assertEquals(3, r.b.t.u)
|
|
|
|
l.y.z = 6;
|
|
|
|
r.b.t.u = 7;
|
|
|
|
assertEquals(6, r.a.y.z)
|
|
|
|
assertEquals(7, k.t.u)
|
|
|
|
}
|
|
|
|
|
2019-03-01 10:19:54 +00:00
|
|
|
%PrepareFunctionForOptimization(f);
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
f(); f(); f();
|
|
|
|
%OptimizeFunctionOnNextCall(f);
|
2019-03-01 10:19:54 +00:00
|
|
|
f();
|
|
|
|
%PrepareFunctionForOptimization(f);
|
|
|
|
f();
|
The current
version is passing all the existing test + a bunch of new tests
(packaged in the change list, too).
The patch extends the SlotRef object to describe captured and duplicated
objects. Since the SlotRefs are not independent of each other anymore,
there is a new SlotRefValueBuilder class that stores the SlotRefs and
later materializes the objects from the SlotRefs.
Note that unlike the previous implementation of SlotRefs, we now build
the SlotRef entries for the entire frame, not just the particular
function. This is because duplicate objects might refer to previous
captured objects (that might live inside other inlined function's part
of the frame).
We also need to store the materialized objects between other potential
invocations of the same arguments object so that we materialize each
captured object at most once. The materialized objects of frames live
in the new MaterielizedObjectStore object (contained in Isolate),
indexed by the frame's FP address. Each argument materialization (and
deoptimization) tries to lookup its captured objects in the store before
building new ones. Deoptimization also removes the materialized objects
from the store. We also schedule a lazy deopt to be sure that we always
get rid of the materialized objects and that the optmized function
adopts the materialized objects (instead of happily computing with its
captured representations).
Concerns:
- Is the FP address the right key for a frame? (Note that deoptimizer's
representation of frame is different from the argument object
materializer's one - it is not easy to find common ground.)
- Performance is suboptimal in several places, but a quick local run of
benchmarks does not seem to show a perf hit. Examples of possible
improvements: smarter generation of SlotRefs (build other functions'
SlotRefs only for captured objects and only if necessary), smarter
lookup of stored materialized objects.
- Ideally, we would like to share the code for argument materialization
with deoptimizer's materializer. However, the supporting data structures
(mainly the frame descriptor) are quite different in each case, so it
looks more like a separate project.
Thanks for any feedback.
R=danno@chromium.org, mstarzinger@chromium.org
LOG=N
BUG=
Committed: https://code.google.com/p/v8/source/detail?r=18918
Review URL: https://codereview.chromium.org/103243005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18936 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-01-30 10:33:53 +00:00
|
|
|
%OptimizeFunctionOnNextCall(f);
|
|
|
|
f(); f();
|
|
|
|
})();
|
[turbofan] Escape analysis support for LoadElement with variable index.
This adds support to the escape analysis to allow scalar replacement
of (small) FixedArrays with element accesses where the index is not a
compile time constant. This happens quite often when inlining functions
that operate on variable number of arguments. For example consider this
little piece of code:
```js
function sum(...args) {
let s = 0;
for (let i = 0; i < args.length; ++i) s += args[i];
return s;
}
function sum2(x, y) {
return sum(x, y);
}
```
This example is made up, of course, but it shows the problem. Let's
assume that TurboFan inlines the function `sum` into it's call site
at `sum2`. Now it has to materialize the `args` array with the two
values `x` and `y`, and iterate through these `args` to sum them up.
The escape analysis pass figures out that `args` doesn't escape (aka
doesn't outlive) the optimized code for `sum2` now, but TurboFan still
needs to materialize the elements backing store for `args` since there's
a `LoadElement(args.elements,i)` in the graph now, and `i` is not a
compile time constant.
However the escape analysis has more information than just that. In
particular the escape analysis knows exactly how many elements a non
escaping object has, based on the fact that the allocation must be
local to the function and that we only track objects with known size.
So in the case above when we get to `args[i]` in the escape analysis
the relevant part of the graph looks something like this:
```
elements = LoadField[elements](args)
length = LoadField[length](args)
index = CheckBounds(i, length)
value = LoadElement(elements, index)
```
In particular the contract here is that `LoadElement(elements,index)`
is guaranteed to have an `index` that is within the valid bounds for
the `elements` (there must be a preceeding `CheckBounds` or some other
guard in optimized code before it). And since `elements` is allocated
inside of the optimized code object, the escape analysis also knows
that `elements` has exactly two elements inside (namely the values of
`x` and `y`). So we can use that information and replace the access
with a `Select(index===0,x,y)` operation instead, which allows us to
scalar replace the `elements`, since there's no escaping use anymore
in the graph.
We do this for the case that the number of elements is 2, as described
above, but also for the case where elements length is one. In case
of 0, we know that the `LoadElement` must be in dead code, but we can't
just mark it for deletion from the graph (to make sure it doesn't block
scalar replacement of non-dead code), so we don't handle this for now.
And for one element it's even easier, since the `LoadElement` has to
yield exactly said element.
We could generalize this to handle arbitrary lengths, but since there's
a cost to arbitrary decision trees here, it's unclear when this is still
beneficial. Another possible solution for length > 2 would be to have
special stack allocation for these backing stores and do variable index
accesses to these stack areas. But that's way beyond the scope of this
isolated change.
This change shows a ~2% improvement on the EarleyBoyer benchmark in
JetStream, since it benefits a lot from not having to materialize these
small arguments backing stores.
Drive-by-fix: Fix JSCreateLowering to properly initialize "elements"
with StoreElement instead of StoreField (which violates the invariant
in TurboFan that fields and elements never alias).
Bug: v8:5267, v8:6200
Change-Id: Idd464a15a81e7c9653c48c814b406eb859841428
Reviewed-on: https://chromium-review.googlesource.com/c/1267935
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56442}
2018-10-08 11:16:36 +00:00
|
|
|
|
|
|
|
// Test variable index access to strict arguments
|
|
|
|
// with up to 2 elements.
|
|
|
|
(function testArgumentsVariableIndexStrict() {
|
|
|
|
function g() {
|
|
|
|
"use strict";
|
|
|
|
var s = 0;
|
|
|
|
for (var i = 0; i < arguments.length; ++i) s += arguments[i];
|
|
|
|
return s;
|
|
|
|
}
|
|
|
|
|
|
|
|
function f(x, y) {
|
|
|
|
// (a) arguments[i] is dead code since arguments.length is 0.
|
|
|
|
const a = g();
|
|
|
|
// (b) arguments[i] always yields the first element.
|
|
|
|
const b = g(x);
|
|
|
|
// (c) arguments[i] can yield either x or y.
|
|
|
|
const c = g(x, y);
|
|
|
|
return a + b + c;
|
|
|
|
}
|
|
|
|
|
2019-03-01 10:19:54 +00:00
|
|
|
%PrepareFunctionForOptimization(f);
|
[turbofan] Escape analysis support for LoadElement with variable index.
This adds support to the escape analysis to allow scalar replacement
of (small) FixedArrays with element accesses where the index is not a
compile time constant. This happens quite often when inlining functions
that operate on variable number of arguments. For example consider this
little piece of code:
```js
function sum(...args) {
let s = 0;
for (let i = 0; i < args.length; ++i) s += args[i];
return s;
}
function sum2(x, y) {
return sum(x, y);
}
```
This example is made up, of course, but it shows the problem. Let's
assume that TurboFan inlines the function `sum` into it's call site
at `sum2`. Now it has to materialize the `args` array with the two
values `x` and `y`, and iterate through these `args` to sum them up.
The escape analysis pass figures out that `args` doesn't escape (aka
doesn't outlive) the optimized code for `sum2` now, but TurboFan still
needs to materialize the elements backing store for `args` since there's
a `LoadElement(args.elements,i)` in the graph now, and `i` is not a
compile time constant.
However the escape analysis has more information than just that. In
particular the escape analysis knows exactly how many elements a non
escaping object has, based on the fact that the allocation must be
local to the function and that we only track objects with known size.
So in the case above when we get to `args[i]` in the escape analysis
the relevant part of the graph looks something like this:
```
elements = LoadField[elements](args)
length = LoadField[length](args)
index = CheckBounds(i, length)
value = LoadElement(elements, index)
```
In particular the contract here is that `LoadElement(elements,index)`
is guaranteed to have an `index` that is within the valid bounds for
the `elements` (there must be a preceeding `CheckBounds` or some other
guard in optimized code before it). And since `elements` is allocated
inside of the optimized code object, the escape analysis also knows
that `elements` has exactly two elements inside (namely the values of
`x` and `y`). So we can use that information and replace the access
with a `Select(index===0,x,y)` operation instead, which allows us to
scalar replace the `elements`, since there's no escaping use anymore
in the graph.
We do this for the case that the number of elements is 2, as described
above, but also for the case where elements length is one. In case
of 0, we know that the `LoadElement` must be in dead code, but we can't
just mark it for deletion from the graph (to make sure it doesn't block
scalar replacement of non-dead code), so we don't handle this for now.
And for one element it's even easier, since the `LoadElement` has to
yield exactly said element.
We could generalize this to handle arbitrary lengths, but since there's
a cost to arbitrary decision trees here, it's unclear when this is still
beneficial. Another possible solution for length > 2 would be to have
special stack allocation for these backing stores and do variable index
accesses to these stack areas. But that's way beyond the scope of this
isolated change.
This change shows a ~2% improvement on the EarleyBoyer benchmark in
JetStream, since it benefits a lot from not having to materialize these
small arguments backing stores.
Drive-by-fix: Fix JSCreateLowering to properly initialize "elements"
with StoreElement instead of StoreField (which violates the invariant
in TurboFan that fields and elements never alias).
Bug: v8:5267, v8:6200
Change-Id: Idd464a15a81e7c9653c48c814b406eb859841428
Reviewed-on: https://chromium-review.googlesource.com/c/1267935
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56442}
2018-10-08 11:16:36 +00:00
|
|
|
assertEquals(4, f(1, 2));
|
|
|
|
assertEquals(5, f(2, 1));
|
|
|
|
%OptimizeFunctionOnNextCall(f);
|
|
|
|
assertEquals(4, f(1, 2));
|
|
|
|
assertEquals(5, f(2, 1));
|
|
|
|
})();
|
|
|
|
|
|
|
|
// Test variable index access to sloppy arguments
|
|
|
|
// with up to 2 elements.
|
|
|
|
(function testArgumentsVariableIndexSloppy() {
|
|
|
|
function g() {
|
|
|
|
var s = 0;
|
|
|
|
for (var i = 0; i < arguments.length; ++i) s += arguments[i];
|
|
|
|
return s;
|
|
|
|
}
|
|
|
|
|
|
|
|
function f(x, y) {
|
|
|
|
// (a) arguments[i] is dead code since arguments.length is 0.
|
|
|
|
const a = g();
|
|
|
|
// (b) arguments[i] always yields the first element.
|
|
|
|
const b = g(x);
|
|
|
|
// (c) arguments[i] can yield either x or y.
|
|
|
|
const c = g(x, y);
|
|
|
|
return a + b + c;
|
|
|
|
}
|
|
|
|
|
2019-03-01 10:19:54 +00:00
|
|
|
%PrepareFunctionForOptimization(f);
|
[turbofan] Escape analysis support for LoadElement with variable index.
This adds support to the escape analysis to allow scalar replacement
of (small) FixedArrays with element accesses where the index is not a
compile time constant. This happens quite often when inlining functions
that operate on variable number of arguments. For example consider this
little piece of code:
```js
function sum(...args) {
let s = 0;
for (let i = 0; i < args.length; ++i) s += args[i];
return s;
}
function sum2(x, y) {
return sum(x, y);
}
```
This example is made up, of course, but it shows the problem. Let's
assume that TurboFan inlines the function `sum` into it's call site
at `sum2`. Now it has to materialize the `args` array with the two
values `x` and `y`, and iterate through these `args` to sum them up.
The escape analysis pass figures out that `args` doesn't escape (aka
doesn't outlive) the optimized code for `sum2` now, but TurboFan still
needs to materialize the elements backing store for `args` since there's
a `LoadElement(args.elements,i)` in the graph now, and `i` is not a
compile time constant.
However the escape analysis has more information than just that. In
particular the escape analysis knows exactly how many elements a non
escaping object has, based on the fact that the allocation must be
local to the function and that we only track objects with known size.
So in the case above when we get to `args[i]` in the escape analysis
the relevant part of the graph looks something like this:
```
elements = LoadField[elements](args)
length = LoadField[length](args)
index = CheckBounds(i, length)
value = LoadElement(elements, index)
```
In particular the contract here is that `LoadElement(elements,index)`
is guaranteed to have an `index` that is within the valid bounds for
the `elements` (there must be a preceeding `CheckBounds` or some other
guard in optimized code before it). And since `elements` is allocated
inside of the optimized code object, the escape analysis also knows
that `elements` has exactly two elements inside (namely the values of
`x` and `y`). So we can use that information and replace the access
with a `Select(index===0,x,y)` operation instead, which allows us to
scalar replace the `elements`, since there's no escaping use anymore
in the graph.
We do this for the case that the number of elements is 2, as described
above, but also for the case where elements length is one. In case
of 0, we know that the `LoadElement` must be in dead code, but we can't
just mark it for deletion from the graph (to make sure it doesn't block
scalar replacement of non-dead code), so we don't handle this for now.
And for one element it's even easier, since the `LoadElement` has to
yield exactly said element.
We could generalize this to handle arbitrary lengths, but since there's
a cost to arbitrary decision trees here, it's unclear when this is still
beneficial. Another possible solution for length > 2 would be to have
special stack allocation for these backing stores and do variable index
accesses to these stack areas. But that's way beyond the scope of this
isolated change.
This change shows a ~2% improvement on the EarleyBoyer benchmark in
JetStream, since it benefits a lot from not having to materialize these
small arguments backing stores.
Drive-by-fix: Fix JSCreateLowering to properly initialize "elements"
with StoreElement instead of StoreField (which violates the invariant
in TurboFan that fields and elements never alias).
Bug: v8:5267, v8:6200
Change-Id: Idd464a15a81e7c9653c48c814b406eb859841428
Reviewed-on: https://chromium-review.googlesource.com/c/1267935
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56442}
2018-10-08 11:16:36 +00:00
|
|
|
assertEquals(4, f(1, 2));
|
|
|
|
assertEquals(5, f(2, 1));
|
|
|
|
%OptimizeFunctionOnNextCall(f);
|
|
|
|
assertEquals(4, f(1, 2));
|
|
|
|
assertEquals(5, f(2, 1));
|
|
|
|
})();
|