This patch normalizes the casing of hexadecimal digits in escape
sequences of the form `\xNN` and integer literals of the form
`0xNNNN`.
Previously, the V8 code base used an inconsistent mixture of uppercase
and lowercase.
Google’s C++ style guide uses uppercase in its examples:
https://google.github.io/styleguide/cppguide.html#Non-ASCII_Characters
Moreover, uppercase letters more clearly stand out from the lowercase
`x` (or `u`) characters at the start, as well as lowercase letters
elsewhere in strings.
BUG=v8:7109
TBR=marja@chromium.org,titzer@chromium.org,mtrofin@chromium.org,mstarzinger@chromium.org,rossberg@chromium.org,yangguo@chromium.org,mlippautz@chromium.org
NOPRESUBMIT=true
Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_chromium_rel_ng
Change-Id: I790e21c25d96ad5d95c8229724eb45d2aa9e22d6
Reviewed-on: https://chromium-review.googlesource.com/804294
Commit-Queue: Mathias Bynens <mathias@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Cr-Commit-Position: refs/heads/master@{#49810}
This fix is two-fold:
1) Incremental UTF-8 decoding: Unify incorrect UTF-8 handling between V8 and
Blink.
Incremental UTF-8 decoding used to allow some overlong sequences / invalid code
points which Blink treated as errors. This caused the decoder and the Blink
UTF-8 decoder to produce a different number of bytes, resulting in random
failures when scripts were streamed (especially, this was detected by the
skipping inner functions feature which adds CHECKs against expected function
positions).
2) Non-incremental UTF-8 decoding: return the correct amount of invalid characters.
According to the encoding spec ( https://encoding.spec.whatwg.org/#utf-8-decoder
), the first byte of an overlong sequence / invalid code point generates an
invalid character, and the rest of the bytes are not processed (i.e., pushed
back to the byte stream). When they're handled, they will look like lonely
continuation bytes, and will generate an invalid character each.
As a result, an overlong 4-byte sequence should generate 4 invalid characters
(not 1).
This is a potentially breaking change, since the (non-incremental) UTF-8
decoding is exposed via the API (String::NewFromUtf8). The behavioral difference
happens when the client is passing in invalid UTF-8 (containing overlong /
surrogate sequences).
However, afaict, this doesn't change the semantics of any JavaScript program:
according to the ECMAScript spec, the program is a sequence of Unicode code
points, and there's no way to invoke the UTF-8 decoding functionalities from
inside JavaScript. Though, this changes the behavior of d8 when decoding source
files which are invalid UTF-8.
This doesn't change anything related to URI decoding (it already throws
exceptions for overlong sequences / invalid code points).
BUG: chromium:765608, chromium:758236, v8:5516
Bug:
Change-Id: Ib029f6a8e87186794b092e4e8af32d01cee3ada0
Reviewed-on: https://chromium-review.googlesource.com/671020
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Franziska Hinkelmann <franzih@chromium.org>
Reviewed-by: Camillo Bruni <cbruni@chromium.org>
Cr-Commit-Position: refs/heads/master@{#48105}
The bug occurred when we detected an erroneous char late, and put the last
character in a chunk into the "incomplete char" buffer. It was not correctly
retrieved when seeking.
BUG=v8:6836
Change-Id: I8ca946dfdb39244c5ca0bdcebe047047010b3a07
Reviewed-on: https://chromium-review.googlesource.com/670729
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Camillo Bruni <cbruni@chromium.org>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/master@{#48066}
U+feff is the UTF BOM but if it occurs inside the text, it's a "zero-width
no-break space". However, the UTF-8 decoder in script streaming still thought
it's a BOM and skipped it. The correct way to handle it would be to create a
U+feff code point instead - the Scanner will then handle it as whitespace.
This is a discrepancy between the Blink UTF-8 decoder and the V8 UTF-8 decoder,
and caused the source positions be off by one. This bug went unnoticed, since
normally off-by-one in this situation doesn't make the code to break.
BUG=chromium:758508,chromium:758236
Change-Id: Ib92a3ee65c402e21b77e42537db2a021cff55379
Reviewed-on: https://chromium-review.googlesource.com/632096
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#47583}
Unify, simplify logic, reduce UTF8 specific handling.
Intend of this is also to have stream views.
Stream views can be used concurrently by multiple threads, but
only one thread may fetch new data from the underlying source.
This together with unified stream view creation is intended to be
used for parse tasks.
BUG=v8:6093
Change-Id: I83c6f1e6ad280c28da690da41c466dfcbb7915e6
Reviewed-on: https://chromium-review.googlesource.com/535474
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Commit-Queue: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45994}
Also, as this is hard to track down, always DCHECK position after ReadBlock().
Change-Id: Ie32c3a311dd8df91f651b6d82ccacc7c95e6fde0
Reviewed-on: https://chromium-review.googlesource.com/528196
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45811}
This reverts commit 7fa071a48b.
Reason for revert: https://bugs.chromium.org/p/chromium/issues/detail?id=729482
Original change's description:
> Reland [parser] Refactor streaming scanner streams.
>
> Unify, simplify logic, reduce UTF8 specific handling.
>
> Intend of this is also to have stream views.
> Stream views can be used concurrently by multiple threads, but
> only one thread may fetch new data from the underlying source.
> This together with unified stream view creation is intended to be
> used for parse tasks.
>
> BUG=v8:6093
>
> Change-Id: I3bce48185fa2c986d16619a9a8ece3ff4c4f5e60
> Reviewed-on: https://chromium-review.googlesource.com/509489
> Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
> Reviewed-by: Marja Hölttä <marja@chromium.org>
> Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
> Cr-Commit-Position: refs/heads/master@{#45688}
TBR=marja@chromium.org,vogelheim@chromium.org,wiktorg@google.com
# Not skipping CQ checks because original CL landed > 1 day ago.
BUG=v8:6093
Change-Id: Iefa7c43a2f6ae3a7f3ef0f77d87b6ae36ae4be99
Reviewed-on: https://chromium-review.googlesource.com/525712
Reviewed-by: Marja Hölttä <marja@chromium.org>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Commit-Queue: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45725}
Unify, simplify logic, reduce UTF8 specific handling.
Intend of this is also to have stream views.
Stream views can be used concurrently by multiple threads, but
only one thread may fetch new data from the underlying source.
This together with unified stream view creation is intended to be
used for parse tasks.
BUG=v8:6093
Change-Id: I3bce48185fa2c986d16619a9a8ece3ff4c4f5e60
Reviewed-on: https://chromium-review.googlesource.com/509489
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
Cr-Commit-Position: refs/heads/master@{#45688}
This reverts commit ce538f70c1.
Reason for revert: breaks BOM handling (thus breaking Outlook web apps).
Original change's description:
> [parser] Refactor streaming scanner streams.
>
> Unify, simplify logic, reduce UTF8 specific handling.
>
> Intend of this is also to have stream views.
> Stream views can be used concurrently by multiple threads, but
> only one thread may fetch new data from the underlying source.
> This together with unified stream view creation is intended to be
> used for parse tasks.
>
> BUG=v8:6093
>
> Change-Id: Ied8e93090c506d4735080298f0fdaeed32043915
> Reviewed-on: https://chromium-review.googlesource.com/501789
> Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
> Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
> Reviewed-by: Marja Hölttä <marja@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#45336}
TBR=marja@chromium.org,vogelheim@chromium.org,jochen@chromium.org,wiktorg@google.com
BUG=v8:6093, chromium:724166
Change-Id: I022a23b8052d20d83a640c07b7864c622548bf90
Reviewed-on: https://chromium-review.googlesource.com/508888
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Adam Klein <adamk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45404}
Unify, simplify logic, reduce UTF8 specific handling.
Intend of this is also to have stream views.
Stream views can be used concurrently by multiple threads, but
only one thread may fetch new data from the underlying source.
This together with unified stream view creation is intended to be
used for parse tasks.
BUG=v8:6093
Change-Id: Ied8e93090c506d4735080298f0fdaeed32043915
Reviewed-on: https://chromium-review.googlesource.com/501789
Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45336}
... and TypeFeedbackMetadata to FeedbackMetadata.
BUG=
Change-Id: I2556d1c2a8f37b8cf3d532cc98d973b6dc7e9e6c
Reviewed-on: https://chromium-review.googlesource.com/439244
Commit-Queue: Igor Sheludko <ishell@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Michael Stanton <mvstanton@chromium.org>
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#42999}
The existing unit test explicitly checked for this case, but was - under
the right circumstances - defeated by the optimization to not re-run the
whole position search if we were close enough.
BUG=chromium:685618
Review-Url: https://codereview.chromium.org/2663883002
Cr-Commit-Position: refs/heads/master@{#42794}
Removed a wrong condition test in TwoByteExternalBufferedStream. This changed fixes errors that may occur under some conditions.
Review-Url: https://codereview.chromium.org/2469723002
Cr-Commit-Position: refs/heads/master@{#40722}
- Smaller, more consistent streams API (Advance, Back, pos, Seek)
- Remove implementations from the header, in favor of creation functions.
Observe:
- Performance:
- All Utf16CharacterStream methods have an inlinable V8_LIKELY w/ a
body of only a few instructions. I expect most calls to end up there.
- There used to be performance problems w/ bookmarking, particularly
with copying too much data on SetBookmark w/ UTF-8 streaming streams.
All those copies are gone.
- The old streaming streams implementation used to copy data even for
2-byte input. It no longer does.
- The only remaining 'slow' method is the Seek(.) slow case for utf-8
streaming streams. I don't expect this to be called a lot; and even if,
I expect it to be offset by the gains in the (vastly more frequent)
calls to the other methods or the 'fast path'.
- If it still bothers us, there are several ways to speed it up.
- API & code cleanliness:
- I want to remove the 'old' API in a follow-up CL, which should mostly
delete code, or replace it 1:1.
- In a 2nd follow-up I want to delete much of the UTF-8 handling in Blink
for streaming streams.
- The "bookmark" is now always implemented (and mostly very fast), so we
should be able to use it for more things.
- Testing & correctness:
- The unit tests now cover all stream implementations,
and are pretty good and triggering all the edge cases.
- Vastly more DCHECKs of the invariants.
BUG=v8:4947
Review-Url: https://codereview.chromium.org/2314663002
Cr-Commit-Position: refs/heads/master@{#39464}