Commit Graph

40 Commits

Author SHA1 Message Date
Benoît Lizé
a1da383fb3 parsing: Lock ExternalStrings in the ExternalStringStream.
Utf*Characterstream caches the data pointer of ExternalStrings through
ExternalStringStream, so lock the strings in ExternalStringStream.

Bug: chromium:877044
Cq-Include-Trybots: luci.chromium.try:linux_chromium_rel_ng
Change-Id: I241caaf64e109b33e2f9982573e11c514410509c
Reviewed-on: https://chromium-review.googlesource.com/1194003
Commit-Queue: Benoit L <lizeb@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Reviewed-by: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#55613}
2018-09-04 14:09:04 +00:00
Ross McIlroy
99a2440c22 [Scanner] Add support for cloning character streams.
Currently chunked and streaming character streams don't have cloning support as this
requires additional changes on the Blink side.

BUG=v8:8041

Change-Id: I167b3b1cd5ae1ac4038f3715b6a679d3e65d9a85
Reviewed-on: https://chromium-review.googlesource.com/1183429
Reviewed-by: Toon Verwaest <verwaest@chromium.org>
Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#55444}
2018-08-28 09:45:24 +00:00
Toon Verwaest
fcfd995aa1 [scanner] Go back to untemplatized scanning with buffering
This reverts the following 3 CLs:

Revert "[scanner] Templatize scan functions by encoding"
Revert "[asm] Remove invalid static cast of character stream"
Revert "[scanner] Prepare CharacterStreams for specializing scanner and parser by character type"

The original idea behind this work was to avoid copying, converting and
buffering characters to be scanned by specializing the scanner functions. The
additional benefit was for scanner functions to have a bigger window over the
input. Even though we can get a pretty nice speedup from having a larger
window, in practice this rarely helps. The cost is a larger binary.

Since we can't eagerly convert utf8 to utf16 due to memory overhead, we'd also
need to have a specialized version of the scanner just for utf8. That's pretty
complex, and likely won't be better than simply bulk converting and buffering
utf8 as utf16.

Change-Id: Ic3564683932a0097e3f9f51cd88f62c6ac879dcb
Reviewed-on: https://chromium-review.googlesource.com/1183190
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Cr-Commit-Position: refs/heads/master@{#55258}
2018-08-21 10:52:52 +00:00
Toon Verwaest
378375d2e5 [scanner] Templatize scan functions by encoding
This way we can avoid reencoding everything to utf16 (buffered) and avoid the
overhead of needing to check the encoding for each character individually.

This may result in a minor asm.js scanning regression due to one-byte tokens
possibly being more common.

Change-Id: I90b51c256d56d4f4fa2d235d7e1e58fc01e43f31
Reviewed-on: https://chromium-review.googlesource.com/1172437
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#55217}
2018-08-20 13:54:16 +00:00
Toon Verwaest
928e7b2973 [scanner] Decode utf8 as chunks come in to utf16, allowing unbuffered streaming
Change-Id: Iaad8bc94e9222d309749491df9a500544b5b37da
Reviewed-on: https://chromium-review.googlesource.com/1158687
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54877}
2018-08-02 19:00:09 +00:00
Toon Verwaest
2d40e2f445 [scanner] Prepare CharacterStreams for specializing scanner and parser by character type
This templatizes CharacterStream by char type, and makes them subclass ScannerStream.
Methods that are widely used by tests are marked virtual on ScannerStream and final on
CharacterStream<T> so the specialized scanner will know what to call. ParseInfo passes
around ScannerStream, but the scanner requires the explicit CharacterStream<T>. Since
AdvanceUntil is templatized by FunctionType, I couldn't mark that virtual; so instead
I adjusted those tests to operate directly on ucs2 (not utf8 since we'll drop that in
the future).

In the end no functionality was changed. Some calls became virtual in tests. This is
mainly just preparation.

Change-Id: I0b4def65d3eb8fa5c806027c7e9123a590ebbdb5
Reviewed-on: https://chromium-review.googlesource.com/1156690
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54848}
2018-08-01 15:11:50 +00:00
Florian Sattler
b45fdb342a [scanner] Adding AdvanceUntil to Utf16CharacterStream
AdvanceUntil allows the Utf16CharacterStream to advance until a charater is found
that passes the check.

Bug: v8:7926
Change-Id: Iae39fb24194aa0ee2f544a55a7847956aa324b64
Reviewed-on: https://chromium-review.googlesource.com/1151303
Commit-Queue: Florian Sattler <sattlerf@google.com>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54783}
2018-07-30 12:13:31 +00:00
Toon Verwaest
12a63796f7 [scanner-streams] Add relocatable character stream for on-heap utf16 streams
Change-Id: I388f6a6c937b6897efe9e88b06ba4b56670fea4f
Reviewed-on: https://chromium-review.googlesource.com/1151191
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54720}
2018-07-26 12:35:56 +00:00
Rodrigo Bruno
82632ed5e4 Reland^2 "Added External Strings to external memory accounting".
This CL depends on Reland^2 "Avoiding re-externalization of strings"
(Idb1b6d1b29499f66bf8cd704977c40b027f99dbd)..

Previously landed as Ied341ec6268000343d2a577b22f2a483460b01f5 and
I3fe2b294f6e038d77787cf0870d244ba7cc20550

Previously reviewed at https://chromium-review.googlesource.com/1121736 and
https://chromium-review.googlesource.com/1118164

Bug: chromium:845409
Change-Id: Ied50bbcaa22a90ecaf15dca19dbc9aaec1737223
Reviewed-on: https://chromium-review.googlesource.com/1147227
Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Commit-Queue: Rodrigo Bruno <rfbpb@google.com>
Cr-Commit-Position: refs/heads/master@{#54712}
2018-07-26 09:25:41 +00:00
Michael Lippautz
71dddd145d Revert "Reland "[heap] Added External Strings to external memory accounting.""
This reverts commit 7bff339e7f.

Reason for revert: Breaks autoroll, see bug.

Bug: v8:7944

Original change's description:
> Reland "[heap] Added External Strings to external memory accounting."
> 
> This is a reland of 5863c0b652
> 
> Original change's description:
> > [heap] Added External Strings to external memory accounting.
> > 
> > Bug: chromium:845409
> > Change-Id: I3fe2b294f6e038d77787cf0870d244ba7cc20550
> > Reviewed-on: https://chromium-review.googlesource.com/1118164
> > Commit-Queue: Rodrigo Bruno <rfbpb@google.com>
> > Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> > Cr-Commit-Position: refs/heads/master@{#54110}
> 
> Bug: chromium:845409
> Change-Id: Ied341ec6268000343d2a577b22f2a483460b01f5
> Reviewed-on: https://chromium-review.googlesource.com/1121736
> Commit-Queue: Rodrigo Bruno <rfbpb@google.com>
> Reviewed-by: Peter Marshall <petermarshall@chromium.org>
> Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
> Reviewed-by: Hannes Payer <hpayer@chromium.org>
> Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#54410}

TBR=ulan@chromium.org,hpayer@chromium.org,mlippautz@chromium.org,petermarshall@chromium.org,rfbpb@google.com

Change-Id: Ie55586e84f44a2d83c7f97110d60abb86f0730c5
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: chromium:845409
Reviewed-on: https://chromium-review.googlesource.com/1136312
Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
Commit-Queue: Michael Lippautz <mlippautz@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54428}
2018-07-13 08:27:51 +00:00
Rodrigo Bruno
7bff339e7f Reland "[heap] Added External Strings to external memory accounting."
This is a reland of 5863c0b652

Original change's description:
> [heap] Added External Strings to external memory accounting.
> 
> Bug: chromium:845409
> Change-Id: I3fe2b294f6e038d77787cf0870d244ba7cc20550
> Reviewed-on: https://chromium-review.googlesource.com/1118164
> Commit-Queue: Rodrigo Bruno <rfbpb@google.com>
> Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#54110}

Bug: chromium:845409
Change-Id: Ied341ec6268000343d2a577b22f2a483460b01f5
Reviewed-on: https://chromium-review.googlesource.com/1121736
Commit-Queue: Rodrigo Bruno <rfbpb@google.com>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54410}
2018-07-12 13:19:28 +00:00
Toon Verwaest
c7ad1ddd44 [scanner] Drop lonely byte support as it's unused by blink anyway.
The embedder should ultimately be responsible for handling this since they
anyway give us a copy of the data. They can easily make sure that the chunks we
get do not have lonely bytes.

Cq-Include-Trybots: luci.chromium.try:linux_chromium_rel_ng
Change-Id: Ie862107bbbdd00c4d904fbb457a206c2fd52e5d0
Reviewed-on: https://chromium-review.googlesource.com/1127044
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54262}
2018-07-05 14:22:44 +00:00
Toon Verwaest
5063241306 [scanner] Rewrite character streams by separating underlying bytestreams from buffering.
Additionally now we only scan over flat heap strings.

Change-Id: Ia73c538a3c7923ec66089e16efa529ef3cea2d06
Reviewed-on: https://chromium-review.googlesource.com/1126938
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54258}
2018-07-05 12:59:28 +00:00
Yang Guo
0973a40867 Revert "[scanner] Rewrite character streams by separating underlying bytestreams from buffering."
This reverts commit 5f2f418d47.

Reason for revert: Speculative revert for LayoutTest timeouts

https://ci.chromium.org/buildbot/client.v8.fyi/V8-Blink%20Linux%2064/24596
https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8-Blink%20Linux%2064%20-%20future/4707
https://ci.chromium.org/buildbot/client.v8.fyi/V8-Blink%20Linux%2064%20(dbg)/12467

Original change's description:
> [scanner] Rewrite character streams by separating underlying bytestreams from buffering.
> 
> Additionally now we only scan over flat heap strings.
> 
> Change-Id: Ic449b19aecd7fc3f283a04a3df6a39772d471565
> Reviewed-on: https://chromium-review.googlesource.com/1125854
> Reviewed-by: Marja Hölttä <marja@chromium.org>
> Commit-Queue: Toon Verwaest <verwaest@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#54224}

TBR=marja@chromium.org,verwaest@chromium.org

Change-Id: Ica3026f318a85ec6bb24a38a8cd998f12c146d7e
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/1126819
Reviewed-by: Yang Guo <yangguo@chromium.org>
Commit-Queue: Yang Guo <yangguo@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54231}
2018-07-05 07:37:15 +00:00
Toon Verwaest
5f2f418d47 [scanner] Rewrite character streams by separating underlying bytestreams from buffering.
Additionally now we only scan over flat heap strings.

Change-Id: Ic449b19aecd7fc3f283a04a3df6a39772d471565
Reviewed-on: https://chromium-review.googlesource.com/1125854
Reviewed-by: Marja Hölttä <marja@chromium.org>
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54224}
2018-07-04 18:37:08 +00:00
Rodrigo Bruno
c5c4b588f1 [heap] Forcing external strings to be registered in the external string table.
Bug: chromium:845409
Cq-Include-Trybots: luci.chromium.try:linux_chromium_rel_ng
Change-Id: I2ab1ca18a900828e4e116f1b087925319d41bf97
Reviewed-on: https://chromium-review.googlesource.com/1124845
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Commit-Queue: Rodrigo Bruno <rfbpb@google.com>
Cr-Commit-Position: refs/heads/master@{#54203}
2018-07-04 10:55:26 +00:00
Jakob Kummerow
cfc6a5c2c6 Reland: [cleanup] Refactor the Factory
There is no good reason to have the meat of most objects' initialization
logic in heap.cc, all wrapped by the CALL_HEAP_FUNCTION macro. Instead,
this CL changes the protocol between Heap and Factory to be AllocateRaw,
and all object initialization work after (possibly retried) successful
raw allocation happens in the Factory.

This saves about 20KB of binary size on x64.

Original review: https://chromium-review.googlesource.com/c/v8/v8/+/959533
Originally landed as r52416 / f9a2e24bbc

Cq-Include-Trybots: luci.v8.try:v8_linux_noi18n_rel_ng
Change-Id: Id072cbe6b3ed30afd339c7e502844b99ca12a647
Reviewed-on: https://chromium-review.googlesource.com/1000540
Commit-Queue: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#52492}
2018-04-09 19:52:22 +00:00
Michael Achenbach
503e07c3ef Revert "[cleanup] Refactor the Factory"
This reverts commit f9a2e24bbc.

Reason for revert: gc stress failures not all fixed by follow up.

Original change's description:
> [cleanup] Refactor the Factory
> 
> There is no good reason to have the meat of most objects' initialization
> logic in heap.cc, all wrapped by the CALL_HEAP_FUNCTION macro. Instead,
> this CL changes the protocol between Heap and Factory to be AllocateRaw,
> and all object initialization work after (possibly retried) successful
> raw allocation happens in the Factory.
> 
> This saves about 20KB of binary size on x64.
> 
> Cq-Include-Trybots: luci.v8.try:v8_linux_noi18n_rel_ng
> Change-Id: Icbfdc4266d7be8b48d2fe085f03411743dc6a0ca
> Reviewed-on: https://chromium-review.googlesource.com/959533
> Commit-Queue: Jakob Kummerow <jkummerow@chromium.org>
> Reviewed-by: Hannes Payer <hpayer@chromium.org>
> Reviewed-by: Yang Guo <yangguo@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#52416}

TBR=jkummerow@chromium.org,yangguo@chromium.org,mstarzinger@chromium.org,hpayer@chromium.org

Change-Id: Idbbc53478742f3e9525eee83342afc6aedae122f
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Cq-Include-Trybots: luci.v8.try:v8_linux_noi18n_rel_ng
Reviewed-on: https://chromium-review.googlesource.com/999414
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Commit-Queue: Michael Achenbach <machenbach@chromium.org>
Cr-Commit-Position: refs/heads/master@{#52420}
2018-04-06 07:23:19 +00:00
Jakob Kummerow
f9a2e24bbc [cleanup] Refactor the Factory
There is no good reason to have the meat of most objects' initialization
logic in heap.cc, all wrapped by the CALL_HEAP_FUNCTION macro. Instead,
this CL changes the protocol between Heap and Factory to be AllocateRaw,
and all object initialization work after (possibly retried) successful
raw allocation happens in the Factory.

This saves about 20KB of binary size on x64.

Cq-Include-Trybots: luci.v8.try:v8_linux_noi18n_rel_ng
Change-Id: Icbfdc4266d7be8b48d2fe085f03411743dc6a0ca
Reviewed-on: https://chromium-review.googlesource.com/959533
Commit-Queue: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Cr-Commit-Position: refs/heads/master@{#52416}
2018-04-06 00:23:46 +00:00
Mathias Bynens
822be9b238 Normalize casing of hexadecimal digits
This patch normalizes the casing of hexadecimal digits in escape
sequences of the form `\xNN` and integer literals of the form
`0xNNNN`.

Previously, the V8 code base used an inconsistent mixture of uppercase
and lowercase.

Google’s C++ style guide uses uppercase in its examples:
https://google.github.io/styleguide/cppguide.html#Non-ASCII_Characters

Moreover, uppercase letters more clearly stand out from the lowercase
`x` (or `u`) characters at the start, as well as lowercase letters
elsewhere in strings.

BUG=v8:7109
TBR=marja@chromium.org,titzer@chromium.org,mtrofin@chromium.org,mstarzinger@chromium.org,rossberg@chromium.org,yangguo@chromium.org,mlippautz@chromium.org
NOPRESUBMIT=true

Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_chromium_rel_ng
Change-Id: I790e21c25d96ad5d95c8229724eb45d2aa9e22d6
Reviewed-on: https://chromium-review.googlesource.com/804294
Commit-Queue: Mathias Bynens <mathias@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Cr-Commit-Position: refs/heads/master@{#49810}
2017-12-02 01:24:40 +00:00
Marja Hölttä
6389b7e6b6 [unicode] Return (the correct) errors for overlong / surrogate sequences.
This fix is two-fold:

1) Incremental UTF-8 decoding: Unify incorrect UTF-8 handling between V8 and
Blink.

Incremental UTF-8 decoding used to allow some overlong sequences / invalid code
points which Blink treated as errors. This caused the decoder and the Blink
UTF-8 decoder to produce a different number of bytes, resulting in random
failures when scripts were streamed (especially, this was detected by the
skipping inner functions feature which adds CHECKs against expected function
positions).

2) Non-incremental UTF-8 decoding: return the correct amount of invalid characters.

According to the encoding spec ( https://encoding.spec.whatwg.org/#utf-8-decoder
), the first byte of an overlong sequence / invalid code point generates an
invalid character, and the rest of the bytes are not processed (i.e., pushed
back to the byte stream). When they're handled, they will look like lonely
continuation bytes, and will generate an invalid character each.

As a result, an overlong 4-byte sequence should generate 4 invalid characters
(not 1).

This is a potentially breaking change, since the (non-incremental) UTF-8
decoding is exposed via the API (String::NewFromUtf8). The behavioral difference
happens when the client is passing in invalid UTF-8 (containing overlong /
surrogate sequences).

However, afaict, this doesn't change the semantics of any JavaScript program:
according to the ECMAScript spec, the program is a sequence of Unicode code
points, and there's no way to invoke the UTF-8 decoding functionalities from
inside JavaScript. Though, this changes the behavior of d8 when decoding source
files which are invalid UTF-8.

This doesn't change anything related to URI decoding (it already throws
exceptions for overlong sequences / invalid code points).

BUG: chromium:765608, chromium:758236, v8:5516
Bug: 
Change-Id: Ib029f6a8e87186794b092e4e8af32d01cee3ada0
Reviewed-on: https://chromium-review.googlesource.com/671020
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Franziska Hinkelmann <franzih@chromium.org>
Reviewed-by: Camillo Bruni <cbruni@chromium.org>
Cr-Commit-Position: refs/heads/master@{#48105}
2017-09-21 10:44:40 +00:00
Marja Hölttä
68310c9f69 [scanner] UTF-8 handling fix (errors near chunk end).
The bug occurred when we detected an erroneous char late, and put the last
character in a chunk into the "incomplete char" buffer. It was not correctly
retrieved when seeking.

BUG=v8:6836

Change-Id: I8ca946dfdb39244c5ca0bdcebe047047010b3a07
Reviewed-on: https://chromium-review.googlesource.com/670729
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Camillo Bruni <cbruni@chromium.org>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/master@{#48066}
2017-09-18 14:13:26 +00:00
Marja Hölttä
9f21cab8c8 Revert "Reland#2 [parser] Refactor streaming scanner streams."
This reverts commit de9269f3c3.

Something's still wrong in the encoding handling (see bug).

Bug: chromium:763106
Cq-Include-Trybots: master.tryserver.chromium.linux:linux_chromium_rel_ng
Change-Id: Icd19dd42b84b9d090e191375a2942b9941110bcf
Reviewed-on: https://chromium-review.googlesource.com/657386
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/master@{#47924}
2017-09-08 13:36:04 +00:00
Marja Hölttä
4e45342994 [script streaming] Fix U+feff handling.
U+feff is the UTF BOM but if it occurs inside the text, it's a "zero-width
no-break space". However, the UTF-8 decoder in script streaming still thought
it's a BOM and skipped it. The correct way to handle it would be to create a
U+feff code point instead - the Scanner will then handle it as whitespace.

This is a discrepancy between the Blink UTF-8 decoder and the V8 UTF-8 decoder,
and caused the source positions be off by one. This bug went unnoticed, since
normally off-by-one in this situation doesn't make the code to break.

BUG=chromium:758508,chromium:758236

Change-Id: Ib92a3ee65c402e21b77e42537db2a021cff55379
Reviewed-on: https://chromium-review.googlesource.com/632096
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#47583}
2017-08-24 19:35:12 +00:00
Wiktor Garbacz
de9269f3c3 Reland#2 [parser] Refactor streaming scanner streams.
Unify, simplify logic, reduce UTF8 specific handling.

Intend of this is also to have stream views.
Stream views can be used concurrently by multiple threads, but
only one thread may fetch new data from the underlying source.
This together with unified stream view creation is intended to be
used for parse tasks.

BUG=v8:6093

Change-Id: I83c6f1e6ad280c28da690da41c466dfcbb7915e6
Reviewed-on: https://chromium-review.googlesource.com/535474
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Commit-Queue: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45994}
2017-06-19 10:18:01 +00:00
Wiktor Garbacz
f4f723e818 [parsing] Fix past the end position for streaming streams.
Also, as this is hard to track down, always DCHECK position after ReadBlock().

Change-Id: Ie32c3a311dd8df91f651b6d82ccacc7c95e6fde0
Reviewed-on: https://chromium-review.googlesource.com/528196
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45811}
2017-06-09 11:35:24 +00:00
Marja Hölttä
4ca7022295 Revert "Reland [parser] Refactor streaming scanner streams."
This reverts commit 7fa071a48b.

Reason for revert: https://bugs.chromium.org/p/chromium/issues/detail?id=729482

Original change's description:
> Reland [parser] Refactor streaming scanner streams.
> 
> Unify, simplify logic, reduce UTF8 specific handling.
> 
> Intend of this is also to have stream views.
> Stream views can be used concurrently by multiple threads, but
> only one thread may fetch new data from the underlying source.
> This together with unified stream view creation is intended to be
> used for parse tasks.
> 
> BUG=v8:6093
> 
> Change-Id: I3bce48185fa2c986d16619a9a8ece3ff4c4f5e60
> Reviewed-on: https://chromium-review.googlesource.com/509489
> Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
> Reviewed-by: Marja Hölttä <marja@chromium.org>
> Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
> Cr-Commit-Position: refs/heads/master@{#45688}

TBR=marja@chromium.org,vogelheim@chromium.org,wiktorg@google.com
# Not skipping CQ checks because original CL landed > 1 day ago.
BUG=v8:6093

Change-Id: Iefa7c43a2f6ae3a7f3ef0f77d87b6ae36ae4be99
Reviewed-on: https://chromium-review.googlesource.com/525712
Reviewed-by: Marja Hölttä <marja@chromium.org>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Commit-Queue: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45725}
2017-06-06 11:42:30 +00:00
Wiktor Garbacz
7fa071a48b Reland [parser] Refactor streaming scanner streams.
Unify, simplify logic, reduce UTF8 specific handling.

Intend of this is also to have stream views.
Stream views can be used concurrently by multiple threads, but
only one thread may fetch new data from the underlying source.
This together with unified stream view creation is intended to be
used for parse tasks.

BUG=v8:6093

Change-Id: I3bce48185fa2c986d16619a9a8ece3ff4c4f5e60
Reviewed-on: https://chromium-review.googlesource.com/509489
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
Cr-Commit-Position: refs/heads/master@{#45688}
2017-06-02 13:50:08 +00:00
Adam Klein
9397c1b73a Revert "[parser] Refactor streaming scanner streams."
This reverts commit ce538f70c1.

Reason for revert: breaks BOM handling (thus breaking Outlook web apps).

Original change's description:
> [parser] Refactor streaming scanner streams.
> 
> Unify, simplify logic, reduce UTF8 specific handling.
> 
> Intend of this is also to have stream views.
> Stream views can be used concurrently by multiple threads, but
> only one thread may fetch new data from the underlying source.
> This together with unified stream view creation is intended to be
> used for parse tasks.
> 
> BUG=v8:6093
> 
> Change-Id: Ied8e93090c506d4735080298f0fdaeed32043915
> Reviewed-on: https://chromium-review.googlesource.com/501789
> Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
> Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
> Reviewed-by: Marja Hölttä <marja@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#45336}

TBR=marja@chromium.org,vogelheim@chromium.org,jochen@chromium.org,wiktorg@google.com
BUG=v8:6093, chromium:724166

Change-Id: I022a23b8052d20d83a640c07b7864c622548bf90
Reviewed-on: https://chromium-review.googlesource.com/508888
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Adam Klein <adamk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45404}
2017-05-18 19:28:58 +00:00
Wiktor Garbacz
ce538f70c1 [parser] Refactor streaming scanner streams.
Unify, simplify logic, reduce UTF8 specific handling.

Intend of this is also to have stream views.
Stream views can be used concurrently by multiple threads, but
only one thread may fetch new data from the underlying source.
This together with unified stream view creation is intended to be
used for parse tasks.

BUG=v8:6093

Change-Id: Ied8e93090c506d4735080298f0fdaeed32043915
Reviewed-on: https://chromium-review.googlesource.com/501789
Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45336}
2017-05-16 11:34:41 +00:00
Wiktor Garbacz
4f5dcd4b4d [parser] Fix chunked utf8 stream handling.
BUG=v8:6377

Change-Id: I5bdd41bdda83d7efe4b37d24d44e2e8c2339a30a
Reviewed-on: https://chromium-review.googlesource.com/500168
Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
Reviewed-by: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45204}
2017-05-09 18:41:00 +00:00
bbudge
771e86fdf2 Remove Factory::NewStringFromASCII method.
BUG=none

Review-Url: https://codereview.chromium.org/2759513002
Cr-Commit-Position: refs/heads/master@{#43913}
2017-03-17 17:52:50 +00:00
ishell@chromium.org
32971301ea Rename TypeFeedbackVector to FeedbackVector.
... and TypeFeedbackMetadata to FeedbackMetadata.

BUG=

Change-Id: I2556d1c2a8f37b8cf3d532cc98d973b6dc7e9e6c
Reviewed-on: https://chromium-review.googlesource.com/439244
Commit-Queue: Igor Sheludko <ishell@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Michael Stanton <mvstanton@chromium.org>
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#42999}
2017-02-07 14:46:36 +00:00
vogelheim
10bb974ec3 [scanner] Regression test for Utf-8 BOM handling (crbug.com/685618).
The existing unit test explicitly checked for this case, but was - under
the right circumstances - defeated by the optimization to not re-run the
whole position search if we were close enough.

BUG=chromium:685618

Review-Url: https://codereview.chromium.org/2663883002
Cr-Commit-Position: refs/heads/master@{#42794}
2017-01-30 23:21:03 +00:00
verwaest
ce63eb08f9 [counters] Move waiting for more data from background-parsing into callbacks
BUG=

Review-Url: https://codereview.chromium.org/2549083002
Cr-Commit-Position: refs/heads/master@{#41492}
2016-12-05 15:47:12 +00:00
predrag.rudic
f04a9b4936 Fix 'MIPS: Fix Utf16CharacterStream scanner crash due to missaligned access'
Removed a wrong condition test in  TwoByteExternalBufferedStream. This changed fixes errors that may occur under some conditions.

Review-Url: https://codereview.chromium.org/2469723002
Cr-Commit-Position: refs/heads/master@{#40722}
2016-11-03 12:32:16 +00:00
vogelheim
138127a608 Fix bad-char handling in utf-8 streaming streams. Also add test.
R=jochen@chromium.org
BUG=chromium:651333, v8:4947

Review-Url: https://codereview.chromium.org/2391273002
Cr-Commit-Position: refs/heads/master@{#40004}
2016-10-05 17:18:58 +00:00
vogelheim
a2b8b6e7db Handle Utf-8 BOM at beginning of an Utf-8 stream.
(This should enable to drop the BOM handling in the Blink bindings.)

R=marja@chromium.org
BUG=v8:4947

Review-Url: https://codereview.chromium.org/2354973002
Cr-Commit-Position: refs/heads/master@{#39579}
2016-09-21 08:40:10 +00:00
vogelheim
b36c60cce8 Remove legacy API on Utf16CharacterStream.
BUG=v8:4947

Review-Url: https://codereview.chromium.org/2347883002
Cr-Commit-Position: refs/heads/master@{#39533}
2016-09-20 09:44:00 +00:00
vogelheim
642d6d314c Rework scanner-character-streams.
- Smaller, more consistent streams API (Advance, Back, pos, Seek)
- Remove implementations from the header, in favor of creation functions.

Observe:
- Performance:
  - All Utf16CharacterStream methods have an inlinable V8_LIKELY w/ a
    body of only a few instructions. I expect most calls to end up there.
  - There used to be performance problems w/ bookmarking, particularly
    with copying too much data on SetBookmark w/ UTF-8 streaming streams.
    All those copies are gone.
  - The old streaming streams implementation used to copy data even for
    2-byte input. It no longer does.
  - The only remaining 'slow' method is the Seek(.) slow case for utf-8
    streaming streams. I don't expect this to be called a lot; and even if,
    I expect it to be offset by the gains in the (vastly more frequent)
    calls to the other methods or the 'fast path'.
  - If it still bothers us, there are several ways to speed it up.
- API & code cleanliness:
  - I want to remove the 'old' API in a follow-up CL, which should mostly
    delete code, or replace it 1:1.
  - In a 2nd follow-up I want to delete much of the UTF-8 handling in Blink
    for streaming streams.
  - The "bookmark" is now always implemented (and mostly very fast), so we
    should be able to use it for more things.
- Testing & correctness:
  - The unit tests now cover all stream implementations,
    and are pretty good and triggering all the edge cases.
  - Vastly more DCHECKs of the invariants.

BUG=v8:4947

Review-Url: https://codereview.chromium.org/2314663002
Cr-Commit-Position: refs/heads/master@{#39464}
2016-09-16 08:29:52 +00:00