size_t is the correct data type for this purpose. Our APIs (in particular
ExternalSourceStream::GetMoreData) are already using it, and there were some
static_casts to convert between them.
This CL doesn't intend to fix all of V8, just the minimal sense-making part
around scanner character streams.
BUG=
Review URL: https://codereview.chromium.org/864273005
Cr-Commit-Position: refs/heads/master@{#26449}
1) Since we fill the output buffer both from the chunks and the conversion
buffer, it's possible that we run out of space and call CopyCharsHelper with 0
length. The underlying functions don't handle it gracefully, so check there.
2) There was a bug where we used to try to copy too many characters from the
beginning of the data chunk into the conversion buffer. Continuation bytes in
UTF-8 are of the form 0b10XXXXXX. If a byte is bigger than that, it's the first
byte of a new UTF-8 character and we should ignore it.
These two together (or maybe in combination with surrogates) are a probable
reason for crbug.com/420932.
3) The test data was off; \uc481 is \xec\x92\x81.
BUG=420932
LOG=N
R=yangguo@chromium.org
Review URL: https://codereview.chromium.org/662003003
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24725 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The problem was that there can be several multi-byte UTF-8 characters near the
splitting point of the data chunks, and the code didn't handle it properly.
This was also the source of crbug.com/417891 - I thought the crash can only
happen when V8 is passed invalid UTF-8 data, but it can also happen in the
abovementioned case. After the fix, we handle the valid UTF-8 case and also
guard against invalid UTF-8 data.
R=yangguo@chromium.org
BUG=chromium:417891
LOG=N
Review URL: https://codereview.chromium.org/654503002
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24547 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
1) Call DeserializeScopeChain only if it's going to do something
non-trivial. And we only need to internalize the AstValueFactory in those cases.
2) BufferedUtf16CharacterStream::FillBuffer doesn't need the length
argument. The length is always kBufferSize and the subclasses can just read it
(it's protected).
R=rossberg@chromium.org
BUG=
Review URL: https://codereview.chromium.org/381613003
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@22307 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- GenericStringUtf16CharacterStream::start_position_ was unused.
- GenericStringUtf16CharacterStream inherits from BufferedUtf16CharacterStream,
so no need to initialize buffer_cursor_ and buffer_end_ twice (this makes it
clearer which class in the inheritance chain takes care of which variables).
R=yangguo@chromium.org
BUG=
Review URL: https://codereview.chromium.org/216523004
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@20334 ce2b1a6d-e550-0410-aec6-3dcde31c8c00