vogelheim
138127a608
Fix bad-char handling in utf-8 streaming streams. Also add test.
...
R=jochen@chromium.org
BUG=chromium:651333, v8:4947
Review-Url: https://codereview.chromium.org/2391273002
Cr-Commit-Position: refs/heads/master@{#40004}
2016-10-05 17:18:58 +00:00
vogelheim
642d6d314c
Rework scanner-character-streams.
...
- Smaller, more consistent streams API (Advance, Back, pos, Seek)
- Remove implementations from the header, in favor of creation functions.
Observe:
- Performance:
- All Utf16CharacterStream methods have an inlinable V8_LIKELY w/ a
body of only a few instructions. I expect most calls to end up there.
- There used to be performance problems w/ bookmarking, particularly
with copying too much data on SetBookmark w/ UTF-8 streaming streams.
All those copies are gone.
- The old streaming streams implementation used to copy data even for
2-byte input. It no longer does.
- The only remaining 'slow' method is the Seek(.) slow case for utf-8
streaming streams. I don't expect this to be called a lot; and even if,
I expect it to be offset by the gains in the (vastly more frequent)
calls to the other methods or the 'fast path'.
- If it still bothers us, there are several ways to speed it up.
- API & code cleanliness:
- I want to remove the 'old' API in a follow-up CL, which should mostly
delete code, or replace it 1:1.
- In a 2nd follow-up I want to delete much of the UTF-8 handling in Blink
for streaming streams.
- The "bookmark" is now always implemented (and mostly very fast), so we
should be able to use it for more things.
- Testing & correctness:
- The unit tests now cover all stream implementations,
and are pretty good and triggering all the edge cases.
- Vastly more DCHECKs of the invariants.
BUG=v8:4947
Review-Url: https://codereview.chromium.org/2314663002
Cr-Commit-Position: refs/heads/master@{#39464}
2016-09-16 08:29:52 +00:00
clemensh
f0523e3046
[wasm] Add UTF-8 validation
...
Names passed for imports and exports are checked during decoding,
leading to errors if they are no valid UTF-8. Function names are not
checked during decode, but rather lead to undefined being returned at
runtime if they are not UTF-8.
We need to do these checks on the Wasm side, since the factory
methods assume to get valid UTF-8 strings.
R=titzer@chromium.org , yangguo@chromium.org
Review-Url: https://codereview.chromium.org/1967023004
Cr-Commit-Position: refs/heads/master@{#36208}
2016-05-12 13:02:14 +00:00
mstarzinger
92e85aed10
[presubmit] Fix build/include linter violations.
...
R=bmeurer@chromium.org
Review URL: https://codereview.chromium.org/1318863004
Cr-Commit-Position: refs/heads/master@{#30554}
2015-09-03 07:56:14 +00:00
jochen
3d5b2f807b
Update UTF-8 decoder to detect more special cases.
...
The blink version is stricter and for parsing it's important that both
decoders behave the same.
BUG=chromium:489944
R=vogelheim@chromium.org
LOG=n
Review URL: https://codereview.chromium.org/1148653007
Cr-Commit-Position: refs/heads/master@{#28601}
2015-05-22 18:47:16 +00:00
marja
0e3b5386ae
Scanner / Unicode decoding: use size_t instead of unsigned.
...
size_t is the correct data type for this purpose. Our APIs (in particular
ExternalSourceStream::GetMoreData) are already using it, and there were some
static_casts to convert between them.
This CL doesn't intend to fix all of V8, just the minimal sense-making part
around scanner character streams.
BUG=
Review URL: https://codereview.chromium.org/864273005
Cr-Commit-Position: refs/heads/master@{#26449}
2015-02-05 07:54:34 +00:00
yangguo@chromium.org
8659e50723
Update unicode to 7.0.0.
...
And do not use code points with PATTERN_* property for identifier start.
Maintain that \u180E is a white space character.
BUG=v8:2892
LOG=Y
R=dpino@igalia.com , mathias@qiwi.be
Review URL: https://codereview.chromium.org/638643002
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24473 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-10-08 14:55:03 +00:00
bmeurer@chromium.org
d07a2eb806
Rename ASSERT* to DCHECK*.
...
This way we don't clash with the ASSERT* macros
defined by GoogleTest, and we are one step closer
to being able to replace our homegrown base/ with
base/ from Chrome.
R=jochen@chromium.org , svenpanne@chromium.org
Review URL: https://codereview.chromium.org/430503007
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@22812 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-08-04 11:34:54 +00:00
mstarzinger@chromium.org
fec6e62dfb
Check alpha-sorting of includes during presubmit.
...
R=rossberg@chromium.org
Review URL: https://codereview.chromium.org/333013002
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@21894 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-06-20 08:40:11 +00:00
jochen@chromium.org
56a486c322
Use full include paths everywhere
...
- this avoids using relative include paths which are forbidden by the style guide
- makes the code more readable since it's clear which header is meant
- allows for starting to use checkdeps
BUG=none
R=jkummerow@chromium.org , danno@chromium.org
LOG=n
Review URL: https://codereview.chromium.org/304153016
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@21625 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-06-03 08:12:43 +00:00
bmeurer@chromium.org
d4b533d41b
Bulk update of Google copyright headers in source files.
...
R=svenpanne@chromium.org
Review URL: https://codereview.chromium.org/259183002
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@21035 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-04-29 06:42:26 +00:00
yangguo@chromium.org
b618d2a42a
Fix inconsistencies wrt whitespaces.
...
This relands r19196 with fixes.
BUG=v8:3109
LOG=Y
R=mstarzinger@chromium.org
Review URL: https://codereview.chromium.org/141323007
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@19222 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-02-10 12:43:10 +00:00
yangguo@chromium.org
02674ee414
Keep two empty lines between declarations for cpp files
...
R=yangguo@chromium.org
Review URL: https://codereview.chromium.org/18509003
Patch from Haitao Feng <haitao.feng@intel.com>.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15510 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2013-07-05 09:52:11 +00:00
yangguo@chromium.org
04ccb975f4
Remove InputBuffer
...
R=yangguo@chromium.org
BUG=
Review URL: https://chromiumcodereview.appspot.com/11727004
Patch from Dan Carney <dcarney@google.com>.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13298 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2013-01-03 09:18:01 +00:00
yangguo@chromium.org
eedcaf1866
Remove Utf8InputBuffer
...
R=yangguo@chromium.org
BUG=
Review URL: https://chromiumcodereview.appspot.com/11649018
Patch from Dan Carney <dcarney@google.com>.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13248 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-12-20 09:20:37 +00:00
erik.corry@gmail.com
03cfc4363b
Fix input and output to handle UTF16 surrogate pairs.
...
Review URL: https://chromiumcodereview.appspot.com/9600009
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11007 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-03-12 12:35:28 +00:00
mstarzinger@chromium.org
2a2ed5004f
Update unicode tables to version 6.1.0.
...
R=erik.corry@gmail.com
BUG=v8:1965
Review URL: https://chromiumcodereview.appspot.com/9615005
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10933 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-03-06 09:43:12 +00:00
erik.corry@gmail.com
70da367f6b
More spelling changes.
...
Review URL: http://codereview.chromium.org/9231009
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10407 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-01-16 12:38:59 +00:00
vitalyr@chromium.org
7976ca2cbc
Merge isolates to bleeding_edge.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7271 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-03-18 20:35:07 +00:00
vitalyr@chromium.org
76e226f832
Revert r7268: it borked the history.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7269 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-03-18 19:41:05 +00:00
vitalyr@chromium.org
6ff7fdebd3
Merge isolates to bleeding_edge.
...
Review URL: http://codereview.chromium.org/6685088
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7268 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-03-18 18:49:56 +00:00
lrn@chromium.org
b9bd4952a7
Changed uncast -1 in unsigned context to use constant kSentinel.
...
Review URL: http://codereview.chromium.org/5993006
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@6133 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-01-03 10:28:39 +00:00
lrn@chromium.org
66574f31de
Unicode: Reduced size of tables.
...
Review URL: http://codereview.chromium.org/3043032
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5161 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-07-30 12:59:57 +00:00
lrn@chromium.org
1d24f5f56b
Updated unicode library.
...
Added Nl category to letters predicate (as requried for JS identifiers).
Changed/simplified representation of canonicalization ranges.
Truncated tables to code points in the BMP (all that is used by JS).
Reformatted tables to avoid excessively long lines.
Removed duplicate entries from multi-character mapping result tables.
Review URL: http://codereview.chromium.org/3030026
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5155 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-07-30 07:10:22 +00:00
deanm@chromium.org
e9f42cde46
Small cleanup to Utf8::CalculateValue:
...
- Don't duplicate kMaxXByteChar constants.
- Don't compare signed and unsigned integers.
Review URL: http://codereview.chromium.org/155414
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2434 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-07-13 11:17:51 +00:00
erik.corry@gmail.com
2d4dd93bdd
Misc. portability fixes.
...
Review URL: http://codereview.chromium.org/42337
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1538 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-18 15:20:26 +00:00
christian.plesner.hansen@gmail.com
144c8c790a
Fixed problem where the two lower-case sigmas would uncanonicalize to
...
themselves and upper-case sigma, but upper-case sigma would
uncanonicalize to just lower-case final sigma.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@844 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-11-26 06:05:07 +00:00
christian.plesner.hansen@gmail.com
b57b4a15cd
Merge regexp2000 back into bleeding_edge
...
Review URL: http://codereview.chromium.org/12427
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@832 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-11-25 11:07:48 +00:00
christian.plesner.hansen@gmail.com
06fa6d1cde
- Case-sensitive atomic regular expressions now use the same code as
...
String.indexOf to do matching.
- The --log option is no longer automatically enabled by the other log
options.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@413 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-10-02 15:35:28 +00:00
christian.plesner.hansen@gmail.com
9bed566bdb
Changed copyright header from google inc. to v8 project authors.
...
Added presubmit step to check copyright.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@242 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-09-09 20:08:45 +00:00
christian.plesner.hansen@gmail.com
3351499cb5
Fixed problem where asian characters were not categorized as letters
...
because they were defined using different syntax in the unicode
database.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@200 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-09-08 10:45:01 +00:00
mads.s.ager@gmail.com
dceb5f6a8f
Improved test support.
...
Fixed issue with building samples and cctests on 64-bit machines.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@23 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-08-28 09:55:41 +00:00
christian.plesner.hansen
43d26ecc35
Initial export.
...
git-svn-id: http://v8.googlecode.com/svn/trunk@2 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-07-03 15:10:15 +00:00