Commit Graph

33 Commits

Author SHA1 Message Date
vogelheim
138127a608 Fix bad-char handling in utf-8 streaming streams. Also add test.
R=jochen@chromium.org
BUG=chromium:651333, v8:4947

Review-Url: https://codereview.chromium.org/2391273002
Cr-Commit-Position: refs/heads/master@{#40004}
2016-10-05 17:18:58 +00:00
vogelheim
642d6d314c Rework scanner-character-streams.
- Smaller, more consistent streams API (Advance, Back, pos, Seek)
- Remove implementations from the header, in favor of creation functions.

Observe:
- Performance:
  - All Utf16CharacterStream methods have an inlinable V8_LIKELY w/ a
    body of only a few instructions. I expect most calls to end up there.
  - There used to be performance problems w/ bookmarking, particularly
    with copying too much data on SetBookmark w/ UTF-8 streaming streams.
    All those copies are gone.
  - The old streaming streams implementation used to copy data even for
    2-byte input. It no longer does.
  - The only remaining 'slow' method is the Seek(.) slow case for utf-8
    streaming streams. I don't expect this to be called a lot; and even if,
    I expect it to be offset by the gains in the (vastly more frequent)
    calls to the other methods or the 'fast path'.
  - If it still bothers us, there are several ways to speed it up.
- API & code cleanliness:
  - I want to remove the 'old' API in a follow-up CL, which should mostly
    delete code, or replace it 1:1.
  - In a 2nd follow-up I want to delete much of the UTF-8 handling in Blink
    for streaming streams.
  - The "bookmark" is now always implemented (and mostly very fast), so we
    should be able to use it for more things.
- Testing & correctness:
  - The unit tests now cover all stream implementations,
    and are pretty good and triggering all the edge cases.
  - Vastly more DCHECKs of the invariants.

BUG=v8:4947

Review-Url: https://codereview.chromium.org/2314663002
Cr-Commit-Position: refs/heads/master@{#39464}
2016-09-16 08:29:52 +00:00
clemensh
f0523e3046 [wasm] Add UTF-8 validation
Names passed for imports and exports are checked during decoding,
leading to errors if they are no valid UTF-8. Function names are not
checked during decode, but rather lead to undefined being returned at
runtime if they are not UTF-8.

We need to do these checks on the Wasm side, since the factory
methods assume to get valid UTF-8 strings.

R=titzer@chromium.org, yangguo@chromium.org

Review-Url: https://codereview.chromium.org/1967023004
Cr-Commit-Position: refs/heads/master@{#36208}
2016-05-12 13:02:14 +00:00
mstarzinger
92e85aed10 [presubmit] Fix build/include linter violations.
R=bmeurer@chromium.org

Review URL: https://codereview.chromium.org/1318863004

Cr-Commit-Position: refs/heads/master@{#30554}
2015-09-03 07:56:14 +00:00
jochen
3d5b2f807b Update UTF-8 decoder to detect more special cases.
The blink version is stricter and for parsing it's important that both
decoders behave the same.

BUG=chromium:489944
R=vogelheim@chromium.org
LOG=n

Review URL: https://codereview.chromium.org/1148653007

Cr-Commit-Position: refs/heads/master@{#28601}
2015-05-22 18:47:16 +00:00
marja
0e3b5386ae Scanner / Unicode decoding: use size_t instead of unsigned.
size_t is the correct data type for this purpose. Our APIs (in particular
ExternalSourceStream::GetMoreData) are already using it, and there were some
static_casts to convert between them.

This CL doesn't intend to fix all of V8, just the minimal sense-making part
around scanner character streams.

BUG=

Review URL: https://codereview.chromium.org/864273005

Cr-Commit-Position: refs/heads/master@{#26449}
2015-02-05 07:54:34 +00:00
yangguo@chromium.org
8659e50723 Update unicode to 7.0.0.
And do not use code points with PATTERN_* property for identifier start.
Maintain that \u180E is a white space character.

BUG=v8:2892
LOG=Y
R=dpino@igalia.com, mathias@qiwi.be

Review URL: https://codereview.chromium.org/638643002

git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24473 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-10-08 14:55:03 +00:00
bmeurer@chromium.org
d07a2eb806 Rename ASSERT* to DCHECK*.
This way we don't clash with the ASSERT* macros
defined by GoogleTest, and we are one step closer
to being able to replace our homegrown base/ with
base/ from Chrome.

R=jochen@chromium.org, svenpanne@chromium.org

Review URL: https://codereview.chromium.org/430503007

git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@22812 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-08-04 11:34:54 +00:00
mstarzinger@chromium.org
fec6e62dfb Check alpha-sorting of includes during presubmit.
R=rossberg@chromium.org

Review URL: https://codereview.chromium.org/333013002

git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@21894 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-06-20 08:40:11 +00:00
jochen@chromium.org
56a486c322 Use full include paths everywhere
- this avoids using relative include paths which are forbidden by the style guide
- makes the code more readable since it's clear which header is meant
- allows for starting to use checkdeps

BUG=none
R=jkummerow@chromium.org, danno@chromium.org
LOG=n

Review URL: https://codereview.chromium.org/304153016

git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@21625 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-06-03 08:12:43 +00:00
bmeurer@chromium.org
d4b533d41b Bulk update of Google copyright headers in source files.
R=svenpanne@chromium.org

Review URL: https://codereview.chromium.org/259183002

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@21035 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-04-29 06:42:26 +00:00
yangguo@chromium.org
b618d2a42a Fix inconsistencies wrt whitespaces.
This relands r19196 with fixes.

BUG=v8:3109
LOG=Y
R=mstarzinger@chromium.org

Review URL: https://codereview.chromium.org/141323007

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@19222 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-02-10 12:43:10 +00:00
yangguo@chromium.org
02674ee414 Keep two empty lines between declarations for cpp files
R=yangguo@chromium.org

Review URL: https://codereview.chromium.org/18509003

Patch from Haitao Feng <haitao.feng@intel.com>.

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15510 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2013-07-05 09:52:11 +00:00
yangguo@chromium.org
04ccb975f4 Remove InputBuffer
R=yangguo@chromium.org
BUG=

Review URL: https://chromiumcodereview.appspot.com/11727004
Patch from Dan Carney <dcarney@google.com>.

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13298 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2013-01-03 09:18:01 +00:00
yangguo@chromium.org
eedcaf1866 Remove Utf8InputBuffer
R=yangguo@chromium.org
BUG=

Review URL: https://chromiumcodereview.appspot.com/11649018
Patch from Dan Carney <dcarney@google.com>.

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13248 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-12-20 09:20:37 +00:00
erik.corry@gmail.com
03cfc4363b Fix input and output to handle UTF16 surrogate pairs.
Review URL: https://chromiumcodereview.appspot.com/9600009

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11007 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-03-12 12:35:28 +00:00
mstarzinger@chromium.org
2a2ed5004f Update unicode tables to version 6.1.0.
R=erik.corry@gmail.com
BUG=v8:1965

Review URL: https://chromiumcodereview.appspot.com/9615005

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10933 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-03-06 09:43:12 +00:00
erik.corry@gmail.com
70da367f6b More spelling changes.
Review URL: http://codereview.chromium.org/9231009

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10407 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-01-16 12:38:59 +00:00
vitalyr@chromium.org
7976ca2cbc Merge isolates to bleeding_edge.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7271 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-03-18 20:35:07 +00:00
vitalyr@chromium.org
76e226f832 Revert r7268: it borked the history.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7269 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-03-18 19:41:05 +00:00
vitalyr@chromium.org
6ff7fdebd3 Merge isolates to bleeding_edge.
Review URL: http://codereview.chromium.org/6685088

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7268 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-03-18 18:49:56 +00:00
lrn@chromium.org
b9bd4952a7 Changed uncast -1 in unsigned context to use constant kSentinel.
Review URL: http://codereview.chromium.org/5993006

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@6133 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-01-03 10:28:39 +00:00
lrn@chromium.org
66574f31de Unicode: Reduced size of tables.
Review URL: http://codereview.chromium.org/3043032

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5161 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-07-30 12:59:57 +00:00
lrn@chromium.org
1d24f5f56b Updated unicode library.
Added Nl category to letters predicate (as requried for JS identifiers).
Changed/simplified representation of canonicalization ranges.
Truncated tables to code points in the BMP (all that is used by JS).
Reformatted tables to avoid excessively long lines.
Removed duplicate entries from multi-character mapping result tables.

Review URL: http://codereview.chromium.org/3030026

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5155 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-07-30 07:10:22 +00:00
deanm@chromium.org
e9f42cde46 Small cleanup to Utf8::CalculateValue:
- Don't duplicate kMaxXByteChar constants.
  - Don't compare signed and unsigned integers.

Review URL: http://codereview.chromium.org/155414


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2434 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-07-13 11:17:51 +00:00
erik.corry@gmail.com
2d4dd93bdd Misc. portability fixes.
Review URL: http://codereview.chromium.org/42337

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1538 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-18 15:20:26 +00:00
christian.plesner.hansen@gmail.com
144c8c790a Fixed problem where the two lower-case sigmas would uncanonicalize to
themselves and upper-case sigma, but upper-case sigma would
uncanonicalize to just lower-case final sigma.


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@844 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-11-26 06:05:07 +00:00
christian.plesner.hansen@gmail.com
b57b4a15cd Merge regexp2000 back into bleeding_edge
Review URL: http://codereview.chromium.org/12427

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@832 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-11-25 11:07:48 +00:00
christian.plesner.hansen@gmail.com
06fa6d1cde - Case-sensitive atomic regular expressions now use the same code as
String.indexOf to do matching.
- The --log option is no longer automatically enabled by the other log
  options.


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@413 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-10-02 15:35:28 +00:00
christian.plesner.hansen@gmail.com
9bed566bdb Changed copyright header from google inc. to v8 project authors.
Added presubmit step to check copyright.



git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@242 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-09-09 20:08:45 +00:00
christian.plesner.hansen@gmail.com
3351499cb5 Fixed problem where asian characters were not categorized as letters
because they were defined using different syntax in the unicode
database.



git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@200 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-09-08 10:45:01 +00:00
mads.s.ager@gmail.com
dceb5f6a8f Improved test support.
Fixed issue with building samples and cctests on 64-bit machines.



git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@23 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-08-28 09:55:41 +00:00
christian.plesner.hansen
43d26ecc35 Initial export.
git-svn-id: http://v8.googlecode.com/svn/trunk@2 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-07-03 15:10:15 +00:00