Commit Graph

2 Commits

Author SHA1 Message Date
Iain Ireland
17f8f12bcb [regexp] Document incorrect assertion in intl/regress-10573.js
There are at least three equivalence classes where this assertion
should not actually hold:

  '\u0390\u1fd3',              // ΐΐ
  '\u03b0\u1fe3',              // ΰΰ
  '\ufb05\ufb06',              // ſtst

Bug: v8:10591
Change-Id: I26cb43d2e67c54e689f1831ea13be46c73d5e92d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2231595
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68246}
2020-06-09 06:27:27 +00:00
Iain Ireland
b65fcfe925 [regexp] Fix non-unicode ignore-case backreferences
https://crrev.com/c/2072858 rewrote the implementation of non-unicode
ignore-case matches to comply with the JS spec in some corner
cases. It fixed character matches and character class matches.

We missed a similar bug in the implementation of back references. This
CL fixes that bug.

The main change is in regexp-macro-assembler.cc, where
CaseInsensitiveCompareUC16 is split into CaseInsensitiveCompareUnicode
(which has the same semantics as before) and
CaseInsensitiveCompareNonUnicode (which has the semantics described
here: https://tc39.es/ecma262/#sec-runtime-semantics-canonicalize-ch).

Most of the rest of the patch undoes https://crrev.com/c/2081816 to
once again make the unicode flag available to the macroassembler, so
that we can decide which helper function to call.

The testcase is a version of test/intl/regress-10248.js, modified to
test backreferences.

Bug: v8:10573
Change-Id: I70ef7d134d37f99b1f75a5eba17020e82d59f1b9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219284
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68129}
2020-06-03 08:59:08 +00:00