v8/regress-1275096.js at 97eff73b71a99ff6a8f97ea7ee8c15a058b96eaf - v8 - Gitea: Git with a cup of tea

AuroraMiddleware/v8

Jakob Gruber 2e17aaca2a [regexp] Fix CharacterRange limits again again again

When emitting code, character ranges must only specify ranges which
the actual subject string (one- or two-byte) may contain.

This was not always the case, specifically for ranges with
`from <= kMaxUint8` and `to > kMaxUint8`.

The reason this is so tricky: 1. not all parts of the pipeline know
whether we are compiling for one- or two-byte subjects; 2. for
case-insensitive regexps, an out-of-bounds CharacterRange may have an
in-bounds case equivalent (e.g. /[Ÿ]/i also matches 'ÿ' == \u{ff}),
which only gets added somewhere in the middle of the pipeline.

Our current solution is to clamp immediately before code emission. We
also keep the existing handling/dchecks of the 0x10ffff marker value
which may occur in the two-byte subject case.

Bug: v8:11069
Change-Id: Ic7b34a13a900ea2aa3df032daac9236bf5682a42
Fixed: chromium:1275096
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3306569
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78186}

2021-12-01 15:13:09 +00:00

8 lines

264 B

JavaScript

Raw Blame History

 // Copyright 2021 the V8 project authors. All rights reserved.
 // Use of this source code is governed by a BSD-style license that can be
 // found in the LICENSE file.
 // Flags: --no-regexp-tier-up
 /[\u{0}zPudf\u{d3}-\ud809\udccc]/iu.exec("");  // Don't crash :)