v8/test/mjsunit/regress/regress-1275096.js
Jakob Gruber 2e17aaca2a [regexp] Fix CharacterRange limits again again again
When emitting code, character ranges must only specify ranges which
the actual subject string (one- or two-byte) may contain.

This was not always the case, specifically for ranges with
`from <= kMaxUint8` and `to > kMaxUint8`.

The reason this is so tricky: 1. not all parts of the pipeline know
whether we are compiling for one- or two-byte subjects; 2. for
case-insensitive regexps, an out-of-bounds CharacterRange may have an
in-bounds case equivalent (e.g. /[Ÿ]/i also matches 'ÿ' == \u{ff}),
which only gets added somewhere in the middle of the pipeline.

Our current solution is to clamp immediately before code emission. We
also keep the existing handling/dchecks of the 0x10ffff marker value
which may occur in the two-byte subject case.

Bug: v8:11069
Change-Id: Ic7b34a13a900ea2aa3df032daac9236bf5682a42
Fixed: chromium:1275096
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3306569
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78186}
2021-12-01 15:13:09 +00:00

8 lines
264 B
JavaScript

// Copyright 2021 the V8 project authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
// Flags: --no-regexp-tier-up
/[\u{0}zPudf\u{d3}-\ud809\udccc]/iu.exec(""); // Don't crash :)