regexec: Fix off-by-one bug in weight comparison [BZ #23036]

Each weight is prefixed by its length, and the length does not include
itself in the count.  This can be seen clearly from the find_idx
function in string/strxfrm_l.c, for example.  The old code behaved as if
the length itself counted, thus comparing an additional byte after the
weight, leading to spurious comparison failures and incorrect further
partitioning of character equivalence classes.

(cherry picked from commit 7b2f4cedf0)
This commit is contained in:
Florian Weimer 2018-07-11 16:43:17 +02:00
parent 718445e569
commit 68c1bf8097
3 changed files with 28 additions and 23 deletions

View File

@ -1,3 +1,10 @@
2018-07-10 Florian Weimer <fweimer@redhat.com>
[BZ #23036]
* posix/regexec.c (check_node_accept_bytes): When comparing
weights, do not compare an extra byte after the end of the
weights.
2018-06-29 Sylvain Lesage <severo@rednegra.net>
[BZ #22996]

1
NEWS
View File

@ -69,6 +69,7 @@ The following bugs are resolved with this release:
[22947] FAIL: misc/tst-preadvwritev2
[22963] cs_CZ: Add alternative month names
[23005] Crash in __res_context_send after memory allocation failure
[23036] regexec: Fix off-by-one bug in weight comparison
[23037] initialize msg_flags to zero for sendmmsg() calls
[23069] sigaction broken on riscv64-linux-gnu
[23102] Incorrect parsing of consecutive $ variables in runpath entries

View File

@ -3848,30 +3848,27 @@ check_node_accept_bytes (const re_dfa_t *dfa, int node_idx,
indirect = (const int32_t *)
_NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB);
int32_t idx = findidx (table, indirect, extra, &cp, elem_len);
int32_t rule = idx >> 24;
idx &= 0xffffff;
if (idx > 0)
for (i = 0; i < cset->nequiv_classes; ++i)
{
int32_t equiv_class_idx = cset->equiv_classes[i];
size_t weight_len = weights[idx & 0xffffff];
if (weight_len == weights[equiv_class_idx & 0xffffff]
&& (idx >> 24) == (equiv_class_idx >> 24))
{
int cnt = 0;
idx &= 0xffffff;
equiv_class_idx &= 0xffffff;
while (cnt <= weight_len
&& (weights[equiv_class_idx + 1 + cnt]
== weights[idx + 1 + cnt]))
++cnt;
if (cnt > weight_len)
{
match_len = elem_len;
goto check_node_accept_bytes_match;
}
}
}
{
size_t weight_len = weights[idx];
for (i = 0; i < cset->nequiv_classes; ++i)
{
int32_t equiv_class_idx = cset->equiv_classes[i];
int32_t equiv_class_rule = equiv_class_idx >> 24;
equiv_class_idx &= 0xffffff;
if (weights[equiv_class_idx] == weight_len
&& equiv_class_rule == rule
&& memcmp (weights + idx + 1,
weights + equiv_class_idx + 1,
weight_len) == 0)
{
match_len = elem_len;
goto check_node_accept_bytes_match;
}
}
}
}
}
else