The optimization introduced in commit
f13c2a8dff, causes regressions in
sorting for languages that have digraphs that change sort order, like
cs_CZ which sorts ch between h and i.
My analysis shows the fast-forwarding optimization in STRCOLL advances
through a digraph while possibly stopping in the middle which results
in a subsequent skipping of the digraph and incorrect sorting. The
optimization is incorrect as implemented and because of that I'm
removing it for 2.23, and I will also commit this fix for 2.22 where
it was originally introduced.
This patch reverts the optimization, introduces a new bug-strcoll2.c
regression test that tests both cs_CZ.UTF-8 and da_DK.ISO-8859-1 and
ensures they sort one digraph each correctly. The optimization can't be
applied without regressing this test.
Checked on x86_64, bug-strcoll2.c fails without this patch and passes
after. This will also get a fix on 2.22 which has the same bug.
If a locale does not have 8-bit characters with case conversion which
are different from the ASCII conversion (±0x20) then we can perform
some optimizations. These will follow later.
* locale/langinfo.h (_NL_LOCALE_NAME): New macro.
[__USE_GNU] (NL_LOCALE_NAME): New macro.
* locale/nl_langinfo.c: Grok special item value for _NL_LOCALE_NAME,
return locale name string for the category.
* posix/regex_internal.h: Add forward declaration of re_dfa_t.
Replace last two parameters of re_string_allocate and
re_string_construct with pointer to DFA.
(re_dfa_t): Add map_notascii field.
* posix/regcomp.c (re_compile_internal): Add call of
re_string_construct.
(init_dfa): Initialize mpa_notascii.
* posix/regex_internal.c: Adjust definitions of re_string_allocate
and re_string_construct.
Pass DFA to re_string_construct. Adjust definition. Initialize
map_notascii field.
(build_wcs_upper_buffer): If map_notascii is zero use simplfied
method to map ASCII values to upper case.
* posix/regex.c: Include localeinfo.h.
* posix/regexec.c: Adjust call of re_string_allocate.
* locale/langinfo.h: Add _NL_CTYPE_MAP_TO_NONASCII.
* locale/localeinfo.h (LIMAGIC): Change value.
* locale/categories.def. Add entry for _NL_CTYPE_MAP_TO_NONASCII.
* locale/C-ctype.h: Likewise.
* locale/programs/ld-ctype.c: Compute whether any mapping maps from
ASCII to non-ASCII value. Write out that value.
2001-07-06 Paul Eggert <eggert@twinsun.com>
* manual/argp.texi: Remove ignored LGPL copyright notice; it's
not appropriate for documentation anyway.
* manual/libc-texinfo.sh: "Library General Public License" ->
"Lesser General Public License".
2001-07-06 Andreas Jaeger <aj@suse.de>
* All files under GPL/LGPL version 2: Place under LGPL version
2.1.
* locale/Makefile (headers): Add bits/locale.h.
* locale/langinfo.h: Don't include <locale.h>. Include <bits/locale.h>
and use __LC_ constants instead of LC_.
* locale/locale.h: Include <bits/locale.h> and define LC_ constants
using __LC_ constants.
* locale/bits/locale.h: New file.
* locale/loadlocale.c: Include <locale.h>.
* locale/nl_langinfo.h: Likewise.
2000-08-12 Andreas Jaeger <aj@suse.de>
* include/features.h (__STDC_ISO_10646__): Define.
Reported by Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>.
* include/features.h (__USE_ISOC99): Define for _XOPEN_SOURCE >= 600.
* locale/langinfo.h: Define YESSTR and NOSTR also for XPG4 (but not
for revision 6 and up).
* posix/sys/types.h: Define __need_timer_t and __need_clockid_t before
including <time.h>.
* time/time.h: Allow __need_timer_t and __need_clockid_t to be defined
to get definitions of just these types.
* signal/signal.h: Define thread signal handling functions also for
POSIX95.
* sysdeps/unix/sysv/linux/bits/types.h: Define thread types also for
POSIX95.
* sysdeps/unix/sysv/linux/alpha/bits/types.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/bits/types.h: Likewise.
* sysdeps/unix/sysv/linux/mips/bits/types.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/types.h: Likewise.
* sysvipc/sys/shm.h: Define pid_t for XPG.
* wcsmbs/wchar.h: Make the various wide char string and stream
functions available for the respective XPG versions.
2000-07-23 Bruno Haible <haible@clisp.cons.org>
* wctype/wchar-lookup.h: New file.
* wctype/iswctype.c: Include "wchar-lookup.h".
(__iswctype): Support alternate locale format with 3-level tables.
* wctype/iswctype_l.c (__iswctype_l): Likewise.
* wctype/towctrans.c (__towctrans): Likewise.
* wctype/towctrans_l.c (__towctrans_l): Likewise.
* wctype/wcfuncs.c: Include "wchar-lookup.h".
(__ctype32_wctype, __ctype32_wctrans): Declare external.
(__iswalnum, __iswalpha, __iswcntrl, __iswdigit, __iswlower,
__iswgraph, __iswprint, __iswpunct, __iswspace, __iswupper,
__iswxdigit, towlower, towupper): Support alternate locale format
with 3-level tables.
* wctype/wcextra.c (iswblank): Likewise.
* wctype/wcfuncs_l.c: Include "wchar-lookup.h".
(__iswalnum_l, __iswalpha_l, __iswcntrl_l, __iswdigit_l, __iswlower_l,
__iswgraph_l, __iswprint_l, __iswpunct_l, __iswspace_l, __iswupper_l,
__iswxdigit_l, __towlower_l, __towupper_l): Support alternate locale
format with 3-level tables.
* wctype/wcextra_l.c (__iswblank_l): Likewise.
* wctype/wctype.c (__wctype): Likewise. In the alternate locale
format, return a 3-level table pointer.
* wctype/wctype_l.c (__wctype_l): Likewise.
* wctype/wctrans.c (wctrans): Likewise.
* wctype/wctype.h (__ISwupper, __ISwlower, __ISwalpha, __ISwdigit,
__ISwxdigit, __ISwspace, __ISwprint, __ISwgraph, __ISwblank,
__ISwcntrl, __ISwpunct, __ISwalnum): New enum values.
(iswctype): Remove macro definition.
* wcsmbs/wcwidth.h: Include "wchar-lookup.h".
(internal_wcwidth): Support alternate locale format with 3-level
tables.
* locale/langinfo.h (_NL_CTYPE_CLASS_OFFSET, _NL_CTYPE_MAP_OFFSET):
New nl_items.
* locale/categories.def (_NL_CTYPE_CLASS_OFFSET, _NL_CTYPE_MAP_OFFSET):
Define them as being type "word".
* locale/C-ctype.c (_nl_C_LC_CTYPE): Add initializers for them.
* ctype/ctype-info.c (__ctype32_wctype, __ctype32_wctrans,
__ctype32_width): New exported variables.
* locale/lc-ctype.c (_nl_postload_ctype): Initialize them in the
alternate locale format. Don't initialize __ctype_names and
__ctype_width in the alternate locale format.
* locale/programs/localedef.h (oldstyle_tables): New declaration.
* locale/programs/localedef.c (oldstyle_tables): New variable.
(OPT_OLDSTYLE): New macro.
(options): Add --old-style option.
(parse_opt): Handle --old-style option.
* locale/programs/ld-ctype.c (locale_ctype_t): Add class_offset,
map_offset, class_3level, map_3level, width_3level members.
(ctype_output): Support for alternate locale format: Computation of
nelems changes. _NL_CTYPE_TOUPPER32, _NL_CTYPE_TOLOWER32 and
_NL_CTYPE_CLASS32 only 256 characters. _NL_CTYPE_NAMES empty.
New fields _NL_CTYPE_CLASS_OFFSET, _NL_CTYPE_MAP_OFFSET. Field
_NL_CTYPE_WIDTH now contains the three-level table. Extra elems
now contain both class and map tables.
(struct wctype_table): New type.
(wctype_table_init, wctype_table_add, wctype_table_finalize): New
functions.
(struct wcwidth_table): New type.
(wcwidth_table_init, wcwidth_table_add, wcwidth_table_finalize): New
functions.
(struct wctrans_table): New type.
(wctrans_table_init, wctrans_table_add, wctrans_table_finalize): New
functions.
(allocate_arrays): Support for alternate locale format: Set
plane_size and plane_cnt to 0. Restrict ctype->ctype32_b to the first
256 characters. Compute ctype->class_3level. Restrict ctype->map32[idx]
to the first 256 characters. Compute ctype->map_3level. Set
ctype->class_offset and ctype->map_offset. Compute ctype->width_3level
instead of ctype->width.
* iconv/gconv_trans.c: Correct a few bugs in the search loop. Remove
remainders of hash table.
* locale/categories.def: Remove remainders of transliteration
hash table.
* locale/langinfo.h: Likewise.
* locale/programs/ld-ctype.c: Likewise. Fix code to write out
transliteration tables.
* locale/gen-translit.pl: New file.
* locale/C-translit.h.in: New file.
* locale/C-ctype.c: Include C-translit.h. Initialize transliteration
data pointers with data from this file.
* locale/Makefile (distribute): Add C-translit.h.in, C-translit.h,
and gen-translit.pl.
Add rule to generate C-translit.h.
2000-07-17 Bruno Haible <haible@clisp.cons.org>
* iconv/gconv_open.c (__gconv_open): Initialize the __data
field of struct __gconv_trans_data differently. Don't pass NULL to
trans_init_fct. Simplify list append operation.
2000-07-14 Bruno Haible <haible@clisp.cons.org>
* intl/dcigettext.c (dcigettext): Call plural_eval on all platforms,
not only those having tsearch.
2000-07-17 Ulrich Drepper <drepper@redhat.com>
* locale/langinfo.h: Add placeholder values in enum for removed
LC_CTYPE entries.
2000-07-17 Jakub Jelinek <jakub@redhat.com>
* elf/dl-addr.c (_dl_addr): Keep searching in the _dl_loaded
chain if the PHDR check fails.
2000-07-17 Mark Kettenis <kettenis@gnu.org>
* nss/getent.c (print_hosts): Make sure we always print a space
between numeric addresses and hostnames.
2000-07-17 Wolfram Gloger <wg@malloc.de>
* malloc/malloc.c (chunk_alloc): Use mmap_chunk() only if allowed,
i.e. if n_mmaps_max>0.
2000-07-16 Mark Kettenis <kettenis@gnu.org>
* resolv/netdb.h (AI_V4MAPPED, AI_ALL, AI_ADDRCONFIG): Adjust
values to remove possible clash with other AI_* constants.
(AI_PASSIVE, AI_CANONNAME, AI_NUMERICHOST): Define as
hexadecimal constants to stress the fact they're in fact
bit flags.
2000-07-15 Mark Kettenis <kettenis@gnu.org>
* nss/getXXent_r.c [NEED__RES]: Include <resolv.h>.
(SETFUNC_NAME, ENDFUNC_NAME, REENTRANT_GETNAME): Use res_ninit
instead of res_init.
2000-07-12 Bruno Haible <haible@clisp.cons.org>
* iconv/gconv_open.c (__gconv_open): Merge duplicated code.
2000-07-12 Bruno Haible <haible@clisp.cons.org>
* iconv/gconv_builtin.c (__gconv_get_builtin_trans): Initialize
__modname.
2000-07-12 Bruno Haible <haible@clisp.cons.org>
* iconv/gconv_open.c (__gconv_open): Initialize
result->__steps[cnt].__data.
2000-07-12 Mark Kettenis <kettenis@gnu.org>
* nss/getent.c (services_keys): Pass port number in network byte
order in call to getservbyport.
2000-07-11 Andreas Jaeger <aj@suse.de>
* stdlib/Makefile (test-canon-ARGS): Fix for building in the
source dir.
* intl/Makefile (do-gettext-test): Likewise.
* dirent/Makefile (opendir-tst1-ARGS): Likewise.
2000-07-11 Andreas Schwab <schwab@suse.de>
* Makeconfig (run-program-prefix): New rule.
(built-program-cmd): Use run-program-prefix.
2000-06-17 Ulrich Drepper <drepper@redhat.com>
* iconv/gconv_trans.c: Implement handling if translit_ignore.
* locale/langinfo.h: Add entries for translit_ignore information.
* locale/categories.def: Add entries for new LC_CTYPE elements.
* locale/C-ctype.c: Add initializers for new fields. Use NULL
pointer instead of "" where possible.
* locale/programs/ld-ctype.c: Write out translit_ignore information.
* intl/Depend: Add localedata.
* intl/tst-gettext.c: Call setlocale for LC_CTYPE.
* intl/tst-gettext.sh: Set LOCPATH to localedata build dir.
* locale/langinfo.h: Add entries for default_missing information.
* locale/C-ctype.c: Add initializers for new fields.
* iconv/gconv_trans.c: If nothing matched, try to use default_missing
information.
* locale/categories.h: Add entries for all LC_CTYPE values.
* locale/programs/ld-ctype.c (ctype_output): Write out default_missing
information.
* localedata/tst-trans.c: Write out an error message if class is
not found.
2000-05-24 Ulrich Drepper <drepper@redhat.com>
* locale/programs/ld-collate.c (struct element_t): Add mbseqorder
and wcseqorder members.
(struct locale_collate_t): Likewise.
(collate_finish): Assign collation sequence value to each character.
Create tables for output.
(collate_output): Write out tables with collation sequence information.
* locale/C-collate.c: Provide C locale data for collation sequence
table.
* locale/langinfo.h: Add _NL_COLLATE_COLLSEQMB and
_NL_COLLATE_COLLSEQWC.
* locale/categories.def: Add entries for _NL_COLLATE_COLLSEQMB and
_NL_COLLATE_COLLSEQWC.
* posix/fnmatch.c: Define SUFFIX and WIDE_CHAR_VERSION before
include fnmatch_loop.c.
* posix/fnmatch_loop.c: Don't use strcoll while determining whether
character is matched by range expression. Use collation sequence
table. Outside glibc fall back on simple character value comparison.
2000-02-11 Ulrich Drepper <drepper@redhat.com>
* locale/langinfo.h: Make CRNCYSTR a separate entry instead of an
alias for CURRENCY_SYMBOL.
* locale/programs/ld-monetary.c: Add support to write out CRNCYSTR
information. [PR libc/1583].
2000-01-28 Ulrich Drepper <drepper@cygnus.com>
* locale/C-monetary.c: Add initializers for new fields.
* locale/C-numeric.c: Likewise.
* locale/Makefile (distribute): Add indigits.h, indigitswc.h,
outdigits.h, and outdigitswc.h.
* locale/langinfo.h: Add _NL_MONETARY_DECIMAL_POINT_WC,
_NL_MONETARY_THOUSANDS_SEP_WC, _NL_NUMERIC_DECIMAL_POINT_WC,
and _NL_NUMERIC_THOUSANDS_SEP_WC.
* locale/indigits.h: New file.
* locale/indigitswc.h: New file.
* locale/outdigits.h: New file.
* locale/outdigitswc.h: New file.
* locale/programs/ld-monetary.c: Write out decimal point and
thousands separator info in wide character form.
* locale/programs/ld-numeric.c: Likewise.
* stdio-common/Makefile (routines): Add _i18n_itoa and _i18n_itowa.
(distribute): Add _i18n_itoa.h and _i18n_itowa.h.
* stdio-common/_i18n_itoa.c: New file.
* stdio-common/_i18n_itoa.h: New file.
* stdio-common/_i18n_itowa.c: New file.
* stdio-common/_i18n_itowa.h: New file.
* stdio-common/printf-parse.h: Parse 'I' flag.
* stdio-common/printf.h (struct printf_info): Add i18n field.
* stdio-common/vfprintf.c: Implement 'I' flag to print using locales'
outdigits.
1999-12-31 Ulrich Drepper <drepper@cygnus.com>
* locale/langinfo.h: Add constants for wide character collation
symbol table.
* locale/categories.def: Add appropriate entries for collate symbol
table entries.
* locale/C-collate.c: Add initializers for new entries.
Remove commented out code.
* locale/elem-hash.h: New file.
* locale/Makefile (distribute): Add elem-hash.h.
* locale/programs/ld-collate.c: Implement output of collate symbol
table.
* posix/regex.c: Implement collation class handling.
1999-12-13 Andreas Jaeger <aj@suse.de>
* resolv/resolv.h: Remove K&R compatibility.
* resolv/res_libc.c: Move definition of _res after res_init,
res_init should use the threaded specific context.
* resolv/Makefile (+cflags): Remove -Wno-comment since it's not
needed anymore.
* locale/langinfo.h: Add constants for wide character collation data.
* locale/categories.def: Add appropriate entries for collate entries.
* locale/C-collate.c: Add initializers for new entries.
* locale/programs/ld-collate.c: Implement output of wide character
tables.
* locale/programs/ld-ctype.c (allocate_arrays): Change algorithm to
compute wide character table size a bit: it now gives up a bit of
total table size for fewer levels.
1999-12-23 Ulrich Drepper <drepper@cygnus.com>
* locale/en_BW: New file.
* locale/en_ZW: New file.
Contributed by Schalk W. Cronj <schalkc@ntaba.co.za>.
Contributed by Schalk W. Cronj <schalkc@ntaba.co.za>.
1999-12-20 Ulrich Drepper <drepper@cygnus.com>
* locale/categories.def: Remove most of the collate definitions.
* locale/langinfo.h: Comment out corresponding definitions.
* locale/programs/locale-spec.c (locale_special): Don't recognize the
collate names yet.
* locale/programs/ld-collate.c: Correct and optimize computation of
weights. Set up list of all definitions correctly. Start writing
function to generate output file.
* locale/programs/ld-ctype.c (allocate_arrays): Increment counter in
loop to compute default mapping.