Commit Graph

2 Commits

Author SHA1 Message Date
Zack Weinberg
97f8225d22 scripts/check-obsolete-constructs.py: Process all headers as UTF-8.
A few of our installed headers contain UTF-8 in comments.
check-obsolete-constructs opened files without explicitly specifying
their encoding, so it would barf on these headers if “make check” was
run in a non-UTF-8 locale.

	* scripts/check-obsolete-constructs.py (HeaderChecker.check):
	Specify encoding="utf-8" when opening headers to check.
2019-03-14 09:44:22 -04:00
Zack Weinberg
711a322a23
Use a proper C tokenizer to implement the obsolete typedefs test.
The test for obsolete typedefs in installed headers was implemented
using grep, and could therefore get false positives on e.g. “ulong”
in a comment.  It was also scanning all of the headers included by
our headers, and therefore testing headers we don’t control, e.g.
Linux kernel headers.

This patch splits the obsolete-typedef test from
scripts/check-installed-headers.sh to a separate program,
scripts/check-obsolete-constructs.py.  Being implemented in Python,
it is feasible to make it tokenize C accurately enough to avoid false
positives on the contents of comments and strings.  It also only
examines $(headers) in each subdirectory--all the headers we install,
but not any external dependencies of those headers.  Headers whose
installed name starts with finclude/ are ignored, on the assumption
that they contain Fortran.

It is also feasible to make the new test understand the difference
between _defining_ the obsolete typedefs and _using_ the obsolete
typedefs, which means posix/{bits,sys}/types.h no longer need to be
exempted.  This uncovered an actual bug in bits/types.h: __quad_t and
__u_quad_t were being used to define __S64_TYPE, __U64_TYPE,
__SQUAD_TYPE and __UQUAD_TYPE.  These are changed to __int64_t and
__uint64_t respectively.  This is a safe change, despite the comments
in bits/types.h claiming a difference between __quad_t and __int64_t,
because those comments are incorrect.  In all current ABIs, both
__quad_t and __int64_t are ‘long’ when ‘long’ is a 64-bit type, and
‘long long’ when ‘long’ is a 32-bit type, and similarly for __u_quad_t
and __uint64_t.  (Changing the types to be what the comments say they
are would be an ABI break, as it affects C++ name mangling.)  This
patch includes a minimal change to make the comments not completely
wrong.

sys/types.h was defining the legacy BSD u_intN_t typedefs using a
construct that was not necessarily consistent with how the C99 uintN_t
typedefs are defined, and is also too complicated for the new script to
understand (it lexes C relatively accurately, but it does not attempt
to expand preprocessor macros, nor does it do any actual parsing).
This patch cuts all of that out and uses bits/types.h's __uintN_t typedefs
to define u_intN_t instead.  This is verified to not change the ABI on
any supported architecture, via the c++-types test, which means u_intN_t
and uintN_t were, in fact, consistent on all supported architectures.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

	* scripts/check-obsolete-constructs.py: New test script.
	* scripts/check-installed-headers.sh: Remove tests for
	obsolete typedefs, superseded by check-obsolete-constructs.py.
	* Rules: Run scripts/check-obsolete-constructs.py over $(headers)
	as a special test.  Update commentary.
	* posix/bits/types.h (__SQUAD_TYPE, __S64_TYPE): Define as __int64_t.
	(__UQUAD_TYPE, __U64_TYPE): Define as __uint64_t.
	Update commentary.
	* posix/sys/types.h (__u_intN_t): Remove.
	(u_int8_t): Typedef using __uint8_t.
	(u_int16_t): Typedef using __uint16_t.
	(u_int32_t): Typedef using __uint32_t.
	(u_int64_t): Typedef using __uint64_t.
2019-03-13 09:39:43 -04:00