* manual/ctype.texi: Likewise.
	* manual/locale.texi: Likewise.
This commit is contained in:
Ulrich Drepper 1999-08-27 19:52:08 +00:00
parent 2e8a853b6c
commit 6dd5b57e8b
3 changed files with 276 additions and 281 deletions

View File

@ -3,11 +3,13 @@
* manual/argp.texi: Fixing language and typos.
* manual/conf.texi: Likewise.
* manual/contrib.texi: Likewise.
* manual/ctype.texi: Likewise.
* manual/filesys.texi: Likewise.
* manual/install.texi: Likewise.
* manual/job.texi: Likewise.
* manual/lang.texi: Likewise.
* manual/llio.texi: Likewise.
* manual/locale.texi: Likewise.
* manual/math.texi: Likewise.
* manual/nss.texi: Likewise.
* manual/pipe.texi: Likewise.

View File

@ -266,34 +266,34 @@ with the SVID.
@section Character class determination for wide characters
The second amendment to @w{ISO C89} defines functions to classify wide
characters. The original @w{ISO C89} standard defined the type
@code{wchar_t} but failed to define any functions to operate on wide
characters.
characters. Although the original @w{ISO C89} standard already defined
the type @code{wchar_t}, no functions operating on them were defined.
The general design of the classification functions for wide characters
is more general. It allows extending the set of available
classifications beyond the set which is always available. The POSIX
standard specifies how the extension can be done and this is already
is more general. It allows extensions to the set of available
classifications, beyond those which are always available. The POSIX
standard specifies how extensions can be made, and this is already
implemented in the GNU C library implementation of the @code{localedef}
program.
The character class functions are normally implemented using bitsets.
I.e., for the character in question the appropriate bitset is read from
a table and a test is performed to determine whether a certain bit is
set in this bitset. Which bit is tested for is determined by the class.
The character class functions are normally implemented with bitsets,
with a bitset per character. For a given character, the appropriate
bitset is read from a table and a test is performed as to whether a
certain bit is set. Which bit is tested for is determined by the
class.
For the wide character classification functions this is made visible.
There is a type representing the classification, a function to retrieve
this value for a specific class, and a function to test using the
classification value whether a given character is in this class. On top
of this the normal character classification functions as used for
There is a type classification type defined, a function to retrieve this
value for a given class, and a function to test whether a given
character is in this class, using the classification value. On top of
this the normal character classification functions as used for
@code{char} objects can be defined.
@comment wctype.h
@comment ISO
@deftp {Data type} wctype_t
The @code{wctype_t} can hold a value which represents a character class.
The ony defined way to generate such a value is by using the
The only defined way to generate such a value is by using the
@code{wctype} function.
@pindex wctype.h
@ -306,8 +306,8 @@ This type is defined in @file{wctype.h}.
The @code{wctype} returns a value representing a class of wide
characters which is identified by the string @var{property}. Beside
some standard properties each locale can define its own ones. In case
no property with the given name is known for the current locale for the
@code{LC_CTYPE} category the function returns zero.
no property with the given name is known for the current locale
selected for the @code{LC_CTYPE} category, the function returns zero.
@noindent
The properties known in every locale are:
@ -339,11 +339,11 @@ by a successful call to @code{wctype}.
This function is declared in @file{wctype.h}.
@end deftypefun
This makes it easier to use the commonly-used classification functions
that are defined in the C library. There is no need to use
To make it easier to use the commonly-used classification functions,
they are defined in the C library. There is no need to use
@code{wctype} if the property string is one of the known character
classes. In some situations it is desirable to construct the property
string and then it becomes important that @code{wctype} can also handle the
strings, and then it is important that @code{wctype} can also handle the
standard classes.
@cindex alphanumeric character
@ -420,7 +420,7 @@ wide characters:
@smallexample
n = 0;
while (iswctype (*wc))
while (iswdigit (*wc))
@{
n *= 10;
n += *wc++ - L'0';
@ -604,11 +604,11 @@ This function is a GNU extension. It is declared in @file{wchar.h}.
@node Using Wide Char Classes, Wide Character Case Conversion, Classification of Wide Characters, Character Handling
@section Notes on using the wide character classes
The first note is probably nothing astonishing but still occasionally a
The first note is probably not astonishing but still occasionally a
cause of problems. The @code{isw@var{XXX}} functions can be implemented
using macros and in fact, the GNU C library does this. They are still
available as real functions but when the @file{wctype.h} header is
included the macros will be used. This is nothing new compared to the
included the macros will be used. This is the same as the
@code{char} type versions of these functions.
The second note covers something new. It can be best illustrated by a
@ -630,8 +630,8 @@ is_in_class (int c, const char *class)
@}
@end smallexample
Now with the @code{wctype} and @code{iswctype} one could avoid the
@code{if} cascades. But rewriting the code as follows is wrong:
Now, with the @code{wctype} and @code{iswctype} you can avoid the
@code{if} cascades, but rewriting the code as follows is wrong:
@smallexample
int
@ -644,7 +644,7 @@ is_in_class (int c, const char *class)
The problem is that it is not guaranteed that the wide character
representation of a single-byte character can be found using casting.
In fact, usually this fails miserably. The correct solution for this
In fact, usually this fails miserably. The correct solution to this
problem is to write the code as follows:
@smallexample
@ -657,10 +657,10 @@ is_in_class (int c, const char *class)
@end smallexample
@xref{Converting a Character}, for more information on @code{btowc}.
Please note that this change probably does not improve the performance
Note that this change probably does not improve the performance
of the program a lot since the @code{wctype} function still has to make
the string comparisons. But it gets really interesting if the
@code{is_in_class} function would be called more than once using the
the string comparisons. It gets really interesting if the
@code{is_in_class} function is called more than once for the
same class name. In this case the variable @var{desc} could be computed
once and reused for all the calls. Therefore the above form of the
function is probably not the final one.
@ -669,18 +669,17 @@ function is probably not the final one.
@node Wide Character Case Conversion, , Using Wide Char Classes, Character Handling
@section Mapping of wide characters.
As for the classification functions, the @w{ISO C} standard also
generalizes the mapping functions. Instead of only allowing the two
standard mappings, the locale can contain others. Again, the
@code{localedef} program already supports generating such locale data
files.
The classification functions are also generalized by the @w{ISO C}
standard. Instead of just allowing the two standard mappings, a
locale can contain others. Again, the @code{localedef} program
already supports generating such locale data files.
@comment wctype.h
@comment ISO
@deftp {Data Type} wctrans_t
This data type is defined as a scalar type which can hold a value
representing the locale-dependent character mapping. There is no way to
construct such a value except using the return value of the
construct such a value apar from using the return value of the
@code{wctrans} function.
@pindex wctype.h
@ -693,8 +692,8 @@ This type is defined in @file{wctype.h}.
@deftypefun wctrans_t wctrans (const char *@var{property})
The @code{wctrans} function has to be used to find out whether a named
mapping is defined in the current locale selected for the
@code{LC_CTYPE} category. If the returned value is non-zero it can
afterwards be used in calls to @code{towctrans}. If the return value is
@code{LC_CTYPE} category. If the returned value is non-zero, you can use
it afterwards in calls to @code{towctrans}. If the return value is
zero no such mapping is known in the current locale.
Beside locale-specific mappings there are two mappings which are
@ -707,15 +706,15 @@ guaranteed to be available in every locale:
@pindex wctype.h
@noindent
This function is declared in @file{wctype.h}.
These functions are declared in @file{wctype.h}.
@end deftypefun
@comment wctype.h
@comment ISO
@deftypefun wint_t towctrans (wint_t @var{wc}, wctrans_t @var{desc})
The @code{towctrans} function maps the input character @var{wc}
according to the rules of the mapping for which @var{desc} is an
descriptor and returns the value so found. The @var{desc} value must be
@code{towctrans} maps the input character @var{wc}
according to the rules of the mapping for which @var{desc} is a
descriptor, and returns the value it finds. @var{desc} must be
obtained by a successful call to @code{wctrans}.
@pindex wctype.h
@ -723,8 +722,8 @@ obtained by a successful call to @code{wctrans}.
This function is declared in @file{wctype.h}.
@end deftypefun
The @w{ISO C} standard also defines for the generally available mappings
convenient shortcuts so that it is not necesary to call @code{wctrans}
For the generally available mappings, the @w{ISO C} standard defines
convenient shortcuts so that it is not necessary to call @code{wctrans}
for them.
@comment wctype.h
@ -765,6 +764,6 @@ This function is declared in @file{wctype.h}.
@end deftypefun
The same warnings given in the last section for the use of the wide
character classification function applies here. It is not possible to
character classification functions apply here. It is not possible to
simply cast a @code{char} type value to a @code{wint_t} and use it as an
argument for @code{towctrans} calls.
argument to @code{towctrans} calls.

View File

@ -99,7 +99,7 @@ most of Spain.
The set of locales supported depends on the operating system you are
using, and so do their names. We can't make any promises about what
locales will exist, except for one standard locale called @samp{C} or
@samp{POSIX}. Later we will describe how to construct locales XXX.
@samp{POSIX}. Later we will describe how to construct locales.
@comment (@pxref{Building Locale Files}).
@cindex combining locales
@ -183,12 +183,12 @@ to use for all purposes except as overridden by the variables above.
@vindex LANGUAGE
When developing the message translation functions it was felt that the
functionality provided by the variables above is not sufficient. E.g., it
should be possible to specify more than one locale name. For an example
take a Swedish user who better speaks German than English, the programs
messages by default are written in English. Then it should be possible
to specify that the first choice for the language is Swedish, the second
choice is German, and if this also fails English is used. This is
functionality provided by the variables above is not sufficient. For
example, it should be possible to specify more than one locale name.
Take a Swedish user who better speaks German than English, and a program
whose messages are output in English by default. It should be possible
to specify that the first choice of language is Swedish, the second
German, and if this also fails to use English. This is
possible with the variable @code{LANGUAGE}. For further description of
this GNU extension see @ref{Using gettextized software}.
@ -226,7 +226,7 @@ category @var{category} to @var{locale}.
If @var{category} is @code{LC_ALL}, this specifies the locale for all
purposes. The other possible values of @var{category} specify an
individual purpose (@pxref{Locale Categories}).
single purpose (@pxref{Locale Categories}).
You can also use this function to find out the current locale by passing
a null pointer as the @var{locale} argument. In this case,
@ -250,19 +250,19 @@ don't make any promises about what it looks like. But if you specify
the same ``locale name'' with @code{LC_ALL} in a subsequent call to
@code{setlocale}, it restores the same combination of locale selections.
To ensure to be able to use the string encoding the currently selected
locale at a later time one has to make a copy of the string. It is not
guaranteed that the return value stays valid all the time.
To be sure you can use the returned string encoding the currently selected
locale at a later time, you must make a copy of the string. It is not
guaranteed that the returned pointer remains valid over time.
When the @var{locale} argument is not a null pointer, the string returned
by @code{setlocale} reflects the newly modified locale.
by @code{setlocale} reflects the newly-modified locale.
If you specify an empty string for @var{locale}, this means to read the
appropriate environment variable and use its value to select the locale
for @var{category}.
If a nonempty string is given for @var{locale} the locale with this name
is used, if this is possible.
If a nonempty string is given for @var{locale}, then the locale of that
name is used if possible.
If you specify an invalid locale name, @code{setlocale} returns a null
pointer and leaves the current locale unchanged.
@ -303,7 +303,7 @@ with_other_locale (char *new_locale,
@end smallexample
@strong{Portability Note:} Some @w{ISO C} systems may define additional
locale categories and future versions of the library will do so. For
locale categories, and future versions of the library will do so. For
portability, assume that any symbol beginning with @samp{LC_} might be
defined in @file{locale.h}.
@ -332,7 +332,7 @@ Defining and installing named locales is normally a responsibility of
the system administrator at your site (or the person who installed the
GNU C library). It is also possible for the user to create private
locales. All this will be discussed later when describing the tool to
do so XXX.
do so.
@comment (@pxref{Building Locale Files}).
If your program needs to use something other than the @samp{C} locale,
@ -342,27 +342,27 @@ locale explicitly by name. Remember, different machines might have
different sets of locales installed.
@node Locale Information, Formatting Numbers, Standard Locales, Locales
@section Accessing the Locale Information
@section Accessing Locale Information
There are several ways to access the locale information. The simplest
There are several ways to access locale information. The simplest
way is to let the C library itself do the work. Several of the
functions in this library access implicitly the locale data and use
what information is available in the currently selected locale. This is
functions in this library implicitly access the locale data, and use
what information is provided by the currently selected locale. This is
how the locale model is meant to work normally.
As an example take the @code{strftime} function which is meant to nicely
As an example take the @code{strftime} function, which is meant to nicely
format date and time information (@pxref{Formatting Date and Time}).
Part of the standard information contained in the @code{LC_TIME}
category are, e.g., the names of the months. Instead of requiring the
category is the names of the months. Instead of requiring the
programmer to take care of providing the translations the
@code{strftime} function does this all by itself. When using @code{%A}
in the format string this will be replaced by the appropriate weekday
name of the locale currently selected for @code{LC_TIME}. This is the
easy part and wherever possible functions do things automatically as in
this case.
@code{strftime} function does this all by itself. @code{%A}
in the format string is replaced by the appropriate weekday
name of the locale currently selected by @code{LC_TIME}. This is an
easy example, and wherever possible functions do things automatically
in this way.
But there are quite often situations when there is simply no functions
to perform the task or it is simply not possible to do the work
But there are quite often situations when there is simply no function
to perform the task, or it is simply not possible to do the work
automatically. For these cases it is necessary to access the
information in the locale directly. To do this the C library provides
two functions: @code{localeconv} and @code{nl_langinfo}. The former is
@ -379,14 +379,13 @@ as far as the system follows the Unix standards.
@subsection @code{localeconv}: It is portable but @dots{}
Together with the @code{setlocale} function the @w{ISO C} people
invented @code{localeconv} function. It is a masterpiece of misdesign.
It is expensive to use, it is not extendable, and is not generally
usable as it provides access only to the @code{LC_MONETARY} and
@code{LC_NUMERIC} related information. If it is applicable for a
certain situation it should nevertheless be used since it is very
portable. In general it is better to use the function @code{strfmon}
which can be used to format monetary amounts correctly according to the
selected locale by implicitly using this information.
invented the @code{localeconv} function. It is a masterpiece of poor
design. It is expensive to use, not extendable, and not generally
usable as it provides access to only @code{LC_MONETARY} and
@code{LC_NUMERIC} related information. Nevertheless, if it is
applicable to a given situation it should be used since it is very
portable. The function @code{strfmon} formats monetary amounts
according to the selected locale using this information.
@pindex locale.h
@cindex monetary value formatting
@cindex numeric value formatting
@ -407,8 +406,8 @@ value.
@comment locale.h
@comment ISO
@deftp {Data Type} {struct lconv}
This is the data type of the value returned by @code{localeconv}. Its
elements are described in the following subsections.
@code{localeconv}'s return value is of this data type. Its elements are
described in the following subsections.
@end deftp
If a member of the structure @code{struct lconv} has type @code{char},
@ -487,7 +486,7 @@ members have the same value.)
In the standard @samp{C} locale, both of these members have the value
@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
what to do when you find this the value; we recommend printing no
what to do when you find this value; we recommend printing no
fractional digits. (This locale also specifies the empty string for
@code{mon_decimal_point}, so printing any fractional digits would be
confusing!)
@ -521,8 +520,8 @@ The local currency symbol for the selected locale.
In the standard @samp{C} locale, this member has a value of @code{""}
(the empty string), meaning ``unspecified''. The ISO standard doesn't
say what to do when you find this value; we recommend you simply print
the empty string as you would print any other string found in the
appropriate member.
the empty string as you would print any other string pointed to by this
variable.
@item char *int_curr_symbol
The international currency symbol for the selected locale.
@ -533,9 +532,9 @@ three-letter abbreviation determined by the international standard
followed by a one-character separator (often a space).
In the standard @samp{C} locale, this member has a value of @code{""}
(the empty string), meaning ``unspecified''. We recommend you simply
print the empty string as you would print any other string found in the
appropriate member.
(the empty string), meaning ``unspecified''. We recommend you simply print
the empty string as you would print any other string pointed to by this
variable.
@item char p_cs_precedes
@itemx char n_cs_precedes
@ -547,8 +546,8 @@ negative amounts.
In the standard @samp{C} locale, both of these members have a value of
@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
what to do when you find this value, but we recommend printing the
currency symbol before the amount. That's right for most countries.
what to do when you find this value. We recommend printing the
currency symbol before the amount, which is right for most countries.
In other words, treat all nonzero values alike in these members.
The POSIX standard says that these two members apply to the
@ -573,7 +572,7 @@ negative amounts.
In the standard @samp{C} locale, both of these members have a value of
@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
what you should do when you find this value; we suggest you treat it as
one (print a space). In other words, treat all nonzero values alike in
1 (print a space). In other words, treat all nonzero values alike in
these members.
These members apply only to @code{currency_symbol}. When you use
@ -581,7 +580,7 @@ These members apply only to @code{currency_symbol}. When you use
@code{int_curr_symbol} itself contains the appropriate separator.
The POSIX standard says that these two members apply to the
@code{int_curr_symbol} as well as the @code{currency_symbol}. But an
@code{int_curr_symbol} as well as the @code{currency_symbol}. However, an
example in the @w{ISO C} standard clearly implies that they should apply
only to the @code{currency_symbol}---that the @code{int_curr_symbol}
contains any appropriate separator, so you should never print an
@ -592,16 +591,16 @@ printing international currency symbols, and print no extra space.
@end table
@node Sign of Money Amount, , Currency Symbol, The Lame Way to Locale Data
@subsubsection Printing the Sign of an Amount of Money
@subsubsection Printing the Sign of a Monetary Amount
These members of the @code{struct lconv} structure specify how to print
the sign (if any) in a monetary value.
the sign (if any) of a monetary value.
@table @code
@item char *positive_sign
@itemx char *negative_sign
These are strings used to indicate positive (or zero) and negative
(respectively) monetary quantities.
monetary quantities, respectively.
In the standard @samp{C} locale, both of these members have a value of
@code{""} (the empty string), meaning ``unspecified''.
@ -615,7 +614,7 @@ unreasonable.)
@item char p_sign_posn
@itemx char n_sign_posn
These members have values that are small integers indicating how to
These members are small integers that indicate how to
position the sign for nonnegative and negative monetary quantities,
respectively. (The string used by the sign is what was specified with
@code{positive_sign} or @code{negative_sign}.) The possible values are
@ -650,36 +649,35 @@ symbol.
It is not clear whether you should let these members apply to the
international currency format or not. POSIX says you should, but
intuition plus the examples in the @w{ISO C} standard suggest you should
not. We hope that someone who knows well the conventions for formatting
monetary quantities will tell us what we should recommend.
not. We hope that someone who knows the conventions for formatting
monetary quantities well will tell us what we should recommend.
@node The Elegant and Fast Way, , The Lame Way to Locale Data, Locale Information
@subsection Pinpoint Access to Locale Data
When writing the X/Open Portability Guide the authors realized that the
@code{localeconv} function is not enough to provide reasonable access to
the locale information. The information which was meant to be available
locale information. The information which was meant to be available
in the locale (as later specified in the POSIX.1 standard) requires more
possibilities to access it. Therefore the @code{nl_langinfo} function
ways to access it. Therefore the @code{nl_langinfo} function
was introduced.
@comment langinfo.h
@comment XOPEN
@deftypefun {char *} nl_langinfo (nl_item @var{item})
The @code{nl_langinfo} function can be used to access individual
elements of the locale categories. I.e., unlike the @code{localeconv}
function which always returns all the information @code{nl_langinfo}
lets the caller select what information is necessary. This is very
fast and it is no problem to call this function multiple times.
elements of the locale categories. Unlike the @code{localeconv}
function, which returns all the information, @code{nl_langinfo}
lets the caller select what information it requires. This is very
fast and it is not a problem to call this function multiple times.
The second advantage is that not only the numeric and monetary
formatting information is available. Also the information of the
A second advantage is that in addition to the numeric and monetary
formatting information, information from the
@code{LC_TIME} and @code{LC_MESSAGES} categories is available.
The type @code{nl_type} is defined in @file{nl_types.h}.
The argument @var{item} is a numeric values which must be one of the
values defined in the header @file{langinfo.h}. The X/Open standard
defines the following values:
The type @code{nl_type} is defined in @file{nl_types.h}. The argument
@var{item} is a numeric value defined in the header @file{langinfo.h}.
The X/Open standard defines the following values:
@vtable @code
@item ABDAY_1
@ -698,7 +696,7 @@ corresponds to Sunday.
@itemx DAY_5
@itemx DAY_6
@itemx DAY_7
Similar to @code{ABDAY_1} etc, but here the return value is the
Similar to @code{ABDAY_1} etc., but here the return value is the
unabbreviated weekday name.
@item ABMON_1
@itemx ABMON_2
@ -712,7 +710,7 @@ unabbreviated weekday name.
@itemx ABMON_10
@itemx ABMON_11
@itemx ABMON_12
The return value is abbreviated name for the month names. @code{ABMON_1}
The return value is abbreviated name of the month. @code{ABMON_1}
corresponds to January.
@item MON_1
@itemx MON_2
@ -726,129 +724,127 @@ corresponds to January.
@itemx MON_10
@itemx MON_11
@itemx MON_12
Similar to @code{ABMON_1} etc but here the month names are not abbreviated.
Similar to @code{ABMON_1} etc., but here the month names are not abbreviated.
Here the first value @code{MON_1} also corresponds to January.
@item AM_STR
@itemx PM_STR
The return values are strings which can be used in the time representation
which uses to American 1 to 12 hours plus am/pm representation.
The return values are strings which can be used in the representation of time
as an hour from 1 to 12 plus an am/pm specifier.
Please note that in locales which do not know this time representation
these strings actually might be empty and therefore the am/pm format
Note that in locales which do not use this time representation
these strings might be empty, in which case the am/pm format
cannot be used at all.
@item D_T_FMT
The return value can be used as a format string for @code{strftime} to
represent time and date in a locale specific way.
represent time and date in a locale-specific way.
@item D_FMT
The return value can be used as a format string for @code{strftime} to
represent a date in a locale specific way.
represent a date in a locale-specific way.
@item T_FMT
The return value can be used as a format string for @code{strftime} to
represent time in a locale specific way.
represent time in a locale-specific way.
@item T_FMT_AMPM
The return value can be used as a format string for @code{strftime} to
represent time using the American-style am/pm format.
represent time in the am/pm format.
Please note that if the am/pm format does not make any sense for the
selected locale the returned value might be the same as the one for
Note that if the am/pm format does not make any sense for the
selected locale, the return value might be the same as the one for
@code{T_FMT}.
@item ERA
The return value is value representing the eras of time used in the
current locale.
The return value represents the era used in the current locale.
Most locales do not define this value. An example for a locale which
does define this value is the Japanese. Here the traditional data
representation is based on the eras measured by the reigns of the
emperors.
Most locales do not define this value. An example of a locale which
does define this value is the Japanese one. In Japan, the traditional
representation of dates includes the name of the era corresponding to
the then-emperor's reign.
Normally it should not be necessary to use this value directly. Using
the @code{E} modifier for its formats the @code{strftime} functions can
be made to use this information. The format of the returned string
is not specified and therefore one should not generalize the knowledge
about the representation on one system.
Normally it should not be necessary to use this value directly.
Specifying the @code{E} modifier in their format strings causes the
@code{strftime} functions to use this information. The format of the
returned string is not specified, and therefore you should not assume
knowledge of it on different systems.
@item ERA_YEAR
The return value describes the name years for the eras of this locale.
The return value gives the year in the relevant era of the locale.
As for @code{ERA} it should not be necessary to use this value directly.
@item ERA_D_T_FMT
This return value can be used as a format string for @code{strftime} to
represent time and date using the era representation in a locale
specific way.
represent dates and times in a locale-specific era-based way.
@item ERA_D_FMT
This return value can be used as a format string for @code{strftime} to
represent a date using the era representation in a locale specific way.
represent a date in a locale-specific era-based way.
@item ERA_T_FMT
This return value can be used as a format string for @code{strftime} to
represent time using the era representation in a locale specific way.
represent time in a locale-specific era-based way.
@item ALT_DIGITS
The return value is a representation of up to @math{100} values used to
represent the values @math{0} to @math{99}. As for @code{ERA} this
value is not intended to be used directly, but instead indirectly
through the @code{strftime} function. When the modifier @code{O} is
used for format which would use numerals to represent hours, minutes,
seconds, weekdays, months, or weeks the appropriate value for this
locale values is used instead of the number.
used in a format which would otherwise use numerals to represent hours,
minutes, seconds, weekdays, months, or weeks, the appropriate value for
the locale is used instead.
@item INT_CURR_SYMBOL
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{int_curr_symbol} element of the @code{struct lconv}.
@item CURRENCY_SYMBOL
@itemx CRNCYSTR
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{currency_symbol} element of the @code{struct lconv}.
@code{CRNCYSTR} is a deprecated alias, still required by Unix98.
@code{CRNCYSTR} is a deprecated alias still required by Unix98.
@item MON_DECIMAL_POINT
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{mon_decimal_point} element of the @code{struct lconv}.
@item MON_THOUSANDS_SEP
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{mon_thousands_sep} element of the @code{struct lconv}.
@item MON_GROUPING
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{mon_grouping} element of the @code{struct lconv}.
@item POSITIVE_SIGN
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{positive_sign} element of the @code{struct lconv}.
@item NEGATIVE_SIGN
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{negative_sign} element of the @code{struct lconv}.
@item INT_FRAC_DIGITS
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{int_frac_digits} element of the @code{struct lconv}.
@item FRAC_DIGITS
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{frac_digits} element of the @code{struct lconv}.
@item P_CS_PRECEDES
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{p_cs_precedes} element of the @code{struct lconv}.
@item P_SEP_BY_SPACE
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{p_sep_by_space} element of the @code{struct lconv}.
@item N_CS_PRECEDES
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{n_cs_precedes} element of the @code{struct lconv}.
@item N_SEP_BY_SPACE
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{n_sep_by_space} element of the @code{struct lconv}.
@item P_SIGN_POSN
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{p_sign_posn} element of the @code{struct lconv}.
@item N_SIGN_POSN
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{n_sign_posn} element of the @code{struct lconv}.
@item DECIMAL_POINT
@itemx RADIXCHAR
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{decimal_point} element of the @code{struct lconv}.
The name @code{RADIXCHAR} is a deprecated alias still used in Unix98.
@item THOUSANDS_SEP
@itemx THOUSEP
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{thousands_sep} element of the @code{struct lconv}.
The name @code{THOUSEP} is a deprecated alias still used in Unix98.
@item GROUPING
This value is the same as returned by @code{localeconv} in the
The same as the value returned by @code{localeconv} in the
@code{grouping} element of the @code{struct lconv}.
@item YESEXPR
The return value is a regular expression which can be used with the
@ -859,37 +855,37 @@ The return value is a regular expression which can be used with the
@code{regex} function to recognize a negative response to a yes/no
question.
@item YESSTR
The return value is a locale specific translation of the positive response
The return value is a locale-specific translation of the positive response
to a yes/no question.
Using this value is deprecated since it is a very special case of
message translation and this better can be handled using the message
message translation, and is better handled by the message
translation functions (@pxref{Message Translation}).
@item NOSTR
The return value is a locale specific translation of the negative response
The return value is a locale-specific translation of the negative response
to a yes/no question. What is said for @code{YESSTR} is also true here.
@end vtable
The file @file{langinfo.h} defines a lot more symbols but none of them
is official. Using them is completely unportable and the format of the
return values might change. Therefore it is highly requested to not use
them in any situation.
is official. Using them is not portable, and the format of the
return values might change. Therefore we recommended you not use
them.
Please note that the return value for any valid argument can be used for
in all situations (with the possible exception of the am/pm time format
related values). If the user has not selected any locale for the
appropriate category @code{nl_langinfo} returns the information from the
Note that the return value for any valid argument can be used for
in all situations (with the possible exception of the am/pm time formatting
codes). If the user has not selected any locale for the
appropriate category, @code{nl_langinfo} returns the information from the
@code{"C"} locale. It is therefore possible to use this function as
shown in the example below.
If the argument @var{item} is not valid the global variable @var{errno}
If the argument @var{item} is not valid, the global variable @var{errno}
is set to @code{EINVAL} and a @code{NULL} pointer is returned.
@end deftypefun
An example for the use of @code{nl_langinfo} is a function which has to
print a given date and time in the locale specific way. At first one
might think the since @code{strftime} internally uses the locale
information writing something like the following is enough:
An example of @code{nl_langinfo} usage is a function which has to
print a given date and time in a locale-specific way. At first one
might think that, since @code{strftime} internally uses the locale
information, writing something like the following is enough:
@smallexample
size_t
@ -913,37 +909,37 @@ i18n_time_n_data (char *s, size_t len, const struct tm *tp)
@}
@end smallexample
Now the date and time format which is explicitly selected for the locale
in place when the program runs is used. If the user selects the locale
Now it uses the date and time format of the locale
selected when the program runs. If the user selects the locale
correctly there should never be a misunderstanding over the time and
date format.
@node Formatting Numbers, , Locale Information, Locales
@node Formatting Numbers, Locale Information, Locales
@section A dedicated function to format numbers
We have seen that the structure returned by @code{localeconv} as well as
the values given to @code{nl_langinfo} allow to retrieve the various
pieces of locale specific information to format numbers and monetary
amounts. But we have also seen that the rules underlying this
information are quite complex.
the values given to @code{nl_langinfo} allow you to retrieve the various
pieces of locale-specific information to format numbers and monetary
amounts. We have also seen that the underlying rules are quite complex.
Therefore the X/Open standards introduce a function which uses this
information from the locale and so makes it is for the user to format
Therefore the X/Open standards introduce a function which uses such
locale information, making it easier for the user to format
numbers according to these rules.
@deftypefun ssize_t strfmon (char *@var{s}, size_t @var{maxsize}, const char *@var{format}, @dots{})
The @code{strfmon} function is similar to the @code{strftime} function
in that it takes a description of a buffer (with size), a format string
and values to write into a buffer a textual representation of the values
according to the format string. As for @code{strftime} the function
in that it takes a buffer, its size, a format string,
and values to write into the buffer as text in a form specified
by the format string. Like @code{strftime}, the function
also returns the number of bytes written into the buffer.
There are two difference: @code{strfmon} can take more than one argument
and of course the format specification is different. The format string
consists as for @code{strftime} of normal text which is simply printed
and format specifiers, which here are also introduced using @samp{%}.
Following the @samp{%} the function allows similar to @code{printf} a
sequence of flags and other specifications before the format character:
There are two differences: @code{strfmon} can take more than one
argument, and, of course, the format specification is different. Like
@code{strftime}, the format string consists of normal text, which is
output as is, and format specifiers, which are indicated by a @samp{%}.
Immediately after the @samp{%}, you can optionally specify various flags
and formatting information before the main formatting character, in a
similar way to @code{printf}:
@itemize @bullet
@item
@ -956,77 +952,74 @@ fill character. By default this character is a space character.
Filling with this character is only performed if a left precision
is specified. It is not just to fill to the given field width.
@item @samp{^}
The number is printed without grouping the digits using the rules of the
current locale. By default grouping is enabled.
The number is printed without grouping the digits according to the rules
of the current locale. By default grouping is enabled.
@item @samp{+}, @samp{(}
At most one of these flags must be used. They select which format to
represent the sign of currency amount is used. By default and if
@samp{+} is used the locale equivalent to @math{+}/@math{-} is used. If
@samp{(} is used negative amounts are enclosed in parentheses. The
At most one of these flags can be used. They select which format to
represent the sign of a currency amount. By default, and if
@samp{+} is given, the locale equivalent of @math{+}/@math{-} is used. If
@samp{(} is given, negative amounts are enclosed in parentheses. The
exact format is determined by the values of the @code{LC_MONETARY}
category of the locale selected at program runtime.
@item @samp{!}
The output will not contain the currency symbol.
@item @samp{-}
The output will be formatted right-justified instead left-justified if
the output does not fill the entire field width.
The output will be formatted left-justified instead of right-justified if
it does not fill the entire field width.
@end table
@end itemize
The next part of a specification is an, again optional, specification of
the field width. The width is given by digits following the flags. If
no width is specified it is assumed to be @math{0}. The width value is
used after it is determined how much space the printed result needs. If
it does not require fewer characters than specified by the width value
nothing happens. Otherwise the output is extended to use as many
characters as the width says by filling with spaces. At which side
depends on whether the @samp{-} flag was given or not. If it was given,
the spaces are added at the right, making the output right-justified and
vice versa.
The next part of a specification is an optional field width. If no
width is specified @math{0} is taken. During output, the function first
determines how much space is required. If it requires at least as many
characters as given by the field width, it is output using as much space
as necessary. Otherwise, it is extended to use the full width by
filling with the space character. The presence or absence of the
@samp{-} flag determines the side at which such padding occurs. If
present, the spaces are added at the right making the output
left-justified, and vice versa.
So far the format looks familiar as it is similar to @code{printf} or
@code{strftime} formats. But the next two fields introduce something
new. The first one, if available, is introduced by a @samp{#} character
which is followed by a decimal digit string. The value of the digit
string specifies the width the formatted digits left to the radix
character. This does @emph{not} include the grouping character needed
if the @samp{^} flag is not given. If the space needed to print the
number does not fill the whole width the field is padded at the left
side with the fill character which can be selected using the @samp{=}
flag and which by default is a space. For example, if the field width
is selected as 6 and the number is @math{123}, the fill character is
@samp{*} the result will be @samp{***123}.
So far the format looks familiar, being similar to the @code{printf} and
@code{strftime} formats. However, the next two optional fields
introduce something new. The first one is a @samp{#} character followed
by a decimal digit string. The value of the digit string specifies the
number of @emph{digit} positions to the left of the decimal point (or
equivalent). This does @emph{not} include the grouping character when
the @samp{^} flag is not given. If the space needed to print the number
does not fill the whole width, the field is padded at the left side with
the fill character, which can be selected using the @samp{=} flag and by
default is a space. For example, if the field width is selected as 6
and the number is @math{123}, the fill character is @samp{*} the result
will be @samp{***123}.
The next field is introduced by a @samp{.} (period) and consists of
another decimal digit string. Its value describes the number of
characters printed after the radix character. The default is
selected from the current locale (@code{frac_digits},
@code{int_frac_digits}, see @pxref{General Numeric}). If the exact
representation needs more digits than those specified by the field width
the displayed value is rounded. In case the number of fractional digits
is selected to be zero, no radix character is printed.
The second optional field starts with a @samp{.} (period) and consists
of another decimal digit string. Its value describes the number of
characters printed after the decimal point. The default is selected
from the current locale (@code{frac_digits}, @code{int_frac_digits}, see
@pxref{General Numeric}). If the exact representation needs more digits
than given by the field width, the displayed value is rounded. If the
number of fractional digits is selected to be zero, no decimal point is
printed.
As a GNU extension the @code{strfmon} implementation in the GNU libc
allows as the next field an optional @samp{L} as a format modifier. If
this modifier is given the argument is expected to be a @code{long
double} instead of a @code{double} value.
As a GNU extension, the @code{strfmon} implementation in the GNU libc
allows an optional @samp{L} next as a format modifier. If this modifier
is given, the argument is expected to be a @code{long double} instead of
a @code{double} value.
Finally as the last component of the format there must come a format
specifying. There are three specifiers defined:
Finally, the last component is a format specifier. There are three
specifiers defined:
@table @asis
@item @samp{i}
The argument is formatted according to the locale's rules to format an
international currency value.
Use the locale's rules for formatting an international currency value.
@item @samp{n}
The argument is formatted according to the locale's rules to format an
national currency value.
Use the locale's rules for formatting a national currency value.
@item @samp{%}
Creates a @samp{%} in the output. There must be no flag, width
Place a @samp{%} in the output. There must be no flag, width
specifier or modifier given, only @samp{%%} is allowed.
@end table
As it is done for @code{printf}, the function reads the format string
As for @code{printf}, the function reads the format string
from left to right and uses the values passed to the function following
the format string. The values are expected to be either of type
@code{double} or @code{long double}, depending on the presence of the
@ -1034,15 +1027,15 @@ modifier @samp{L}. The result is stored in the buffer pointed to by
@var{s}. At most @var{maxsize} characters are stored.
The return value of the function is the number of characters stored in
@var{s}, including the terminating NUL byte. If the number of
characters stored would exceed @var{maxsize} the function returns
@var{s}, including the terminating @code{NULL} byte. If the number of
characters stored would exceed @var{maxsize}, the function returns
@math{-1} and the content of the buffer @var{s} is unspecified. In this
case @code{errno} is set to @code{E2BIG}.
@end deftypefun
A few examples should make it clear how to use this function. It is
A few examples should make clear how the function works. It is
assumed that all the following pieces of code are executed in a program
which uses the locale valid for the USA (@code{en_US}). The simplest
which uses the USA locale (@code{en_US}). The simplest
form of the format is this:
@smallexample
@ -1055,15 +1048,15 @@ The output produced is
"@@$123.45@@-$567.89@@$12,345.68@@"
@end smallexample
We can notice several things here. First, the width for all formats is
different. We have not specified a width in the format string and so
this is no wonder. Second, the third number is printed using thousands
separators. The thousands separator for the @code{en_US} locale is a
comma. Beside this the number is rounded. The @math{.678} are rounded
to @math{.68} since the format does not specify a precision and the
default value in the locale is @math{2}. A last thing is that the
national currency symbol is printed since @samp{%n} was used, not
@samp{i}. The next example shows how we can align the output.
We can notice several things here. First, the widths of the output
numbers are different. We have not specified a width in the format
string, and so this is no wonder. Second, the third number is printed
using thousands separators. The thousands separator for the
@code{en_US} locale is a comma. The number is also rounded.
@math{.678} is rounded to @math{.68} since the format does not specify a
precision and the default value in the locale is @math{2}. Finally,
note that the national currency symbol is printed since @samp{%n} was
used, not @samp{i}. The next example shows how we can align the output.
@smallexample
strfmon (buf, 100, "@@%=*11n@@%=*11n@@%=*11n@@", 123.45, -567.89, 12345.678);
@ -1076,13 +1069,13 @@ The output this time is:
"@@ $123.45@@ -$567.89@@ $12,345.68@@"
@end smallexample
Two things stand out. First, all fields have the same width (eleven
Two things stand out. Firstly, all fields have the same width (eleven
characters) since this is the width given in the format and since no
number required more characters to be printed. The second important
point is that the fill character is not used. This is correct since the
white space was not used to fill the space specified by the right
precision, but instead it is used to fill to the given width. The
difference becomes obvious if we now add a right width specification.
white space was not used to achieve a precision given by a @samp{#}
modifier, but instead to fill to the given width. The difference
becomes obvious if we now add a width specification.
@smallexample
strfmon (buf, 100, "@@%=*11#5n@@%=*11#5n@@%=*11#5n@@",
@ -1096,14 +1089,14 @@ The output is
"@@ $***123.45@@-$***567.89@@ $12,456.68@@"
@end smallexample
Here we can see that all the currency symbols are now aligned and the
space between the currency sign and the number is filled with the
selected fill character. Please note that although the right precision
is selected to be @math{5} and @math{123.45} has three characters right
of the radix character, the space is filled with three asterisks. This
is correct since as explained above, the right precision does not count
the characters used for the thousands separators in. One last example
should explain the remaining functionality.
Here we can see that all the currency symbols are now aligned, and that
the space between the currency sign and the number is filled with the
selected fill character. Note that although the width is selected to be
@math{5} and @math{123.45} has three digits left of the decimal point,
the space is filled with three asterisks. This is correct since, as
explained above, the width does not include the positions used to store
thousands separators. One last example should explain the remaining
functionality.
@smallexample
strfmon (buf, 100, "@@%=0(16#5.3i@@%=0(16#5.3i@@%=0(16#5.3i@@",
@ -1117,14 +1110,15 @@ This rather complex format string produces the following output:
"@@ USD 000123,450 @@(USD 000567.890)@@ USD 12,345.678 @@"
@end smallexample
The most noticeable change is the use of the alternative style to
represent negative numbers. In financial circles it is often done using
parentheses and this is what the @samp{(} flag selected. The fill character
is now @samp{0}. Please note that this @samp{0} character is not
regarded as a numeric zero and therefore the first and second number are
not printed using a thousands separator. Since we use in the format the
specifier @samp{i} instead of @samp{n} now the international form of the
The most noticeable change is the alternative way of representing
negative numbers. In financial circles this is often done using
parentheses, and this is what the @samp{(} flag selected. The fill
character is now @samp{0}. Note that this @samp{0} character is not
regarded as a numeric zero, and therefore the first and second numbers
are not printed using a thousands separator. Since we used the format
specifier @samp{i} instead of @samp{n}, the international form of the
currency symbol is used. This is a four letter string, in this case
@code{"USD "}. The last point is that since the left precision is
selected to be three the first and second number are printed with an
extra zero at the end and the third number is printed unrounded.
@code{"USD "}. The last point is that since the precision right of the
decimal point is selected to be three, the first and second numbers are
printed with an extra zero at the end and the third number is printed
without rounding.