From 0a57b83e4aecdc14c62a5a998288b5b40c799d4d Mon Sep 17 00:00:00 2001 From: Alexandre Oliva Date: Wed, 29 Jan 2014 05:20:37 -0200 Subject: [PATCH] * manual/macros.texi: Introduce macros to document multi thread, asynchronous signal and asynchronous cancellation safety properties. * manual/intro.texi: Introduce the properties themselves. --- ChangeLog | 7 + NEWS | 3 + manual/intro.texi | 685 +++++++++++++++++++++++++++++++++++++++++++++ manual/macros.texi | 165 +++++++++++ 4 files changed, 860 insertions(+) diff --git a/ChangeLog b/ChangeLog index d98157b41a..dc40e27bf9 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2014-01-29 Alexandre Oliva + + * manual/macros.texi: Introduce macros to document multi + thread, asynchronous signal and asynchronous cancellation + safety properties. + * manual/intro.texi: Introduce the properties themselves. + 2014-01-27 Kaz Kojima * sysdeps/sh/sh4/Makefile: New file. diff --git a/NEWS b/NEWS index 32ce337f47..d47151603a 100644 --- a/NEWS +++ b/NEWS @@ -118,6 +118,9 @@ Version 2.19 * The _BSD_SOURCE feature test macro no longer enables BSD interfaces that conflict with POSIX. The libbsd-compat library (which was a dummy library that did nothing) has also been removed. + +* Preliminary documentation about Multi-Thread, Async-Signal and + Async-Cancel Safety has been added. Version 2.18 diff --git a/manual/intro.texi b/manual/intro.texi index deaf089b10..fb501a67f9 100644 --- a/manual/intro.texi +++ b/manual/intro.texi @@ -159,6 +159,14 @@ Utilities standard} (POSIX.2) are also implemented in @theglibc{}. These include utilities for dealing with regular expressions and other pattern matching facilities (@pxref{Pattern Matching}). +@menu +* POSIX Safety Concepts:: Safety concepts from POSIX. +* Unsafe Features:: Features that make functions unsafe. +* Conditionally Safe Features:: Features that make functions unsafe + in the absence of workarounds. +* Other Safety Remarks:: Additional safety features and remarks. +@end menu + @comment Roland sez: @comment The GNU C library as it stands conforms to 1003.2 draft 11, which @comment specifies: @@ -172,6 +180,683 @@ pattern matching facilities (@pxref{Pattern Matching}). @comment (not yet implemented) @comment confstr +@node POSIX Safety Concepts, Unsafe Features, , POSIX +@subsubsection POSIX Safety Concepts +@cindex POSIX Safety Concepts + +This manual documents various safety properties of @glibcadj{} +functions, in lines that follow their prototypes and look like: + +@sampsafety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} + +The properties are assessed according to the criteria set forth in the +POSIX standard for such safety contexts as Thread-, Async-Signal- and +Async-Cancel- -Safety. Intuitive definitions of these properties, +attempting to capture the meaning of the standard definitions, follow. + +@itemize @bullet + +@item +@cindex MT-Safe +@cindex Thread-Safe +@code{MT-Safe} or Thread-Safe functions are safe to call in the presence +of other threads. MT, in MT-Safe, stands for Multi Thread. + +Being MT-Safe does not imply a function is atomic, nor that it uses any +of the memory synchronization mechanisms POSIX exposes to users. It is +even possible that calling MT-Safe functions in sequence does not yield +an MT-Safe combination. For example, having a thread call two MT-Safe +functions one right after the other does not guarantee behavior +equivalent to atomic execution of a combination of both functions, since +concurrent calls in other threads may interfere in a destructive way. + +Whole-program optimizations that could inline functions across library +interfaces may expose unsafe reordering, and so performing inlining +across the @glibcadj{} interface is not recommended. The documented +MT-Safety status is not guaranteed under whole-program optimization. +However, functions defined in user-visible headers are designed to be +safe for inlining. + + +@item +@cindex AS-Safe +@cindex Async-Signal-Safe +@code{AS-Safe} or Async-Signal-Safe functions are safe to call from +asynchronous signal handlers. AS, in AS-Safe, stands for Asynchronous +Signal. + +Many functions that are AS-Safe may set @code{errno}, or modify the +floating-point environment, because their doing so does not make them +unsuitable for use in signal handlers. However, programs could +misbehave should asynchronous signal handlers modify this thread-local +state, and the signal handling machinery cannot be counted on to +preserve it. Therefore, signal handlers that call functions that may +set @code{errno} or modify the floating-point environment @emph{must} +save their original values, and restore them before returning. + + +@item +@cindex AC-Safe +@cindex Async-Cancel-Safe +@code{AC-Safe} or Async-Cancel-Safe functions are safe to call when +asynchronous cancellation is enabled. AC in AC-Safe stands for +Asynchronous Cancellation. + +The POSIX standard defines only three functions to be AC-Safe, namely +@code{pthread_cancel}, @code{pthread_setcancelstate}, and +@code{pthread_setcanceltype}. At present @theglibc{} provides no +guarantees beyond these three functions, but does document which +functions are presently AC-Safe. This documentation is provided for use +by @theglibc{} developers. + +Just like signal handlers, cancellation cleanup routines must configure +the floating point environment they require. The routines cannot assume +a floating point environment, particularly when asynchronous +cancellation is enabled. If the configuration of the floating point +environment cannot be performed atomically then it is also possible that +the environment encountered is internally inconsistent. + + +@item +@cindex MT-Unsafe +@cindex Thread-Unsafe +@cindex AS-Unsafe +@cindex Async-Signal-Unsafe +@cindex AC-Unsafe +@cindex Async-Cancel-Unsafe +@code{MT-Unsafe}, @code{AS-Unsafe}, @code{AC-Unsafe} functions are not +safe to call within the safety contexts described above. Calling them +within such contexts invokes undefined behavior. + +Functions not explicitly documented as safe in a safety context should +be regarded as Unsafe. + + +@item +@cindex Preliminary +@code{Preliminary} safety properties are documented, indicating these +properties may @emph{not} be counted on in future releases of +@theglibc{}. + +Such preliminary properties are the result of an assessment of the +properties of our current implementation, rather than of what is +mandated and permitted by current and future standards. + +Although we strive to abide by the standards, in some cases our +implementation is safe even when the standard does not demand safety, +and in other cases our implementation does not meet the standard safety +requirements. The latter are most likely bugs; the former, when marked +as @code{Preliminary}, should not be counted on: future standards may +require changes that are not compatible with the additional safety +properties afforded by the current implementation. + +Furthermore, the POSIX standard does not offer a detailed definition of +safety. We assume that, by ``safe to call'', POSIX means that, as long +as the program does not invoke undefined behavior, the ``safe to call'' +function behaves as specified, and does not cause other functions to +deviate from their specified behavior. We have chosen to use its loose +definitions of safety, not because they are the best definitions to use, +but because choosing them harmonizes this manual with POSIX. + +Please keep in mind that these are preliminary definitions and +annotations, and certain aspects of the definitions are still under +discussion and might be subject to clarification or change. + +Over time, we envision evolving the preliminary safety notes into stable +commitments, as stable as those of our interfaces. As we do, we will +remove the @code{Preliminary} keyword from safety notes. As long as the +keyword remains, however, they are not to be regarded as a promise of +future behavior. + + +@end itemize + +Other keywords that appear in safety notes are defined in subsequent +sections. + + +@node Unsafe Features, Conditionally Safe Features, POSIX Safety Concepts, POSIX +@subsubsection Unsafe Features +@cindex Unsafe Features + +Functions that are unsafe to call in certain contexts are annotated with +keywords that document their features that make them unsafe to call. +AS-Unsafe features in this section indicate the functions are never safe +to call when asynchronous signals are enabled. AC-Unsafe features +indicate they are never safe to call when asynchronous cancellation is +enabled. There are no MT-Unsafe marks in this section. + +@itemize @bullet + +@item @code{lock} +@cindex lock + +Functions marked with @code{lock} as an AS-Unsafe feature may be +interrupted by a signal while holding a non-recursive lock. If the +signal handler calls another such function that takes the same lock, the +result is a deadlock. + +Functions annotated with @code{lock} as an AC-Unsafe feature may, if +cancelled asynchronously, fail to release a lock that would have been +released if their execution had not been interrupted by asynchronous +thread cancellation. Once a lock is left taken, attempts to take that +lock will block indefinitely. + + +@item @code{corrupt} +@cindex corrupt + +Functions marked with @code{corrupt} as an AS-Unsafe feature may corrupt +data structures and misbehave when they interrupt, or are interrupted +by, another such function. Unlike functions marked with @code{lock}, +these take recursive locks to avoid MT-Safety problems, but this is not +enough to stop a signal handler from observing a partially-updated data +structure. Further corruption may arise from the interrupted function's +failure to notice updates made by signal handlers. + +Functions marked with @code{corrupt} as an AC-Unsafe feature may leave +data structures in a corrupt, partially updated state. Subsequent uses +of the data structure may misbehave. + +@c A special case, probably not worth documenting separately, involves +@c reallocing, or even freeing pointers. Any case involving free could +@c be easily turned into an ac-safe leak by resetting the pointer before +@c releasing it; I don't think we have any case that calls for this sort +@c of fixing. Fixing the realloc cases would require a new interface: +@c instead of @code{ptr=realloc(ptr,size)} we'd have to introduce +@c @code{acsafe_realloc(&ptr,size)} that would modify ptr before +@c releasing the old memory. The ac-unsafe realloc could be implemented +@c in terms of an internal interface with this semantics (say +@c __acsafe_realloc), but since realloc can be overridden, the function +@c we call to implement realloc should not be this internal interface, +@c but another internal interface that calls __acsafe_realloc if realloc +@c was not overridden, and calls the overridden realloc with async +@c cancel disabled. --lxoliva + + +@item @code{heap} +@cindex heap + +Functions marked with @code{heap} may call heap memory management +functions from the @code{malloc}/@code{free} family of functions and are +only as safe as those functions. This note is thus equivalent to: + +@sampsafety{@asunsafe{@asulock{}}@acunsafe{@aculock{} @acsfd{} @acsmem{}}} + + +@c Check for cases that should have used plugin instead of or in +@c addition to this. Then, after rechecking gettext, adjust i18n if +@c needed. +@item @code{dlopen} +@cindex dlopen + +Functions marked with @code{dlopen} use the dynamic loader to load +shared libraries into the current execution image. This involves +opening files, mapping them into memory, allocating additional memory, +resolving symbols, applying relocations and more, all of this while +holding internal dynamic loader locks. + +The locks are enough for these functions to be AS- and AC-Unsafe, but +other issues may arise. At present this is a placeholder for all +potential safety issues raised by @code{dlopen}. + +@c dlopen runs init and fini sections of the module; does this mean +@c dlopen always implies plugin? + + +@item @code{plugin} +@cindex plugin + +Functions annotated with @code{plugin} may run code from plugins that +may be external to @theglibc{}. Such plugin functions are assumed to be +MT-Safe, AS-Unsafe and AC-Unsafe. Examples of such plugins are stack +@cindex NSS +unwinding libraries, name service switch (NSS) and character set +@cindex iconv +conversion (iconv) back-ends. + +Although the plugins mentioned as examples are all brought in by means +of dlopen, the @code{plugin} keyword does not imply any direct +involvement of the dynamic loader or the @code{libdl} interfaces, those +are covered by @code{dlopen}. For example, if one function loads a +module and finds the addresses of some of its functions, while another +just calls those already-resolved functions, the former will be marked +with @code{dlopen}, whereas the latter will get the @code{plugin}. When +a single function takes all of these actions, then it gets both marks. + + +@item @code{i18n} +@cindex i18n + +Functions marked with @code{i18n} may call internationalization +functions of the @code{gettext} family and will be only as safe as those +functions. This note is thus equivalent to: + +@sampsafety{@mtsafe{@mtsenv{}}@asunsafe{@asucorrupt{} @ascuheap{} @ascudlopen{}}@acunsafe{@acucorrupt{}}} + + +@item @code{timer} +@cindex timer + +Functions marked with @code{timer} use the @code{alarm} function or +similar to set a time-out for a system call or a long-running operation. +In a multi-threaded program, there is a risk that the time-out signal +will be delivered to a different thread, thus failing to interrupt the +intended thread. Besides being MT-Unsafe, such functions are always +AS-Unsafe, because calling them in signal handlers may interfere with +timers set in the interrupted code, and AC-Unsafe, because there is no +safe way to guarantee an earlier timer will be reset in case of +asynchronous cancellation. + +@end itemize + + +@node Conditionally Safe Features, Other Safety Remarks, Unsafe Features, POSIX +@subsubsection Conditionally Safe Features +@cindex Conditionally Safe Features + +For some features that make functions unsafe to call in certain +contexts, there are known ways to avoid the safety problem other than +refraining from calling the function altogether. The keywords that +follow refer to such features, and each of their definitions indicate +how the whole program needs to be constrained in order to remove the +safety problem indicated by the keyword. Only when all the reasons that +make a function unsafe are observed and addressed, by applying the +documented constraints, does the function become safe to call in a +context. + +@itemize @bullet + +@item @code{init} +@cindex init + +Functions marked with @code{init} as an MT-Unsafe feature perform +MT-Unsafe initialization when they are first called. + +Calling such a function at least once in single-threaded mode removes +this specific cause for the function to be regarded as MT-Unsafe. If no +other cause for that remains, the function can then be safely called +after other threads are started. + +Functions marked with @code{init} as an AS- or AC-Unsafe feature use the +internal @code{libc_once} machinery or similar to initialize internal +data structures. + +If a signal handler interrupts such an initializer, and calls any +function that also performs @code{libc_once} initialization, it will +deadlock if the thread library has been loaded. + +Furthermore, if an initializer is partially complete before it is +canceled or interrupted by a signal whose handler requires the same +initialization, some or all of the initialization may be performed more +than once, leaking resources or even resulting in corrupt internal data. + +Applications that need to call functions marked with @code{init} as an +AS- or AC-Unsafe feature should ensure the initialization is performed +before configuring signal handlers or enabling cancellation, so that the +AS- and AC-Safety issues related with @code{libc_once} do not arise. + +@c We may have to extend the annotations to cover conditions in which +@c initialization may or may not occur, since an initial call in a safe +@c context is no use if the initialization doesn't take place at that +@c time: it doesn't remove the risk for later calls. + + +@item @code{race} +@cindex race + +Functions annotated with @code{race} as an MT-Safety issue operate on +objects in ways that may cause data races or similar forms of +destructive interference out of concurrent execution. In some cases, +the objects are passed to the functions by users; in others, they are +used by the functions to return values to users; in others, they are not +even exposed to users. + +We consider access to objects passed as (indirect) arguments to +functions to be data race free. The assurance of data race free objects +is the caller's responsibility. We will not mark a function as +MT-Unsafe or AS-Unsafe if it misbehaves when users fail to take the +measures required by POSIX to avoid data races when dealing with such +objects. As a general rule, if a function is documented as reading from +an object passed (by reference) to it, or modifying it, users ought to +use memory synchronization primitives to avoid data races just as they +would should they perform the accesses themselves rather than by calling +the library function. @code{FILE} streams are the exception to the +general rule, in that POSIX mandates the library to guard against data +races in many functions that manipulate objects of this specific opaque +type. We regard this as a convenience provided to users, rather than as +a general requirement whose expectations should extend to other types. + +In order to remind users that guarding certain arguments is their +responsibility, we will annotate functions that take objects of certain +types as arguments. We draw the line for objects passed by users as +follows: objects whose types are exposed to users, and that users are +expected to access directly, such as memory buffers, strings, and +various user-visible @code{struct} types, do @emph{not} give reason for +functions to be annotated with @code{race}. It would be noisy and +redundant with the general requirement, and not many would be surprised +by the library's lack of internal guards when accessing objects that can +be accessed directly by users. + +As for objects that are opaque or opaque-like, in that they are to be +manipulated only by passing them to library functions (e.g., +@code{FILE}, @code{DIR}, @code{obstack}, @code{iconv_t}), there might be +additional expectations as to internal coordination of access by the +library. We will annotate, with @code{race} followed by a colon and the +argument name, functions that take such objects but that do not take +care of synchronizing access to them by default. For example, +@code{FILE} stream @code{unlocked} functions will be annotated, but +those that perform implicit locking on @code{FILE} streams by default +will not, even though the implicit locking may be disabled on a +per-stream basis. + +In either case, we will not regard as MT-Unsafe functions that may +access user-supplied objects in unsafe ways should users fail to ensure +the accesses are well defined. The notion prevails that users are +expected to safeguard against data races any user-supplied objects that +the library accesses on their behalf. + +@c The above describes @mtsrace; @mtasurace is described below. + +This user responsibility does not apply, however, to objects controlled +by the library itself, such as internal objects and static buffers used +to return values from certain calls. When the library doesn't guard +them against concurrent uses, these cases are regarded as MT-Unsafe and +AS-Unsafe (although the @code{race} mark under AS-Unsafe will be omitted +as redundant with the one under MT-Unsafe). As in the case of +user-exposed objects, the mark may be followed by a colon and an +identifier. The identifier groups all functions that operate on a +certain unguarded object; users may avoid the MT-Safety issues related +with unguarded concurrent access to such internal objects by creating a +non-recursive mutex related with the identifier, and always holding the +mutex when calling any function marked as racy on that identifier, as +they would have to should the identifier be an object under user +control. The non-recursive mutex avoids the MT-Safety issue, but it +trades one AS-Safety issue for another, so use in asynchronous signals +remains undefined. + +When the identifier relates to a static buffer used to hold return +values, the mutex must be held for as long as the buffer remains in use +by the caller. Many functions that return pointers to static buffers +offer reentrant variants that store return values in caller-supplied +buffers instead. In some cases, such as @code{tmpname}, the variant is +chosen not by calling an alternate entry point, but by passing a +non-@code{NULL} pointer to the buffer in which the returned values are +to be stored. These variants are generally preferable in multi-threaded +programs, although some of them are not MT-Safe because of other +internal buffers, also documented with @code{race} notes. + + +@item @code{const} +@cindex const + +Functions marked with @code{const} as an MT-Safety issue non-atomically +modify internal objects that are better regarded as constant, because a +substantial portion of @theglibc{} accesses them without +synchronization. Unlike @code{race}, that causes both readers and +writers of internal objects to be regarded as MT-Unsafe and AS-Unsafe, +this mark is applied to writers only. Writers remain equally MT- and +AS-Unsafe to call, but the then-mandatory constness of objects they +modify enables readers to be regarded as MT-Safe and AS-Safe (as long as +no other reasons for them to be unsafe remain), since the lack of +synchronization is not a problem when the objects are effectively +constant. + +The identifier that follows the @code{const} mark will appear by itself +as a safety note in readers. Programs that wish to work around this +safety issue, so as to call writers, may use a non-recursve +@code{rwlock} associated with the identifier, and guard @emph{all} calls +to functions marked with @code{const} followed by the identifier with a +write lock, and @emph{all} calls to functions marked with the identifier +by itself with a read lock. The non-recursive locking removes the +MT-Safety problem, but it trades one AS-Safety problem for another, so +use in asynchronous signals remains undefined. + +@c But what if, instead of marking modifiers with const:id and readers +@c with just id, we marked writers with race:id and readers with ro:id? +@c Instead of having to define each instance of “id”, we'd have a +@c general pattern governing all such “id”s, wherein race:id would +@c suggest the need for an exclusive/write lock to make the function +@c safe, whereas ro:id would indicate “id” is expected to be read-only, +@c but if any modifiers are called (while holding an exclusive lock), +@c then ro:id-marked functions ought to be guarded with a read lock for +@c safe operation. ro:env or ro:locale, for example, seems to convey +@c more clearly the expectations and the meaning, than just env or +@c locale. + + +@item @code{sig} +@cindex sig + +Functions marked with @code{sig} as a MT-Safety issue (that implies an +identical AS-Safety issue, omitted for brevity) may temporarily install +a signal handler for internal purposes, which may interfere with other +uses of the signal, identified after a colon. + +This safety problem can be worked around by ensuring that no other uses +of the signal will take place for the duration of the call. Holding a +non-recursive mutex while calling all functions that use the same +temporary signal; blocking that signal before the call and resetting its +handler afterwards is recommended. + +There is no safe way to guarantee the original signal handler is +restored in case of asynchronous cancellation, therefore so-marked +functions are also AC-Unsafe. + +@c fixme: at least deferred cancellation should get it right, and would +@c obviate the restoring bit below, and the qualifier above. + +Besides the measures recommended to work around the MT- and AS-Safety +problem, in order to avert the cancellation problem, disabling +asynchronous cancellation @emph{and} installing a cleanup handler to +restore the signal to the desired state and to release the mutex are +recommended. + + +@item @code{term} +@cindex term + +Functions marked with @code{term} as an MT-Safety issue may change the +terminal settings in the recommended way, namely: call @code{tcgetattr}, +modify some flags, and then call @code{tcsetattr}; this creates a window +in which changes made by other threads are lost. Thus, functions marked +with @code{term} are MT-Unsafe. The same window enables changes made by +asynchronous signals to be lost. These functions are also AS-Unsafe, +but the corresponding mark is omitted as redundant. + +It is thus advisable for applications using the terminal to avoid +concurrent and reentrant interactions with it, by not using it in signal +handlers or blocking signals that might use it, and holding a lock while +calling these functions and interacting with the terminal. This lock +should also be used for mutual exclusion with functions marked with +@code{@mtasurace{:tcattr}}. + +Functions marked with @code{term} as an AC-Safety issue are supposed to +restore terminal settings to their original state, after temporarily +changing them, but they may fail to do so if cancelled. + +@c fixme: at least deferred cancellation should get it right, and would +@c obviate the restoring bit below, and the qualifier above. + +Besides the measures recommended to work around the MT- and AS-Safety +problem, in order to avert the cancellation problem, disabling +asynchronous cancellation @emph{and} installing a cleanup handler to +restore the terminal settings to the original state and to release the +mutex are recommended. + + +@end itemize + + +@node Other Safety Remarks, , Conditionally Safe Features, POSIX +@subsubsection Other Safety Remarks +@cindex Other Safety Remarks + +Additional keywords may be attached to functions, indicating features +that do not make a function unsafe to call, but that may need to be +taken into account in certain classes of programs: + +@itemize @bullet + +@c revisit: uses are mt-safe, distinguish from const:locale +@item @code{locale} +@cindex locale + +Functions annotated with @code{locale} as an MT-Safety issue read from +the locale object without any form of synchronization. Functions +annotated with @code{locale} called concurrently with locale changes may +behave in ways that do not correspond to any of the locales active +during their execution, but an unpredictable mix thereof. + +We do not mark these functions as MT- or AS-Unsafe, however, because +functions that modify the locale object are marked with +@code{const:locale} and regarded as unsafe. Being unsafe, the latter +are not to be called when multiple threads are running or asynchronous +signals are enabled, and so the locale can be considered effectively +constant in these contexts, which makes the former safe. + +@c Should the locking strategy suggested under @code{const} be used, +@c failure to guard locale uses is not as fatal as data races in +@c general: unguarded uses will @emph{not} follow dangling pointers or +@c access uninitialized, unmapped or recycled memory. Each access will +@c read from a consistent locale object that is or was active at some +@c point during its execution. Without synchronization, however, it +@c cannot even be assumed that, after a change in locale, earlier +@c locales will no longer be used, even after the newly-chosen one is +@c used in the thread. Nevertheless, even though unguarded reads from +@c the locale will not violate type safety, functions that access the +@c locale multiple times may invoke all sorts of undefined behavior +@c because of the unexpected locale changes. + + +@c revisit: this was incorrectly used as an mt-unsafe marker. +@item @code{env} +@cindex env + +Functions marked with @code{env} as an MT-Safety issue access the +environment with @code{getenv} or similar, without any guards to ensure +safety in the presence of concurrent modifications. + +We do not mark these functions as MT- or AS-Unsafe, however, because +functions that modify the environment are all marked with +@code{const:env} and regarded as unsafe. Being unsafe, the latter are +not to be called when multiple threads are running or asynchronous +signals are enabled, and so the environment can be considered +effectively constant in these contexts, which makes the former safe. + + +@item @code{hostid} +@cindex hostid + +The function marked with @code{hostid} as an MT-Safety issue reads from +the system-wide data structures that hold the ``host ID'' of the +machine. These data structures cannot generally be modified atomically. +Since it is expected that the ``host ID'' will not normally change, the +function that reads from it (@code{gethostid}) is regarded as safe, +whereas the function that modifies it (@code{sethostid}) is marked with +@code{@mtasuconst{:@mtshostid{}}}, indicating it may require special +care if it is to be called. In this specific case, the special care +amounts to system-wide (not merely intra-process) coordination. + + +@item @code{sigintr} +@cindex sigintr + +Functions marked with @code{sigintr} as an MT-Safety issue access the +@code{_sigintr} internal data structure without any guards to ensure +safety in the presence of concurrent modifications. + +We do not mark these functions as MT- or AS-Unsafe, however, because +functions that modify the this data structure are all marked with +@code{const:sigintr} and regarded as unsafe. Being unsafe, the latter +are not to be called when multiple threads are running or asynchronous +signals are enabled, and so the data structure can be considered +effectively constant in these contexts, which makes the former safe. + + +@item @code{fd} +@cindex fd + +Functions annotated with @code{fd} as an AC-Safety issue may leak file +descriptors if asynchronous thread cancellation interrupts their +execution. + +Functions that allocate or deallocate file descriptors will generally be +marked as such. Even if they attempted to protect the file descriptor +allocation and deallocation with cleanup regions, allocating a new +descriptor and storing its number where the cleanup region could release +it cannot be performed as a single atomic operation. Similarly, +releasing the descriptor and taking it out of the data structure +normally responsible for releasing it cannot be performed atomically. +There will always be a window in which the descriptor cannot be released +because it was not stored in the cleanup handler argument yet, or it was +already taken out before releasing it. It cannot be taken out after +release: an open descriptor could mean either that the descriptor still +has to be closed, or that it already did so but the descriptor was +reallocated by another thread or signal handler. + +Such leaks could be internally avoided, with some performance penalty, +by temporarily disabling asynchronous thread cancellation. However, +since callers of allocation or deallocation functions would have to do +this themselves, to avoid the same sort of leak in their own layer, it +makes more sense for the library to assume they are taking care of it +than to impose a performance penalty that is redundant when the problem +is solved in upper layers, and insufficient when it is not. + +This remark by itself does not cause a function to be regarded as +AC-Unsafe. However, cumulative effects of such leaks may pose a +problem for some programs. If this is the case, suspending asynchronous +cancellation for the duration of calls to such functions is recommended. + + +@item @code{mem} +@cindex mem + +Functions annotated with @code{mem} as an AC-Safety issue may leak +memory if asynchronous thread cancellation interrupts their execution. + +The problem is similar to that of file descriptors: there is no atomic +interface to allocate memory and store its address in the argument to a +cleanup handler, or to release it and remove its address from that +argument, without at least temporarily disabling asynchronous +cancellation, which these functions do not do. + +This remark does not by itself cause a function to be regarded as +generally AC-Unsafe. However, cumulative effects of such leaks may be +severe enough for some programs that disabling asynchronous cancellation +for the duration of calls to such functions may be required. + + +@item @code{cwd} +@cindex cwd + +Functions marked with @code{cwd} as an MT-Safety issue may temporarily +change the current working directory during their execution, which may +cause relative pathnames to be resolved in unexpected ways in other +threads or within asynchronous signal or cancellation handlers. + +This is not enough of a reason to mark so-marked functions as MT- or +AS-Unsafe, but when this behavior is optional (e.g., @code{nftw} with +@code{FTW_CHDIR}), avoiding the option may be a good alternative to +using full pathnames or file descriptor-relative (e.g. @code{openat}) +system calls. + + +@item @code{!posix} +@cindex !posix + +This remark, as an MT-, AS- or AC-Safety note to a function, indicates +the safety status of the function is known to differ from the specified +status in the POSIX standard. For example, POSIX does not require a +function to be Safe, but our implementation is, or vice-versa. + +For the time being, the absence of this remark does not imply the safety +properties we documented are identical to those mandated by POSIX for +the corresponding functions. + + +@end itemize + @node Berkeley Unix, SVID, POSIX, Standards and Portability @subsection Berkeley Unix diff --git a/manual/macros.texi b/manual/macros.texi index daaf1c0aad..f280a8170a 100644 --- a/manual/macros.texi +++ b/manual/macros.texi @@ -47,4 +47,169 @@ GNU/Hurd systems GNU/Linux systems @end macro +@c Document the safety functions as preliminary. It does NOT expand its +@c comments. +@macro prelim {comments} +Preliminary: + +@end macro +@c Document a function as thread safe. +@macro mtsafe {comments} +| MT-Safe \comments\ + +@end macro +@c Document a function as thread unsafe. +@macro mtunsafe {comments} +| MT-Unsafe \comments\ + +@end macro +@c Document a function as safe for use in asynchronous signal handlers. +@macro assafe {comments} +| AS-Safe \comments\ + +@end macro +@c Document a function as unsafe for use in asynchronous signal +@c handlers. This distinguishes unmarked functions, for which this +@c property has not been assessed, from those that have been analyzed. +@macro asunsafe {comments} +| AS-Unsafe \comments\ + +@end macro +@c Document a function as safe for use when asynchronous cancellation is +@c enabled. +@macro acsafe {comments} +| AC-Safe \comments\ + +@end macro +@c Document a function as unsafe for use when asynchronous cancellation +@c is enabled. This distinguishes unmarked functions, for which this +@c property has not been assessed, from those that have been analyzed. +@macro acunsafe {comments} +| AC-Unsafe \comments\ + +@end macro +@c Format safety properties without referencing the section of the +@c definitions. To be used in the definitions of the properties +@c themselves. +@macro sampsafety {notes} +@noindent +\notes\| + + +@end macro +@c Format the safety properties of a function. +@macro safety {notes} +\notes\| @xref{POSIX Safety Concepts}. + + +@end macro +@macro mtasurace {comments} +race\comments\ +@end macro +@macro asurace {comments} +race\comments\ +@end macro +@macro mtsrace {comments} +race\comments\ +@end macro +@macro mtasuconst {comments} +const\comments\ +@end macro +@macro mtslocale {comments} +locale\comments\ +@end macro +@macro mtsenv {comments} +env\comments\ +@end macro +@macro mtshostid {comments} +hostid\comments\ +@end macro +@macro mtssigintr {comments} +sigintr\comments\ +@end macro +@macro mtuinit {comments} +init\comments\ +@end macro +@macro asuinit {comments} +init\comments\ +@end macro +@macro acuinit {comments} +init\comments\ +@end macro +@macro asulock {comments} +lock\comments\ +@end macro +@macro aculock {comments} +lock\comments\ +@end macro +@macro asucorrupt {comments} +corrupt\comments\ +@end macro +@macro acucorrupt {comments} +corrupt\comments\ +@end macro +@macro ascuheap {comments} +heap\comments\ +@end macro +@macro asuheap {comments} +heap\comments\ +@end macro +@macro ascudlopen {comments} +dlopen\comments\ +@end macro +@macro ascuplugin {comments} +plugin\comments\ +@end macro +@macro ascuintl {comments} +i18n\comments\ +@end macro +@macro asuintl {comments} +i18n\comments\ +@end macro +@macro acsfd {comments} +fd\comments\ +@end macro +@macro acsmem {comments} +mem\comments\ +@end macro +@macro mtascusig {comments} +sig\comments\ +@end macro +@macro mtasuterm {comments} +term\comments\ +@end macro +@macro acuterm {comments} +term\comments\ +@end macro +@macro mtstimer {comments} +timer\comments\ +@end macro +@macro mtascutimer {comments} +timer\comments\ +@end macro +@macro mtasscwd {comments} +cwd\comments\ +@end macro +@macro acscwd {comments} +cwd\comments\ +@end macro +@macro mtsposix {comments} +!posix\comments\ +@end macro +@macro mtuposix {comments} +!posix\comments\ +@end macro +@macro assposix {comments} +!posix\comments\ +@end macro +@macro asuposix {comments} +!posix\comments\ +@end macro +@macro acsposix {comments} +!posix\comments\ +@end macro +@macro acuposix {comments} +!posix\comments\ +@end macro + @end ifclear