Both implementations now are very similar to one another, to the point
we could share the code if we wanted to. They are based on assembly
code for the Relaxed functions only, as the i386 and x86-64
architectures only allow for full memory ordering or something that
closely resembles it (see 8.2 "Memory ordering" in the Intel 64 and
IA-32 Architectures Software Developer's Manual Volume 3A). We could
add "lfence/mfence/sfence" in future versions if we wanted to (SSE2+).
Change-Id: I76966d9f8694edfece2c5ebd3387348fac721447
Reviewed-by: Bradley T. Hughes <bradley.hughes@nokia.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>