37 KiB
___ | ___ |
---|---|
Doc. No.: | P0959R3 |
Date: | 2020-09-09 |
Reply to: | Marius Bancila, Tony van Eerd |
Audience: | Library WG |
Title: | A Proposal for a Universally Unique Identifier Library |
A Proposal for a Universally Unique Identifier Library
I. Revision History
1.1 P0959R0
Original UUID library paper with motivation, specification, and examples.
1.2 P0959R1
Revised with feedback from the LWG and the community.
- Removed string constructors and replaced with an overloaded static member function
from_string()
. - Parsing strings to
uuid
throws exceptionuuid_error
instead of creating a nil uuid when the operation fails. - {} included in the supported format for string parsing, i.e.
"{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}"
. - Removed
state_size
. - Rename member function
nil()
tois_nil()
. - The default constructor is defaulted.
- Added a conversion construct from
std::span<std::byte, 16>
. - Added the member function
as_bytes()
to convert theuuid
into a view of its underlying bytes. - Constructing a
uuid
from a range with a size other than 16 is undefined behaviour. - Replaced operators
==
,!=
and<
with the three-way operator<=>
- Removed mutable iterators (but preserved the constant iterators).
- Removed typedefs and others container-like parts.
- Defined the correlation between the internal UUID bytes and the string representation.
- Added UUID layout and byte order specification from the RFC 4122 document.
- Added section on alternative ordering design
- Most functions are constexpr.
- Replaced typedefs with using statements.
1.3 P0959R2
P0959R1 was discussed in San Diego LEWGI (attendance 13 people) with the following feedback in polls:
-
std::uuid
's byte ordering should be fixed (implying we have the option to make it memcpy and there is no need for a container-like interface)SF F N A SA 4 3 1 0 0 -
std::uuid
should have iterators (e.g. begin/end methods)SF F N A SA 2 3 2 1 1 Comment: interpreted as "author's discretion".
-
std::uuid
should be able to generate uuids from namesSF F N A SA 4 2 3 0 0 -
explore having
std::uuid
be able to generate from timesSF F N A SA 4 3 2 0 0 -
We want
std::basic_uuid_random_generator
SF F N A SA 0 6 2 0 1 Comments: Author is instructed to investigate less hazardous API, and not just fix the examples.
-
Explore non-exception based approach to error handling in
std::uuid::from_string
SF F N A SA 9 0 0 0 0
Based on this feedback the following changes have been done in this version:
std::uuid
byte order is fixed- removed container interface (iterators and
size()
) because, fundamentally, a uuid is a single entity, an identifier, and not a container of other things - removed
basic_uuid_random_generator
default constructor - added
uuid_time_generator
functor to generate variant 1 time-based uuids - proper initialization for the pseudo-random generator engines in all examples
- removed
to_wstring()
and madeto_string()
a function template - made
from_string()
a non-throwing function template returningstd::optional<std::uuid>
- added
is_valid_uuid()
a non-throwing function template that checks if a string contains a valid uuid - removed the
std::wstring
overloaded call operator foruuid_name_generator
and replaced with with function templates uuid
s produced from names in different character sets or encodings are different (i.e. "jane" and L"jane")- removed the class
uuid_error
- footnote on the use of the name Microsoft
1.4 P0959R3
P0959R2 was discussed by LEWGI in Kona with the following feedback:
- The random number generator is not random enough; may want to consult W23 - or not provide a default.
- Why
string
andchar*
instead ofstring_view
? - Some methods that can be
constexpr
/noexcept
are not. - Need more explanations about the choices of ordering and how the RFC works in that regard.
- Need to specify which algorithm the name generator uses.
- The uuid should be a
class
not astruct
. - The three-way comparison operator cannot be defaulted if the class doesn't have at least exposition only members.
Based on this feedback and further considerations, the following changes have been done in this version:
- Added detailed explanation of the algorithms for generating uuids.
from_string()
and non-memberswap()
declared withnoexcept
.- The
uuid
type is now defined as a class. - Added an exposition-only member to hint and the possible internal representation of the uuid data.
- Added
uuid
constructors from arrays. - Added implementation-specific constant uuids that can be used for generating name-based uuids.
- Added a new section (Open discussion) in this document to address questions/concerns from the committee.
- The global constants in the namespace
std
for generating name-based uuids are definedinline
II. Motivation
Universally unique identifiers (uuid), also known as Globally Unique Identifiers (guids), are commonly used in many types of applications to uniquely identify data. A standard uuid library would benefit developers that currently have to either use operating system specific APIs for creating new uuids or resort to 3rd party libraries, such as boost::uuid.
UUIDs are 128-bit numbers that are for most practical purposes unique, without depending on a central registration authority for ensuring their uniqueness. Although the probability of UUID duplication exists, it is negligible. According to Wikipedia, "for there to be a one in a billion chance of duplication, 103 trillion version 4 UUIDs must be generated." UUID is an Internet Engineering Task Force standard described by RFC 4122.
The library proposed on this paper is a light one: it enables developers to generate random and name-based UUIDs, serialize and deserialize UUIDs to and from strings, validate UUIDs and other common operations.
III. Impact On the Standard
This proposal is a pure library extension. It does not require changes to any standard classes, functions or headers. It does not require any changes in the core language, and it has been implemented in standard C++ as per ISO/IEC 14882:2017. The implementation is available at https://github.com/mariusbancila/stduuid.
IV. Design Decisions
The proposed library, that should be available in a new header called <uuid>
in the namespace std
, provides:
- a class called
uuid
that represents a universally unique identifier - strongly type enums
uuid_variant
anduuid_version
to represent the possible variant and version types of a UUID - function objects that generate UUIDs, called generators:
basic_uuid_random_generator<T>
,uuid_random_generator
,uuid_name_generator
,uuid_time_generator
- conversion functions from strings
from_string()
and to stringsstd::to_string()
, as well as an overloadedoperator<<
forstd::basic_ostream
std::swap()
overload foruuid
std::hash<>
specialization foruuid
Default constructor
Creates a nil UUID that has all the bits set to 0 (i.e. 00000000-0000-0000-0000-000000000000
).
uuid empty;
auto empty = uuid{};
Iterators constructors
The conversion constructor that takes two forward iterators constructs an uuid
with the content of the range [first, last). It requires the range to have exactly 16 elements, otherwise the behaviour is undefined. This constructor follows the conventions of other containers in the standard library.
std::array<uuid::value_type, 16> arr{{
0x47, 0x18, 0x38, 0x23,
0x25, 0x74,
0x4b, 0xfd,
0xb4, 0x11,
0x99, 0xed, 0x17, 0x7d, 0x3e, 0x43
}};
uuid id(std::begin(arr), std::end(arr));
uuid::value_type arr[16] = {
0x47, 0x18, 0x38, 0x23,
0x25, 0x74,
0x4b, 0xfd,
0xb4, 0x11,
0x99, 0xed, 0x17, 0x7d, 0x3e, 0x43 };
uuid id(std::begin(arr), std::end(arr));
Array constructors
The array constructors allow to create a uuid
from an array or using direct initialization.
uuid id{ {0x47, 0x18, 0x38, 0x23,
0x25, 0x74,
0x4b, 0xfd,
0xb4, 0x11,
0x99, 0xed, 0x17, 0x7d, 0x3e, 0x43 } };
assert(to_string(id) == "47183823-2574-4bfd-b411-99ed177d3e43");
uuid::value_type arr[16] = {
0x47, 0x18, 0x38, 0x23,
0x25, 0x74,
0x4b, 0xfd,
0xb4, 0x11,
0x99, 0xed, 0x17, 0x7d, 0x3e, 0x43 };
uuid id(arr);
assert(to_string(id) == "47183823-2574-4bfd-b411-99ed177d3e43");
Span constructor
The conversion constructor that takes a std::span
and constructs a uuid
from a contiguous sequence of 16 bytes.
std::array<uuid::value_type, 16> arr{ {
0x47, 0x18, 0x38, 0x23,
0x25, 0x74,
0x4b, 0xfd,
0xb4, 0x11,
0x99, 0xed, 0x17, 0x7d, 0x3e, 0x43
} };
std::span<uuid::value_type, 16> data(arr);
uuid id{ data };
assert(to_string(id) == "47183823-2574-4bfd-b411-99ed177d3e43");
Nil
A nil UUID is a special UUID that has all the bits set to 0. Its canonical textual representation is 00000000-0000-0000-0000-000000000000
. Member function is_nil()
indicates whether the uuid
has all the bits set to 0. A nil UUID is created by the default constructor or by parsing the strings 00000000-0000-0000-0000-000000000000
or {00000000-0000-0000-0000-000000000000}
.
uuid id;
assert(id.is_nil());
std::optional<uuid> id = uuid::from_string("00000000-0000-0000-0000-000000000000");
assert(id.has_value() && id.value().is_nil());
variant
and version
Member functions variant()
and version()
allow checking the variant type of the uuid and, respectively, the version type. These are defined by two strongly typed enums called uuid_variant
and uuid_version
.
uuid id = uuid::from_string("47183823-2574-4bfd-b411-99ed177d3e43").value();
assert(id.version() == uuid_version::random_number_based);
assert(id.variant() == uuid_variant::rfc);
Comparisons
Although it does not make sense to check whether a UUID is less (or less or equal) then another UUID, the overloading of this operator for uuid
is necessary in order to be able to store uuid
values in containers such as std::set
that by default use operator <
to compare keys. Because operators ==
and !=
are needed for checking whether two UUIDs are the same (a basic operation for an identifier type) the three-way comparison operator <=>
is defined so that all comparison operators are provided.
std::array<uuid::value_type, 16> arr{ {
0x47, 0x18, 0x38, 0x23,
0x25, 0x74,
0x4b, 0xfd,
0xb4, 0x11,
0x99, 0xed, 0x17, 0x7d, 0x3e, 0x43
} };
uuid id1{ std::span<uuid::value_type, 16>{arr} };
uuid id2 = uuid::from_string("47183823-2574-4bfd-b411-99ed177d3e43").value();
assert(id1 == id2);
std::set<std::uuid> ids{
uuid{},
uuid::from_string("47183823-2574-4bfd-b411-99ed177d3e43").value()
};
assert(ids.size() == 2);
assert(ids.find(uuid{}) != ids.end());
Span view
Member function as_bytes()
converts the uuid
into a view of its underlying bytes.
std::array<uuids::uuid::value_type, 16> arr{ {
0x47, 0x18, 0x38, 0x23,
0x25, 0x74,
0x4b, 0xfd,
0xb4, 0x11,
0x99, 0xed, 0x17, 0x7d, 0x3e, 0x43
} };
uuid const id{ arr };
assert(!id.is_nil());
auto view = id.as_bytes();
assert(memcmp(view.data(), arr.data(), arr.size()) == 0);
Swapping
Both member and non-member swap()
functions are available to perform the swapping of uuid
values.
uuid empty;
uuid id = uuid::from_string("47183823-2574-4bfd-b411-99ed177d3e43").value();
assert(empty.is_nil());
assert(!id.is_nil());
std::swap(empty, id);
assert(!empty.is_nil());
assert(id.is_nil());
empty.swap(id);
assert(empty.is_nil());
assert(!id.is_nil());
String parsing
Static overloaded member function template from_string()
allows to create uuid
instances from various strings.
The input argument must have the form xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
or {xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
where x
is a hexadecimal digit. The return value is an std::optional<uuid>
that contains a valid uuid
if the parsing completed successful.
auto id1 = uuid::from_string("47183823-2574-4bfd-b411-99ed177d3e43");
assert(id1.has_value());
auto id2 = uuid::from_string(L"{47183823-2574-4bfd-b411-99ed177d3e43}");
assert(id2.has_value());
The order of the bytes in the input string reflects directly into the internal representation of the UUID. That is, for an input string in the form "aabbccdd-eeff-gghh-iijj-kkllmmnnoopp"
or "{aabbccdd-eeff-gghh-iijj-kkllmmnnoopp}"
the internal byte order of the resulted UUID is aa,bb,cc,dd,ee,ff,gg,hh,ii,jj,kk,ll,mm,nn,oo,pp
.
String conversion
Non-member template function to_string()
returns a string with the UUID formatted to the canonical textual representation xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
, where x
is a lower case hexadecimal digit.
auto id = uuid::from_string("47183823-2574-4bfd-b411-99ed177d3e43");
assert(id.has_value());
assert(to_string(id.value()) == "47183823-2574-4bfd-b411-99ed177d3e43");
assert(to_string<wchar_t>(id.value()) == L"47183823-2574-4bfd-b411-99ed177d3e43");
The order of the internal UUID bytes reflects directly into the string bytes order. That is, for a UUID with the internal bytes in the form aa,bb,cc,dd,ee,ff,gg,hh,ii,jj,kk,ll,mm,nn,oo,pp
the resulted string has the form "aabbccdd-eeff-gghh-iijj-kkllmmnnoopp"
.
Hashing
A std::hash<>
specialization for uuid
is provided in order to enable the use of uuid
s in associative unordered containers such as std::unordered_set
.
std::unordered_set<uuid> ids{
uuid{},
uuid::from_string("47183823-2574-4bfd-b411-99ed177d3e43").value(),
};
assert(ids.size() == 2);
assert(ids.find(uuid{}) != ids.end());
Generating new uuids
Several function objects, called generators, are provided in order to create different versions of UUIDs.
Examples for generating new UUIDs with the basic_uuid_random_generator
class:
{
std::random_device rd;
auto seed_data = std::array<int, std::mt19937::state_size> {};
std::generate(std::begin(seed_data), std::end(seed_data), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
std::mt19937 engine(seq);
basic_uuid_random_generator<std::mt19937> dgen{engine};
auto id1 = dgen();
assert(!id1.is_nil());
assert(id1.size() == 16);
assert(id1.version() == uuid_version::random_number_based);
assert(id1.variant() == uuid_variant::rfc);
}
{
std::random_device rd;
auto seed_data = std::array<int, 6> {};
std::generate(std::begin(seed_data), std::end(seed_data), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
std::ranlux48_base engine(seq);
basic_uuid_random_generator<std::ranlux48_base> dgen{engine};
auto id1 = dgen();
assert(!id1.is_nil());
assert(id1.size() == 16);
assert(id1.version() == uuid_version::random_number_based);
assert(id1.variant() == uuid_variant::rfc);
}
{
std::random_device rd;
auto seed_data = std::array<int, std::mt19937::state_size> {};
std::generate(std::begin(seed_data), std::end(seed_data), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
auto engine = std::make_unique<std::mt19937>(seq);
basic_uuid_random_generator<std::mt19937> dgen(engine.get());
auto id1 = dgen();
assert(!id1.is_nil());
assert(id1.size() == 16);
assert(id1.version() == uuid_version::random_number_based);
assert(id1.variant() == uuid_variant::rfc);
}
Examples for generating new UUIDs with the uuid_random_generator
type alias:
std::random_device rd;
auto seed_data = std::array<int, std::mt19937::state_size> {};
std::generate(std::begin(seed_data), std::end(seed_data), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
std::mt19937 engine(seq);
uuid const guid = uuid_random_generator{engine}();
auto id1 = dgen();
assert(!id1.is_nil());
assert(id1.size() == 16);
assert(id1.version() == uuid_version::random_number_based);
assert(id1.variant() == uuid_variant::rfc);
Examples for genearting new UUIDs with the uuid_name_generator
class:
uuid_name_generator dgen(uuid::from_string("415ccc2b-f5cf-4ec1-b544-45132a518cc8").value());
auto id1 = dgen("john");
assert(!id1.is_nil());
assert(id1.size() == 16);
assert(id1.version() == uuid_version::name_based_sha1);
assert(id1.variant() == uuid_variant::rfc);
auto id2 = dgen("jane");
assert(!id2.is_nil());
assert(id2.size() == 16);
assert(id2.version() == uuid_version::name_based_sha1);
assert(id2.variant() == uuid_variant::rfc);
auto id3 = dgen("jane");
assert(!id3.is_nil());
assert(id3.size() == 16);
assert(id3.version() == uuid_version::name_based_sha1);
assert(id3.variant() == uuid_variant::rfc);
auto id4 = dgen(L"jane");
assert(!id4.is_nil());
assert(id4.size() == 16);
assert(id4.version() == uuid_version::name_based_sha1);
assert(id4.variant() == uuid_variant::rfc);
assert(id1 != id2);
assert(id2 == id3);
assert(id3 != id4);
V. Technical Specifications
Header
Add a new header called <uuid>
.
uuid_variant
enum
namespace std {
enum class uuid_variant
{
ncs,
rfc,
microsoft,
future
};
}
Footnote: Microsoft is a registered trademark of Microsoft Corporation. This information is given for the convenience of users of this document and does not constitute an endorsement by ISO or IEC of these products.
uuid_version
enum
namespace std {
enum class uuid_version
{
none = 0,
time_based = 1,
dce_security = 2,
name_based_md5 = 3,
random_number_based = 4,
name_based_sha1 = 5
};
}
uuid
class
namespace std {
class uuid
{
public:
using value_type = uint8_t;
constexpr uuid() noexcept = default;
constexpr uuid(value_type(&arr)[16]) noexcept;
constexpr uuid(std::array<value_type, 16> const & arr) noexcept;
constexpr explicit uuid(std::span<value_type, 16> bytes);
template<typename ForwardIterator>
constexpr explicit uuid(ForwardIterator first, ForwardIterator last);
constexpr uuid_variant variant() const noexcept;
constexpr uuid_version version() const noexcept;
constexpr bool is_nil() const noexcept;
constexpr void swap(uuid & other) noexcept;
constexpr std::span<std::byte const, 16> as_bytes() const;
constexpr std::strong_ordering operator<=>(uuid const&) const noexcept = default;
template<class CharT = char>
static bool is_valid_uuid(CharT const * str) noexcept;
template<class CharT = char,
class Traits = std::char_traits<CharT>,
class Allocator = std::allocator<CharT>>
static bool is_valid_uuid(std::basic_string<CharT, Traits, Allocator> const & str) noexcept;
template<class CharT = char>
static uuid from_string(CharT const * str) noexcept;
template<class CharT = char,
class Traits = std::char_traits<CharT>,
class Allocator = std::allocator<CharT>>
static uuid from_string(std::basic_string<CharT, Traits, Allocator> const & str) noexcept;
private:
template <class Elem, class Traits>
friend std::basic_ostream<Elem, Traits> & operator<<(std::basic_ostream<Elem, Traits> &s, uuid const & id);
value_type data[16]; // exposition-only
};
}
non-member functions
namespace std {
inline constexpr void swap(uuid & lhs, uuid & rhs) noexcept;
template <class Elem, class Traits>
std::basic_ostream<Elem, Traits> & operator<<(std::basic_ostream<Elem, Traits> &s, uuid const & id);
template<class CharT = char,
class Traits = std::char_traits<CharT>,
class Allocator = std::allocator<CharT>>
inline std::basic_string<CharT, Traits, Allocator> to_string(uuid const & id);
}
Constants
The following implementation-specific constant uuid values can be used for generating name-based uuids.
namespace std {
inline constexpr uuid uuid_namespace_dns = /* implementation-specific */;
inline constexpr uuid uuid_namespace_url = /* implementation-specific */;
inline constexpr uuid uuid_namespace_oid = /* implementation-specific */;
inline constexpr uuid uuid_namespace_x500 = /* implementation-specific */;
}
Generators
Random uuid generators
basic_uuid_random_generator<T>
is a class template for generating random or pseudo-random UUIDs (version 4, i.e. uuid_version::random_number_based
).
The type template parameter represents a function object that implements both the RandomNumberEngine
and UniformRandomBitGenerator
concepts.
basic_uuid_random_generator
can be constructed with a reference or pointer to a an objects that satisfies the UniformRandomNumberGenerator
requirements.
namespace std {
template <typename UniformRandomNumberGenerator>
class basic_uuid_random_generator
{
public:
using engine_type = UniformRandomNumberGenerator;
explicit basic_uuid_random_generator(engine_type& gen);
explicit basic_uuid_random_generator(engine_type* gen);
uuid operator()();
};
}
A type alias uuid_random_generator
is provided for convenience as std::mt19937
is probably the preferred choice of a pseudo-random number generator engine in most cases.
namespace std {
using uuid_random_generator = basic_uuid_random_generator<std::mt19937>;
}
The algorithm for generating random uuids is as follows:
- Set the two most significant bits (bits 6 and 7) of the
clock_seq_hi_and_reserved
to zero and one, respectively. - Set the four most significant bits (bits 12 through 15) of the
time_hi_and_version
to the binary value 0100 (representing version 3 as defined in the section VII. UUID format specification). - Set all the other bits to randomly (or pseudo-randomly) chosen values.
Name-base uuid generator
uuid_name_generator
is a function object that generates new UUIDs from a name and has to be initialized with another UUID.
namespace std {
class uuid_name_generator
{
public:
explicit uuid_name_generator(uuid const& namespace_uuid) noexcept;
template<class CharT = char>
uuid operator()(CharT const * name);
template<class CharT = char,
class Traits = std::char_traits<CharT>,
class Allocator = std::allocator<CharT>>
uuid operator()(std::basic_string<CharT, Traits, Allocator> const & name);
};
}
This generator produces different uuids for the same text represented in different character sets or encodings. In order words, the uuids generated from "jane" and L"jane" are different.
The algorithm for generating name-based uuids is as follows:
- Use a uuid as a namespace identifier for all uuids generated from names in that namespace
- Convert the name to a canonical sequence of octets (as defined by the standards or conventions of its name space); put the name space ID in network byte order.
- Compute the SHA-1 hash of the name space ID concatenated with the name.
- Copy the octects of the hash to the octets of the uuid as follows:
- octets 0 to 3 of the hash to octets 0 to 3 of
time_low field
, - octets 4 and 5 of the hash to octets 0 and 1 of
time_mid
, - octets 6 and 7 of the hash to octets 0 and 1 of
time_hi_and_version
- octet 8 of the hash to
clock_seq_hi_and_reserved
- octet 9 of the hash to
clock_seq_low
- octets 10 to 15 of the hash to octets 0 to 5 of
node
- octets 0 to 3 of the hash to octets 0 to 3 of
- Set the four most significant bits (bits 12 through 15) of the
time_hi_and_version
field to binary value 0101 (representing version 5 as defined in the section VII. UUID format specification). - Set the two most significant bits (bits 6 and 7) of the
clock_seq_hi_and_reserved
to zero and one, respectively. - Convert the resulting uuid to local byte order.
Time-based uuid generator
uuid_time_generator
is a function object that generates a time-based UUID as described in the RFC4122 document.
namespace std {
class uuid_time_generator
{
public:
uuid_time_generator() noexcept;
uuid operator()();
};
}
The generated uuid's parts are as follows:
- the timestamp is a 60-bit value, representing the number of 100 nanosecond intervals since 15 October 1582 00:00:000000000.
- the clock sequence is a 14-bit value, that is initially a high-quality pseudo-random value; when the previous value is known, it is simply incremented by one.
- the node identifier is an IEEE 802 MAC address (when multiple are available any could be used); if no such address is available, a pseudo-randomly generated value may be used, in which case the multicast bit (least significant bit of the first byte) is set to 1, this to avoid clashes with legitimate IEEE 802 addresses.
The algorithm for generating time-based uuids is as follows:
- Consider the timestamp to be a 60-bit unsigned integer and the clock sequence to be a 14-bit unsigned integer. Sequentially number the bits in a field, starting with zero for the least significant bit.
- Determine the values for the UTC-based timestamp and clock sequence to be used in the UUID as a 60-bit count of 100-nanosecond intervals since 00:00:00.00, 15 October 1582.
- Copy the bits of the timestamp to the uuid bits as follows, using the same order of significance:
- bits 0 through 31 to the
time_low
field - bits 32 through 47 to the
time_mid
field - bits 48 through 59 to the 12 least significant bits (bits 0 through 11) of the
time_hi_and_version
field
- bits 0 through 31 to the
- Copy the bits of the clock sequence to the uuid bits as follows, using the same order of significance:
- bits 0 through 7 to the
clock_seq_low
field - bits 8 through 13 to the 6 least significant bits (bits 0 through 5) of the
clock_seq_hi_and_reserved
field
- bits 0 through 7 to the
- Set the four most significant bits (bits 12 through 15) of the
time_hi_and_version
field to the binary value 0001 (representing version 1 as defined in the section VII. UUID format specification). - Set the two most significant bits (bits 6 and 7) of the
clock_seq_hi_and_reserved
to zero and one, respectively. - Set the
node
field to the 48-bit IEEE address in the same order of significance as the address.
Specialization
The template specializations of std::hash
for the uuid
class allow users to obtain hashes of UUIDs.
namespace std {
template <>
struct hash<uuid>
{
using argument_type = uuid;
using result_type = std::size_t;
result_type operator()(argument_type const &uuid) const;
};
}
VI. Alternative ordering design
The comparison of UUIDs is not an operation that makes logical sense but it is required for using UUIDs as keys for containers such as std::set
or std::map
. The technical specification of the uuid
class given in the previous section features the three-way operator <=>
as member of the class. The C++ community, however, is divided on this particular aspect (with long discussions happening in the ISO C++ proposals forum) between those that want convenience and those that want rigour. Although the RFC 4122 document, as quoted in the next section, does specify rules for lexical equivalence, not everybody is happy with the definition of comparison operator <
.
The alternative put forward is to define non-member ordering functors. In this case, the library can define the following functors:
uuid_lexicographical_order
performs a lexicographic comparison (can provide compatibility with other systems)uuid_fast_order
performs a memberwise comparison for efficiency.
These could be used as the comparison function for containers such as std::set
or std::map
:
std::map<std::uuid, Value, std::uuid_fast_order> map;
template<typename Value, typename Allocator = std::allocator<T>>
using uuid_map = std::map<std::uuid, Value, std::uuid_lexicographical_order, Allocator>;
In this case, the uuid
class should be defined as follows:
namespace std {
class uuid
{
public:
using value_type = uint8_t;
constexpr uuid() noexcept = default;
constexpr uuid(value_type(&arr)[16]) noexcept;
constexpr uuid(std::array<value_type, 16> const & arr) noexcept;
constexpr explicit uuid(std::span<value_type, 16> bytes);
template<typename ForwardIterator>
constexpr explicit uuid(ForwardIterator first, ForwardIterator last);
constexpr uuid_variant variant() const noexcept;
constexpr uuid_version version() const noexcept;
constexpr bool is_nil() const noexcept;
constexpr void swap(uuid & other) noexcept;
constexpr std::span<std::byte const, 16> as_bytes() const;
template<class CharT = char>
static bool is_valid_uuid(CharT const * str) noexcept;
template<class CharT = char,
class Traits = std::char_traits<CharT>,
class Allocator = std::allocator<CharT>>
static bool is_valid_uuid(std::basic_string<CharT, Traits, Allocator> const & str) noexcept;
template<class CharT = char>
static uuid from_string(CharT const * str) noexcept;
template<class CharT = char,
class Traits = std::char_traits<CharT>,
class Allocator = std::allocator<CharT>>
static uuid from_string(std::basic_string<CharT, Traits, Allocator> const & str) noexcept;
private:
template <class Elem, class Traits>
friend std::basic_ostream<Elem, Traits> & operator<<(std::basic_ostream<Elem, Traits> &s, uuid const & id);
value_type data[16]; // exposition-only
};
}
In addition, the following must be added to the <uuid>
header:
namespace std {
struct uuid_lexicographical_order {
constexpr bool operator()(uuid const&, uuid const&) const;
};
struct uuid_fast_order {
constexpr bool operator()(uuid const&, uuid const&) const;
};
}
Should users require other types of comparison they can provide their own implementation and use them instead of the standard ones.
VII. UUID format specification
The content of this section is copied from the RFC 4122 document that describes the UUID standard.
The UUID format is 16 octets; some bits of the eight octet variant field determine the layout of the UUID. That is, the interpretation of all other bits in the UUID depends on the setting of the bits in the variant field. The variant field consists of a variable number of the most significant bits of octet 8 of the UUID. The following table lists the contents of the variant field, where the letter "x" indicates a "don't-care" value.
Msb0 | Msb1 | Msb2 | Description |
---|---|---|---|
0 | x | x | Reserved, NCS backward compatibility. |
1 | 0 | x | The variant specified in this document. |
1 | 1 | 0 | Reserved, Microsoft Corporation backward compatibility. |
1 | 1 | 1 | Reserved for future definition. |
Only UUIDs of the variant with the bit pattern 10x are considered in this document. All the other references to UUIDs in this document refer to this variant. For simplicity, we will call this the RFC variant UUID.
The UUID record definition is defined only in terms of fields that are integral numbers of octets. The fields are presented with the most significant one first.
Field | Data Type | Octet number | Note |
---|---|---|---|
time_low |
unsigned 32 bit integer | 0-3 | The low field of the timestamp |
time_mid |
unsigned 16 bit integer | 4-5 | The middle field of the timestamp |
time_hi_and_version |
unsigned 16 bit integer | 6-7 | The high field of the timestamp multiplexed with the version number |
clock_seq_hi_and_reserved |
unsigned 8 bit integer | 8 | The high field of the clock sequence multiplexed with the variant |
clock_seq_low |
unsigned 8 bit integer | 9 | The low field of the clock sequence |
node |
unsigned 48 bit integer | 10-15 | The spatially unique node identifier |
In the absence of explicit application or presentation protocol specification to the contrary, a UUID is encoded as a 128-bit object, as follows:
The fields are encoded as 16 octets, with the sizes and order of the fields defined above, and with each field encoded with the Most Significant Byte first (known as network byte order).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_low |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_mid | time_hi_and_version |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|clk_seq_hi_res | clk_seq_low | node (0-1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node (2-5) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The version number is in the most significant 4 bits of the timestamp (bits 4 through 7 of the time_hi_and_version
field). The following table lists the currently-defined versions for the RFC variant.
Msb0 | Msb1 | Msb2 | Msb3 | Version | Description |
---|---|---|---|---|---|
0 | 0 | 0 | 1 | 1 | The time-based version specified in this document. |
0 | 0 | 1 | 0 | 2 | DCE Security version, with embedded POSIX UIDs. |
0 | 0 | 1 | 1 | 3 | The name-based version specified in this document that uses MD5 hashing. |
0 | 1 | 0 | 0 | 4 | The randomly or pseudo-randomly generated version specified in this document. |
0 | 1 | 0 | 1 | 5 | The name-based version specified in this document that uses SHA-1 hashing. |
Rules for Lexical Equivalence
Consider each field of the UUID to be an unsigned integer as shown in the table in the section above. Then, to compare a pair of UUIDs, arithmetically compare the corresponding fields from each UUID in order of significance and according to their data type. Two UUIDs are equal if and only if all the corresponding fields are equal.
As an implementation note, equality comparison can be performed on many systems by doing the appropriate byte-order canonicalization, and then treating the two UUIDs as 128-bit unsigned integers.
UUIDs, as defined in this document, can also be ordered lexicographically. For a pair of UUIDs, the first one follows the second if the most significant field in which the UUIDs differ is greater for the first UUID. The second precedes the first if the most significant field in which the UUIDs differ is greater for the second UUID.
VIII. Open discussion
The LEWGI in Kona have raised the following questions or concerns, which are answered below.
Why string
and char*
instead of string_view
?
The reason for having these two overloads is because it should be possible to create uuids both from literals and objects.
auto id1 = uuid::from_string("47183823-2574-4bfd-b411-99ed177d3e43"); // [1]
std::string str{ "47183823-2574-4bfd-b411-99ed177d3e43" };
auto id2 = uuid::from_string(str); // [2]
Because template argument deduction does not work, both [1] and [2] would produce compiler errors. Instead, it should be one of the following, neither being desirable.
auto id1 = uuid::from_string(
std::string_view {"47183823-2574-4bfd-b411-99ed177d3e43"}); // [1]
std::string str{ "47183823-2574-4bfd-b411-99ed177d3e43" };
auto id2 = uuid::from_string(std::string_view {str}); // [2]
or
auto id1 = uuid::from_string<char, std::char_traits<char>>(
"47183823-2574-4bfd-b411-99ed177d3e43"); // [1]
std::string str{ "47183823-2574-4bfd-b411-99ed177d3e43" };
auto id2 = uuid::from_string<char, std::char_traits<char>>(str); // [2]
Need more explanations about the choices of ordering and how the RFC works in that regard.
The RFC states the rules for lexical equivalence. These are mentioned in section VII, paragraph Rules for Lexical Equivalence . They say that:
- to compare two UUIDs you must compare the fields they are composed of in the order of significance; two UUIDs are equal if all the fields are equal.
- a UUID follows another one if the most significant field in which the UUIDs differ is greater for the first UUID.
IX. References
- [1] Universally unique identifier, https://en.wikipedia.org/wiki/Universally_unique_identifier
- [2] A Universally Unique IDentifier (UUID) URN Namespace, https://tools.ietf.org/html/rfc4122
- [3] Boost Uuid Library http://www.boost.org/doc/libs/1_65_1/libs/uuid/uuid.html
- [4] crossguid - Lightweight cross platform C++ GUID/UUID library, https://github.com/graeme-hill/crossguid