Universally Unique Identifiers (UUIDs)
by Doug Williams
m.douglas.williams at gmail.com
This library provides Universally Unique Identifiers (UUIDs) as defined in RFC 4122. A UUID is a 128-bit value that is externally encoded as a string in 8-4-4-4-12 format. This library provides functions for constructing time-based (type 1), name-based using MD5 hashing (type 3), (pseudo-)random (type 4), and name-based using SHA-1 hashing (type 5) UUIDs. A copy of RFC 4122 is included with this library.
An example of a time-based (type 1) UUID is:
f81d4fae-7dec-11d0-a765-00a0c91e6bf6
which was generated on Monday, February 3, 1997 at 5:43:12 PM GMT on a machine with an IEEE 802 MAC address of 00-a0-c9-1e-6b-f6.
The UUID library is available from the PLaneT repository.
(require (planet williams/uuid/uuid)) |
1 Interface
The UUID library provides the following functions:
(uuid-RFC-4122? uuid) → boolean? |
uuid : uuid? |
(uuid-version uuid) → exact-nonnegative-integer? |
uuid : uuid-RFC-4122? |
Version | Description |
1 | The time-based version specified in RFC 4122. |
2 | DCE Security version, with embedded POSIX UIDs. |
3 | The name-based version specified in RFC 4122 that uses MD5 hashing. |
4 | The randomly or pseudo-randomly generated version specified in RFC 4122. |
5 | The name-based version specified in RFC 4122 that uses SHA-1 hashing. |
The version is more accurately a sub-type, but the term is retained for compatibility.
The following routines compare UUIDs by treating each as equivalent to an unsigned 128-bit integer.
(hex-string->uuid hex-string) → uuid? |
hex-string : hex-string? |
(string->uuid string) → (or/c uuid? false/c) |
string : string? |
Examples:
(string->uuid "f81d4fae7dec11d0a76500a0c91e6bf6")
#<uuid f81d4fae-7dec-11d0-a765-00a0c91e6bf6>
(string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6")
#<uuid f81d4fae-7dec-11d0-a765-00a0c91e6bf6>
(uuid->hex-string uuid) → string? |
uuid : uuid? |
Example:
(define UUID (string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"))
(uuid->hex-string UUID)
"f81d4fae7dec11d0a76500a0c91e6bf6"
(uuid->string uuid) → string? |
uuid : uuid? |
Example:
(define UUID (string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"))
(uuid->string UUID)
"f81d4fae-7dec-11d0-a765-00a0c91e6bf6"
(uuid->urn-string uuid) → string? |
uuid : uuid? |
Example:
(define UUID (string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"))
(uuid->urn-string UUID)
"urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6"
nil-uuid
#<uuid 00000000-0000-0000-0000-000000000000>
The following identifiers are bound to predefined UUIDs that represent specific name spaces that are used to generate name-based UUIDs.
namespace-DNS
#<uuid 6ba7b810-9dad-11d1-80b4-00c04fd430c8>
namespace-URL
#<uuid 6ba7b811-9dad-11d1-80b4-00c04fd430c8>
namespace-OID
#<uuid 6ba7b812-9dad-11d1-80b4-00c04fd430c8>
namespace-X500
#<uuid 6ba7b814-9dad-11d1-80b4-00c04fd430c8>
1.1 Time-Based (Type 1) UUIDs
A time-based (type 1) UUID uses the current time in number of 100 nanosecond intervals since 00:00:00.00 UTC, 10 October 1582 (60 bits), a clock sequence number to help avoid duplicates (14 bits), and the IEEE 801 MAC address (48 bits) to generate a unique identifier. Note that the current time field will not rollover until around A.D. 3400.
(make-uuid-1) → uuid? |
Example:
(make-uuid-1)
#<uuid d2177dd0-eaa2-11de-a572-001b779c76e3>
(uuid-1->date uuid) → date? |
uuid : uuid? |
Note that the following examples were run with the locale set to MST (GMT-7).
Examples:
(define UUID (string->uuid "d2177dd0-eaa2-11de-a572-001b779c76e3"))
(date->string (uuid-1->date UUID) #t)
"Wednesday, December 16th, 2009 5:26:29pm"
(define UUID (string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"))
(date->string (uuid-1->date UUID) #t)
"Monday, February 3rd, 1997 10:43:12am"
1.2 Name-Based (Types 3 and 5) UUIDs
The version 3 or 5 UUID is meant for generating UUIDs from names that are drawn from, and unique within, some name space. The concept of name and name space should be broadly construed, and not limited to textual names. For example, some name spaces are the domain name space, URLs, ISO Object IDs (OIDs), X.500 Distinquished Names (DNs), and reserved words in a programming language.
Name-based UUIDs may be generated using either MD5 hashing, for type 3 UUIDs, or SHA-1 hashing, for type 5 UUIDs. If backward compatibility is not an issue, SHA-1 is preferred.
Note that there is an apparent error in the RFC 4122 specification. (See http://www.rfc-editor.org/errata_search.php?rfc=4122.) Specifically, the reference implementation swaps the eight octets 0..3, 4..5, and 6..7 twice, for the name space UUID and for the MD5 output, as foreseen for little endian input, but the values are already big endian - that is, only one swap is needed. Most implementations (e.g., the Unix uuid command and Python library) used the corrected implementation, but some others have not. We have added a Boolean-valued #:legacy keyword to specify which result to compute: #f for the corrected version or #t for the original (i.e., ’buggy’) version. The default is the corrected version.
| |||||||||||||||||||||
namespace-uuid : uuid? | |||||||||||||||||||||
name : string? | |||||||||||||||||||||
legacy? : boolean? = #f |
Examples:
(make-uuid-3 namespace-DNS "www.widgets.com")
#<uuid 3d813cbb-47fb-32ba-91df-831e1593ac29>
(make-uuid-3 namespace-DNS "www.widgets.com" #:legacy #t)
#<uuid e902893a-9d22-3c7e-a7b8-d6e313b71d9f>
| |||||||||||||||||||||
namespace-uuid : uuid? | |||||||||||||||||||||
name : string? | |||||||||||||||||||||
legacy? : boolean? = #f |
Examples:
(make-uuid-5 namespace-DNS "www.widgets.com")
#<uuid 21f7f8de-8051-5b89-8680-0195ef798b6a>
(make-uuid-5 namespace-DNS "www.widgets.com" #:legacy #t)
#<uuid 13726f09-44a9-5eeb-8910-3525a23fb23b>
1.3 Pseudo-Random (Type 4) UUIDs
(make-uuid-4) → uuid? |
Example:
(make-uuid-4)
#<uuid 177f42e6-6f22-44d9-93f6-8c475170daf6>
2 Example
The following example demonstrates various functions of the UUIS library.
#lang scheme |
(require scheme/date) |
(require "uuid.ss") |
; Time-Based UUIDs |
(define U1 (make-uuid-1)) |
(printf "(make-uuid-1)~n~a~n" |
U1) |
(printf "Created ~a~n~n" |
(date->string (uuid-1->date U1) #t)) |
; Name-Based UUID Using MD5 Hashing |
(printf |
"(make-uuid-3 namespace-DNS \"www.widgets.com\")~n~a~n" |
(make-uuid-3 namespace-DNS "www.widgets.com")) |
(printf |
"(make-uuid-3 namespace-DNS \"www.widgets.com\" #:legacy #t)~n~a~n~n" |
(make-uuid-3 namespace-DNS "www.widgets.com" #:legacy #t)) |
; Name-Based UUID Using SHA-1 Hashing |
(printf |
"(make-uuid-5 namespace-DNS \"www.widgets.com\")~n~a~n" |
(make-uuid-5 namespace-DNS "www.widgets.com")) |
(printf |
"(make-uuid-5 namespace-DNS \"www.widgets.com\" #:legacy #t)~n~a~n~n" |
(make-uuid-5 namespace-DNS "www.widgets.com" #:legacy #t)) |
; (Pseudo-)Random UUID |
(define U4 (make-uuid-4)) |
(printf "(make-uuid-4)~n~a~n~n" U4) |
(printf "U4 = ~a~n~n" U4) |
(printf "(uuid->string U4)~n~s~n~n" (uuid->string U4)) |
(printf "(uuid->urn-string U4)~n~s~n~n" (uuid->urn-string U4)) |
; Comparisons |
(printf "namespace-DNS = ~a~n" namespace-DNS) |
(printf "(uuid=? U4 U4) = ~a~n" (uuid=? U4 U4)) |
(printf "(uuid=? U4 namespace-DNS) = ~a~n" (uuid=? U4 namespace-DNS)) |
(printf "(uuid<? U4 namespace-DNS) = ~a~n" (uuid<? U4 namespace-DNS)) |
(printf "(uuid>? U4 namespace-DNS) = ~a~n" (uuid>? U4 namespace-DNS)) |
Produces the following output.
(make-uuid-1) |
#<uuid 7c769460-eac0-11de-a1ca-001b779c76e3> |
Created Wednesday, December 16th, 2009 8:58:51pm |
|
(make-uuid-3 namespace-DNS "www.widgets.com") |
#<uuid 3d813cbb-47fb-32ba-91df-831e1593ac29> |
(make-uuid-3 namespace-DNS "www.widgets.com" #:legacy #t) |
#<uuid e902893a-9d22-3c7e-a7b8-d6e313b71d9f> |
|
(make-uuid-5 namespace-DNS "www.widgets.com") |
#<uuid 21f7f8de-8051-5b89-8680-0195ef798b6a> |
(make-uuid-5 namespace-DNS "www.widgets.com" #:legacy #t) |
#<uuid 13726f09-44a9-5eeb-8910-3525a23fb23b> |
|
(make-uuid-4) |
#<uuid ab595962-0a37-4520-8bef-afc559955201> |
|
U4 = #<uuid ab595962-0a37-4520-8bef-afc559955201> |
|
(uuid->string U4) |
"ab595962-0a37-4520-8bef-afc559955201" |
|
(uuid->urn-string U4) |
"urn:uuid:ab595962-0a37-4520-8bef-afc559955201" |
|
namespace-DNS = #<uuid 6ba7b810-9dad-11d1-80b4-00c04fd430c8> |
(uuid=? U4 U4) = #t |
(uuid=? U4 namespace-DNS) = #f |
(uuid<? U4 namespace-DNS) = #f |
(uuid>? U4 namespace-DNS) = #t |
3 Issues and Comments
The biggest issue is that of the ’buggy’ reference implementation in RFC 4122 with regard to the generation of name-based UUIDs. It seems that the implementation in the Unix uuid command and the Python library (among others) is the correct implementation and we use this as the default behavior. However, it also seems that there are implementations ’in the wild’ that match the original RFC 4122 reference implementation. Therefore, we also provide this behavior using the #:legacy keyword.
The current time is measured at millisecond accuracy, which means we lose a significant amount of the available address space – 5 decimal digits. The advantage is a simple, portable implementation. This is fine for low-volume UUID generation.
The current implementation does not maintain any state information for UUID generation. This means that we generate a new random clock sequence for every new time-based UUID, which increases the probability of collisions. Again, this is fine for low-volume UUID generation.
At some point, (make-uuid-1) needs to allow the optional specification of the node to use. Currently, this is the primary MAC address for the machine on which the code is run, which could be considered a security issue in some cases.