Data history: Global identifiers

D. Adams
12oct01 1010


Our requirements call for a unique global identifier to be assigned to each event data object. Here we describe one possible implementation as a proof of principle. No doubt there are more sophisticated schemes but the following has the virtue of being simple.

Requirements

The identifier must be unique, i.e. each value only assigned once. Different machines around the world should be able to obtain identifiers quickly without relying on a high-speed connection to a central server. The identifier must have a compact persistent representation so the objects do not became too large.

Size

ATLAS acquires data at a rate of about 10**9 events/yr and can be expected to run for 20 years. A factor 5 safety margin gives a total of 10**11 events. If we allow for 1000 objects/event, we obtain a total of 10**14 objects. This corresponds to 47 bits so we choose an identifier size of 64 bits to allow room for wasted indices. This gives 200,000 more indices than expected objects.

Distribution

We need to distribute the indices in a manner that guarantees they are unique but does not have a high latency associated with the assignment of each value. A central source maintains a pool of index lists. The values in different lists do not overlap. One choice would be to maintain 2**40 (10**12) lists each containing 2**24 (17M) values.

Each computer requests one list from the central pool for each process that it expects to run. Each process gains exclusive use of a list, removes the indices as needed and then releases the list with its remaining values.

Implementation

A simple implementation would be to use a file for each list. The file would contain a unique 40-bit list ID (assigned by the central server) and the next 24-bit local index. The unique ATLAS identifier is constructed by appending the index to the list ID. The index is incremented each time an ID is assigned.

A process gains exclusive use by opening and locking the file and releases the list by closing the file. The updated new index is written to the file before closing.


dladams@bnl.gov