UUIDs generator

Hi,

Are there any plans for a UUID generator?

So I’ve been looking for a good solution but I didn’t get far.

So 1 blob of entropy gives us 256 bits Nat (cryptographic entropy (randomness source) is only obtainable asyncronously in discrete chunks of 256 bits (32-byte sized Blobs))

This raises the question, how is the entropy generated? radioactive decay? random background radio noise?

Also how much does 1 blob of entropy cost in cycles?

To get a UUID we need to be able to generate a 128 bit label. Universally unique identifier - Wikipedia

Now in order to get a low probability collision for a 128-bit UUID, the ‘birthday effect’ tells us that a collision is likely after you’ve generated about 2^64 keys, provided you have 128 bits of entropy in each key. From what I’ve read you’ll start seeing collisions in about 2^50 but the chances of seeing a collision if you’ve populated your database with just 1000 billion keys are still negligible.

So at best we can use a full blob of entroopy for 2 UUIDs.

Now my question is, what’s the best way to generate up to 10 UUIDs using just 1 blob of entropy and ensure a low probability collision?

PS:

I’ve seens some possible solutions using the Hash Hash :: Internet Computer

The problem is Hash returns a 31 bit Nat and secondly not sure how deterministic is this.

1 Like

Tagging @claudio @rossberg as they’re more expert on the matter

I’m actually not expert on this area, so will refrain from giving advice.

The source of the Random.blob entropy bits is a Threshold Random Signature, computed, in concert, by a quorum of the replicas of the IC, which (AFAIK) should give “cryptographic” randomness.

I would not use Hash.mo hashing functions - these don’t provide any cryptographic guarantees at all - indeed, most of Motoko’s hashing support needs revisiting, IMO.

That’s my understanding as well.

Does a UUID generator really need cryptographic randomness? Or randomness at all, even?

A counter that counts can generate a stream of UUIDs and not use any crypto at all. It’s perfectly “secure” as a stream of UUIDs, except that the order of the stream is predictable. Does that matter here?

If not, Hash.mo is fine.

I see, according to Wikipedia, the difference between a GUID and UUID is whether there is centralized coordination in the generation or not:

Their uniqueness does not depend on a central registration authority or coordination between the parties generating them, unlike most other numbering schemes. While the probability that a UUID will be duplicated is not zero, it is close enough to zero to be negligible.

But if you control a canister that issues these, the need for UUID is simply absent. It’s a need that a decentralized app has that a single canister lacks.

Is there a good reason to have the generation be truly distributed?

In that case, you’d better just do it client-side, and not in Motoko at all, IMO.

Our application does require UUIDs to be random - or ‘hard to guess’ - so e.g. a consecutive numbering system would not be acceptable. We will also have a requirement to generate a large number of UUIDs on a continuous basis so latency may be an issue with cross-canister calls - it would be so much better to do this locally.

Client-side may work for some use-cases, but we’re likely to require UUID generation without direct client interaction so would not want to rely on this.

Grateful for more ideas!

How about using the Principal type from Candid/Motoko/Rust as a kind of UUID for resources?

They can be generated in many languages already. For instance, you can use the existing agent libraries (in JS and Rust) for the IC to generate a large set of Principal IDs randomly.

Those are cryptographically generated, AFAIK, so they will have the needed properties that @DATHKA and @Gabriel may be wanting, and it will be practically impossible to generate duplicates.

A nice benefit to re-using this type for a general UUID type is that all of these languages already know how to send, receive, print and compare them, and the standards for all of these choices are already established.

1 Like

Hi,
I did for POC a canister for UID I also use two modules from Enzo Haussecker @enzo Hex and sha256 which I modify a little to works with Nat8.

The maturity is low that why I store it in private repo.

As the random is slow I use a stack to store UID which are generated in parallel. using function :
public func createNewUids( number : Int)

You can get UID with :
public func getNewUid() : async ?UID

I also create a function that create a hash which is fast and it use low resources
public query func getNewId() : async Id

When I need a global UID I use getNewUid when I need an Id for internal use local to a canister or a limited datasets I use getNewId.

I share just the canister source code as zip if you have any interest.

uid.zip

2 Likes