Best way to query user data in stable memory

alexpowmow · September 20, 2024, 11:18pm

What Motoko based data structure are people using to query user data by Principal ID in stable memory? I was thinking about using HashMap or RBTree, but apparently I am not allowed to use those structures for stable memory - what is the reason for that too? Anyway, if anyone has an efficient way to query user data by Principal ID please let me know, and if that is not the best method please let me know what you are using instead. Thanks

jennifertran · September 21, 2024, 4:24am

Have you looked into stable-trie by @timo?

There are other packages as well, maybe Motoko-hashtable is a good fit as well?

timo · September 21, 2024, 4:34am

What exactly is your user data? To answer the question it is important to know whether it is constant size (same size for each user) or variable size (size differs from user to user). And also if user data can be overwritten later in a way that the size changes.

timo · September 21, 2024, 4:51am

It is not clear to me from your question if you want a data structure that lives in stable memory (in Regions) or if you want one that lives in heap and gets serialized to stable memory during upgrades.

For the first kind (Regions) there are only two that I know of:

https://mops.one/stable-trie
GitHub - sardariuss/MotokoStableBTree: https://forum.dfinity.org/t/icdevs-org-bounty-24-stablebtree-mokoko-up-to-10k/14867 (doesn’t seem to be published on mops)

For the second kind there are many options. If you want that kind please respond and I can list more. For this kind you can also use standard RBTree that you mentioned. It has share/unshare functions that you can put in preupgrade/postupgrade.

alexpowmow · September 21, 2024, 3:21pm

The user data is not constant size and I would like the ability to change its size at any point. I was going to use a Trie but I ran into issues using the motoko Hash library (mo:base/Hash) to hash the Principal ID. Thanks for taking the time to try and help btw!

alexpowmow · September 21, 2024, 3:23pm

I wanted to use the Motoko Trie library (mo:base/Trie) at first. However, in order to create a balanced Trie with Principal ID being the key I needed to hash it first. I have run into issues using the Motoko Hash library on Principal IDs. And it seems like the only applicable Motoko based data structure that is stable memory compatible. So I am just unsure what to use now.

alexpowmow · September 21, 2024, 3:26pm

I feel like it would be best to just have a fully compatible stable memory data structure. BTW this data structure is supposed to just store basic user data that they choose to give the application. Like name, interests, username. Things like that.

alexpowmow · September 21, 2024, 3:37pm

What are your thoughts on: https://mops.one/stableheapbtreemap ?

timo · September 21, 2024, 4:59pm

In that case I usually give out consecutive numerical user ids (for internal use only, not exposed to the user) to the principals as they sign up. The map principal → user id is stored in a tree. The user data is then stored in a vector with user id as the index in the vector. That approach gives a lot of flexibility. The expensive trie lookup is performed only once, afterwards all accesses are fast vector accesses (array accesses). Some internal tasks, if the user id is already known, can even be performed without any tree lookup. The vector can be declared stable and will be serialized at upgrade without you having to do anything. For the tree map principal → user id you now have more options than before because the user id is constant size.

If you care about space waste after deletion of a user: an entry in the vector will remain unused forever if a user id isn’t used anymore. If you delete the data to which the vector entry points then you will only waste 4 bytes for the unused vector entry. That seems acceptable.

For the tree map principal → user id we have developed and used two data structures specifically for this purpose. Both offer the inverse map user id → principal as well (which you may need or not).

The characteristics of the first one:

is a heap data structure
has a share/unshare interface for persistence across upgrades
is based on RBTree from base
is space efficient because it offers both directions of the map but stores each principal only once
has no deletion
expected to work with high 10s of millions of user (maybe not 100s of millions)

The characteristics of the second one:

is not a heap datastructure but lives directly in stable memory (Regions)
no serialization needed at upgrades
based on a trie implementation written from scratch
comes in two versions: first version called “Map” has deletion but not the inverse map user id → principal, second version called “Enumeration” has the inverse map but does not offer deletion
expected to work with billions of users

No deletion means you are going to store each principal forever. Often this is just ok.

timo · September 21, 2024, 5:13pm

That will work. It’s a heap datastructure but you can declare it stable and it will serialize automatically at upgrade time.

alexpowmow · September 21, 2024, 6:37pm

This is really great, exactly the response I needed. Thank you so much!

Topic		Replies	Views
Stable Memory for the Database in Motoko Developers	1	677	July 24, 2023
Is there a stable type like a HashMap where the key can be of Principal type and the value a custom type? Language Support	10	856	April 8, 2024
Developers documentation best Map/DB Motoko practices Motoko	2	93	October 9, 2024
Announcing Motoko package stable-trie Motoko	14	228	October 20, 2024
Any Code Examples for a Small User Database Using Stable Data Structures? Developers	6	626	October 19, 2023

Best way to query user data in stable memory

Related topics