Should I use HashMap or RBTree?

tokuryoo · May 10, 2023, 7:14am

Hi everyone,
The Motoko language provides HashMap and RBTree. Is there a best practice on which one to use? I am just studying. I am not familiar with algorithms.

docs - HashMap

Runtime: O(1)
Space: O(1)

docs - RBTree

Runtime: O(log(n)) worst case cost per insertion, removal, and retrieval operation.
Space: O(n) for storing the entire tree. n denotes the number of key-value entries (i.e. nodes) stored in the tree.

skilesare · May 10, 2023, 3:43pm

Probably this one: GitHub - ZhenyaUsenko/motoko-hash-map: Stable hash maps for Motoko

chenyan · May 10, 2023, 4:34pm

In terms of performance, we have some benchmark numbers for different map structures: Collection libraries | canister-profiling

icme · May 10, 2023, 8:45pm

HashMap:

provides most efficient O(1) insert, lookup, & deletion
space taken is O(n)
Depending on implementation every so often the hashtable underneath the map needs to double, which can result in an O(n) operation.

RBTree/BTree:

provides O(log(n)) insert, lookup, & deletion.
space taken is O(n)
Sorts the data by key within the tree. This allows for operations like scanning between two bounds, which is powerful if one wants to paginate through entries in the data structure that depend on a sorted ordering.

skilesare · May 10, 2023, 10:34pm

Byron is being a bit modest here…he also wrote GitHub - canscale/StableHeapBTreeMap: A functional BTreeMap that can be used with a stable variable in heap memory which I think is a pretty solid improvement on the BTree that maintains sort order.

Byron…you need to pimp your ride.

@chenyan…what is the mechanism that makes the rust maps so much more performant?

chenyan · May 11, 2023, 12:57am

Rust’s hash map uses the standard Swiss Table. Zhenya’s hash map uses V8’s hash map, which is better suited for storing large objects. Data structure-wise, there shouldn’t be much difference for normal use cases. The performance diff really comes from being written in different languages. Rust is much more low level than Motoko, so it can generate code that is much more efficient.

skilesare · May 11, 2023, 2:52am

Would you expect that Motoko would get more performant? Is there a particular mechanism that makes it not as efficient? Would adding a map at a lower level as a ‘native’ structure bring more performance in motoko?

chenyan · May 11, 2023, 4:31am

It’s always getting more performant. For example, see the perf improvements from 0.8.4 to 0.8.5: bump to moc 0.8.5 by chenyan-dfinity · Pull Request #43 · dfinity/canister-profiling · GitHub.

Being a high level language means there is a performance overhead in exchange for a native and convenient experience. You cannot expect Python to be as efficient as C. But there are still many Python developers. FFI can be an option to bridge the perf gap, but it’s a long way to support FFI.

tokuryoo · May 11, 2023, 8:29am

@skilesare
@chenyan
@icme

Thanks for your reply!

a.
Collection libraries | canister-profiling
The value of generate 50k in hashmap is 2_387_017_574. Is the unit of measure Cycle?

b.
Collection libraries | canister-profiling

We have a limit on the maximal cycles per round.

I understood that the less Cylce consumed by Map, the more processing can be done per round. Am I correct in understanding that faster Map processing does not mean faster response time?

c.
docs - HashMap

Runtime: O(1)
Space: O(1)

Maybe O(1) is wrong and O(n) is correct. I do not know.

d.
I had never heard of StableHeapBTreeMap before. Thanks.

e.
If I want to handle Huge data in a Map, I thought Rust should be adopted.
For general systems that are not Internet Computer, efficiency is related to processing speed and response time, and inefficient efficiency can be solved by replacing CPUs or adding servers, so there is no problem with a slow language. However, I thought that efficiency is important when processing on an Internet Computer and on a single subnet.

Topic		Replies	Views
Trie based backend Motoko	1	70	October 4, 2024
Motoko B-Tree, for efficient ordered key-value data Language Support Motoko	22	1730	May 18, 2022
(Heap-based) BTreeMap library for Motoko Language Support Motoko	2	821	March 10, 2023
Developers documentation best Map/DB Motoko practices Motoko	2	94	October 9, 2024
HashMap vs TrieMap Developers	4	1837	February 27, 2022

Should I use HashMap or RBTree?

Related topics