Rust Help: BTreeSet doesn't behave the way I am expecting it to

This is not specific to the IC. This is a general Rust problem that I’m stuck with.

So, my use case is I have a post that has 3 fields, the post ID, the publisher ID and the score. The combination of the publisher ID and the post ID makes a unique post. And the score is used to rank the posts. So, I store them in a standard BTreeSet. Basically I need to store them in a sorted order by score so that I can paginate over them. And when updating the scores, the existing post whose publisher ID and post ID combination match should be updated. So, I have implemented Eq , PartialEq , Ord and PartialOrd on the custom item.

I’ve created a minimal example here: GitHub - saikatdas0790/posts_in_btreeset

If you run the tests, you’ll see that when comparing the post objects directly, they behave as I expect them to. But when I put them in a BTreeSet, duplicates are created. I’m kinda lost on how to remedy this.

Thank you for any inputs :slight_smile:

The Ord trait is where most of your logic should be defined for the BTreeSet to work as you intend to with your own types.

That’s what I’ve done here

Try running the unit tests. It’s surprising that the first test passes but the last one fails

Oh I see what you mean now.

The thing is that a BTree only look for what is around a value you want to insert in the tree (lesser, greater, equal), this is why it has such a great performance.

If you look at your updated example here, when you want to insert ‘31 bis’ the algo is only comparing 18446 and 18_446_744_073_605_493_716, when trying to find in the tree where to put the value.

So because of the ordering of the BTree and its tree traversal mechanism, it cannot ‘know’ that there is a post_id somewhere that already exists in the data structure.

I had a similar problem, and one workaround would be to use another data structure like a BTreeMap in order to track the scores and reference the Post there.

I hope it helped.

Cheers

1 Like

A great library for stuff that needs to be owned somewhere, and then indexed or referenced however else in many ways, is slotmap. You insert a value, get back a key, and can then copy that key around freely without worrying about lifetimes. Basically like storing a list index, except versioned so removing values Just Works.

You can’t use this directly with an automatically-sorted collection, but you can easily store a regular vector and use binary_search_by_key to maintain a sort order - if you search for your new element, given an already-sorted slice, it will tell you in O(log n) where to insert it.

2 Likes

Thank you for the tip. I’ll see if it improves my current implementation.

I have currently implemented a custom collection that provides an iterator to yield values exactly as I need them as described above. The implementation uses a mix of a BTreeMap and a HashMap to store the values. But it solves my needs allowing storing duplicate entries for a same score, while still maintaining a sort order for items inserted and also providing a mechanism to update score values with equality checked on a different set of properties.

Quite proud of myself for having managed to figure this part out :slight_smile:

2 Likes