Reversing an array

Is there a good way to reverse an array? I’m not the best with for/loop/break, but I tried Iter.revRange, and it uses an Int not a Nat, so I couldn’t index the array with it.

          // @todo change to toIter when base is updated
          let l = List.fromArray(onError.toArray());
          for (f in List.toArray(List.reverse(l)).vals()) {
            f()
          };

image

i assumed these were mirror images of each other, there’s probably a reason but just giving my 2 cents.

import Array "mo:base/Array";

func reverse<A>(xs : [A]) : [A] {
  let size = xs.size();
  Array.tabulate(size, func (n : Nat) : A {
    xs[size - 1 - n];
  });
};

awesome, thanks so much

I made a pull request to add this to motoko-base:

1 Like

I’m claiming the assist on this one

We use a lot of this in our ORM. Don’t judge, I wrote it ages ago :slight_smile: Do you have a more efficient method?

Thanks again!

  // isUnique
  public func isUnique<T>(arr : [T], isEq : (T, T) -> Bool) : Bool {
    var list : List.List<T> = null;

    for (a in arr.vals()) {
      if (List.some<T>(list, func(v : T) { isEq(v, a) })) {
        return false;
      };
      list := List.push<T>(a, list);  // add value to list
    };

    return true;
  };

Instead of tabulating a new array, why not just use the old array and iterate through it in reverse.

You can repurpose the Iter.revRange function to be more Nat friendly. It’s slightly more complicated, but a lot more efficient as you don’t have to create an additional array.

public class revRangeExcludeLowerBound(x : Nat, y : Nat) {
  var i = x;
  public func next() : ?Nat { 
    if (i == y) { null }; // when it hits the lower bound, exit the iterator
    else {
      let j = i; 
      i -= 1; 
      ?j
    }
  };
};

Then use it with something like:

let arr = yourArray;
// iterate from the size of the array -> 0
// remember that 0 will exit the iterator without running the code in the
// for loop as it is the lower bound (see the if condition above)
for (i in revRangeExcludeLowerBound(arr.size(), 0)) {
  // when at arr.size this will access the last element (size - 1)
  // last element iterated upon will be when i == 1 (so the 0th element of the array)
  doSomething(arr[i - 1]);
}
1 Like

Yea you’re looking at O(n^2) right there.

Depends on the size of your array, but if you want uniqueness, or a lookup check I think you’re using the wrong data structure here if you want to check key uniqueness. I’d recommend using a HashMap or a Red-Black Tree and storing the data in there in the first place.

I’ll give you an example of how you can use this stable Red-Black Tree library to improve your runtime to O(n) StableRBTree/StableRBTree.mo at main · canscale/StableRBTree · GitHub


type Item = {
  name: Text;
  damage: Nat;
};

type Count = Nat;

let itemBackpack = RBTree.init<Item, Count>();

// this is how you seed a Red-Black Tree with your existing array of items to have an item count key value store
for (item in yourCurrentItemArray.vals()) {
  switch(RBTree.get<Item, Count>(itemBackpack, itemCompareTo, item)) {
    case null { RBTree.put<Item, Count>(item, 1) };
    case (?count) { RBTree.put<Item, Count>(item, count + 1 };
  }
};

// get unique items
let uniqueItems = Iter.filter<(Item, Count)>(
  RBTree.entries<Item, Count>(itemBackpack),
  func((item, count)) { count == 1 }
);

Another approach would be to just use a HashMap for even quicker lookups. Keep in mind that HashMaps are O(1), but aren’t nearly as memory efficient as Red-Black Trees. The Motoko HashMap implementation has been rumored to take up 28X the size

So the use case is One to Many relationships, so for instance if a Character has pets ID [20, 35, 55] and we try and adopt ID 20 it throws an error. It’s just an edge case really.

I think we do need to start looking at more efficient data structures, but because we have like 140 record types and also the stable variable issue, it’s a lot of work.

here’s the character definition. Have to do a screenshot as I’m getting a weird forum bug otherwise.

The IDs work mostly like database relationships.

That makes sense, you might get into trouble looking for all the adopted pets through each and every character each time you want to identify uniqueness/ownership. Better to just have a separate data structure that holds all the pets that have not been adopted/owned yet. Then you look it up in the unowned data strict and see if it’s not owned by someone.

should be just arr.size() not arr.size()+1

Good catch (was just pseudocoding it). Edited the post

So I tried it and it’s not inclusive of the upper and lower bounds, because of the pesky compiler Nat -1 thing.

What about this ?

  // revRangeNat
  // like revRange but returns ?Nat not ?Int
  // so can be used with arrays
  //
  // let r = revRangeNat(0)     // ?0 -> null
  // let r = revRangeNat(3, 1)  // ?3 -> ?2 -> ?1 -> null
  public class revRangeNat(x : Nat, y : Nat) {
    var i : Int = x;
    public func next() : ?Nat {
      if (i + 1 == y or i < 0) {
        null;
      } else {
        let j = Int.abs(i);
        i -= 1;
        ?j;
      };
    };
  };

    // revRangeNat
    do {
      let rev = LIter.revRangeNat(0, 0);
      assert(rev.next() == ?0);
      assert(rev.next() == null);
    };
    do {
      let rev = LIter.revRangeNat(3, 1);
      assert(rev.next() == ?3);
      assert(rev.next() == ?2);
      assert(rev.next() == ?1);
      assert(rev.next() == null);
    };

Do you have a tweet sized(or longer if you have more time) explanation of how a triemap is different than an rbtrie? And why the rbtrie is better?

@skilesare

TLDR Tweet: Red-Black trees are self balancing, whereas a Trie is not balanced and does not re-balance.

Going deeper if you care to…

This means that for Tries, depending on the order in which records are being inserted, one could end up with a lopsided Trie from the root node, resulting in an asymptotic complexity for get/put/delete of O(n). Tries are typically used for word dictionaries, as they conveniently store paths from the beginning of a words to the end, and can label points along the trie graph that symbolize the end of a word.

Red-Black Trees are self balancing binary trees, which follow several invariants upon each insertion in order to maintain the balanced state of the tree. This means that both subtrees from each root tree are balanced, resulting in an asymptotic complexity for get/put/delete of O(log2(n)).

To get a better mental picture of how Tries work, I recommend inserting items in the following order “a”, “ab”, “abc”, and “abcd” into this Trie visualizer tool Trie Visualization

To get a better mental picture of how Red-Black Trees work, I recommend inserting the same “a”, “ab”, “abc”, and “abcd” into this Red-Black tree visualizer tool Red/Black Tree Visualization and comparing the two results.


While traditionally Tries are used for “word dictionaries” and the size of each leaf is the size of the alphabet being used (26 letters for english), the current implementation of Trie in motoko-base is slightly modified and is not geared around this same use “word dictionary” use case. It sets an arbitrary limit of 8 elements per leaf, meaning that at each leaf of the Trie there room for 8 elements to be inserted. Elements are ordered and compared at each leaf of the Trie by their Hashed result, and collisions are handled in a similar fashion to a HashMap. Once this limit of 8 element in a leaf is surpassed, the Trie creates a branch leading to another deeper level of the Trie with another leaf (and 8 more element slots).

If the data that you are inserting into your TrieMap is inserted in a perfectly random ordering, you’ll most likely not see too big of a performance difference - but if it is inserted in a monotomically increasing order (like with a timestamp), you’ll end up with an incredibly lopsided Trie, which makes for bad query/update performance. Even a normal random insertion order is bound to have some elements in the Trie that are much deeper down than other elements, leading to inconsistent performance.


In most cases where key value lookups and operations are required, I would therefore say that in terms of Motoko libraries currently Red-Black Trees provide the best combination of performance and space efficiency out of any of the existing data structures.

I have plans to build out a B-Tree data structure in Motoko, which is another self balancing tree data structure that has slight performance benefits over Red-Black Trees as the size of the Tree increases past several GB, as well as a few slight additional benefits over the current functionally programmed implementation (in specific deletion).

3 Likes

@borovan

I found an even more delightful solution that just uses the existing Iter.revRange() function with your Int.abs() suggestion. Teamwork makes the dream work :laughing:

do {
  let arr = [2,4,6,8,10];
  for (i in Iter.revRange(arr.size(), 1)) {
    Debug.print(debug_show(arr[Int.abs(i-1)]));
  };
};
// Output

10
8
6
4
2

Just a bit of documentation here as of May 15, 2022, I tested the StableRBTree vs the RBTree and was surprised by what I found:

Basically, the orthogonal persistence is only 10% more efficient than using a stable representation.

I’m not sure for that efficiency I can justify having to jump through the pre/post upgrade hoops.

This means a lot of refactoring, but I think it will be worth it.

The real concern here was that I was only able to get about 180,000 items into an tree in one consensus round. I don’t know how much bigger the pre/post rounds are, but this could be an issue for folks using that methodology if you get to any kind of scale.

1 Like

I think your test might be flawed, leading to the large performance difference you observe.

In line 20, you declare a new local variable instead of using the outer var tree variable (on line 15)

   let tree = RBTree.RBTree<Nat,Text>(Nat.compare);

That means the tree will be dead at the end of the message and the GC won’t need to collect very much at all, leaving you more cycles to insert data in the first place.

At least, that’s my hunch.

image

1 Like