Motoko Base Library Changes

I’d agree that you don’t need Buffer at this point, especially after seeing the benchmarks.

First a question: I found it interesting that List has a cheaper iter than array. What is it that causes that? It seems a bit counterintuitive.

Second, my concern is not so much that we need buffer…but that sometimes motoko is not as intuitive as it could be. The fact that 4 years in I’m just figuring out that Array.tabulate should be my goto for calculating a temporary transformation across an iter. Since we are remaking BASE, how do we get developers to realities like this faster? To understand why you’d want to use array.tabulate you need to understand a bunch of things that are not exactly obvious:

  1. You can’t return classes or non-static types(including [var a] variable array items) from shared actor functions or queries. As a Motoko dev this takes a while to figure out and adapt to. Once you do you become accustomed to creating toArray-ble collections to modify data in your canister to go out to the world. This seems a bit annoying as you don’t understand why you can return {x = 5; y = 6} but not {var x =5; var y=6}. But you adapt and just get in the habit of doing these transformations.
  2. What you use for the transformation gets scattered about the community. People start with arrays but quickly realize that Array.init gives you a var array and thus you can’t return it so you look for something else…ahhh buffer…then your storage gets more nuanced and you find hashMap and no one tells you it is wildly inefficient at scale(thankfully fixed in new base). So you go find something else and by the time you get back to transforming your data the community has lost any sense of a best practice.

All of that to say, is there a way to short-circuit that journey? Maybe StaticArray? So that you never consider storing changeable data with it and it auto associates with the error you get when you try to return non-static data? This base class would not allow or have any var in it and thus you’d only use it more appropriately? And .tabulate is not very intuitive either. StaticArray could have the tabulate signature of Array on init instead which is what I want to do when I’m doing this.

“Ok Austin, time to create this collection of static stuff that is valid to return.”

As an aside, is there some insidious reason that when being returned from a actor shared function the engine can’t just freeze var items anyway…it sure would avoid a lot of this mess…or worst case give us a .toStatic() function we can implement in our classes

I assume you are referring to the FromIters.bench.mo benchmark, where a cost of Array.fromIter is compared to pure/List.fromIter. To create the array, the iterator must first be materialized to determine the size. Then the elements are copied to the array. The pure/List only needs to materialize, thus needs fewer instructions.

Definitely we need to improve on that front. We have the ongoing documentation restructuring and couple of ideas for popularizing Motoko features/patterns. But it is hard to be proactive about all the problems users have writing Motoko in production. That’s why your feedback is invaluable for us to keep improving Motoko.


About shareability: In the new-base you could use the immutable data structures from the
pure/ folder, e.g. pure/List,Map,Set,..., these are shareable and stable, no need to convert them. They might be more or less efficient depending on the use case.

  1. Simply removing var from the type and freezing the values won’t help too much, because it would create incompatible data types and the functions from its module would not work anymore. For example: removing var from the type of Map would make it impossible to use Map.get later on. Making it not usable.
  2. Allowing mutable data to be shared and simply returned from actor methods is tricky to implement (cyclic references) and would be confusing as well (as the memory would not be shared, but users would expect to get shared references between actors which is not possible)
  3. StaticArray? As a way to get the error faster? I probably don’t understand this idea fully, because you could write type MyDataIsShareable = shared () -> async MyData and it would force this error, but probably it would already be present in the actual methods where MyData is used.

We will keep thinking about this issue in general… For now we don’t see a good solution.
For now you can either:

  • use immutable data structures that are shared by default (no var inside them)
  • or serialize/deserialize non-shareable (e.g. impure) data structures manually

We have ideas how to reduce the manual labor of implementing such functions from scratch that would be enabled by some of the future Motoko features we have in our backlog.

@rvanasa Are there any plans for a delete() or remove() API in the new List module?

That was one of the things that Buffer had previously.

Also tagging @timo here as well in case I’m missing something in Vector.

1 Like

The function is called removeLast.

Ah, so the data structure has indexed insertion and retrieval, but doesn’t provide indexed removal then, correct?

What exactly is that? Do you mean for example [1,2,3] → [1,null,3] or [1,2,3] → [1,3]?

From the point of the data structure abstraction, I’d like something like
List.remove(list, index) where I don’t have to care about the underlying representation.

In your example, after removing the 2 from [1,2,3], I’d want it to tell me that:

  1. List.size() == 2
  2. List.indexOf(list, Nat.equal, 2) is null
  3. List.get(list, 0) == 1 and List.get(list, 1) == 3
1 Like

That isn’t possible with List. The remove function in Buffer has a warning that it is inefficient which in other words means “has worst case linear complexity”. List was created to eliminate worst case linear complexity from all operations (except those where it is explicitly wanted such as iterating over the List). Hence, List does not have a remove function.

Since deleting an entry in the middle means downshifting all entries that come after, in the worst case (deleting an entry near the start) this means re-creating the whole list. So you might just as well copy the entire list into a new list, minus the element you want to delete.

To make that easier a concat function would be useful that can concatenate slices of existing lists into a new list. With that the linear complexity is obvious to anyone using the function. For now, while concat does not exist, you have to write your own function.

One more comment:

If by “indexed insertion” you mean an analogy to your “indexed removal”, i.e. for example [1,3,4,5] → [1,2,3,4,5], then that’s a misunderstanding. List does not provide that. It provides “indexed access”, read or write, where write means “overwrite”.

I’d warn that this is NOT obvious to anyone using the function. 90% of the devs coming to our ecosystem have never had to think about performance or cycles or complexity. Their code mostly runs on unbounded super chips in people’s browsers and chews up(according to my task manger), sometimes GB of memory to talk to an api that downloads funny internet memes.

If we are designing some thing called “base”
we need to be super careful and perhaps do a bit of hand holding. Things like “pure” are super unhelpful as well.

I don’t know that we need to go to the extreme of calling things EasyButExpensiveMap and FastButBigMap, but if we are throwing stuff into on big “base” library we need to do something.

At the other extreme if we only put the “production” level libraries in base then people bounce from the language when they can’t find, and have no explanation for why. The .delete(idx) function is missing.

Balancing the computer science problems with the developer attraction, retention, and support problem needs specific and elevated attention.

(This is all further complicated by the "make it obvious for ais while also making it impossible for AIs to do stupid things, which is rapidly rocketing to the top of the priority list.)

2 Likes

Thanks again for all the feedback! We’ve had several internal conversations to find ways to improve the API and/or documentation in response to these suggestions.

The Motoko team is evaluating several options for rolling out the new base library in a way that allows previous Mops / Vessel packages to continue working as before:

  1. The first possibility is to allow different package versions for dependencies, similar to npm. For instance, if you use the llm Mops package and specify base = "1.0.0" in your mops.toml file, your project will use the new base library (1.0.0) while the dependency continues to use the previous version (0.14.9). In general, this would make it simpler for library developers to release breaking changes under the same package name.

  2. Another option is to allow explicitly stating the version in the import (similar to Deno or Go), such as import LLM "mo:llm/v2". This would enable using more than one version of a package in the same project.

  3. A third option is that we could make a push to update the entire ecosystem to use the new base library before the release, per internal discussions.

We are currently proceeding with (1.) and would greatly appreciate any feedback or suggestions so we can make this update as smooth as possible for the community.

4 Likes

I’m team v1. 0.0 (extra words for min)

2 Likes

Currently base and moc are bundled together with dfx downloads. I know that the sdk team is doing some work to make it easier to split out dfx from moc, the replica, etc.

Are there any plans to make it easy to separate the motoko components (moc from the base library that’s included with installs), or are the expectations that people manage that separation that through mops.toml?

1 Like

Are there any plans to make it easy to separate the motoko components (moc from the base library that’s included with installs), or are the expectations that people manage that separation that through mops.toml?

Currently the plan is to keep this the same, i.e. specifying a custom base library version in the mops.toml file. Do you have a use case where a different approach would make things simpler?

1 Like

Ok that works.

In that case option 1 makes sense to me.

1 Like