If not feasible to implement a functionality in Motoko, is it possible to create a Motoko interface able to call an existing library?

I would like to implement a machine learning package for Motoko. This would be handy when creating canisters interacting with other canisters (e.g. trading bot) or quickly preprocessing data retrieved by oracles by means of future http requests.

Most machine learning methods uses linear algebra methods for instance to multiply, decompose or invert matrixes. Usually linear algebra functions implemented in high level languages like Python call highly optimized well-known packages like LAPACK which offers high-quality and very well tested subroutines for linear algebra.

Reimplementing a linear algebra library in Motoko is out of question due to implementation and maintenance complexity.

In cases like the one described where it is not feasible to implement a functionality in Motoko, does it exist a way to create a Motoko interface able to call an existing library?

2 Likes

I don’t believe so but it looks like there has been some discussion about this.

@claudio might be able to say more.

A way to achieve something similar today would be to compile an existing package to Wasm and expose it as a canister.

Here’s an example of doing that in C: examples/c/qr at 6f9478f1976d76d93a35ab141437648f715b44d8 · dfinity/examples · GitHub

What missing there is a .did file that would allow other canisters to know how to call into it. See What is Candid? :: Internet Computer

1 Like

@paulyoung, Thank you very much for the reply. I think that to expose a full LAPACK library as a canister is not a good approach in my case because the main motivation of creating a linear algebra in Motoko is precisely to simplify the development of machine learning libraries in Motoko, so having to run an additional canister just to use a Motoko library seems against the main purpose to simplify things.

I really think that to make Motoko a mainstream language to implement sophisticated contracts, it will be necessary a simple way to interface external libraries compiled in WASM.

By the time being, I have no more option that moving my project to Rust and using the many ML/AI libraries available in Rust.

@paulyoung @claudio, Can I use any Rust library if I develop my canister in Rust or are there any restrictions?
I’m asking because @skilesare says that:

I don’t understand what is exactly this limitation. Could you explain to what extend I can reuse Rust libraries?

1 Like

I think this comment sums it up pretty well: any limitation in using rust to develop ic? · Issue #123 · dfinity/cdk-rs · GitHub

Why is that I can only do “small bits of work” in my contract when leveraging using existing libraries?

In my experience you aren’t limited to “small bits of work”.

The comment I linked to above by @AdamS says:

Rust is a great language to develop on the IC, and many of our official canisters like certified-assets and cycles-wallet are written in Rust. The limitations are those of any other WebAssembly target; std is available, but most functions available in std but not in alloc other than threading primitives will panic/error/generally do nothing useful, such as std::fs functions. This should allow you to compile against most any crate, but any code path that actually attempts to, for example, read the file-system, will unconditionally error. Shouldn’t be a problem for any library crates that don’t interact with I/O at all.

To clarify, there are currently some limitations such as the amount of work that can be done in a single block but these apply to the IC in general and aren’t specific to using Rust.

I am sorry this is really confusing for a beginner like me. Is this paragraph saying that I can basically use any Rust library as long as this library is not attempting to use the file system? is there any other constraint?

Where can I find more information about the “limitation of work that a canister can do in a single block”?

This has some good information: Deterministic Time Slicing

1 Like

Looks like this is one more reason why I need to way to call a library function instead of a canister method. It confirms the need a “simple way to interface external libraries compiled in WASM” instead of calling a canister.

Concerning my previous question about @AdamS comment, is he saying that “I can basically use any Rust library as long as this library is not attempting to use the file system?” is there any other constraint?

The file system is one example. I don’t know if there’s a comprehensive list of what does/doesn’t work but it seems there may be an opportunity to create a resource that does so.

Below is another limitation, at least for now. I thought this might interest you as well.

1 Like

I ´m sorry, I don´t understand what your are referring to when you write “resource”, “does so” ?

I meant some documentation that people like you and I could refer to when we’re wondering what we can and can’t use :slightly_smiling_face:

Some sort of static analysis tool that informs people of unsupported API usage would be great as well.

1 Like

There is also a limitation with individual message size being capped at 2MB.

This means that it’s possible to create a Wasm file that is too large to be deployed using traditional methods.

I think @skilesare has a workaround for that scenario.

1 Like

You are still constrained by the inter canister message size.(I think 3MB), but using a canister you can apprend a couple chunks and get up to that limit. The ledger canister is just over 2MB so you have to use this method.

1 Like

If 10 years in the future, motoko is a dominant programming paradigm in the blockchain space, do you still feel this way?

I’d argue that it is imperative for these libraries to exist in an async-bounded work cycle framework. Even if time slicing solves long running processes, I don’t think it solves blocking and we will be right back at square one where we need chunkable computation for scalability.(I’d be thrilled to be wrong here and hopefully timeslicing doesn’t block).

We have the funds and are growing the community to create these libraries. If you write up a spec for what you need and and provide sample libraries that can be easily portes to motoko, then we can write up an ICDevs bounty to try to get the work done.

Ultimately we likely need a few Manhattan project(without the mass destruction) style projects to brute force some motoko libraries. RegEx, Math Libraries, Templating libraries, workflow libraries, media libraries all come to mind.

2 Likes

I think you are right. Actually Javascript seems to have implemented native linear algebra libraries. So, they probably came down to your same conclusion. For instance “lalolib.js” or “numericjs” shows they have implemented singular value decomposition solvers and other solvers natively in JS. Maybe JS libraries could be taken as reference.

I think that a possible implementation strategy could be based on gradually developing 3 packages: 1) Core linear algebra and math tools, 2) machine learning library implementing a set of simple methods and 3) few simple canister examples leveraging 2. Depending of the machine learning library (2) that we decide to implement we would prioritize few core functionalities (1). So it is important to pick a useful yet simple ML method to start with package (2). A good candidate for package (2) are filtering methods like recursive least squares, multi arm bandits and Kalman filters. Filtering methods are interesting because they do not require training, therefore they do not require offline training data because these methods are adaptive. This functionality could be handy for the many “always on” forms of bots, oracles and other data processing engines.

3 Likes

@skilesare I would like to investigate further the complexities of developing such libraries. I am not familiar with the implications of developing such library in a “async-bounded work cycle framework”. Could you indicate me how to learn more about this specific difficulty when developing a Motoko library?

We could have a lengthy discussion at some point, but the highlights are that you only have so many instructions that you can use. An error occurs once you run out. So if you are doing a long running process(like updating an index on a collection), you may have to ‘chunk’ the process and step through the data set over a number of consensus rounds. Time slicing may fix this, but then I think the canister will be blocked until it finishes.

So ideally, you want to find some number of operations that take up less than half the “block” and execute your long-running calculation over a number of blocks. In the index example, If you have 10,000 blog entries and processing 2,500 entries takes about 1/4 the block then you’d call the process 4 times. Unfortunately, motoko can’t see the current balance on remaining cycles or this would be much easier. I’ve had to just use trial and error in the past to find a good value.