Long Term R&D: Motoko (proposal)

diegop · December 7, 2021, 3:39am

1. Summary

This is a motion proposal for the long-term R&D of the DFINITY Foundation, as part of the follow-up to this post: Motion Proposals on Long Term R&D Plans (Please read this post for context).

This project’s objective

Extend and improve the native language of the IC, Motoko, which includes: better IDE integration, package manager, logging and monitoring support, improved scalability of garbage collection and upgrades, and maintaining parity with new IC features such as heartbeat messages.

2. Discussion lead

Claudio Russo

3. How this R&D proposal is different from previous types

Previous motion proposals have revolved around specific features and tended to have clear, finite goals that are delivered and completed. They tended to be measured in days, weeks, or months.

These motion proposals are different and are defining the long-term plan that the foundation will use, e.g., for hiring and organizational build-out. They have the following traits and patterns:

Their scope is years, not weeks or months as in previous NNS motions
They have a broad direction but are active areas of R&D so they do not have an obvious line of execution.
They involve deep research in cryptography, networking, distributed systems, language, virtual machines, operating systems.
They are meant to match the strengths of where the DFINITY foundation’s expertise is best suited.
Work on these proposals will not start immediately.
There will be many follow-up discussions and proposals on each topic when work is underway and smaller milestones and tasks get defined.

An example may be the R&D for “Scalability” where there will be a team investigating and improving the scalability of the IC at various stages. Different bottlenecks will surface and different goals will be met.

3. How this R&D proposal is similar to what we have seen

We want to double down on the behaviors we think have worked well. These include:

Publicly identifying owners of subject areas to engage and discuss their thinking with the community
Providing periodic updates to the community as things evolve, milestones reached, proposals are needed, etc…
Presenting more and more R&D thinking early and openly.

This has worked well for the last 6 months so we want to repeat this pattern.

4. Next Steps

Developer forum intro posted
1-pager from the discussion lead posted
NNS Motion proposal submitted

5. What we are asking the community

Ask questions
Read 1-pager
Give feedback
Vote on the motion proposal

Frankly, we do not expect many nitty-gritty details because these are meant to address projects that go on for long time horizons.

The DFINITY foundation’s only goal is to improve the adoption of the IC so we want to sanity-check the projects we see necessary for growing the IC by having you (the ICP community) tell us what you all think of these active R&D threads we have.

6. What this means for the existing Roadmap or Projects

In terms of the current roadmap and proposals executed, those are still being worked on and have priority.

An intellectually honest way to look at this long-term R&D project is to see them as the upstream or “primordial soup” from which more baked projects emerge from. With this lens, these proposals are akin to asking, “what kind of specialties or strengths do we want to make sure DFINITY foundation has built up?”

Most (if not all) projects that the DFINITY foundation has executed or is executing are borne from long-running R&D threads. Even when community feedback tells the foundation, “we need X” or “Y does not work”, it is typically the team with the most relevant R&D area that picks up the short-term feature or project.

diegop · December 7, 2021, 4:46am

Please note:

Some folks gave asked if they should vote to “reject” any of the Long Term R&D projects as a way to signal prioritization. The answer is simple: “No, please, ACCEPT”

These long-term R&D projects are the DFINITY’s foundation’s thesis at R&D threads it should have across years (3 years is the number we sometimes use internally). We are asking the community to ACCEPT (pending 1-pager and more community feedback of course). Prioritization can come at a separate step.

claudio · December 7, 2021, 4:40pm

Motoko (1-pager)

1. Objective

The objective of this proposal is to gain community approval on DFINITY’s plans to extend and maintain Motoko as an attractive and feasible option for canister smart contract development on the IC.

2. Background

Motoko was conceived at a time when language options for development on the IC were few and had very sharp edges. While launching a new language is fraught with risk, it was felt that the available languages and compilers targeting WebAssembly were either too unsafe or too impractical, forgoing memory safety or requiring manual construction of asynchronous code. The situation has improved and Rust is now a viable option, but Motoko arguably still provides the least friction and most safety when authoring IC canisters.

Nevertheless, the challenges of supporting a new language and ecosystem are considerable, requiring heavy investment in:

Compiler technology
Runtime system (e.g. robust and scalable memory management, serialization etc.)
Libraries
Tooling (e.g. IDE support, debuggers, profilers, you name ‘em)
Documentation

3. Why this is important

The adoption of any new technology, including the IC, requires an easy on-ramp. Technology that is hard to use will struggle to thrive.

Motoko, with its relative ease of use and static enforcement of IC constraints provides a gentle introduction to the IC. However, in order to scale to industrial applications, Motoko and its toolchain will require significant research and development to deliver increased performance, scalability and productivity.

4. Topics under this project

Development of Motoko and its toolchain is a broad endeavour, covering both research and engineering. Our roadmap includes some of the commitments listed below.

Our ability to adhere to these mid- and long-term commitments will depend on resourcing and other developments. These commitment include:

Improved scalability of the garbage collector, most likely by adopting an incremental, generational collector.
Gaining and maintaining parity with newer IC features such as heartbeat messages and the new 128bit cycles API.
Improved scalability of stable variable serialization including reduced memory consumption and improved robustness (e.g. to stack overflow).
Opt-in, direct access to stable memory for scenarios unsuited to stable variables.
Language enhancements, such as better abstraction facilities for asynchronous code, generics with shared type, error propagation syntax, object composition/modification syntax, and other quality-of-life improvements.
Tooling improvements: more flexible package management, enhanced IDE support.

Improved logging and debugging support when and as supported by the IC.

Library development, expanding existing libraries (e.g. Iter.mo, Char.mo) and adding new ones (e.g. Time.mo)

5. Key milestones

Scalability: Provide access to most of 4GB Wasm memory for heap and large stable variable data, without compromising ability to upgrade.
Language: Support efficient and safe abstraction of asynchronous code.
Parity: Expose all (reasonable) IC System APIs in suitable safe and natural abstractions.
Ecosystem: Offer a wide selection of commonly requested libraries.

6. Discussion Leads

Andreas Rossberg and Claudio Russo

7. Why the DFINITY Foundation should make this a long-running R&D project

Motoko showcases the facilities of the IC and Candid in a coherent, seamless design unobscured by the library or macro-based encodings required to integrate third party languages.

Although not the only choice of language, most external projects to date have chosen Motoko as their development language. The Foundation has an interest and a responsibility to maintain and grow the value of these external investments.

8. Skills and Expertise necessary to accomplish this

Type theory, concurrency theory, language design and implementation (compilation), memory management/garbage collection, documentation, library design.

9. Open Research questions

The cost model of the IC is more abstract than typical hardware. Traditional compilers optimize for speed, or, occasionally code and memory size. Optimizing

for IC cycle consumption and Wasm page use requires atypical design choices.

The IC is a distributed computing system using asynchronous communication. Motoko’s actor model addresses some of the challenges of programming such a system, but still lacks good support for ensuring concurrency correctness, abstracting over asynchronous code, distributed transactions and other forms of actor coordination all of which are open research questions.
The IC currently provides a traditional form of access control based on caller identity, a simple form of stack inspection. This approach has known limitations which others have proposed are better addressed by capability based systems. Adding capabilities based security to both the IC and Motoko is an open, long-term research problem.
Multi-canister upgrade. Motoko supports actor classes, allowing programmatic installation of networks of actors, but (like the IC in general) has no good support for how to programmatically and safely upgrade installed instances.

10. Examples where community can integrate into project

Motoko and its tooling are developed in the open. We welcome and encourage feedback, suggestions and contributions from the community, particularly in the development of libraries which are sorely needed. Notable examples include a fully-fledged Time library and various cryptographic libraries normally expected on blockchain platforms.

Similarly, documentation is an open-ended project, and contributions from the community, e.g., tutorials, could help to improve the discoverability and understandability of Motoko features and its integration with the platform.

11. What we are asking the community

Review comments, ask questions, give feedback
Vote accept or reject on NNS Motion

coin_master · December 7, 2021, 6:43pm

The language development looks great already and I am excited for the roadmap.
I would argue that the most pressing issue now for onboarding new developers is the IDE support, i.e. code completion and goto functionality, etc…
Without IDE support currently it feels like working in the dark.

I also like to see in the roadmap a better and more descriptive error messages.

levi · December 7, 2021, 7:40pm

What is capabilities based security? and how does it differ from what we have now?

claudio · December 8, 2021, 8:56am

claudio · December 8, 2021, 9:02am

I assume you’ve tried the VSCode plugin and found it wanting? It does support completion on library references and “goto definition”, but not “goto references”.

It certainly needs more work. We are recruiting for someone to work in that area and would also welcome community contributions.

jzxchiang · December 12, 2021, 5:51am

Thanks, the roadmap looks great.

better abstraction facilities for asynchronous code

I’m guessing this also includes being able to define an actor method in a different file from the actor declaration.

Also, if it’d be possible to have an actor method call another actor method as a helper function but without async, which changes the atomicity semantics as well as (possibly) the latency.

For example,

actor {
    public func foo(): async () {
        let tmp = bar();
    };

    public func bar(): async Nat {
        5;
    };
};

Also, it’d be helpful to have more guidelines on error handling. Option types, throw, trap… it’s not always that clear to me which to use, even after reading this.

coin_master · December 12, 2021, 7:43am

Yes I am using this extension, I will check again the completion on library references part, I don’t recall it was there last time I checked.

jzxchiang · December 13, 2021, 12:11am

I assume you’ve tried the VSCode plugin and found it wanting?

I think a very useful plugin feature would be to show the inferred Motoko type of a variable when hovering over it with your cursor, like for TypeScript.

Also, “Go to Definition” for variables as well as functions defined in the same file as the call site. (It only works for functions defined in a separate module / file currently.)

jzxchiang · December 13, 2021, 12:20am

Another random one: inferring generic function argument types better. I noticed I need to constantly provide generic definitions to Trie functions, e.g. Trie.find<..., ...> even though it should know what the types are based on the Trie definition.

claudio · December 14, 2021, 11:09am

Thanks for the suggestions and the encouragement.

Regarding async abstraction.

We were primarily thinking of avoiding the extra yield in examples like:

private func flip() : lazy Bool  = lazy { 
  let blob = await Random.rand();
  blob[0] & 1 == 1 
};

shared func oldCoin() : async Bool {
    await flip();  // does two real awaits.
}

We deliberately don’t want to distinguish internal from external sends, but allowing one to abstract async code into a local function sans penalty would hopefully get you closer to where you want to be.

The problem is that we chose to make every await signal a definite commit point and it’s not clear how to provide a mechanism that is flexible but doesn’t provide concurrency footguns.

+1 on error handling guidlines - many don’t realize that throw doesn’t rollback state and that traps from inner calls are propagated as rejects. There is some documentation Errors and Options :: Internet Computer but not enough detail.

claudio · December 14, 2021, 11:10am

I’d be interested to see a concrete example.

For example,

        let t : Trie.Trie<Nat32,Text> = Trie.empty/*<Nat32,Text>*/();
        let k : Trie.Key<Nat32> = {hash = 1; key = 1};
        let o = Trie.find/*<Nat32,Text>*/(t, k, Nat32.equal);

seems to work, provided you provide an annotation to resolve the overloaded numeric literals in the key value. I’ve inserted the missing, inferred annotations.

claudio · December 14, 2021, 11:13am

Both of those should not be out of reach, given resources.

claudio · December 14, 2021, 11:32am

It seems to work for me in VSCode (after a saving the file sans errors). Both on identifiers imported from base and local libraries. Right click the identifier and select goto definition.

This only works on references to library members, not local definitions.

jzxchiang · December 15, 2021, 2:22am

Wait, if await flip() now does two real awaits, doesn’t that compound the problem? I thought the goal was to make await flip() do zero real awaits. Not sure I fully understand the example.

I think we may be talking about two different things.

My original example was for a shared (async) function to call another shared (async) function in the same canister without incurring delay from an actual yield. Basically, turning an async function into a sync function.

I think your example is the opposite: turning a sync function into an async function. I’m not really sure the benefit of that, besides being able to throw errors in private functions.

that traps from inner calls are propagated as rejects.

Oh wow, I didn’t know that. I’m guessing that applies to any “lambda” (like those passed to Array.map) that traps. Is the difference between a trap and a reject ever significant? I thought assert traps were a type of rejection (which still rollback state, unlike throws).

jzxchiang · December 15, 2021, 2:26am

I haven’t tried on the most recent Motoko version but for example…

let commentsArr = Iter.toArray(Trie.iter(comments));

Array.mapFilter<(Types.CommentId, Types.CommentInfo), Types.PublicCommentInfo>(
  commentsArr,
  func((commentId, commentInfo)) {
  ...

If I try to get rid of the generic <> for Array.mapFilter, the compiler complains that it can’t infer the type of variable commentId.

But it should be able to deduce it, given that it should know the type of commentsArr. This is where showing the type of an identifier on cursor hover would be useful IMO.

rossberg · December 15, 2021, 8:25am

@jzxchiang, Motoko uses an approach called bidirectional type checking. That means, roughly speaking, that type information can either propagate inward-out or outward-in, but it cannot usually flow “sideways”.

It is of course possible to have more general type inference, but in the presence of subtyping, that get’s complicated and potentially surprising very quickly. In particular, it’s not always immediately obvious what the “best” type for something is. In your example, e.g. the result type of the argument function, and thereby the element type of the produced output array.

There has been recent research to extend something like ML-style full type inference to subtyping, but it isn’t quite battle-tested, type errors become more challenging for users, and it’s not directly compatible with a few other features that the Motoko type system has.

rossberg · December 15, 2021, 1:59pm

Yeah, we should probably try to optimise tail awaits if possible, like in the example you gave (assuming you meant async where it said lazy). Or other transformations that do not observably change the actor’s behaviour.

Where I would draw the line is with introducing alternative forms of disguised async. I think that would be premature optimisation in the design of the language, of the kind that causes more harm than good.

Programmers must learn that async is expensive. It’s as simple as that. Consequently, avoid infecting functions with async except where absolutely necessary. Always separate the async communication layer from compute-intensive “business” logic.

Providing fuzzy features that make await cheap in some executions but not in others just makes the cost model more opaque overall, and is counterproductive in the grand scheme of things, because it introduces unexpected performance cliffs. I also bet that many devs will be very confused by a subtle distinction like lazy vs async.

And that’s just performance. What’s even worse is that atomicity, commits and rollbacks would become similarly unpredictable. They’re already is tricky enough to program correctly without further fuzziness.

FWIW, when I think of better abstraction facilities for async, I have in mind enabling more powerful composition of futures and future-based combinators. If those can be optimised, even better, but it should remain semantically transparent.

rossberg · December 15, 2021, 2:05pm

No, that’s a different problem. The only way that would be possible – without losing scope-based encapsulation – would be some form of mixin composition on actors. But that’s not an easy feature to add.

Most OO languages, e.g. Java, work fine without the ability to put methods elsewhere. In Motoko, you can at least write auxiliary modules.

Topic		Replies	Views
The future of Motoko and concerns Developers	11	2756	March 29, 2023
Long Term R&D: SDK (proposal) Roadmap	23	3253	May 4, 2022
#ask How should we improve Motoko? Developers	69	3814	August 28, 2023
Motion Proposals on Long Term R&D Plans Roadmap	17	8122	June 10, 2022
12 Holhos - An ICDevs Fundraiser DFINITY	18	1621	November 24, 2022