Deterministic Time Slicing

ulan · October 20, 2022, 2:37pm

Thanks! I agree with your conclusion.

A small side note about Motoko GC (or GC in general): a GC in theory may happen in any message that does an allocation. For example, one update message may allocate many objects but do not reach the threshold for triggering GC. Then the next update message may allocate a single object and go over the threshold and trigger GC.

icme · November 1, 2022, 10:09am

For reference, both my current (and many Motoko developers’) cases, I think many of us would like to see 4-10x on this limit in order to get around some of the current GC limitations and use as much of the full heap as possible.

ulan · November 1, 2022, 12:16pm

Thanks, that matches our experiments with Motoko GC. The plan is to get some coverage for the 2x limit. If everything is okay, then we will gradually increase the limit to ca. 30B - 50B instructions (6x-10x of the current limit).

The 2x increase is being rolled out to all subnets this week, btw.
The install_code DTS has already been rolled out to all subnets without issues.

ulan · November 1, 2022, 12:29pm

I’ll give an update on DTS in the upcoming session of Monthly Global R&D tomorrow.

lastmjs · November 1, 2022, 12:55pm

Could we discuss more why we wouldn’t want DTS for queries? For example, imagine a database doing queries. Those could be complex and take a little bit. I hit the instruction limit various times when developing Sudograph Beta (very unoptimized queries, but still).

ulan · November 1, 2022, 1:37pm

If there are use cases and user demand for slow queries, then we can add DTS for queries. I wasn’t aware of such use cases and my intuition was that queries are short-running.

If we decide to add DTS for queries, then we probably also need some way of charging for queries because long-running execution will consume CPU/memory resources.

lastmjs · November 1, 2022, 1:41pm

I think it’s reasonable that complex database queries will run into the cycle limits relatively easily, but I don’t have hard data on an optimized DB solution. I imagine this will be the case though, if I had to make my best guess.

Manu · November 1, 2022, 1:41pm

I agree that it would be nice to be charging for queries before we allow for long running queries, but it is a bit odd if we don’t immediately allow long queries, because then some messages would work in replicated mode but not in query mode right?

ulan · November 1, 2022, 1:54pm

it is a bit odd if we don’t immediately allow long queries, because then some messages would work in replicated mode but not in query mode right

No, it is consistent. A query method both in replicated and non-replicated mode has the same non-DTS instruction limit of 5B instructions. In other words, running query as update, doesn’t activate DTS for it.

Update methods run with DTS.

ulan · November 1, 2022, 1:57pm

Maybe I misunderstood you. I guess you meant that if we take the same function foo() and put it inside a query method, then it will run out of instructions. But it we put it inside an update method, then it might succeed? If so, then yes. There is inconsistency.

Manu · November 1, 2022, 4:06pm

Ah makes sense, no your first comment answered my question.

icme · November 22, 2022, 6:10pm

Hey , just checking in on this feature - are we currently at 2X increase on DTS?

What does the timeline look like for getting to 6-10X on this feature?

(asking on behalf of a few eager Motoko devs)

ulan · November 23, 2022, 9:41am

Hi @icme!

Yes, we are currently at 2x limit. DTS looks good so far in production, so I think we could go to 6x relatively quickly: in a couple of replica versions.

There is one non-technical issue that we discovered with Motoko that needs to be resolved before we go to 6x.

The issue is out-of-memory handling in Motoko. Currently the low instruction limit for updates acts as a safeguard against Motoko canisters hitting the hard 4GB limit. When the memory usage of a Motoko canister increases and reaches 1-2GB, then update messages start failing with out-of-instructions errors. At that point upgrading the canister is still possible (because upgrade messages have higher instruction limit), so the owner of the canister can salvage the canister and its data by upgrading it to a new version that uses less memory.

With the 6x DTS, the canister will be able to grow to 4GB with update messages. Once the canister reaches 4GB and updates start failing due to out-of-memory, then upgrades will also fail. This means that the canister becomes stuck without any fix.

I have an idea to solve this problem by introducing a “freezing threshold” for memory. It would be a canister settings parameter with the default value of 3GB. When the canister reaches that limit, then updates start failing, but upgrades continue to work. The owner of the canister would be able to increase or decrease the parameter.

icme · November 23, 2022, 8:17pm

This is awesome @ulan thanks for the update - can’t wait to test it out!

Just to be clear, this memory limit is because of overflowing the heap/main memory, and is different than the upgrades failing due to upgrade cycles limitations, correct? I believe streaming serialization was implemented (see @claudio’s comment in link) that allows very close to (slightly less than) 4GB of heap memory to be serialized to stable memory during upgrades.

Also, with respect to the “freezing threshold” idea

I think it would be great if this would be a system func “hook” that a canister developer could tie into and trigger an action once this threshold is hit.

With CanDB I’m doing something similar (but not implemented at the language/system level obviously). I currently have two fixed limits that are lower than the 4GB heap limits. These limits are:

An INSERT_THRESHOLD, after which no new items can be inserted
An UPDATE_THRESHOLD, after which no new items can be modified (i.e prevents a user from appending to the attribute metadata associated with a record)

I use these thresholds to both trigger auto-scaling actions as well as to permit or reject additional inserts/updates to the CanDB data store in a canister.

icme · November 23, 2022, 11:08pm

Based on some initial tests, I found that 2X DTS was able to push heap memory to roughly 2X its previous limits before hitting the GC. Big improvement - and looking forwards to 6X DTS.

This was pre-DTS
Screen Shot 2022-11-22 at 10.19.52

This is at 2X DTS

Reference: for these tests, I’m just inserting into a RBTree<Nat, Nat>.

@ulan I think with large blob data and 2X DTS we still might be able to push canisters to grow to 4GB anyways (for example, inserting 1.5-1.9MB chunks), so this 4GB update issue will still exist.

ulan · November 24, 2022, 1:29pm

Yes, exactly. It is about the memory limit. Streaming serialization allocates a small buffer, which may also fail if the update calls use all the available 4GB memory.

I think it would be great if this would be a system func “hook” that a canister developer could tie into and trigger an action once this threshold is hit.

That would be some kind of memory pressure callback? I.e. a user defined function is called by the system when the canister’s Wasm memory reaches a user-defined threshold. I like the idea. Perhaps, it could be generalized to canister lifecycle callbacks/notification: low cycles notification, low memory notification, execution failure notification, etc.

With CanDB I’m doing something similar (but not implemented at the language/system level obviously). I currently have two fixed limits that are lower than the 4GB heap limits.

You’re way ahead of many developers that don’t think about potential out-of-memory.

Based on some initial tests, I found that 2X DTS was able to push heap memory to roughly 2X its previous limits before hitting the GC. Big improvement - and looking forwards to 6X DTS.

Thanks for running the test! It’s great to see DTS helping.

I think with large blob data and 2X DTS we still might be able to push canisters to grow to 4GB anyways (for example, inserting 1.5-1.9MB chunks), so this 4GB update issue will still exist.

I agree.

ulan · December 2, 2022, 3:13pm

Small update: we increased the limit to 20B instructions. It will be rolled out to all subnets next week. Further increases are blocked by the memory freezing threshold feature.

ulan · December 6, 2022, 9:43am

@icme: I learned today that Motoko can sometimes perform GC in a heartbeat. I thought previously that Motoko always calls user function as a separate update message that does GC if needed, but my understanding was incorrect. If your canister has a heartbeat, then it might fail with out-of-instructions.

I’ll try implement DTS for heartbeat as soon as possible.

claudio · December 6, 2022, 1:22pm

In the meantime, the Motoko team will try to prepare a release that avoids the GC during heartbeat that users can elect to use by installing from the GitHub release page (without waiting for a release of dfx).

UPDATE: a release, for manual installation, is here:

skilesare · January 22, 2023, 9:35pm

How’s it going? Any chance of 2x again anytime soon?

Topic		Replies	Views
Heads-Up: Fixing PROD issue by adjusting the complexity of some System API calls Developers	17	1402	April 29, 2022
How is deterministic execution of WebAssembly ensured? Developers	2	1738	October 24, 2021
New Wasm Instrumentation Developers	18	2181	September 27, 2023
Canister:limit for single message execution Developers	6	927	July 4, 2022
The complexities of supporting multithreaded canister execution Developers	2	647	October 1, 2021

Deterministic Time Slicing

Related topics