Heartbeat improvements / Timers [Community Consideration]

Right, there will be the same upgrade challenges as with the normal calls. We could block the upgrade until there are no more pending timers, or we could cancel the timers, allowing the canisters to handle the situation in pre_ and post_upgrade functions…

Of course, SDK and Motoko teams will be in the loop. We’ll share the design once it’s ready.

2 Likes

Just chiming in my 2 cents here.

We are building a dapp that uses heartbeats extensively. Our current architecture has us spinning up canisters for every logged in individual where the canister acts as a proxy for all interaction that they do with the app.

A couple of distinct scenarios where we use heartbeat are:

  • A logged in user creates a new post. This post has a score that gets calculated on creation. There are multiple parameters that contribute to the score. One of them is freshness/staleness. The older a post is, the lower this component is. Currently, we recalculate this component every hour and update the score. After 48 hours has passed, the freshness component reaches 0, and doesn’t need to be recalculated. So, for this, essentially, we need a setInterval (JS lingo) that stops after 48 hours
  • There are contest/tournaments that are started by end users and they can choose an arbitrary time in the future for this contest to end. This would ideally need some sort of setTimeout (JS lingo) that should fire once at the time specified
  • The user created content is synced with “cache” canisters that aggregate and rank posts sent to them by the individual canisters talked about earlier. They are periodically updated by the individual canisers with the top posts from their repositories as scores get recalculated. They run indefinitely on an interval. Needs a setInterval that never expires.

Currently we use ic-cron by the brilliant @senior.joinu for everything but our cycle costs are insane.

We just started this week and we estimate it will cost us around 2T cycles for every user canister every week. That amounts to around 10$ for every user every month. And will come to around $100,000 in cycle costs for 10,000 users every month.
We expect at least 10,000 users a month for this app owing to our MVP app having reached 8,000 active users while we were actively maintaining it.

Our main point of contention with the current heartbeat implementation is that our canisters should not be billed for the IC choosing to call into them on every block. Rather, they should only be billed for the times our canister has explicitly asked to be called.

8 Likes

Thanks for sharing, @saikatdas0790, very insightful.

2 Likes

It would make sense for this kind of a project to have a central cron canister that runs heartbeat, and calls into the other canisters with timers. I started working on this but paused the project when dfinity announced they’d re-do the heartbeat implementation. I’ve made the project public, feel free to take a look at it and adapt it for your implementation. There was still a lot to do, like access lists & stuff, but as far as I can remember the flow of adding yourself to the queue and receiving remote_heartbeat worked.

2 Likes

I’m not sure I completely agree. We did consider doing this, but this will not scale beyond a certain point.

Here’s some napkin calculations:
Let’s consider heartbeat executes every second. That’s 86,400 executions a day. Let’s also consider it will pick up a single task during each execution not to go over the single round cycle limit, tasks being things like recalculating scores or syncing posts to a cache canister among others. And they usually run twice every hour for example.

That way you have 4 tasks for each user every hour → which is 4*24(hours a day) = 96 tasks for a user every day. That caps your users to a limit of 86,400(max tasks in a single day) /96(tasks for a single user) = 900 for a single heartbeat canister. Which means now you need to shard these heartbeat canisters as well if you have more than a 1000 users

There is also a matter of orchestrating which canisters need to be called by these heartbeat canisters and adding and removing from their respective collections as well as calling between them which makes for quite a complicated project by itself.

On the other hand, having the simplicity of colocating each canisters’ periodic functions within itself is much simpler and doesn’t involve the above song and dance.

In case I misunderstood your suggestion, please feel free to correct me.

I meant instead of running heartbeat() on every user canister, you’d have the same functionality but run it from a remote_heartbeat() function. Then, using my example above, you’d “register” each client-canister with a remote_heartbeat at say 30 minutes. Or whatever interval you need.

Then the main heartbeat canister will call each client canister with remote_heartbeat() at the set interval. This way you have all the implementation canister-side and also can use a much cheaper version of heartbeat - you only pay for heartbeat on one main canister, and for each remote_heartbeat() you just pay what you need (e.g. once every 30 minutes)

3 Likes

FWIW this is what i’m doing on my project while I wait for Dfinity’s timer. It’s working well so far.

2 Likes

Hey guys,
Just to update you, we’ve aligned across teams on the following.

On the IC side there will be a single one-off global timer.
On the Motoko and SDK sides there will be a standard library support for multiple and interval timers.

IC side changes:

  1. Add a new System API ic0.set_global_timer(time: i64) -> i64 function (not exposed to the users).
  2. The function schedules a call to the exported canister_global_timer Wasm method in some round after the specified time (similar to the canister_heartbeat method we have now).
  3. The function returns the previous value of the timer.

Motoko/Rust CDK changes:

Note: the API is not final, feel free to suggest.

  1. Add a timer library with the following interface:

    1. timer_id = set_timer(delay, fun);
      – one-off execution of the fun after a minimum delay in seconds.
    2. timer_id = set_timer_interval(delay, fun);
      – periodic execution of the fun with a minimum delay in seconds.
    3. clear_timer(timer_id);
      – clear the previously set one-off or periodic timer_id.

Other details:

  1. The canister_global_timer will be called with the Heartbeat SystemAPI type, so the same restrictions apply as for the heartbeats.
  2. The timers will be canceled during the upgrade, so the canister is expected to serialize the timers library state and later restore it and set the timer in the canister_post_upgrade.

We’re starting to implement the IC side changes, the library support will follow shortly…

12 Likes

For the
2. timer_id = set_timer_interval(delay, fun);

Could we add an optional_parameter for something like stop_after which could either be n executions or t (milli)seconds?

1 Like

A valid point, and I don’t have a good answer yet :disappointed:

The generic solution would be to allow the timer handler to cancel itself based on some conditions, but I guess there might be issues with borrow checker in Rust…

Maybe the timer handler could return a boolean to stop the periodic calls?

1 Like

I think being able to declaratively specify it during the timer creation is a better, convenient and safer API as opposed to writing cleanup logic separately and messing up there :slight_smile:

True, it’s much simpler in this case. I’m just not sure if it would cover all the use cases…

Say, if we want to stop the timer on some error, or when the post is deleted and we don’t need to recalculate its score anymore…

1 Like

But couldn’t we just use the clear_timer for those cases where there’s additional business logic that needs to be run

The API surface should remain as simple as possible so I think the proposed API is really good.

To cancel an interval timer after a certain time you can simply set another timer which does the cancellation. There is no need to add complexity to the API.

1 Like

Sure, I think it’s fine since this is a lower level API and we can have wrapper libraries that handle these cases

Is this a typo, or is there a reason for using i64 instead of u64?

Sure, it is u64, but it’s not a typo. It’s just the way our specification describes the system API calls.

1 Like

The motion proposal for canister timers is now live: 88293

4 Likes

That’s because of the WebAssembly type system: “Integers are not inherently signed or unsigned, their interpretation is determined by individual operations.”

Even though the System API uses i64 it will be interpreted as an unsigned number similar to how the existing ic0.time : () -> (timestamp : i64); does it.

It is also worth noting that the ic0.set_global_timer() will not be directly exposed by Rust CDK. Instead we will have a higher level library operates on proper time types.

3 Likes