Is there a way to give precedence to certain function executions if there is an active queue in the canister's / subnet's execution?

I want hearbeat to jump the queue to the first spot in line.

eg, once all private function calls (or public functions called by the canister itself) in queue are done executing, don’t execute any externally triggered functions yet: first trigger heartbeat, then continue with the externally-triggered queue.

3 Likes

I was told “no”.

I’m thinking in terms of people attempting to DoS time-critical function executions by just filling the queue with random calls for as long as they need the canister to remain effectively frozen.

What approach can be taken to prevent that sort of thing?

1 Like

I want hearbeat to jump the queue to the first spot in line.

The heartbeat (or timer) is executed always as the first message whenever a canister is scheduled for execution.

I was told “no”.

Was this in some other forum thread? If so, can you give me a pointer? Maybe there was a misunderstanding.

It could be the case that whoever told you that meant that a canister cannot choose between ingress/inter-canister calls and which one gets executed when (there’s a simple round-robin mechanism we use for executing these messages). This part is true.

3 Likes

It seems you’re thinking in terms of the subnet’s queue, whereas I’m thinking in terms of the canister’s own queue. Not sure how it works behind the scenes or whether there is such a thing as “the canister’s own queue”. But now I’m worried about the same problem but at the subnet level: how does one ensure that certain functions of certain canisters get called eg within 30 seconds of their intended trigger and not eg 4 weeks later after a DoS attack is abandoned?

Further context for the question:

What I need it for actually checks enough time has passed first so that it triggers something inside heartbeat only every eg 30 seconds, not every heartbeat. (It doesn’t matter if it’s 22 seconds or 39 seconds in this case. I’m using the timestamp for a minimum time elapsed threshold, and it definitely needs to happen within let’s say 10 minutes: in that sense it is time-critical. If it takes eg 4 weeks the system would break). How do I ensure that hearbeat and the function calls within it run on their own “priority” lane in practice, protected from external DoS-motivated calls?

The aim is achieving security in practice.

It seems you’re thinking in terms of the subnet’s queue, whereas I’m thinking in terms of the canister’s own queue. Not sure how it works behind the scenes or whether there is such a thing as “the canister’s own queue”.

No, I was referring to the order within a canister’s queue. Canisters have an ingress queue (for all ingress messages they receive) and further input/output queues for each canister they communicate with. When we execute messages for a canister, heartbeats and timers are injected in the front of their queue (if the canister defines them of course). Then, we start executing ingress and inter-canister messages from the input queues of this canister.

But now I’m worried about the same problem but at the subnet level: how does one ensure that certain functions of certain canisters get called eg within 30 seconds of their intended trigger and not eg 4 weeks later after a DoS attack is abandoned?

Some background: The scheduling algorithm takes into account the so-called compute allocation of canisters when it’s trying to figure out which canister(s) should be executed next. If your canister has specified a compute allocation of X%, it roughly means that it will be scheduled for execution X% of rounds (i.e. with 50% allocation you should expect your canister will be triggered for execution every other round). Even if you don’t set that, your canister still makes progress but there are no specific guarantees. The compute allocation that can be reserved by canisters on a subnet is capped to 50%, to actually ensure that there’s sufficient room for the best-effort (i.e. no compute allocation or equivalent to 0) canisters to make progress.

So, if you want to really make sure that your canister is scheduled on a stricter frequency, you can consider setting a non-zero compute allocation (it costs extra cycles to reserve). Whenever the canister is triggered, its heartbeat will be the first that is executed (no matter how many other messages exist in its own queue).

What I need it for actually checks enough time has passed first so that it triggers something inside heartbeat only every eg 30 seconds, not every heartbeat.

First, I’d recommend using timers instead if you only need this to happen every 30 seconds, they’re much more efficient than the heartbeat. Timers are also put in the front of the queue and are executed before other messages.

(It doesn’t matter if it’s 22 seconds or 39 seconds in this case. I’m using the timestamp for a minimum time elapsed threshold, and it definitely needs to happen within let’s say 10 minutes: in that sense it is time-critical. If it takes eg 4 weeks the system would break)

The timers work exactly like you describe. The time you set is used as the minimum elapsed threshold to trigger the timer. To iterate from the above: given that the compute allocation that can be reserved by all canisters in a subnet is 50%, that should give your canister enough room to execute frequently enough even if it’s running on a best-effort basis. I’d expect that the 10 minutes mark you mention would never be hit – you’d need a lot of other canisters with many messages each to exist to prevent yours from running within that period. And if you want to be extra sure, just set a small compute allocation (e.g. 1%) and then you can be guaranteed that your canister runs at least 1 every 100 rounds (so roughly once every 100 seconds given the usual block rate on a subnet).

5 Likes

Incredibly helpful. Many things I didn’t know about.

Where can I see a Motoko code snippet like this but for timer? Internet Computer Content Validation Bootstrap

How do I set compute allocation for a Motoko canister, and how do I estimate or understand cycles cost differences vs not setting any?

If the canister, for simplicity, has a 50% allocation, does that mean that it will certainly be called every other block, or that a DoS attack would have to increase its output to ensure effective freezing? At a higher level: to what extent, assuming of course an adversarial situation, where eg the attacker will also attempt setting his own canisters to the highest possible allocation, does setting allocations solve the DoS attack vector for time-critical function executions?

The aim is to understand what would happen (and protect against unacceptable outcomes) in an adversarial context rather than in a fair-use context.

1 Like

Where can I see a Motoko code snippet like this but for timer? Internet Computer Content Validation Bootstrap

I’m not sure if the Motoko support is fully rolled out yet. @claudio @ggreif can you weigh in here? And point to an example if it exists?

How do I set compute allocation for a Motoko canister

This is independent of Motoko vs Rust. It’s an option in dfx. See the docs for more details.

how do I estimate or understand cycles cost differences vs not setting any?

You can find the relevant cost here. See specifically “Compute Percent Allocated Per Second”. If no compute allocation is set, there’s no extra charge for your canister.

If the canister, for simplicity, has a 50% allocation, does that mean that it will certainly be called every other block, or that a DoS attack would have to increase its output to ensure effective freezing?

It will be called every other block and the heartbeat/timer will be triggered first.

At a higher level: to what extent, assuming of course an adversarial situation, where eg the attacker will also attempt setting his own canisters to the highest possible allocation, does setting allocations solve the DoS attack vector for time-critical function executions?

As long as you have secured a compute allocation for your canister, you will get the guarantee I mentioned (i.e. X% allocation will trigger its execution X out of 100 rounds). If a subnet has already reached the limit on how much compute allocation can be reserved (either legitimately or some adversary is trying to affect your canisters living in the same subnet), you can always go to another subnet. If you’re worried that an adversary will try to fill up all IC subnets to prevent your canisters from running, well, that’s a whole other story but I think this is quite unlikely to happen.

2 Likes

Beware the docs say

This should be a percent in the range [0…100].

rather than 50.

Just to confirm: is 10M per second correct or is it per block?

Great. That’s probably enough of a guarantee for my purposes. Are “rounds” blocks? If not, what are they exactly?

I keep thinking of how things could be made to go wrong: even if a canister makes a call that on purpose doesn’t return anything for 4 weeks, and say that deliberately-stuck function call also includes calls to my canister’s public functions, the subnet still keeps going and my canister’s execution is not affected, correct?

I was just pointed this out: motoko-base/src/Timer.mo at master · dfinity/motoko-base · GitHub

If there’s a code snippet somewhere confirming correct usage it would be great.

1 Like

Timers are in moc since 0.7.5 and are present in dfx 0.13.0-beta.1! Have fun!

(See the thread of Heartbeat improvements / Timers [Community Consideration] - #138 by ggreif for the complete discussion.)

Here is the documentation which hasn’t yet found its way into the portal, but will soon!

1 Like

I’m not sure I follow. Yeah, the value of compute allocation for a canister is allowed to be a percent in the range 0…100. So, any value in between (including 50) would be valid.

Just to confirm: is 10M per second correct or is it per block?

It is per second. On subnets where the block rate coincides to be 1 block/s, then you can also think of it as the per block value. Technical detail: the charging happens periodically based on the time that has passed since the last time the canister was charged (so definitely seconds is the base unit you need to care about).

Are “rounds” blocks? If not, what are they exactly?

Consensus produces a new block, delivers it to the upper layers of the IC where execution happens and then we execute a “round” where we process messages from various canisters within some time limit which ideally matches the block rate (so we can execute as much as possible before the next block). Every new block delivered triggers a new round of execution.

I keep thinking of how things could be made to go wrong: even if a canister makes a call that on purpose doesn’t return anything for 4 weeks, and say that deliberately-stuck function call also includes calls to my canister’s public functions, the subnet still keeps going and my canister’s execution is not affected, correct?

Correct. The problem you might see in that case is that your canister is not upgradable because of the outstanding callback but that’s orthogonal and should not affect regular execution of your canister.

1 Like

I’m not sure if I’m confused or what’s going on.

Without getting too deep into the weeds, do you see any space for DoS attacks in gaps that may arise from the fact that rounds need not coincide exactly with the block rate (and perhaps might diverge greatly under certain adversarial circumstances?)?

Again very enlightening. Thank you.

By the way, I don’t expect to have this problem since for security reasons the canister will likely be non-upgradable / blackholed, but is there a solution being worked on? I saw talk about it on the forum at some point but don’t know if there have been developments or whether it’s something canisters will have to live with.

The compute allocation that can be reserved by canisters on a subnet is capped to 50%

This refers to the total allocation that can be claimed by all canisters on a subnet. Subnets have more than one cores available for execution, typically 4 (this is for update calls specifically). So, when we say that the total on a subnet is capped at 50% it means that 2 out of the 4 cores can be reserved. So, you could have 1 canister claiming 1 core (by setting its own compute to 100%) and others claiming the other core. Is it more clear now?

Without getting too deep into the weeds, do you see any space for DoS attacks in gaps that may arise from the fact that rounds need not coincide exactly with the block rate (and perhaps might diverge greatly under certain adversarial circumstances?)?

We actually try to ensure that this does not happen. The limits we impose on canister execution strive to make it such that rounds take as close as possible to the time we have between blocks. It’s not perfect but we continually improve the heuristics used for this. I’d consider it a bug if that happens (and we take these cases very seriously) and as such in your analysis I’d take it as something that should hold.

By the way, I don’t expect to have this problem since for security reasons the canister will likely be non-upgradable / blackholed, but is there a solution being worked on? I saw talk about it on the forum at some point but don’t know if there have been developments or whether it’s something canisters will have to live with.

My team is the one driving a solution on this. We hope to provide something reasonable within the first half of the year.

3 Likes

Absolutely. I live at the application layer so I hadn’t even heard of multiple cores.

Glad you’re working on updates. If some sort of security / defi working group arises I’d be interested in listening in.

Perfect little snippet, thanks.

system func timer(setGlobalTimer : Nat64 -> ()) : async () {
  let next = Nat64.fromIntWrap(Time.now()) + 20_000_000_000;
  setGlobalTimer(next); // absolute time in nanoseconds
  print("Tick!");
}
1 Like

Can a canister adjust its compute allocation from within itself? eg use Timer to check for certain conditions, then increase / reduce compute allocation accordingly.

If it can be done, how might one replace print("Tick!"); here for that compute allocation adjustment to say 1%?

system func timer(setGlobalTimer : Nat64 -> ()) : async () {
  let next = Nat64.fromIntWrap(Time.now()) + 20_000_000_000;
  setGlobalTimer(next); // absolute time in nanoseconds
  print("Tick!");
}

Separate issue; implications of the above still sinking in for me:

On Discord (Discord), this scenario:

Say in the case someone calls function X while A is executing.
A is executing, and inside it there are calls to B and C:

  • await B
    // here someone else calls X, not from inside A
  • await C
    return from A

Would that run A, B, C, return A, enter X or

would it run A, B, X, C, return from A

got this answer:

An await is an abstraction over a callback being executed later. What order stuff runs in depends on a lot of factors, especially subnets. But in the most pessimistic execution case, the latter; nothing prevents calling another method before the callback runs.

So if we bring that back to the timer, and executing time-sensitive processes, we might have:

Timer is executing, and inside it there are async calls to B, C, D, some inter-canister, some to other canisters (asume all canisters are well-behaved and don’t delay return), some of which themselves call other canisters with async / await:

  • await B // maybe a intercanister call to a function that itself awaits a call in another canister
    // at this point someone else, not from inside Timer and from outside the canister, calls function X in our canister
  • await C // again may include async calls to other canisters within it
    return / exit from timer

Doesn’t the Discord answer say that X could be executed before C, and therefore before Timer finishes processing, and therefore before the time-critical process is finalised?

If so, what approach or flow can be used to guarantee that the complete body of timer is executed first, including its async calls, or otherwise guarantee that the time-critical process in the body of timer is indeed finalised on time such that the system (in this case a defi protocol), which depends on that time-critical execution, won’t malfunction?

I sort of created a control flow system that sets a variable to false when certain things are being processed, and then checks that condition before executing most of the body of the rest of the canister’s functions, but

(a) is this the only way (and even a way that works) and

(b) ideally there would be a way doesn’t even attempt executing other functions, partly because the fact that other functions are being kept from entering the if statement where most of the body is only guarantees that those things don’t happen, rather than guaranteeing that the complete body of timer does happen (on time).

I expect there may be misconceptions in my framing.

1 Like

I’m still using dfx 0.12.1. If I update to 0.13.0-beta is anything likely to break / do I need to make any manual changes other than running an update command?

12 doesn’t seem to let me do this sort of thing for manual import:

import { setTimer = setTimerNano; cancelTimer = cancel } = "mo:⛔";
import { fromIntWrap } = "Nat64";

The canister would need to be a controller of itself to be able to adjust its compute allocation (only controllers can update settings of the canister such as its compute allocation).

Doesn’t the Discord answer say that X could be executed before C, and therefore before Timer finishes processing, and therefore before the time-critical process is finalised?

Yes, indeed, if your timer performs further async calls to other canisters, then the IC cannot guarantee that X in your example wouldn’t be executed before C. In other words, the scenario you describe is possible.

If so, what approach or flow can be used to guarantee that the complete body of timer is executed first, including its async calls, or otherwise guarantee that the time-critical process in the body of timer is indeed finalised on time such that the system (in this case a defi protocol), which depends on that time-critical execution, won’t malfunction?

If your timer is performing downstream async calls, things become more complicated as you’ve started realizing.

I sort of created a control flow system that sets a variable to false when certain things are being processed, and then checks that condition before executing most of the body of the rest of the canister’s functions

Yes, that’s actually how I’d tackle the issue as well, essentially use a guard that prevents any other function from being run (or rather return early as you said) if I know that a time-sensitive task is still not complete to basically give every chance for this time-sensitive task to complete.

ideally there would be a way doesn’t even attempt executing other functions, partly because the fact that other functions are being kept from entering the if statement where most of the body is only guarantees that those things don’t happen, rather than guaranteeing that the complete body of timer does happen (on time).

Not entirely sure I’m following your concern here. If you mean that you’d like some way to tell the system “look I have this super sensitive task that I need done, so drop on the floor everything else coming at my canister except for whatever helps me complete this task”, then we don’t have that I’m afraid.

There were some ideas in the past to allow canisters to provide their own scheduling of messages, so in your case you could choose to give priority to the responses for B, C and D and delay any X as long as you need. This never really went further than a potential idea though.

1 Like