All or nothing batch transaction ICRC standard?

Does an all or nothing batch transaction standard already exist?

For example, right now, most DeFi applications that make transactions want to assess some sort of fee, which requires 3 transactions

  1. ICRC-2 transfer_from app collects amount + app fee from payer and sends to a holding account
    (2 and 3, in parallel)
  2. App sends the amount to the payee destination (could be another user of the app)
  3. App sends the fee to the fees collection destination (app revenue account).

The issue with this is that instead of hitting the ledger directly from the frontend, all these calls need to be made to the canister. So instead of 2-4 seconds for a ledger update on a fiduciary subnet, you get:

2 sec (app canister update call) +
6 sec (cross subnet call plus step 1)
6 sec (cross subnet call plus 2 & 3 in parallel)

= 14 seconds, minimum.

If all or nothing batch transactions were enabled, then we could define the order of transactions and send all together, in order transaction 1, 2, & 3 and if #3 fails then the rest are rolled back since these would all occur synchronously on the ledger,

This would greatly simplify the need to create complex SAGA conditional payout workflows with asynchronous rollback conditions.

I’m not saying that this can’t be built without all or nothing batch transactions, but moreso I think it would eliminate a lot of DeFi complexity, speed up compound transactions, and help out any ICP apps that wants to incorporate just-in-time fees into their business.

3 Likes

ICRC7 already has the concept of (atomic) batch transfers.
ICRC4 tries to do the same for ICRC1, but it seems like it is not finalized yet.

All or nothing batch transfers are not always easy to guarantee, in the ICRC7 the token has to indicate explicitly whether it supports this through icrc7_atomic_batch_transfers : () -> (opt bool) query;. I do not see a reason why this can not be done for fungible token too.

Other thread: ICRC4 - Batch Transfers - Nearing Finalization - Please Review

2 Likes

Hasn’t this type of flow been one of the main issues that was holding back ICP DeFI? Looks like you described atomic vs async DeFi? Im glad to hear it’s being worked on in ICRC-7 and ICRC-4.

You can do atomic if the items are on the same canister. If they aren’t on the same canister, then atomic can be done, but via saga and it involves latency. I don’t think there is a great solution. It is the same dilemma that every L2 has. They have atomicity once the assets are on that chain, but you have to get them there first.

2 Likes

Can it be done on IC protocol level at least within the same subnet? Multiple async calls to multiple canisters but as one atomic transaction.

1 Like

Maybe. There are some peculiarities in how motoko works with the replica where async calls on the same subnet are ‘like sync’ calls most of the time, but I don’t think you can guarantee it. (This is how the canpack stuff works with rust libraries).

In theory I think that other pending async class initiated during the round by other canisters on the subnet could be intersperced between your calls, but I think the biggest ‘danger’ is when the round is closing and it starts bumping calls to the next round. Timers could be scheduled before your await runs and that could cause atomicity issues.

See: Motoko - Timers - Specific Behavior - #7 by berestovskyy

1 Like

How do sharded blockchains like MultiverseX achieve all-or-nothing transactions across shards, despite the inherent asynchronous nature of the internet? Are there specific techniques or architectural patterns they employ that could be adapted to improve composability and atomicity on the Internet Computer, considering its subnet architecture shares some similarities with sharding?

So my guess is we are stuck with processing transactions one at a time, which can take up to one minute. This will make it extremely hard to attract DeFi projects which value great user experience. Believe it or not, crypto people see speed as good user experience.

I want to clarify a few things here.

  1. As an ICP user, making an ICP transaction from an app to a ledger on ICP doesn’t hold up other ledger transactions from happening. There’s general message, ingress, and instruction limits, but ledgers on ICP are quick and can process thousands of transactions per round of consensus (~2-4 sec). The call is quick if the call is made from a principal/delegate identity on the frontend (log in with II, Plug, NFID, any ecosystem wallet).

  2. Canisters can send hundreds of calls (transactions) at a time in parallel to any other canister right now. That’s what this icrc2-batch library does. The current ~500 call limit canister A → canister B comes from canister output queue limits, which could be raised in the future.

  3. Referring to point #1, making calls with a delegated identity will always be quick. The slowdown in the case I’ve described occurs when an application wants to perform the transaction asynchronously via an inter-canister call to perform a transaction without triggering a wallet pop-up for UX purposes. For example, the ICRC-2 standard (approve/transfer_from) allows a user to approve another principal (user or app) to spend X amount of funds on their behalf. This enables things like recurring payments, or transferring funds without needing to click yes to transfer funds for every single action, and utilize the X amount of funds that have been previously approved.

    In our case, we’d like to be able to trigger these transactions through a canister for improved UX on the user’s behalf (no endless popups), as well as to assess fees on top of each payment that is made. That’s the primary reason why we’re going through the canister to perform each of these actions.

With that context in place, having an all-or-nothing batch endpoint, where all calls are made to the same ledger canister would reduce the number of edge cases and complexity that apps need to handle. Projects on ICP want to both provide value and make a profit, and this would help with both!

State changes within a canister are synchronous in nature within a round of consensus, so going from ICRC-4 (batch transactions) which is already in place, to all-or-nothing batch transactions isn’t a technical problem. It’s just something that the ICRC standards community needs to align on and prioritize.

The main point of this forum post from my side was to trigger a conversation, and learn if any work is being done on this front.

Moving forwards is both as easy (and as hard) as getting community consensus to both implement (code) and upgrade (governance) existing token ledgers (ICP, ckBTC, SNS, etc.) to support all or nothing batch transactions. I’d expect ICRC-4 to come first (batch transactions, without all-or-nothing), but this is a nice addition.

@quint above mentioned ICRC-7, which upon reading more about ICRC-7

" icrc7:atomic_batch_transfers of type bool (optional): true if and only if batch transfers of the ledger are executed atomically, i.e., either all transfers execute or none, false otherwise. Defaults to false if the attribute is not defined."

The atomic batch transfer referenced in this quote is stored this in the canister’s metadata, which makes me thing that all batch transactions on the canister are either atomic or not. I’m not sure why there isn’t the option to perform both atomic and non-atomic batch transfers on the same canister/ledger :man_shrugging: , but maybe that’s a point for discussion moving forwards.

2 Likes

If an ICRC-7 ledger supports atomic transaction, there’s no need for non atomic calls since both would basically result in the same behavior and response in that ledger implementation.

With only one key difference, an atomic ledger will immediately return an error element. But even a non atomic ledger can throw a single error as response which will need to be handled just the same.

So basically an atomic ledger doesn’t actually behave differently from the perspective from a client, it returns a list of 0 or more response with possibly an error. But if you know the ledger is atomic (by checking this metadata field) you additionally know that it won’t return partial lists of responses.

1 Like

Does this assume the same subnet? I can’t ever get more than like 12 async calls at a time before hitting the instruction limit on most of my tests. Curious if I’m doing something wrong.

For example: Motoko Playground - DFINITY

The function “test” uses self call one shots and I get 55 in one round with an artificial await after every item. Call test and then getsum.

If I remove the await on line 23 I only get 24 items processed before I get a

Call was rejected:
Request ID: 7894f1259e0345929f4c030cacff02b4831c613920ae76e45b93da76d03371ca
Reject code: 4
Reject text: could not perform oneway

(It is actually somewhat interesting that 24 of these actually get processed. Because I hit a trap I would expect the whole thing to get rolled back)

If I instead call ignore thisactor.doAThing(x); instead (has a standard async) I get the same behavior.

It looks like here you are doing an await as well once you get above a threshold…what threshold are you currently using?: icrc2-batch/internal/WrappedICRC2Actor.mo at 9d43b1b9fd3d3bb776df14bdd2df08fd919c6220 · MemeFighterCo/icrc2-batch · GitHub

It is a small point, but we should be clear that these batches are being spanned across multiple rounds and you can’t assume elsewhere in your code that other stuff isn’t happening while you are waiting multiple rounds for all the items to get filed.

Hey @skilesare, sorry for the delay on this.

Here’s an example that should allow you to do more than 24 calls at a time.

Motoko Playground - DFINITY

This example does 100 calls at a time.

The bottleneck I would have expected you to hit in this example around 200-500 is due to cycles reserved for outgoing calls, since canisters on the Motoko playground have a limited cycles balance, but that doesn’t seem to be occurring :thinking:

I would have then expected you to reach the canister output queue limit at around ~500 outgoing calls, but I just ran this code with 600 calls, which originally made me think there is some optimization for calls directed back at the same canister.

However, then I set up this 2 canister example in the playground just to make sure.

Sender canister

Receiver canister

Seems to work fine with 600 outgoing calls, which I did not expect at all :man_shrugging:. I’d still stick to less than 500 outgoing calls at a time to be safe and implement batching logic, but hopefully these examples are helpful.

@dsarlis or @claudio - any idea why this example allows me to queue up more than 500 calls at a time between canisters?

In your example you have the actual await inside of a helper which means that when it actually awaits each future it is doing an individual await and resetting your instruction limit each time. It seems creating an async* future does not do the same calculation that a straight-up async future produces. Interesting.

It makes sense that you are not running out of cycles because you are doing them all inline even though it seems like you are doing it in parallel. (I understand that functionally it may not matter). If you look at my updated example: https://m7sm4-2iaaa-aaaab-qabra-cai.raw.ic0.app/?tag=2787402379 you will see that after running test() and then getSum that only about 26 executions happen per block.

This may be fine if you are calling something on the same canister, but I would imagine if you were calling something on another subnet, because each await is halting execution until it resolves, that it would take 500*6 seconds to get through them all. (sure enough…if you check out Motoko Playground - DFINITY you’ll see that each balance check takes 6 seconds and they are not going in parallel…click getSum while waiting for test to finish…I’m guessing the response will time out eventually…but I wonder what happens on the canister? I guess it keeps going). (edit: I got Server returned an error:
Code: 400 ()
Body: Specified ingress_expiry not within expected range: Minimum allowed expiry: 2024-09-05 16:46:07.283220654 UTC, Maximum allowed expiry: 2024-09-05 16:51:37.283220654 UTC, Provided expiry: 2024-09-05 16:46:00 UTC) …but that may be just because it was an ingress call.

There does seem to be some nice accounting going on if you’re calling the same subnet/canister here though so it is a nice pattern to use to keep from having to process the futures in batches like I had been doing…but as soon as you go across a subnet boundary you’ll want to process in batches. See the speed improvement on Motoko Playground - DFINITY where I actually do get 8 balance response about every 6 seconds.

In-lining the call instead of awaiting it fixes this.

Check out this Motoko Playground example.

If you go above 25 calls in parallel, you’ll receive this error

Call was rejected:
Request ID: 35071e1b663a96b69ca970787b13653e09cbbec4b40f747c3aaf741e021fea00
Reject code: 4
Reject text: could not perform remote call

This is actually due to the canister running out of cycles that are reserved when a call is made (20B per call). Since the Motoko playground limits how many cycles are on a canister at any point in time, running this with a canister on mainnet that has more cycles on it ensures that more calls in parallel can be made.

Then you should run into the next xnet call limit at 500, which is the canister output queue limit. You can get around this by batching calls.

2 Likes