ICRC-3 Draft v2 and Next Steps

Hi,
The Ledger&Tokenization Working Group has finalized the second draft of the ICRC-3 standard to access the Block Log of Ledgers. You can find the rendered document here.

The changes since the first draft are:

  1. improved definition of the icrc3_get_tip_certificate endpoint to fetch the certified tip of the chain, i.e. the last block index and the last block hash
  2. the suggested algorithm to download and verify the blocks of a ledger
  3. a new endpoint icrc3_get_archives to fetch the archive nodes of the ledger with their ranges of blocks
  4. a new generic block schema that is independent from the fungible tokens standard. This provides a base standard to access the block log for any ledger
  5. a more precise definition of Value and the ri hashing function over Value including examples
  6. an example for each ICRC-1 and ICRC-2 block schema
  7. various improvements based on community feedback

Please have a look and feel free to leave feedback here or better in the PR itself.

The working group now will proceed with voting and then we will make the NNS motion proposal. The plan is to have the vote on the motion proposal before the end of the year.

9 Likes

hi when can i expect the icrc3 to be finalized or moved from draft state?
if it’s not in the near future, what interface standard should i follow?

1 Like

The inner block schemas - couldn’t they be Candid as well? Just a binary we can decode later.

Usage will then be:

// Where ICRC_Transfer is a custom type
let ?transfer : ICRC_Transfer = from_candid(block.content) else Debug.trap("err");
let from = transfer.tx.from

Instead of using it like this:

let #Map(m) = block else Debug.trap("err");
let ?tx = Array.find(m, func (k, v) = k == "tx") else Debug.trap("err");
let #Map(txm) = tx else Debug.trap("err");
let ?from = Array.find(m, func (k, v) = k == "from") else Debug.trap("err");

I guess that’s not only a question for icrc-3, but in general a lot of icrc’s are getting these generic variant schemas. By now every CDK should have to_candid and from_candid equivalent. I think in size - they will be equal or Candid version wins. In speed - running multiple Array.find’s to get your values out, will probably be slower too and require libraries that make it usable. @claudio Makes me wonder what’s the problem with the first way of doing it having binary candid sub schemas.

Off the top of my head a couple of reasons:

  1. While CDKs may have from/to candid, can you imagine calling dfx canister call my_sevice icrc3_get_blocks(...) and getting back binary?
  2. My understanding is that the binary representation of candid is NOT guaranteed. It may change from spec to spec. Imagine a blackholed indexing canister scanning logs and the ball of a sudden it stops being able to understand the blob format.
  3. Transfers are fairly flat and basic, but future transactions for unknown services may be deeply nested and having a strongly typed language have some method of parsing through records of unknown candid types and making sense of them has long term value.
  4. The types in Value have direct parallels to the calculation of the representational independent hash that is used to generate the ‘blockchain’ of these transaction logs.

One suggestion for dealing with them is that have some solid helper libraries. This is roughly what I was attempting to do with GitHub - icdevsorg/candy_library at 0.3.0-alpha which is a supertype of Value and should give some nice helper(although there are a to add). There is also GitHub - ZhenyaUsenko/motoko-candy-utils which has the beginnings of a path syntax.

import Utils "mo:candy_utils/CandyUtils";
import { get; getShared; getAll; getAllShared; path } "mo:candy_utils/CandyUtils";

let #Nat(myAmount) = getShared(block, path("tx.amt")) else D.trap("err");

Some work to do on it yet…and I think I may need to add some #Map functions for candy v3.

And also, yes…for know types…we should likely have a parser that takes the Value and produces a candid version of it.

1 Like

My understanding is that the binary representation of candid is NOT guaranteed. It may change from spec to spec.

I don’t think the binary format will change. We may add some more types in the future, but it won’t change the existing format. However, the binary representation of a fixed value is not unique, different implementations can get different raw bytes. So we cannot compute hash based on the binary data.

In Rust, we have a IDLValue type which allows to decode the binary data directly without using the subtyping rule. The variant is roughly equivalent to the Value type defined here. But manipulating a generic value like IDLValue/Value is always verbose and inconvenient. Ideally, we can convert the generic value into something native in the host language, which can make access the data easier.

1 Like

Thanks for taking the time to explain!

Not using dfx call myself, but yes that will be a problem. Also it will be a problem for dashboards, but that is fixable - there can be a repository with schema id → candid and that id is next to the binary. Clients like dfx and the dashboard can fetch schemas and use them automatically. Making a registry will be a problem tough - who governs it, how schemas get added, how to test, etc.

That’s good to know.
Well, hashing could just work the same way—by operating on values after decoding binary Candid blocks. Schema makers need to ensure they do not use unhashable types.
We could probably have a function that decodes unknown Candid to { _23423 : { _634534 : [234,2342] } } and hashes that somehow while guaranteeing invariable uniformity across implementations?.

Please, let’s do that if we are not going for the Candid option. I’m still not convinced that generic values are better, but this will make it a lot less of a problem.

Hi,
FYI DFINITY has decided to take a bit more time to check the ICRC-3 standard. This means we will wait a bit longer to make the motion proposal. Sorry for the inconvenience.

3 Likes

It will be great if when fetching transaction history a canister only interested in its own history doesn’t have to go trough everyone elses.
Something like this could work: We still have the whole log, but if we pass a slot range parameter we only get what’s in the slot range. There can be 1024 slots. In which slot a transaction gets added depends on the hash of the from and to owner:Principal. This means every tx gets added to the full log and two slot logs (probably no need for phash in slot logs). Then our canister can just process 1/1024 of the whole log and be certain it’s getting all transactions from and to it.
Probably good for the indexer too - it can become multi-canister.

I don’t think we want to mix up icrc3 and indexing. An indexing icrc is a great idea, and designing in a way that a ledger can host the index would be ideal.

The trade off to be wary of is speed and agility of the head ledger canister vs robust functionality. We’ve erred on moving functionality off to other canisters, which if given the proper trust assumptions, should be ok as long as the head ledger canister doesn’t rely on the extra functionality itself(because you lose atomicity).

Since icrc3 generated an immutable record, it should be able to report an indexing canister tree that is sufficient for reporting back just your transactions.

hi Mario
do you think it’s a good idea to follow the ICP ledger methods for getting the transaction data for now?

1 Like

The ICP Ledger won’t be able to support ICRC-3 no matter what because of AccountIdentifier. For ICP Ledger, I suggest to use query_encoded_blocks. If you are fetching them from outside then remember to validate them. The approach is very similar to ICRC-3 except that the certificate is embedded in the response of query_encoded_blocks when you query a suffix of the Ledger. You can see an example in the ICP Index canister’s build_index function.

For ICRC Ledgers (ckBTC, SNSes, …) you can use something very similar to ICRC-3 already. The endpoint get_blocks is essentially the same as icrc3_get_blocks while get_data_certificate is the same as icrc3_get_tip_certificate. You can see an example in the ICRC Rosetta prototype’s sync_from_the_tip function.

2 Likes

It’s not really indexing, more like storing tx in 1024 logs instead of one. On each transaction, we will probably have 2 principal->slot and 2 more Vec add operations. If that is taking too many resources - ok.
When another canister (indexer) does the splitting, it will be at least one inter-canister call slower. The indexer has to somehow launch tens of icrc3_get_blocks queries simultaneously to be able to keep up. How much it can launch if lagging behind? 50, 100? Splitting the log eliminates the need for that, at least for canisters interested in their own transactions.

The issue is that it adds complexity to the Ledger. Using more resources is fine, adding complexity is not. Our philosophy is to keep the Ledger as simple as possible.

I also see another issue. Some applications/canisters may need to fetch the full log (e.g. Rosetta). How do you serve it? Do you have one additional log containing everything?

If the indexer is in the same subnet then it should work. The index may lag behind but it should catch up after few seconds. For instance, the ICP Index is behind the Ledger by 25 to 150 blocks every second but it quickly catches us. Consider that the ICP Index is simple and queries batches of blocks sequentially so it could be further improved by running multiple requests simultaneously.
The question is whether it matters for a consumer to wait a few more seconds to get newer blocks.

Another advantage of the indexers is that they can be distributed by account.

It’s not a problem for us unless there are thousands of tx/s for long periods of time. I guess a problem for another day.
We are currently using get_transactions in our contracts (on another subnet). (Please look it up if you have time - DeVeFi ledger middleware) The indexer is currently made to fit end-consumer wallets. If it can return information about all subaccounts of a principal and also allow from - to (to be specified) not just get the last N blocks, we can probably use it instead.
I wonder if it will be a problem if hundreds of canisters send get_transactions/icrc3_get_blocks calls every sec from different subnets.

Can’t we make the ICRC-3 to support only AccountIdentifier?

We would likely need an alternate schema defined for legacy ICP style ledgers(ICP and OGY are the only two I know). ICRC3 logs are extensible so we can propose an alternative schema for those that looks a lot like: ICRC-1/standards/ICRC-3 at icrc-3 · dfinity/ICRC-1 · GitHub but that has and ICRC-57 LegacyAccount is represented as a #Blob of the account id bytes. and ops of 57xfer, 57mint, 57burn. Whether or not the legacy ledger would support this would be up to DFINITY to decide and support, and would be further complicated by the fact that the current “blockchain” on these ledgers probably doesn’t use this schema and thus would take an extraordinary effort to ‘replay’ the chain and calculate the ICRC-57 certificate which would need to coexist with the classic certificate(which I think we’ve accounted for in the proposal but is still a bit stinky to think about having to support).

1 Like

I don’t have enough context here but using a Blob though convenient, seems less safe to me and morally equivalent to using a string.

The producer and consumer need to agree on the format of the encode value, perhaps Candid but could be any binary encoding.

When using candid, the producer and consumer need to further agree on the type of the encoded value. None of that is enforced by the contract, just like when using a text encoding.

I guess once could also contemplate a mixture of variant format with (candid) blob extensibility point.

And as Yan points out, there are different candid encodings for the same values, which people might not bear in mind when coding, and leads to trouble with hashing, equality testing etc.

We can but the issue is not just the AccountIdentifier. The ICP Ledger uses its own type of blocks, encoding and hashing which are fundamentally different from the ICRC Ledgers. The reason why ICRC moved away from the approach used by the ICP Ledger is that the approach used by the ICP Ledger is not extensible and difficult to use for clients.

1 Like

What do you think if we put an put an account: opt Account field in the GetBlocksArg type so that when the field is set, it returns the blocks indexed for the specified account. This way the standard is compatible with both kinds of implementations whether the accounts’-blocks-indexes are stored on the ledger itself like @infu mentioned or whether it is on a different canister like the sns index canisters. In the ICRC-3 GetBlocksResult, the canister returns blocks that it has and it returns callback functions pointing to where the caller can get the other blocks. This callback is being used to point the caller towards the archive canisters but it can also point back on itself with a next chunk if the ledger canister itself has more blocks than what fits in a single message, or it can point to an index canister which can then point back on itself with the chunks. So ICRC-3 as it is now is compatible with both implementation types of either the ledger holds all the blocks or the ledger creates archive-canisters that hold blocks, so this same compatibility can be with the index-canister functionality by adding the optional field.

When the field is set, the canister returns - or points to another canister (like the sns index canister) using the callback in the GetBlocksResponse that can return - the blocks indexed for that account within the range of the start and len.

There are many other possible filters: to, from, date, amount range, memo, schema(approve, xfer, etc)

These would all be interesting. In the token call today we discussed an add on chat would add a filter function. The lift to do this is non-trivial so I’d recommend a separate, optional function with a bit more complexity and leave the straight line function in tact. Not a strong opinion.