Ledger & Tokenization Working Group Update

Are there meeting minutes of the WG meeting for those who weren’t able to attend?

What was the outcome?

Regarding the textual encoding: no objections to the encoding with explicit subaccount length and “not a principal” marker. Everyone agreed that having short subaccounts with automatic padding might be useful. We can specify that encoders are allowed to do such space optimizations, and decoders must respect them.

Ok, so encoders can use the abbreviation with automatic padding or can choose not to? That means there is no uniqueness, or, more precisely, we can have anywhere between 1-33 different encodings for the same account (because the encoder can choose to abbreviate any number of trailing zero bytes).

Regarding the choice of marker for “not a principal” I think we should also inform and ask people outside of the ICRC-1 group.

Yes, but the encoding is already non-unique in the presence of default subaccounts. The WG agreed that we should be flexible in encodings that we accept and advice applications not to use textual representations as keys (in a map/database, etc), but rather decode the strings and normalize them. That’s what the ledger has to do as well.

Yes, I’ve talked to @bjoern about it before the WG meeting and he was cool with that. I’ll prepare a spec change.

Is that so easy? Thinking for example about a block explorer where you paste in the account identifier and then get back all transactions of it. So then I would paste in one encoding but in the transaction list that I see it may then appear in a different encoding? Or does it appear in a completely different way, as a pair with the subaccount id bytes exposed? Might be confusing. Whether the pair is exposed or the textual encoding is shown can be easily solved with a toggle. But not the same way between different textual encodings.

Made a general post about the topic here: Using the principals' textual encoding for other things than principals

I also started thinking about block explorers and how users might be confused because some app didn’t apply an optimization, but another did.

Then we have to enforce optimizations on the spec level and reject unoptimized encodings from the very start. This will give use uniqueness, but will break some natural properties, such as

∀ a ∈ Account : decodeAccount(encodeAccount(a)) = a

I’m think that not a huge problem, but would be nice to listen to more opinions, e.g., from @bogdanwarinschi.

What is a here? The pair (principal, 32 bytes)? Then the property isn’t broken.

The pair is (principal, opt blob). Optimizing (principal, opt DEFAULT) to (principal, null) is required for uniqueness, unless we remove the requirement that (principal, opt DEFAULT) is the same is (principal, null) (which will be incompatible with the ICP ledger).

Maybe I don’t understand. The problem with the default account was always there, wasn’t it? What changes with respect to it?

I am trying to understand the statement:

Why does enforcing an optimization break a property that held before? The problem was already there without the optimization, or?

We would basically have to say that (principal, 32*0x00) is not a valid account, only (principal, null) is. If you want to have the cited property, the encoder has to reject (principal, 32*0x00).

But maybe I’m misunderstanding something. What does (principal, opt DEFAULT) stand for by the way?

That’s the same as (principal, 32*0x00).

If we want uniqueness property, we have to enforce all optimizations, including the substitution of (principal, 32*0x00) for (principal, null). So decode(encode(principal, 32*0x00)) is equal to (principal, null), which is semantically equivalent to (principal, 32*0x00), but structurally different (all programming languages will consider these two objects non-equal). I think we can live with that, but that’s what I meant by breaking a natural property. In the original proposal, decode(encode(principal, 32*0x00)) = (principal, 32*0x00).

Then we lose another natural property that encoding is a total function.


Well, within candid types and Motoko types it is not total anyway because blob/Blob doesn’t limit the size to 32 bytes. So encode will always reject some things that passes as correct type. So we could also reject 32*0x00.

Anyway, remind me why we want account type to be (principal, opt blob) rather than (principal, blob)? Is it just more convenient to make calls by hand with dfx in the null case? Or does it make client code look nicer when dealing with default accounts?

Both of these + the fact that most of users don’t need to care about subaccounts.
And the killer feature IMO is that this way each principal can be a valid account.

Yes, but on which level? You mean (principal) is a subtype of (principal, opt blob)? So we don’t even have to write “null” anywhere?

Ok, I think I see your point. You propose that we make the form (principal, blob) as a canonical form so blob always has to be specified in the Ledger interface, but the ledger and the client libraries can take liberties in how they store the “default” value?

I see the null case as a space and usability optimization for the most common case. We can live without it, but it’s a bit hard to go back and change ICRC-1 to make subaccounts required.

Can we just live with the disambiguity of the default account and with not having some natural properties for this special case?

I don’t see any serious issues with that, I’ll update the spec to require unique textual representation.

Alternatively, I think it also acceptable to tell the users that all zeros isn’t a valid subaccount. They can either use the default account or they can specify a non-zero subaccount. It makes sense and the worry that a function isn’t total seems minor. What the internal representation does with that is up to the developer. They can represent the default by all-zeros or in some other way. That is not visible to the user. But the encode and decode functions have to throw an error if you feed in what would result in an all-zero subaccount.

1 Like

I’ve updated the spec; valid textual representations are unique now. I’ve also sketched a reference implementation in Motoko.

Could we make the special trailing byte 0x7f instead of 0xff?

In case the use of principals proliferates and more values of that byte get a meaning assigned to them then we don’t close off the option to expand the space via leb encoding of those values.