Announcing "Token Standard" as topic of the first meeting of the Ledger & Tokenization Working Group

@Maxfinity, thank you for your last-minute change proposal w.r.t. the textual representation. @roman-kashitsyn, @benji, and I just had a good discussion on the proposal and what has been discussed in the forum by @Maxfinity, @timo and others.

Here’s a summary of our findings.

The properties we want to achieve

  • A textual encoding of any non-reserved principal is a valid textual encoding of the default account of that principal on the ledger.
  • The decoding function is injective (i.e., different valid encodings correspond to different accounts). This property enables applications to use text representation as a key, for example in a map.
  • Protection against copy-paste errors or typos
  • Human readability (particularly the ability to identify the subaccount with the naked eye)

Approach

We concluded that the most suitable representation meeting those properties is the one presented in the following examples (note that the principal contains a checksum over itself):

  • 4kydj-ryaaa-aaaag-qaf7a-cai (default subaccount = principal)
  • 4kydj-ryaaa-aaaag-qaf7a-cai:1 (simple subaccount, no checksum on subaccount)
  • 4kydj-ryaaa-aaaag-qaf7a-cai:3fCe35D21Aa8 (complex subaccount, contains checksum over the whole 2-tuple through the case of letters in the hexadecimal representation of the subaccount)

Informal specification

Let f be the textual encoding function specified as follows, where || is string concatenation. Let principal be a principal in textual representation and subaccount a subaccount in byte array representation.

f(principal, subaccount) := principal || “:” chk(principal || “:” || hex(subaccount), subaccount)

chk(a, b) is a checksum function that capitalises b based on the SHA-256 hash of a. The input string a is a canonicalized hexadecimal string, i.e., comprising only the characters [0…9, a…f]. The hexadecimal representation a is hashed with SHA-256 to obtain h, and for each digit with index i in a, print it in uppercase in the result, if the 4*i-th bit of the hash is 1, in lowercase otherwise. Digits are taken over from a to the result. I.e., we capitalise letter symbols of b in the output based on the hash h of a. This is analogous to Ethereum address checksums.

In words

  • The encoding is created from the principal, followed by a colon, followed by the subaccount
  • The subaccount is defined as follows:
    • Take the subaccount in hexadecimal representation with leading zeroes stripped
    • Compute a checksum over the principal and subaccount and represent it through the case of the letters in the hexadecimal representation of the subaccount (the principal remains untouched)

Question / discussions

  • Do we need to include the “:” in the input to the checksumming, as a “domain separator”? Does not harm, but not clear whether it is really needed. If not, it should be removed.
  • Why does Ethereum EIP-55 use the 4i* and not just i? For a good hash function, this should not make any difference.
  • Should we rather do the hashing on the byte-array representations? Might be cleaner, it would be an easy change that we need to discuss.

What does this achieve?

  • This checksum is part of the resulting subaccount only if the hexadecimal representation of the subaccount contains letters. I.e., for “simple” subaccounts like 1, 2, etc. there is no checksum available for this reason as digits don’t have a case. For subaccounts derived through a hash function, e.g., SHA-224 or SHA-256, we have an expected ~21 or ~24 bits of checksum, respectively, expressed through casing of the letters, which catches copy-paste errors with high probability. Simple accounts like “1”, “1234” etc. do not have a checksum.
  • We leave the principal untouched, so it can be easily compared via eyeballing.
  • We have a checksum over everything if the subaccount is not a “simple” subaccount. Having checksums over complex, long subaccounts addresses requirements addressed in the forum threat. Not having checksums over short, simple subaccounts seems OK in the light of the discussions.
  • Having upper/lowercase in the subaccount has not been seen as an issue so far (but also not explicitly addressed). @timo?
  • Users can still create simple subaccounts on their own as they are not checksummed.

We think that this approach is the best compromise we can make given the discussion we have had so far in the forum. It has checksums where helpful, but skips them where we think that they are less required. We think this is a clear improvement over the previous proposal.

Please let us know what you think about going forward with this proposal. If we do not hear objections, someone needs to spec it properly and then we can open a vote on it. At least it seems that what we have now is strictly better than what we had before.

3 Likes