Thoughts on the token standard

The more I think about this, the more difficult I see the problem of atomic cross-canister transactions. (Sorry, this will be a long post.)


Some background…

A common use case of tokens is for a user to spend some tokens to perform some action on the blockchain. For example, a user wants to spend 20 LINK to convert it to BAT on some DEX. This entire flow should be treated as a transaction.

There are 3 parties involved:

  • Spender (either user or canister)
  • Token canister
  • Receiver canister (e.g. the DEX canister in the above example)

Let’s abbreviate these parties as S, T, and R, respectively.

S always initiates the transaction, since they own the tokens. So S → T is the first step. The question is what happens next?

In ERC-20, the flow is S → T (approve), S → R (startAction, e.g.), and R → T (transferFrom).
In ERC-667, the flow is S → T (transferAndCall) and T → R (onTokenTransfer).

Notice that the caller-callee relationship for T and R is inverted in ERC-667 versus ERC-20. The benefit of ERC-667 over ERC-20 is one transaction for the spender instead of two, which means lower gas fees (on Ethereum) and a simpler UX. The drawback is that reentrancy may be an issue if implemented incorrectly.

@senior.joinu’s proposal to use pub/sub is an extension of the ERC-667 flow, where T calls multiple Rs and not just the receiver of the spent tokens. Generally speaking, it inherits the same properties as the simpler ERC-667 flow, so I don’t talk about it here.


So I tried porting over the transferAndCall function from ERC-667 onto IC in Motoko. Since updates that call other canisters are not atomic, I added a rollback in the event of a callback failure.

This is what it looks like:

actor Token {

    private stable var balances: Trie.Trie<Principal, Nat> = Trie.empty();

    private func _key(p: Principal) : Trie.Key<Principal> { { key = p; hash = Principal.hash(p) } };
    
    private func _getBalance(p: Principal): Nat {
        let balance = Trie.find(balances, _key(p), Principal.equal);
        switch (balance) {
            case (null) { 0 };
            case (?balance) { balance };
        };
    };

    private func _setBalance(p: Principal, value: Nat): () {
        balances := Trie.put(balances, _key(p), Principal.equal, value).0;
    };

    public shared(msg) func transferAndCall(to: Principal, value: Nat): async () {
        let fromBalance = _getBalance(msg.caller);
        let toBalance = _getBalance(to);

        assert fromBalance >= value;

        _setBalance(msg.caller, fromBalance - value);
        _setBalance(to, toBalance + value);

        // Skip this entire try block if `to` is a user principal and not a canister principal.
        // There may be no good way to determine that...
        try {
             let toActor = actor Principal.toText(to): actor {
                 onTokenTransfer: (from: Principal, value: Nat) -> async ()
             };
             await toActor.onTokenTransfer(msg.caller, value);
        } catch (err) {
            // Rollback in the event of callback error
            _setBalance(msg.caller, fromBalance);
            _setBalance(to, toBalance);

            // Can't assert (i.e. trap) instead of throw, otherwise rollback would be reverted
            throw Error.reject("transferAndCall failed");
        };
    };
};

In the distributed transactions world, I think this would be an example of the saga pattern, with the rollback called a “compensating transaction”.

Here’s the problem…

Isolation is poor.

For example, let’s say transferAndCall has updated the balances and is currently await-ing on the onTokenTransfer callback to complete. In the meantime, either the spender or the receiver queries their balance. Then, onTokenTransfer failed and the balances are reverted. The data queried is no longer accurate. This is a dirty read.

Even worse, someone could have tried making illegal transfers during that await-ing time, e.g. spending tokens they have at the time but won’t after onTokenTransfer eventually fails. But if they would’ve had enough tokens either way (had onTokenTransfer failed or succeeded), then their transfer would be reverted by the rollback, leading to a lost update.

How can we solve this?

  1. We could move the onTokenTransfer call to before we update the balances but after we assert. That way, we don’t need to roll anything back, because the balance updating code can’t fail and it’s also the last thing that happens. The problem is: what if onTokenTransfer actually needs to do something with those tokens, e.g. a DEX canister wants to swap those tokens for another type of token? Those tokens wouldn’t exist if we made the call before updating the balances.
  2. Alternatively, we could keep the current order, but keep tracking of “pending balances” in addition to the actual balances. Initially, we update the pending balances, and then we call onTokenTransfer. If that call succeeds, we “finalize” the pending balances by updating the actual balances. If that call fails, we revert the pending balances. This way, dirty reads don’t happen because they never see the pending balances, only the actual balances. Lost updates can be prevented by not proceeding if a pending transfer is in progress (not clear on the details here). This general approach is called semantic locking. The issue is that a) locking isn’t great, and b) the same issue as above, where the onTokenTransfer function wants to do something with its newly received tokens but can’t because they are still “pending” and not “finalized” here.

Not really sure what to conclude from all of this…

Am I overcomplicating things? Does an ERC-20 flow with approve and transferFrom avoid some of these problems with an ERC-667 flow? I’d be really interested to hear what others think.

5 Likes