Are two canisters for a icrc ledger suite (ledger and index) still necessary?

The ICRC ledger suite, consisting of an ledger and index canister was originally designed when canister memory was more limited.

Now that canisters have 500GB of memory available to them, could it make sense to merge index canisters into the ledger canister?

Pros:

  • atomic index updates
  • significantly decreased cycle costs (no polling index canister timer)
  • simpler integration UX (just have to import or link to one canister, instead of two)

Cons:

  • Moves away from microservices approach by coupling ledger + index functionality, more risk if a bad update is pushed
  • Would require a new standard

More generally, given scalability improvements from the past few years, I’d be curious if to understand if the ledger were re-designed today if it would result in any changes to the design.

4 Likes

We’re still at a 4GB limit for Motoko, aren’t we(unless we write stuff to stable)?

My concern with 500GB of ledger is >4GB of indexes.

I’m all for a simpler solution. I’ve been pushing for a GENERIC ICRC_3 index canister that can index any ICRC-3 ledger, not just token ledgers. Since it is indexing loads of data and indexes are large, perhaps we can stripe it across multiple canisters/subnets but have a unified interface. Since it is indexing multiple archive canisters we can call it CanDB😬.

giphy
(but we should really do this)

On another note, I’d nominate icrc72 for the cost reduction of price(and to get notify of transactions back on the table…I’m already working on it).

As for simple UX, we have been talking about adding a metadata end point to the ledger to point to its index canister which would be helpful. I think this is "coming soon":trade_mark:

3 Likes

Found it: Merge pull request #116 from dfinity/bw/indexdiscovery · dfinity/ICRC@904fbb1 · GitHub

3 Likes

I’m not sure I follow what you mean by load distribution. All transactions still hit the ledger, which is a single canister.

Can you elaborate on what you mean by independent scaling? Currently a single canister can grow up to 500GB, which is half the capacity of a single subnet (1TB).

Transactions still hit the ledger, but requests for e.g. account balances or the TX history of an account can be offloaded to an indexer

1 Like

very interesting question. personally I think it should not be merged. I would even question if I would want to run the indexer on-chain at least until there is a good mechanism to avoid polling. I see too many downsides here in terms of maintenance costs and usability from data consumers and their endusers.

I would generally like to know how important you think on-chain indexing is for apps consuming the data vs. using an off-chain indexer that does the same. certainly the benefit of on-chain indexing is that you can be 100% sure the served data is correct and not manipulated. if I was building a wallet or dapp that needs that data I would most likely prefer a traditional (central) API that serves the data in the most efficient way possible. we also keep in mind that API consumers would probably prefer one API call over aggregation of different API calls.

are there any concrete use cases despite being able to 100% trust the data?

one simple example to express the drawbacks is if you look at the how long it takes to fetch token related data when opening OISY or other products that consume the on-chain data directly.

this is in the works as @skilesare mentioned :+1:

just want to follow up here and say that if you need to be sure to get the current balance, you should call the ledger directly for that unless the indexer is atomically updated (which is not the case)

or alternatively run the indexing service for supported tokens on my own if I want to be sure it is correct and/or if the API provider is too expensive.

1 Like

Ah, these are query calls though, and since update & query use different threads these queries shouldn’t impact transaction throughput right?

Another benefit of having the indexer on chain is if a canister wants to integrate with the index canister directly. Much harder to integrate with/trust off-chain service. Not arguing with the cost/efficiency benefits of having this indexing happen off-chain. But currently, this can all be done in a single canister.

Some comparisons:
The ICP ledger canister currently holds 4 GiB of state.
The ICP index canister currently holds ~7.94 GiB of state.

The OpenChat ledger canister currently holds 61 MiB of state.
The OpenChat index canister currently holds 420 MiB of state.
There’s also 1 archive canister that holds 112 MiB of state.

So if you have a super frequently used ledger, this means heap memory (limited to 6GB) needs to increase or you need to store this data in stable memory. But most single token ledgers seem like they’d be fine for a few years before having to migrate to use stable memory. Which isn’t ideal, but it seems decent to start with until we get more canister heap memory :man_shrugging:

100% agree. but I am actually wondering about the use cases where a canister directly integrates with an index canister on-chain. personally I cannot really think of one at this point. but I am eager to learn in what case that is desired by devs out there.

my point is more coming from a DX and UX perspective for the most common use-cases:

  • which tokens does the users principal hold?
  • what are the balances of the users principal on each ledger?
  • what is the (global) token tx-history / activity log of a users principal?

to me it seems that this is currently kind of a pain. you need to fetch the data from multiple different index canisters. additionally, an index canister might not even be available for these ledgers. from my point of view, the data is typically only needed in the UI.

potentially you can of course create a multi-ledger indexer on-chain. I am not aware of such an attempt yet and I am also not sure how feasible this really is.

1 Like