Sharded Database/Multi Canister Architecture Discussion

I am deciding on whether I want to create my database as a multi canister architecture due to the 4GiB Motoko Canister storage limitation. I have some questions that I am unsure the answer regarding this decision.

  1. Is the best tactic for a multi canister architecture to have one “management” canister which is responsible for creating, keeping a record of, and querying dynamically created canisters? If so, is the “management” canister at risk of becoming a bottleneck (Every request must first be routed to the management canister - then to the corresponding sub canister - meaning the management canister will receive a lot of traffic). Are there any ways to prevent the “management” canister from being a bottleneck?
  2. If I use the above method to shard my database, how can I perform database wide searches? If each canister only holds a sub section of the data, would I have to query all of the sharded canisters to conduct a full database search? This would be very inefficient and still not very practical. Is there a good way to shard my database so it can be easily searched as a whole?
  3. I have read in the forum that enhanced orthogonal persistence could in the VERY near future allow for a Motoko canister to hold up to 400 GiB in heap memory. This would solve all of my problems. Can I expect this update within the next month or so? Is it worth waiting it out?
  4. If orthogonal persistence is still in the distant future or if it will not actually provide 400 GiB of heap memory, is it worth it for me to look into the Motoko Stable Regions API. Can the Stable Regions API handle a lot of query and update traffic efficiently and at a reasonable cost? Is it worth the extra cost and complexity of not having to spin up the equivalent of 100 canisters (debatably 200) to achieve the same storage functionality?

Sorry for all of my questions recently. Just do not want to rush my backend architecture to save myself the technical debt. Thank you all DFINITY developers for all your help so far, much appreciated keep up the great work!

A lot of this depends on how you will query your DB and where you will query it from(do you need to query from canisters on the IC or from a web app). If most queries will come from outside the IC the canDB is a solid solution as it has been in production for a while and has a client library that takes care of stitching queries together for you.

If you need access from other IC canisters you may just want to go with a standard canister unless you are really, really sure you’re going to grow faster than the wasm64 and motoko heap will expand.

Stable memory with the regions API from motoko works and is an option, but keep in mind that reading and writing to stable memory is a bit more expensive.(although I think there may have been some improvements lately).

1 Like

I want to query from a web application, but I do not want to use a pre made solution, I would like to use my own data structures and remain flexible.

To what extent will my performance decrease and my request cost increase when using Stable Regions API? Any actual numbers?

Do you believe that a “management” canister can be a bottleneck or should it work just fine?

Do you know what the time frame is on enhanced orthogonal persistence and will it actually increase Motoko stable heap memory to 400 GiB?