The need for the developer of the app to have to write, from scratch, solutions to manage persistent state using only in-memory data structures, with no direct access to storage, is disappointing. Orthagonal persistence doesn’t seem to make solving any of the standard issues any easier.
One seems to be faced with adding dump query methods to retrieve memory data in bulk and then bulk delete methods, etc. If I were developing a serious actor that needs to manage records forever or even for a limited period of time, like 30 days, those methods would be the first to write and of course, must be maintained as I make changes to the global in-memory data structures. Bulk update is of course also a must if one wants to deploy large sets of data quickly. The IC provides nothing here.
Source code is a bad place to define data structures that must persist their data forever, - databases and database schema were invented to fix this glaring problem. Source code is also a bad place to store schema definitions. However, that’s the only place the IC provides for developers to perform those crucial functions, putting an additional semantic and implementation burden on the developer that is greater than alternatives.
For canisters, and of course for any such system today, when schema changes, upgrade code must be written and tested. Mokoto, I guess solely because it is defined by Dfinity for the IC, has included in its language definition the hook points to perform one kind of upgrade, so you don’t have to do invent that hook yourself even more painfully. Having written such upgrade code many times, I can assure it is painful and buggy code to write, and is impossible to automate. How testing this upgrade logic will work in this environment then becomes a concern, bringing up questions such as: how long it takes an populate a canister with a GB of data from scratch, locally (if possible) or on the mainnet? Note, for one, that a complete backup after an upgrade is usually a matter of course. The IC doesn’t provide this at any time, as far as I can tell.
The problem is that there seems to be no abstractions available to developers to externally manage data separately from the canister code itself. Update calls are the only mechanism available. This means paying for the consensus protocol for every call, even if you simply don’t want or need it - i.e, initialization.
That sqlite just happens to be almost runnable in a canister shows how hard the problem is. As lovely as having an SQL API is, as a developer you are now faced with the prospect of treating all the scripts and such you might need or like to us as strings passed to update functions, with commensurate loss of semantics in the method calls itself, or updating the canister’s code itself constantly. At least one could update table structures and data without having to update the canister itself but the clumsiness of this approach makes using such a package in an actor problematic to me without any examples to review.