Thanks for linking that discussion.
This clone/copy and delete methodology is exactly my thinking. Fully copying a canister through inter-canister calls is both expensive and would take a significant amount of time, not to mention you now introduce a whole bunch of distributed systems problems with what to do if a 1-2GB canister data copy fails in the middle (since you would be splitting the copy process into multiple update calls supposedly).
While this works great for backups, you have to start out by spinning up that extra canister before. This doesn’t work if you’re auto scaling out in a reactionary fashion, and to top it off doesn’t effectively tackle the problem of if you have data that is unevenly distributed (based on a particular primary key).
The intermediate solution that a few teams have come up with is to find a specific entity that they do not believe will push the 4GB canister limit anytime soon (say a user, or a message chat on OpenChat) and just spin up a new canister for each of those entities. I think that the above will most likely be my approach (in the intermediate term), as I’m certain that a deeper IC based solution (either the replica copy solution or another solution) is the way to go in the long term.
Building off of this, I’m assuming these clones would then be bounded by the individual subnet of a particular application - which would then be the next bottleneck.
@diegop From a brief glance I didn’t see this on the 2022 Sneak Preview RoadMap. Are there any Roadmap items that might tackle this problem or engineers that might be able to comment on the feasibility/work required to perform efficient canister data cloning?