Hey everyone!
Sorry for a long delay.
The new draft of the technical design is ready and available here:
(this is a new link, the old one leads to the previous draft of the documentation for history-preserving purposes)
This new design is much more simpler than the previous one. I got rid from all the additional stuff and focused solely on the core functionality, making sure it can be extended later. Now we only have 4 documents. Everything else, that was covered earlier, but didn’t make it to the current draft is considered as a subject for future developments
.
The design itself was hardly reworked, as an attempt to enable all the features we were discussing here, in this thread.
First of all, the state itself is now implemented with a different data structure, which greatly simplifies the design. It is still a b-tree of b-trees, but arranged in a slightly different way, with only numbers (currently, Nat
s) as possible keys. This makes the whole database globally ordered, which, with the new CompositeKey
feature, enables a lot of interesting stuff.
Second, we now have a separate document describing Queries
- which are functions that allow clients to fetch a lot of data very fast and cheap, while traveling between shards. This document was added as a response to @infu’s and @berestovskyy’s comments about not being able to return a lot of data to clients with the previous design. Now it is possible and is very efficient.
@infu, I’ve put a lot of thought into your idea of making transactions in union-db
similar to map-reduce. What I realised is that with the current design, it is not possible, since each shard only knows about its closest neighbors from left and right, and not about all the shards. But, the good news is that the new transactions design allows us to implement this parallelism pattern later, when we will implement Caching.
By the way, about the transactions - they are now completely different. They no longer rely on custom syntax or coroutines (@berestovskyy). Instead, the whole distributed transaction engine is now just a minimalistic implementation of Saga pattern (more on that in the docs). This makes them very much decoupled from the rest of the library (@ulan) and makes it possible to use them in any other distributed context, even without the rest of union-db
.
Sagas are inherently ACD, and not ACID. But I put a lot of effort to show through the examples, that this is not a big issue and there are a lot of ways of adding the isolation layer on top, if you really need it. Moreover, there is an example, that implements fully functional 2PC protocol, using only the proposed Saga implementation.
Another exiting thing that I want to tell you is that this great redesign started initially with @berestovskyy’s comment about deadlocks. Now, with this new database structure, composite keys, iterators and a little bit of savvy, you can implement 2PC protocol completely free of any possible deadlocks, by breaking the circular wait
condition and making all your locks appear in the same order.
There is a whole example about this in the documentation, make sure you read it, if that seems interesting to you.
I did my best to try to bring my original idea as close to what you guys see. As @ulan proposed, the scope of the PoC is also changed. We’ll first try to implement this new transaction engine, that should work with any other use-case. This will allow us to decide, whether it is even reasonable to implement cross-subnet transactions or not. The demo project is also different now (@infu) - now as a demonstration of the transaciton engine, I will implement an online shopping app, with several canisters-microservice and an external invoice-canister-based payment system.
Thank you all so much for your help! Also, big thanks to @domwoe, for helping me to drive this project to the good - without your help it would not happen.
Please, ask me anything. The docs are open for commenting, as usual. One note is that, Transactions and Queries documents are very heavy on code, so my apologies to someone who is not technical enough. It thought this was a best way to realistically show various flows that can now be implemented with union-db
.