I have been talking with the founder of Arweave through DM, and I am going to post excerpts from our conversation with his permission.
Some of this conversation is in response to @PaulLiu’s posts here.
These will just be excerpts, I may remove or shorten some message responses, correct typos, and the order/flow might not be exactly how they were in the DMs:
Jordan: How are you going to secure these processes running on possibly untrusted CUs?
Sam: I think this is a misperception. The CUs will be staked, so if they provide you with a false state attestation they can be slashed. This gives you a security guarantee proportionate to their stake – if you would like more stake, you can just ask more CUs
Jordan: Your claims of smart contract-level verifiable compute seem strange without explaining or figuring it out yet, I’ve been talking with your team in ETH Denver
Sam: I think the core difference between the approach of ICP and ao is that ao has dissociated compute and consensus, while ICP bundles them
Sam: Having them separated lets you do interesting things with them – like let the compute be in other types of VM, run longer, require VM extensions, etc
Sam: It also gives you a ton more flexibility on the consensus side, too: Choice of SU (giving ordering and DA guarantees) is left to the developer. You can run a trusted one (like the PoA testnet ao currently runs), or you can use a staked one (as we plan for the base mainnet release), but you could also use Bitcoin, Arweave, EigenDA, or Celestia without changing the core data protocol
Sam: So to put it in the simplest sense: Your process is secured by the availability of its inputs, because verifiability of that (with any deterministic VM) gives you reproducibility of the state. You can then ask any CU – or even just run it yourself – to calculate your state.
Sam: The staked CUs actually give you far higher guarantees on reading the state vs traditional blockchains (I think this holds for ICP, too?). Typically, a user reads the state of the network by sending a HTTP request to a gateway/indexer/RCP. In ao when you ask a CU (also via HTTP) you get a signed response from the node. If you later find that this state attestation was invalid (either by asking another CU, or running it yourself), then you can slash the CUs stake.
Sam: You can also validate it yourself much faster, as execution is decoupled from consensus. Essentially each process is its own independently verifiable ‘blockchain’ – so you can calculate its state without generating the state of any other process. I believe if you tried to do this in ICP you would have to re-validate the entire subnet?
Jordan: I am referring to the idea that the network of CUs will be untrusted by default, as in they will be independent of the process owner, thus possibly Byzantine, and some kind of mechanism would need to be created on top to trust them. The staking is part of that mechanism it seems
Jordan: Yes but how does slashing work? Who can slash? How do you prove that a computation was done incorrectly without zk? It seems you would need some kind of consensus mechanism, ideally in real-time so that a computational result can be trusted without a lot of latency
Jordan: Sounds correct, now I wonder where the consensus on ao will come from, as it is necessary to provide security guarantees without zk…I think with zkWasm for example relying just on Arweave for consensus on inputs and outputs would be enough, correct?
Jordan: Having VM optionality and long-running computations is a definite improvement over ICP currently, still confused on how/where the consensus will come from though
Jordan: Consensus on inputs is crucial of course, but not sufficient (zkVM might change that story). I can’t just ask one CU to process something because I can’t trust that one CU, stake would help but I don’t think that’s sufficient for all use cases, some kind of consensus amongst multiple CUs sounds like what is needed
Jordan: Also running it myself is not going to be feasible for building web-scale applications for example. I’m not sure what your ambitions with AO are, but on ICP we’re building BFT web servers, databases, full-on web backends. For example we have Express.js running on ICP, you essentially deploy a BFT 13-40x replicated JS server. We have basic SQLite and hopefully soon Postgres compiled to Wasm running (PGLite). You can’t run this stuff yourself to verify it
Jordan: ICP doesn’t have automated slashing like that, but nodes can be removed by the DAO, and subnets can provide signatures on all outputs (I believe they do actually for all state change requests, not for read-only requests by default, but they can do that to). The exact verifiability I don’t remember, as in where exactly these signatures are verified
Jordan: It’s just completely impractical to expect a user/dev to verify a web-scale backend with potentially gigabytes of data, this doesn’t scale of make sense for many use cases.
Jordan: We’re going for full-blown web-scale Ghz-level compute over gigabytes of data, you can’t run this stuff on a client
Jordan: Now again, having a zkVM at the core would change the story
Jordan: Sadly I am still left unsatisfied, I don’t see where the consensus on the processes is supposed to happen, well I guess it’s supposed to happen outside of the system? But then someone needs to build this
Jordan: As a dev I want to build full web backends on AO, I want http, Postgres, etc
Jordan: I want to just run the process and have bounds under which it is provably secure, provably BFT
Jordan: I don’t see that here, I see some building blocks maybe
Jordan: For example I want an algorithm/protocol here that says as long as 2/3 CUs come to the same output with the same inputs then I am guaranteed honesty or something like that
Jordan: These are essentially the guarantees ICP, Ethereum, Bitcoin, etc give us, and these are all networks potentially full of Byzantines
Jordan: Byzantines are processes that can fail for arbitrary reasons including dishonesty, which is exactly what a CU is even if staked
Jordan: The stake just helps weed out Byzantines, but I think more than that is required
Jordan: In the end: where is the 2/3 etc real-time BFT guarantee coming from?
Sam: Ok, I think we are getting to the crux here. There are three layers:
- CUs: You can ask as many CUs as you want in order to gain as much certainty (backed by stake) as you desire. Because ordering+DA is separated from execution, you do not need an explicit consensus mechanism _at the CU level. For example, if you wanted to have the equivalent of 100% stake (on a PoS execution network) you could simply ask and pay every CU to do the computation. If they all come to the same result, you have your ‘consensus’. This is flexible to reaching 2/3, 1/3, or just a USD token equivalent quantity of stake.
- Slashing: What if they don’t agree? You can raise a vote on the staking process, calling others to calculate the real state and vote. This is a form of BFT consensus, but only executed when necessary. The core staking process itself will use Arweave’s BFT consensus as the SU to ensure availability and ordering, without dependence on any specific node. The staking process will also allow subledgers that grant for faster votes on low-confirmation time BFT ledgers, or even simply staked SUs – but participants can always default back to the Arweave network if problems occur on these subledgers.
- Finally, everything rolls up to availability on top of Arweave’s consensus.
You are totally right that zkWASM will improve this. You could use any untrusted CU to grab your state.
Jordan: So can I achieve this: spin up a process on 31 CUs, for every incoming message have the client/user check that at least 21/31 agree?
Jordan: How will a user/client be able to verify that the computation was performed correctly without a lot of latency?
Jordan: I’m imagining a client using HTTP, they’ll want to perform an HTTP call to a server running in a process, and in the response they’ll want to know somehow that 21/31 agreed
Jordan: Is this possible with low latency?
Jordan: And another question, do you foresee the kinds of use cases I’ve been describing as being plausible with AO? Web servers, databases, frontends, full on web scale backend applications?
Jordan: * and by low latency I mean ideally under one second, but in the worst case a few seconds would maybe be acceptable, maybe
Sam: Yep! Or any other configuration. You could just ask any number in parallel and aggregate the results (or the first x% to reply, etc).
Sam: You could build an aggregator like this, or you could just have the client speak to many nodes in parallel. Both would work, although an aggregator doesn’t exist right now
Sam: These results you collate are a big step up on traditional RPC nodes, too, because they are staked against. If the CU lied you could take your result and have the node be slashed – regardless of whether ‘you’ is a web browser, another MU, etc.
Sam: Definitely decentralized databases and backends!
Jordan: Awesome!
Jordan: But what about latency?
Jordan: I’m thinking there will be some significant latencies on the order of seconds at least??
Sam: I think the claims of servicing HTTP requests ‘directly’ from ICP canisters are unfortunately relatively weak, because there is always a ‘gateway’ node that has to sit between the user’s browser and network. The question is how the gateway will be incentivized (given it is transferring all of the bandwidth) and trusted (given lack of good browser options for validation). I think you could do exactly the same thing with ao (with staked results in the browser, too!), but I don’t think it would be a high integrity claim we would be happy standing by to call it ‘serving’ HTTP requests from the processes. It can definitely pre-process the results, though! Is there anything I am missing on the ICP side here?
Sam: It will definitely depend on the way you write your process, but in the mode we expect to be the default (staked independent MUs, SUs, and CUs), it should be sub-1s latency. In the current setup we get roundtrip latency of ~500-800ms.
Sam: If you want to move 1b USD in a process, though, the message recipient may want to wait a period of type with the message in the buffer (even if it is received with very high stake)
Sam: Again, the approach is to allow flexibility and modularity, rather than a monolithic design where one is not needed
Jordan: There is some nuance here. ICP has a system of boundary nodes that accept plain HTTP requests and convert them into API requests that are then sent (over HTTP still I believe) to the replicas, where they are then gossipped or otherwise communicated among themselves
Jordan: So yes, there is essentially a proxy layer, the current plan is to allow anyone to run these proxies into the network
Jordan: But you can also call the API nodes directly over HTTP…I’m not sure if you can call the replica nodes themselves directly. We are doing this to get around some authentication issues right now, where in the browser we override global fetch and intercept the dev’s requests only if a certain HTTP header is present. We then use a client-side “agent” to perform the API calls, bypassing the translation of raw HTTP requests into the API call requests
Jordan: Calling the claims weak might be warranted, but practically speaking the developer can write Express or other http servers in their canisters
Jordan: There are of course limitations, but the developer experience is getting better. Here’s an example of a very simple Express application that you can deploy to ICP: [https://github.com/demergent-labs/azle/blob/main/examples/hello_world/src/backend/index.ts](https://github.com/demergent-labs/azle/blob/main/examples/hello_world/src/backend/index.ts](https://t.co/ACgZRPo2yP))
Jordan: There will always be infrastructure required to receive and translate HTTP requests, resolve DNS, and deal with networking that might be outside of the guarantees of the blockchain…perhaps?
Jordan: But in the short-term only the HTTP Gateways themselves, the ones that convert raw HTTP requests into HTTP API requests, will be outside of the ICP network’s incentivization mechanisms
Jordan: And when I talk about serving the HTTP requests, I think I’m mostly interested in as a developer being able to write an HTTP server using normal web development libraries in my process, and just deploying that to the network and having it work
Jordan: Obviously with good-enough security guarantees, but I believe ICP has pretty good guarantees on that and will be improving
Jordan: If AO can provide a similar experience, that’s what I’m most interested in knowing
Jordan: It sounds like you’re saying that it can
Jordan: If latency, cost, scalability, and developer experience are good enough, then that’s amazing
Sam: Yep – the ‘pre-processing’ (generating the HTML etc to serve) is definitely possible, and cool that it can be validated. Serving web apps is an entirely different game though. In the Arweave ecosystem we have the
@ar_io_network
which is building decentralized gateway infrastructure, but getting the incentives right is highly non-trivial.
Sam: Yeah – it is definitely an exciting idea. I don’t see any reason it shouldn’t work on ao, with some of the same caveats as ICP. Will probably take a little while for the compilation tooling to get there though
Sam: This conversation has actually got me thinking much more about whether CUs could essentially be gateways themselves?
Sam: You pay them for access, which solves the incentive problem, and their responses must be signed (we have a system called P3 for adding crypto auth in HTTP headers: https://arweave.net/UoDCeYYmamvnc0mrElUxr5rMKUYRaujo9nmci206WjQ…).
Sam: Even if the CU thinks that the client it is talking to is a web browser that won’t validate the signature in the headers (or re-validate the execution from other CUs), the risk of being slashed is so high that there is a very strong incentive not to lie about the results still
Sam: So in theory you could ‘serve’ the site directly from the CU, with some strong guarantees. And of course, you could swap out the CU just like you can swap out an Arweave gateway (ex. [http://sam.arweave.dev](http://sam.arweave.dev), [http://sam.g8way.io](http://sam.g8way.io), [http://sam.ar-io.dev](http://sam.ar-io.dev), etc)
Sam: Can’t imagine how you could subsidize the CU-as-gateway, though.
Jordan: My biggest concern with AO is still verifiability, and his takes validate my suspicions that AO may be lacking there
Jordan: Being proven wrong would be very interesting of course
Sam: …If so desired, many CUs in ao can sign the same state to attest to its validation. This is essentially the same as providing the chain signatures that ICP has
Sam: number of signatures from different staked CUs can give you far better verifiability than a BLS sig from many different unstaked nodes. Lack of stake means that these subnets are highly vulnerable to ‘eclipse’ style attacks. In ICP these can affect not only state attestations to users (which, I believe, are even relayed through unsigned and unstaked gateways?) but also the passage of messages between subnets.
Sam: Somewhere in the message they also mention that no intermediate states are stored. This is again incorrect – nodes in ao already store staked snapshots on Arweave that other nodes can validate and also resume execution from. Again, joining an ICP subnet from a BLS signature of state at a block height is far riskier, as nothing at all prevents an attacker from slowly eclipsing the subnet and signing any data they like with the trusted keys.
Sam: A final point to note is that in general the entire ao architecture is flexible at its core. Processes are free to choose any requirements that they want in terms of security – they could always trust a single set of keys (PoA), require a variable stake on the messages they receive, or even delay processing for riskier messages for a period of time (the delay the author mentions in rollups). You could even implement the precise security mechanisms of ICP if so desired. Instead of being prescriptive and enforcing one model on all users+devs, ao let’s you choose what makes sense for each individual use case.
Jordan: ust to check on something, CUs have flexible verifiability of messages between processes right? Process A → Process B, Process B could require X signatures from specific CUs before accepting the message?
Sam: I wouldn’t call it flexible verification of messages at the CU level (they cryptographically attest to faithfully executing the code of the process), but yes, the processes can flexibly verify the messages as they like. Process B can require any number of signatures, optionally attached to any amount of stake, or require a ZK witness of correct execution if they prefer.
Sam: Or even only accept messages from a certain set of other processes, or require a signature from a specific key
Sam: Have been thinking that it might be interesting to demonstrate what an ICP-style security model implemented in ao would look like. It’s actually far simpler than the PoS system that we expect most processes will employ
Sam: I also think that a reasonable way to sum it up is inline with afat’s diagram: https://us1.discourse-cdn.com/flex023/uploads/dfn/original/3X/4/c/4cf1cb80b97d881fbd5d6b515bd767ea6bd463d6.jpeg
Sam: I thought it was quite neat
Sam: The exact placement of the dots is hard to follow (why is EVM on IC further to the right than EVM on ao, when his point is that IC is essentially ‘EVM compatible’ – you could run EVM on ao+IC), but the idea to visually show that ao contains a flexible plane of different trade-offs is cool
Sam: I think that afat is right that you could run a good ‘permissioned decentralization’ SU for ao on an IC subnet. This would be great! I remain to be convinced about the trust model for IC subnets, but this would be a cool option to give people if they want it.
Sam: For CUs, however, IC would be very limited relative to ‘normal’ staked ao CUs. Having consensus at the state layer (rather than inputs) puts fundamental restrictions on what they can achieve (lower execution counts, no VM choice, VM extensions, etc)
Sam: The subnet-consensus-on-execution approach even limits you from achieving greater trust on your execution outputs if you so desire: In ao, you could ask a full ‘subnet’s worth’ of CUs about the state, or you could ask every CU about the state – or just a couple. Whatever your preference is
Sam: So for that reason, I don’t think IC CUs would be competitive.
Sam: But for SUs, it could be very cool!