RFC: Standardising how smart contracts expose state

Just saying “possibly” to indicate that I wouldn’t consider it a blocker if we can’t support everything.

Based on the paper abstract certainly sounds interesting. Not open access though and it’s too late today for my brain to process papers anyways. Did we discuss this paper before, when designing certified variables and/or candid?

Certified variables on the Candid layer? · Issue #1814 · dfinity/motoko · GitHub is also relevant, and may provide another attack vector for this problem: define the various data accesses as plain Candid-returning query methods, and (somehow) have a generic mechanism to certify these.

1 Like

Replicated queries are essentially calling https://sdk.dfinity.org/docs/interface-spec/index.html#http-query via an ingress message. If you called it as a query, you do not have any certification that you can validate. But if you call it as an ingress message, you get the result in the ingress status (via read_state) for which you now have a certification.

canister_status returns private information about the canister. This is information that the canister does not want to expose to the rest of the world only to its controllers. The set of canister’s controllers has always been public. This proposal is not attempting to change any restrictions. The public data will remain public and the private data will remain private.

Precisely, thanks for stating clearly what I meant to say in the original proposal.

Indeed, this is the usual point on which we keep getting stuck.

I suppose, conceptually, what the proposal is suggesting is that read_state return the following struct:

(HashTree, Option<Certificate>. The hash tree is always returned and when executing as a non-replicated call, then the certificate is additionally returned.

Instead would something like the following make sense?

Result<CandidStruct, (HashTree, Certificate)>. Now when executing a replicated call, you get an easier to digest struct and when executing a non-replicated call, you get the HashTree and the certificate.

3 Likes

Something like that, with a generic (not application specific) way to relate the hash tree to the payload in the CandidStruct. So that, for example, Candid UI or ic.rocks can validate such responses from arbitrary canisters.

Hmm, that seems to be a good criteria to evaluate solutions: is it expressive and general enough to be compatible with such canister-agnostic tools.

(Or we just do certified queries on the system level (threshold signatures on query call responses, independent of the main chain, internal link), and save sooo much complexity and effort in upper layers…)

Could also be a good opportunity to defintely decide what information about a canister we would like to be private or public by default? there have been earlier discussions about this before.

I know that there are solutions like setting the blackhole canister as a controller, but as a developer that just seems cumbersome to me (now needing to manage the cycles of the blackhole canister as well just to expose this information).

I know I have seen @wang @stephenandrews and myself express a preference for making the currently private information public by default and optionally private. I’ve seen @Sherlocked say that he doesn’t think the cycle balance of a canister is too much information. I’ve seen @Levi say that cycle balance is too much information.

I would love to hear if other community members have an opinion about this as well.

3 Likes

@Fulco : good points. I agree these need to be addressed as well. This will significantly increase the scope of this RFC. Based on the discussions so far, I think I need to go back to the drawing board a bit on replicated vs. non-replicated queries. I have some vague ideas that I want to write down first.

3 Likes

I was thinking about the above proposed API for read_state and how I was not happy with it. I ended up writing down my thoughts on why I am not happy with it. I am posting them here to keep the conversation going. No precise proposal yet. I think more discussions and design is needed here.

Replicated vs. non-replicated queries

There are two modes of executions possible on the IC. Replicated mode is when all honest nodes on the subnet perform the execution and non-replicated mode is when just a single node performs the execution.

There are two types of functions that a canister can have. Update functions are when the state changes made by the functions are preserved (amongst other capabilities) and query functions are when the state changes made by the functions are discarded (amongst other capabilities).

Since, update functions are modifying the state of the canister, all honest nodes on the subnet need to execute it, in other words, update functions can only run in replicated mode.

On the other hand, state changes from query functions are discarded so they are fine to run in the non-replicated mode. When we originally designed the IC, we asserted that in terms of capabilities, query functions are strictly less capable than update functions so it should be fine to execute query functions as replicated mode as well only with limited capabilities.

The above decision had important ramifications for improving developer experience. It enables update functions to call query functions which then execute in replicated mode. This has the benefit that canisters do not have to provide duplicate function definitions: one callable from update functions and one callable by users in query calls.

Then we implemented support for data certification. When a query function is executing in the non-replicated mode, it can use this feature to return a certificate for data that the caller can validate. As certification validation is only necessary when executing in the non-replicated mode; it is not available in the replicated mode. This means a query function running in the non-replicated mode has different capabilities than a query running in the replicated mode. The certification is also not available for update functions (as they can only run in replicated mode) so now it is no longer the case that query functions are strictly less capable than update functions.

In comment, I made a proposal for how functions executing in replicated mode will have different capabilities than functions executing in non-replicated mode. In particular, I proposed the following function signature:

fn read_state(…) → Result<CandidStruct, (HashTree, Certificate)>

When a function executing in the replicated mode calls this function, it gets back a CandidStruct and when a function executing in the non-replicated mode calls this function, it gets back a (HashTree, Certificate).

This API is not very nice because an update function will always have to handle the Err case even though we know that the Err case is impossible.

What we are seeing above are examples of the different capabilities that query functions have when executing in the replicated and non-replicated modes. More specifically, update functions and query functions in non-replicated mode have different sets of capabilities and query functions in replicated mode have a subset of capabilities of the two.

We can come up with better names for the two classes of queries: queries only callable in non-replicated mode (Q1); and queries callable in both replicated and non-replicated modes (Q2).

We can then say that update functions can only call Q2 queries and in the future when we support inter-canister queries Q1 queries can also call Q2 queries. We can also refine the proposal for read_state to instead define two different functions:

fn certified_read_state(…) → (HashTree, Certificate)

fn non_certified_read_state(…) → CandidStruct

certified_read_state is only callable from Q1 and non_certified_read_state is callable from both update functions and Q2.

4 Likes

Good analysis! This bifurcation, making query methods no longer a restricted update method has always bothered me (certified variables, but also inter-canister calls). But just giving up and letting the developer deal with this even increased complexity isn’t a satisfying answer either. Isn’t a goal to hide the complexities of blockchain and crypto from the user? I hope we have hide it in lower level of abstractions that normal developers don’t deal with.

Like we have reasonably successfully hidden differences between ingress update calls and inter-canister calls from the developer (e.g. polling only needed in one of them; certification only involved in one of them). We should try hard to maintain that level of abstraction, also for query calls. After all, they are “just” an optimization…

4 Likes

I agree, the goal should always be to not expose unnecessary complexity to the user. However, we should also avoid the temptation of pre-mature abstraction. In particular, in this case, maybe the complexity can be hidden by some language extension or a library. I would still like to experiment with the raw APIs without adding support for the abstractions in the system. Upgrading systems while having to maintain backwards compatibility is a chore.

3 Likes

I morally agree with @nomeata’s point that we should not make developers’ lives harder than necessary. But I also think that by using abstractions that do not quite fit, we did this already. The API for certification is one pretty good example. But there are more (meeting with @akhilesh.singhania are often insightful …), like:

  • It is unsafe to call untrusted canisters or canisters on untrusted subnets, which may never return, thus not allowing the calling canister to stop.
  • The semantics of calls to other canisters are not quite what one may expect, as state changes prior to the call are persisted. That means developers must be very careful not to unexpectedly leave state inconsistent during the call or when potentially trapping after it. Which in turn means that one has to understand intricacies of the platform even when working with “nice” abstractions.

To me it seems that sometimes abstractions that don’t quite fit may be worse than no abstractions at all.

2 Likes

Both good reasons to develop in the pure actor model and ditch this convenient but dangerous async stuff :slight_smile:

6 Likes

I don’t think that’s him.

Just want to chime in that once Enable canisters to make HTTP(S) requests is implemented, one can use it to call read_state end point on IC itself. Not exactly a satisfying solution, but will be possible.

1 Like
1 Like

If a piece of private state is only meant to be read by controllers, then the security of a call must ensure that the caller is genuine, not just that the result is certified. But of course this is more of a question about how non-replicated query call is going to work.

That’s a fantastic idea. Certifying data is a huge pain and doesn’t scale well. The only way out of this misery is automation.

Amen. I wrote two async runtimes for Rust canisters, and I still hate all things async passionately. State machines is the only true way to specify and build distributed software.

Also, I wish I learned about TLA+ 3 years ago :slight_smile:

4 Likes

What is the current status of this? Has it been deployed or is it still in the design or implementation phase? Thanks.

I wanted to follow up if the read_state query function (callable by both canisters and external users) will launch soon. Thanks!

Hi @jzxchiang, this has not really been moved further than discussions about what an MVP solution would look like, so I wouldn’t expect it to launch soon.

Hey all,

Very very very late to the game here and only just seeing how useful this proposal is. ic-spec is an intimidating document!

Did we ever end up somewhere here? I know we have outbound HTTP requests so… technically… this is possible. But, I think being able to verify the /canister/<canister_id>/* would solve a ton of composability issues we struggled with building quark.

1 Like