Let's have a candid discussion about Candid

  • Has someone already used this title somewhere else? Seems familiar

TLDR

  1. Here’s where Demergent Labs stands on Candid vs JSON at the moment: https://x.com/lastmjs/status/1735690729723195808

  2. We would love feedback on our choice

  3. Should we as a community begin the process of deprioritizing Candid and replacing it with a more well-known serialization stack e.g. JSON and JSON schema?

IMO It really isn’t about what the technically superior solution is alone, but about barriers to adoption.

Deprioritizing Candid at Demergent Labs

At Demergent Labs we’ve been building a TypeScript/JavaScript and Python CDK since 2022. Both projects are functional, one is about to leave beta and the other is in a good beta state.

We’ve also been engaged in documentation around these CDKs and generally figuring out the best way to educate people/devs about ICP, and to get them started with canister development.

As we see it, Candid and the tooling around it provide a large barrier to learning how to develop canisters on ICP. Even with good abstractions, the concept of RPC with query and update methods, needing to generate bindings to hook up a front-end, and dealing with restricted non-native data structures (like records that can’t have dynamic field names) causes significant hurdles for newcomers.

This is our observation over time as we engage in education, and especially at in-person hackathons with relatively young web developers (and completely new to ICP). It is especially difficult for a dev to figure out how to hook up their frontend to their backend, as they must generate bindings, hook up an agent, and engage in RPC calls, as opposed to just using regular HTTP with window.fetch and JSON.

It’s also taken a significant amount (perhaps most) of our development time at Demergent Labs to properly integrate Candid into the CDKs, providing a proper level of abstraction that is a good developer experience, and dealing with bugs or misunderstandings in the spec or implementations. And Candid in JS is not very performant, so we’ve had to deprioritize its use in StableBTreeMap, opting for JSON instead. I would guess the general consensus is that Candid serde is not very performant relative to other choices.

So at Demergent Labs we have decided to deprioritize education and tooling around Candid and the query/update RPC paradigm. We can’t remove it yet, and it’s still very important to the ecosystem and thus the CDKs, but we will first be introducing interacting with canisters using regular HTTP REST APIs that use JSON as the serialization format.

This aligns with our general thesis: developer adoption is best facilitated by removing as many barriers to development as possible, and new concepts should be kept to a minimum and introduced as late as possible in the developer journey.

Thus, our goal is to provide first-class support and education around building JSON HTTP APIs using TypeScript, JavaScript, and Python. This will result in a more familiar server environment for new developers, and allow them to get started very quickly with minimal ICP-specific tooling.

What do you think?

I would love comments and opinions on our decision.

I would also like to open up the discussion on Candid itself and its usefulness to the ecosystem. I suggest that we should consider if it would be better to embrace a more well-known and adopted serialization format, I suggest JSON and maybe JSON schema.

Candid is not a requirement at the protocol level, and as canister developers we can choose to expose our canister endpoints as we please. Though Candid is the preferred method now, it is not necessary.

We should discuss moving away from Candid as a community. I would like to dig into its merits and drawbacks to see if this would be appropriate.

17 Likes

Would certainly save me a lot of time.

1 Like

Sounds great for Azle - frontend communication. I would still want to do Azle - IC canisters calls in Candid and also expose candid methods for other IC canisters to be able to call.
I’ve personally made icblast GitHub - infu/icblast to take care of the obstacles around candid in frontends. It removes the [0] opt (which is probably something new devs find very weird) It also has ‘toState/fromState’ functions that makes objects coming from agentJs serializable - they have classes and BigInt and if you try to put them in Redux or JSON.stringify you will not get what you expect. Also the bindings are automatically generated. No TypeScript support, but developing frontends using these little things is smooth.

I think Taggr uses json. I found it pretty hard to call their canisters - probably because there are no docs and tools. They may have something to say

2 Likes

The other most common obstacle is probably when a dev needs to call their canister to authorize a principle or pass init parameters. Then they have to learn a brand new language - text candid.
Perhaps dfx should allow json or we need some tool so this works:
dfx canister call method 'tool canister_id {json}' and it converts the json to text candid

3 Likes

If I am not getting it wrong this is great actually.
We need standards for HTTP Rest apis at canister level to remove barriers for new developers to the ecosystem. So that anyone can start integration of FE to canisters without learning any new thing.
Candid has few perks like you do not need to write extra code for rest api and any non technical person can do testings. But from a dev perspective rest apis and good json handling at canister level is much much needed.

1 Like

100% agree, but I really like Internet Identity…

How does the http_request, and http_request_update factor into this discussion?

I just implemented the http_request support in icpp-pro (C++ CDK), and was able to completely abstract away Candid:

  IC_HttpRequest request;
  ic_api.from_wire(request);
...
# request is a regular C++ structure.
# No Candid knowledge needed to receive and process regular REST calls based on JSON.
...
  IC_HttpResponse response;
  response.status_code = 200;
  response.headers = request.headers;
  response.body = request.body;
  response.upgrade = false;
  ic_api.to_wire(response);

I saw that the Rust CDK also abstracts away the Candid in the http_request wrapper, but not completely.

What more beyond abstracting away Candid from the developer who wants to implement REST with JSON is needed?

I am checking into porting the oats webserver, but that is a very early idea only.

Very interesting to hear more details at the technical level for Kybra.

2 Likes

Before collecting my thoughts into a formal opinion, I’ll lay out a few thoughts and questions(not necessarily for you @lastmjs, but also DFINITY.

  1. This has been an immense frustration of mine as well, but it goes even deeper than just CANDID. There is also CBOR. The first time I jumped on DSCVR and started sniffing network traffic I remember this distinct crestfallen feeling when I started looking at the request and response bodies and realized the thousands and thousands of casual hackers that were just going to bounce right off because they couldn’t inspect the traffic.(h/t to @jorgenbuilder and the awesome IC Inspector that finally made this 1000x easier, but still, convincing devs to install a plug-in to get started seemed to be unnecessary friction.

  2. There is straight JSON, and then there is CANDID as JSON which you feed to the agent. Just passing both in the HTTP request body instead of having to go through an agent would be a big improvement I think.

  3. How does the IC handle transaction de-dupe at the ingress layer(or does it even). There is specific code in the ledger canisters trying provide some dedupe code, but I also assumed that there was also something happening at the agent/request level to make sure that a malicious boundary node can’t just keep feeding your signed request back into the IC over and over. (ie, someone grabs your post to taggr and then spams taggr with it over and over until your run out of charge, creating the same post over and over). So does the IC do something to keep this from happening, and if so, what does a rest-based access need to do to get the same protections? Ethereum transactions have a nonce that is always increasing that has to be submitted with each request and the state machine inside the evm actually keeps track of this nonce for every account. Duplicating that kind of architecture across subnets would be untenable. Perhaps the agent is doing something under the hood with the root and/or subnet sig to kee things from being dipped(maybe with a sliding trx window). Or maybe it is all on the canister writer in which case we need to beef up our tutorials on defending against malicious boundary nodes.

  4. Does the current http_request and http_request_upgrade pathway not suffice for what you want to do? You can certainly send put, post, and gets there, but I guess you need to handle identifying a principal in some custom way…and you have to use the raw endpoint because…(see 5)

  5. I know certification plays into this as well and that even certification v2 isn’t robust enough to handle the likely infinite responses from an API(although this likely only affects queries…but I thing upgraded updates are validated by boundary nodes as well). The agent is(I believe) doing some kind of validation of the results based on the rootkey so that your web dapp code never gets results from data it can’t certify. You’ll have to push this functionality to the boundary nodes(similar to the way certification works today). (I’ve talked a bit with @NathanosDev about a potential v3 of certification that defines atomic level certifications for data elements, but man does it sound like a headache to manage and would need some serious tooling to make it easy for devs)

  6. How to manage users sessions and the handling of Principals? Do you just put a signature in the request headers that proves the principal? Do you try to do session management? That seems hard until we have SEV because the sessions would theoretically be respectable by a node.

  7. The nice thing about candid…and I admit this is a bit abstract…is that it has “good shape” and is “shaped like the Internet Computer”. Variants are an awesome abstraction for the kinds of state machines that are ideal for the IC. The ability for it to compress down and make transport faster than JSOn likely has a real effect over billions and billions of messages. But again there is likely an abstraction of “CANDID as JSON” that is: 1) at least json so new devs(and AI coding tools) can write it with no friction, 2) be defined with tools like swagger and json-schema, 3) be converted by boundary nodes at run time. It would be interesting to hear from DFINITY how much additional overhead this step might lead to and how many ingress messages per second we’d be taking off the table.

  8. The future composable and interoperable IC has far more dynamic inter-canister messages than ingress messages so the additional text-based parsing of inter-canister messages likely has a real effect and is likely why candid and cbor were chosen in the first place.

  9. I’d imagine just as the boundary nodes could translate ‘CANDID as JSON’ to CANDID that there is an express plug-in that would convert "CANDID to CANDID as JSON’ on the canister side.

3 Likes

Thanks for starting this community discussion!

In a very abstract sense, I would say that the only thing Candid brings, compared to other serialization formats, is the type level guarantee of interface backward compatibility. If this feature is not important to canisters, there is no point in using Candid. In an ideal world of IC ecosystem, canisters depend on each other. We don’t want one canister upgrade to ever break other canisters, especially considering some canisters are immutable (e.g., blackholed), and they cannot keep up with the new interface change of other canisters. That’s the whole motivation of starting a new strongly typed serialization format like Candid. And we are not alone in this endeavor, if you look at the new WASI component model, the WASI interface type has very similar favors like Candid.

Candid and the tooling around it provide a large barrier to learning how to develop canisters on ICP.

That’s true. But I think learning any new serialization format takes a lot of time. If the developer is new to protobuf, GraphQL or JSON, it’s not easy to learn in a few minutes.

the concept of RPC with query and update methods

That seems to be a separate problem, not related to Candid.

dealing with restricted non-native data structures (like records that can’t have dynamic field names) causes significant hurdles for newcomers.

How many production bugs come from JSON being able to arbitrarily adding and removing fields?

It is especially difficult for a dev to figure out how to hook up their frontend to their backend, as they must generate bindings, hook up an agent, and engage in RPC calls, as opposed to just using regular HTTP with window.fetch and JSON.

Agreed. We should provide better documentation and tooling to make this easier. But I don’t see it as a reason to move away from Candid. Also if the developers are familiar with protobuf, our binding generation is very similar.

It’s also taken a significant amount (perhaps most) of our development time at Demergent Labs to properly integrate Candid into the CDKs

Totally agree. The largest effort to develop a new CDK is probably developing a Candid library in the host language which feels ergonomic and native in the host language. But I would argue that this is a one time cost for the CDK developers, and all canister developers can benefit from this effort. Developing a any binding library takes a lot of time. As an example, you can look at how many years it takes to develop serde_json and how many language features are added to the Rust compiler due to this development.

And Candid in JS is not very performant, so we’ve had to deprioritize its use in StableBTreeMap, opting for JSON instead.

If the usage doesn’t require interface backward compatibility, you don’t have to use Candid. You are free to use a more performant and space efficient serialization format for storing internal data. That’s totally fine.

I would guess the general consensus is that Candid serde is not very performant relative to other choices.

Yes. The focus is to ensure backward compatibility in deserialization. serde is in general not very performant. We use serde mainly to improve the UX, so that users can write native Rust structures. If you have a particular use case, which is too slow, we are happy to look into how to improve the performance.

we will first be introducing interacting with canisters using regular HTTP REST APIs that use JSON as the serialization format.

There could be a way to wrap Candid, so that it expose a REST API at the higher level. But I haven’t got it much thought yet. The problem with JSON is that it’s unstructured. It’s easy to get started, but you can get a lot of bugs later on, and it’s not composable.

developer adoption is best facilitated by removing as many barriers to development as possible, and new concepts should be kept to a minimum and introduced as late as possible in the developer journey.

Agreed. We share the same goal!

Candid is not a requirement at the protocol level, and as canister developers we can choose to expose our canister endpoints as we please.

The interface spec doesn’t require Candid, mainly due to separation of concerns. Candid sits higher than the protocol level. Theoretically, you can use other formats, but all of current toolings are built with the assumption that the canister speaks Candid. Using other formats would cause fragmentation of the ecosystem. Canisters using Candid can never talk to canisters which don’t use Candid. Plus you lose the backward compatibility guarantee.

8 Likes

I’ve been wondering about this. Is the component model something that dfinity is considering to support? It seems to be gaining a lot of traction GitHub - WebAssembly/component-model: Repository for design and specification of the Component Model
WIT would be their alternative to candid
component-model/design/mvp/WIT.md at main · WebAssembly/component-model · GitHub

2 Likes

Imo Candid should definitely stay the only inter-canister communication format.
I like @skilesare 's idea - boundary nodes being able to transform it bidirectionally to JSON and other formats.
Most devs start by making their canisters work only for their frontend and they plan on releasing documentation and API for other devs in the far future. If they use other formats - that won’t help us increase the networking effect. It would be way harder to create something using other canisters if Candid wasn’t the dominant format and everyone picked their personal favorite format - Msgpack, JSON, Protobuf, Candid, CBOR, etc.
So it will be best if boundary nodes do the transformation or CDKs somehow allow multiple input/output formats.

1 Like

A open api documentation integration would be cool.

2 Likes

The other most common obstacle is probably when a dev needs to call their canister to authorize a principle or pass init parameters. Then they have to learn a brand new language - text candid.
Perhaps dfx should allow json or we need some tool so this works

@infu This is totally doable, and it’s in my todo list for a while. Candid has mappings to various languages, e.g., JS, Motoko and Rust. It’s not hard for users to use these languages to communicate, and let dfx convert the values into Candid.

We need standards for HTTP Rest apis at canister level to remove barriers for new developers to the ecosystem.

@h1teshtr1path1 IC always supports REST API: The Internet Computer Interface Specification | Internet Computer, and REST API has nothing to do with JSON. The agent library provides an abstraction to make communicating with IC easier, without directly using the REST API. But if you want, you can certainly use the HTTPS endpoint directly.

WIT would be their alternative to candid

@Tbd I would say that WIT and Candid are complimentary to each other. In the WIT spec, they explicitly mentioned that subtyping is not implemented: component-model/design/mvp/WIT.md at main · WebAssembly/component-model · GitHub. And subtyping is arguably the only thing we implement in Candid.

I like @skilesare 's idea - boundary nodes being able to transform it bidirectionally to JSON and other formats.

@infu @skilesare Boundary nodes only takes traffic from outside of the IC. Inter-canister calls won’t go through boundary nodes. So a transformer at the boundary nodes level doesn’t help.

3 Likes

I do note that under the hood, candid is quite flexible, so e.g. it can handle new fields being added or expected fields being omitted if that is really wanted or needed. BUT this needs diving deep into things such as IDLTypes. I feel as if there is quite a bit of space to make candid more ergonomic without radical design changes. I write this as someone who primarily consumes candid from other sources, sometimes with mismatching .did files and with fields being added or removed unexpectedly. I am also a bit of an outsider, as I don’t work on Candid core but have made tools such as idl2json to make my own use of candid easier. Of course that doesn’t mean that canisters MUST use candid but I do feel optimistic about candid being a good common language for canisters in the long term.

Maybe my next tool (after finally updating idl2json) should be a tool that converts a .did file into a JSON schema. :wink:

5 Likes

Maybe better would be a tool that converts json schema to a .did file. That way a developer focused on JSON can get candid for interoperability in the IC ecosystem without having to learn candid.

3 Likes

I’m curious about the obstacle of learning candid, does it mainly refers to the textual representation? If so, we can totally hide this from dfx, e.g., allowing users to use JSON, Rust, or Motoko with dfx, and convert them into Candid under the hood.

2 Likes

In your experience, is JSON Schema a thing in real world development? Do front end devs use e.g. typescript to generate JSON Schema? Or is it just something that pedants and people who love messing around with these things (I stand guilty as charged - I am a bit paranoid about critical systems not breaking) create and everyone else ignores?

I think that needs to be answered by people having difficulty with it today. I learnt candid years ago when the documentation was minimal and there were few examples. The pain I had is probably very different from the pain that newcomers to the ecosystem have today. That said, my impression from talking to folks is that it is just another thing to grapple with when devs are already being overloaded with new concepts. So, maybe causing candid to disappear into the background, at least early in the development cycle, would flatten the learning curve. Or good clear guidelines on when Candid is needed (inetroperability with other canisters) and when it is optional.

2 Likes

Describing interfaces in Candid is good, way better than JSON Schema.
image

Writing text Candid in the command line or scripts in combination with other complexities is where it becomes overwhelming.

This reminds me of ffmpeg - which can have really complex command line input:
image

And someone wrote this library https://www.npmjs.com/package/fluent-ffmpeg making things a bit tidier. It just generates the text command and runs it.

We could have something like that for dfx. The part of the script above will turn into this:


let neuron_id = await dfx.network(NETWORK)
.canister(GOVERNANCE_CANISTER_ID)
.candid("candid/sns_governance.did")
.call("list_neurons", {of_principal:DX_PRINCIPAL, limit:1})
.then(x => x.neurons[0].id.id)

I like this a lot… so I’ll add few more commands :slight_smile:


let me = await dfx.identity("some").getPrincipal()

await dfx.build("main");

await dfx.canister(GOVERNANCE_CANISTER_ID)
.install("./build/main.wasm")
.candid("./build/main.did")
.initParams({ 
   admin: me,
   some: [1,2,3,4,5],
   logo: await getFile("./mylogo.png")
   });
9 Likes

I agree Candid and bash doesn’t really work well. That’s why I try to use ic-repl for scripting. The above examples can be written as follows in ic-repl:

identity default "~/.config/dfx/identity/default/identity.pem";
call nns.list_neurons(record {of_principal = opt default; limit = 1});
let neuron_id = _.neurons[0].id[0].id;
neuron_id
function deploy(wasm, init) {
  let id = call ic.provisional_create_canister_with_cycles(record { settings = null; amount = null });
  call ic.install_code(
    record {
      arg = encode wasm.__init_args(init);
      wasm_module = wasm;
      mode = variant { install };
      canister_id = id.canister_id;
    },
  );
  id
};
identity default "~/.config/dfx/identity/default/identity.pem";
let canister = deploy(
  file("build/main.wasm"), 
  record { admin = default; some = vec {1;2;3}; logo = file("./mylogo.png"); }
);
2 Likes

It massively helps web developers. Web “hackers” turn into developers, but they start by hacking. They start by intercepting the network traffic(using the network tab in chrome) to see what is sent to the server and what comes back. If every dapp on the IC had clear text json going from the dapp to the boundary nodes and clear text json responses coming back I’d estimate that we’d have had orders of magnitude more developers engaging with our dapps.

1 Like