Community Consideration: Explore Query Charging

I also agree with the above statement regarding adding both on-chain & off-chain GPU compute facility to the IC.

The greatest need for the data protection guarantees offered by the IC is for AI inference queries which will process vast amounts of private and protected information about individuals and businesses. Inference queries clearly belong on-chain using CPU and GPU computation so charging for queries with larger instruction limits and choosing single replica node query computation (default) or verified query computation (by choice) is important.

For AI model training, which typically has much higher memory, throughput and data storage requirements, the need for off-chain GPU compute will be necessary in the near term. We can discuss this in more detail over in the DeAI WG thread to keep this thread focused on query charging. However one related question for the Dfinity engineering team is who should we be asking about the goals and implementation progress for on-chain GPU compute coming in the Gen 3 replica node specification currently under development and testing at Dfinity?
The DeAI WG discussed in our last meeting the need to better understand how Gen 3 replica node GPU compute might be utilised by canister developers in the future and how the choice of physical GPUs specified for Gen 3 will align with their product development requirements wrt AI inference and model training/fine-tuning.
@stefan-kaestle could you point us in the right direction?

As this is somewhat off-topic we can take the GPU discussion over here Technical Working Group DeAI

4 Likes