Parallel execution for canisters

Thanks a lot for all the input, everyone. @Kurt I haven’t dived into all the technical details you shared regarding PyPIM, but that’s where having a discussion over a call would be beneficial as we discussed via DM.

I want to address a few high-level points for now:

I think “off-chain” and “on-chain” are overloaded terms. What usually is meant with “off-chain” is that whatever is being referenced has a weak trust model and/or not reliable, and the converse for “on-chain”.

When it comes to the AI workers approach we’re exploring, there are different ways of decentralizing it. For example, if AI workers are hosted by the node providers with a similar replication factor as subnets, then it would have the exact same trust model as most ICP subnets, and would therefore be just as “on-chain” as anything else on the Internet Computer.

In that case, would they not meet your trust requirements? If you have a specific use-case in mind (real use-cases, not hypothetical ones, please), I’d be happy to discuss further.

Thanks for the input. I’m not familiar with JAM, but will happily take a look at their approach. A (very) quick look at their docs show the following:

JAM, short for Join-Accumulate Machine, represents a prospective design to succeed the relay chain […] However, within the actual chain, only the Join and Accumulate functions are executed, while the Collect and Refine processes occur off-chain.

So in my first-pass understanding, it does seem like the computation happens off-chain. But, as I alluded to earlier, I’d rather not have a conversation about what “on-chain” vs “off-chain”. It would be a far more productive to talk tech and discuss specific solutions or alternatives to AI workers rather than compare projects based on high-level claims.

Hey Jdcv97, comparing ICP’s 1B parameter limit to ChatGPT’s 1.8T is like comparing apples to sheep—it misses the point. ICP isn’t trying to brute-force massive models on-chain; it’s built for something different. The real bottleneck isn’t GPUs—those are already old news next to PyPIM, photonic chips, and AI-specific ASICs. Dfinity’s delay in rolling out GPUs might actually be a blessing, letting them skip the outdated tech and jump straight to what’s next.

Edge AI is coming fast. In the next 18 months, we’ll have mini AI models running on our phones, with data locked down in ICP canisters. That’s not just hosting—it’s secure, decentralized execution at a level no centralized cloud can touch. ICP’s play isn’t about raw compute; it’s about being the trust layer for AI agents and sensitive data in a world where security trumps everything.

Why we have to always rely on other technologies built by big tech companies? I thought this R&D team the whole purpose was to create new tech, from scratch, there’s nothing that innovative that Dfinity has built for ICP, quick protocol from google, ECDSA, BLS, threshold signatures, web assembly.

Honestly I thought Dfinity purpose was to create something revolutionary and innovative, but we are stagnant and relying on traditional tech companies to build things for us.

You mentioned something called “PyPim” I will look in to that, but again you are suggesting to use someone else’s tech.

So why do we have 200 “engineers “ if this guys can’t create anything that helps us scale? I get the chat gpt part, I’m just saying 1B parameters is bullshit, is nothing that actually has value on real world, which LLM are you gonna execute on this canisters if can’t process more than 1B parameters.

Another thing, and when you ask for solutions to this engineers they never come with “ we will create new tech that helps us on this particular limitation “ instead they come with the easy answer “ there’s no solution yet and we don’t know what to do” (in their own words, but that’s the message) also sometimes they have come with “ if you community have any idea on how we can do it we can hear from suggestions “ come on, so if they are asking community what to do we are basically (fu cked) sorry the term.

@Jan what is this R&D team about? Are this guys really building new technologies or they are just waiting on the office till IBM,
Google, apple build some stuff that help us?

Will this agents, the LLM (Code) will be running inside canisters? Going execution through consensus? That’s what I mean by on chain. But if that code is being executed inside AWS EKS with GPUS inside a docker container, then that’s off chain.

My use case is not something I’m building but what is being sold by @dominicwilliams to his investors and users, “LLM’s will be running inside canisters absorbing blockchain properties” enterprises will run their customs AI inside canisters, law firms will host their AI om this canisters so their data can’t be hacked or suffer prompt injection.

Can the AI Workers you mentioned be decentralized? I mean give the code to the SNS ? And the SNS having on chain control over that code?

PyPIM has not been used by anyone ever in a Blockchain / crypto setup the idea to incorporate it is mine and the technology is very new.

A bit more about it for context,

And there are a few companies who are manufacturing PIM chips but it’s more about the architecture than hardware,

The paper titled “PyPIM: Integrating Digital Processing-in-Memory from Microarchitectural Design to Python Tensors” presents a framework called PyPIM, which aims to bridge the gap between high-level Python programming and low-level microarchitectural design for digital processing-in-memory (PIM) systems. This framework is designed to simplify the development of PIM applications and enable the conversion of existing tensor-oriented Python programs to PIM with ease.

Key Concepts and Contributions:

Processing-in-Memory (PIM):

PIM architectures perform computations directly within the memory, reducing the need for data transfer between the CPU and memory, thereby mitigating the “memory wall” problem.

The paper focuses on digital memristive PIM, which uses memristors for both storage and logic operations.

Microarchitecture and ISA:

The paper proposes a microarchitecture that supports efficient operation decoding for partitions, flexible addressing, and inter-crossbar communication.

An instruction set architecture (ISA) is introduced to abstract the implementation details of memristive digital PIM, enabling general-purpose PIM algorithm development.

Development Library:

A high-level Python library is proposed, which allows developers to write PIM applications using familiar tensor operations, similar to NumPy and PyTorch.

The library includes dynamic memory management and general-purpose algorithms, making it easier to develop PIM applications.

Host Driver:

A host driver translates high-level Python code into low-level micro-operations, enabling efficient execution on PIM hardware.

The driver is designed to be flexible and can be updated without replacing the hardware, providing a non-bottleneck solution for PIM performance.

Simulator:

A GPU-accelerated simulator is developed to verify the correctness and performance of PIM applications, serving as a drop-in replacement for physical PIM chips.

Potential Impact on ICP Crypto with AI:

Efficient AI Computations:

AI applications, particularly those involving deep learning, often require extensive matrix and tensor operations. PyPIM’s ability to perform these operations directly in memory can significantly speed up computations by reducing data transfer times.

The framework’s support for high-throughput arithmetic and parallelism can be leveraged to accelerate AI model training and inference.

Integration with Existing AI Frameworks:

PyPIM’s Python library can be integrated with existing AI frameworks like PyTorch and TensorFlow, allowing developers to seamlessly incorporate PIM capabilities into their AI pipelines.

This integration can help optimize AI workloads, making them more efficient and faster.

Cryptographic Applications:

Cryptographic algorithms often involve intensive computational tasks, such as encryption, decryption, and key generation. PyPIM’s efficient processing capabilities can be utilized to accelerate these tasks.

The framework’s support for bitwise operations and parallelism can be particularly beneficial for cryptographic algorithms that rely on such operations.

Energy Efficiency:

PIM architectures are known for their energy efficiency, as they reduce the need for data movement. This can be particularly advantageous for AI and cryptographic applications that require significant computational resources.

By leveraging PyPIM, ICP crypto with AI can achieve more energy-efficient computations, which is crucial for sustainable and scalable solutions.

Scalability:

The framework’s support for inter-crossbar communication and dynamic memory management can help scale AI and cryptographic applications to handle larger datasets and more complex computations.

This scalability is essential for applications that require processing large volumes of data, such as training deep learning models on extensive datasets.

In summary, PyPIM offers a comprehensive framework for integrating digital PIM capabilities into AI and cryptographic applications.