Technical Working Group DeAI

looks interesting, thank you @apotheosis ! That sounds great for a session :+1:
Next Thursday, we’ll have a session focused on hardware and infrastructure but maybe the week after (Sep 25) or Oct 2 (if it works better)?

Absolutely relevant to our long-running discussions about whether or not it would be possible to reach consensus over the outputs of identitical LLMs running on identically configured accelerator hardware in a subnet. I was reading their article about this earlier today. You can find that here Defeating Nondeterminism in LLM Inference - Thinking Machines Lab

1 Like

Hi everyone, thank you for today’s call (2025.09.11). Special thanks to Sanjeev for sharing his expertise on Active Inference, and Tim for organizing this session! This is the generated summary (short version, please find the long version here ): This talk by Sanjeev Namjoshi (VERSES) introduced Active Inference and the Free Energy Principle as a unifying framework for perception, action, learning, and self-organization. It explained how agents build generative models to minimize free energy by aligning predictions with observations, balancing exploration and exploitation, and maintaining identity within preferred states. Key concepts included surprisal, variational free energy, Markov blankets, predictive coding, and Bayesian mechanics as a potential “fourth branch of physics.” Practical notes highlighted modularity, multi-agent systems, and data efficiency compared to deep RL, with current tooling spanning Matlab (SPM), PyMDP, RXInfer, and ActiveInference.jl. VERSES applies these ideas in industrial domains such as robotics, logistics, and supply chains, valuing interpretability, efficiency, and adaptability.

Links shared during the call:

3 Likes

Might be a good ‘discussion session’ then. I am in no way an expert on the matter.. but we could discuss the article and what it means in the context of blockchain. on the 25th as @patnorris suggested

3 Likes

that sounds like a plan :+1:

Hi everyone, join us this Thursday for our call on Accelerated Infrastructure for AI where we focus on hardware and infrastructure to support AI workloads on ICP. @icarus will lead the session :muscle: I’m looking forward to it, see you then

5 Likes

Hi everyone, thank you for today’s call (2025.09.18). Special thanks to @icarus for leading the call and sharing his expertise! This is the generated summary (short version, please find the long version here ): The call explored FPGA-based AI accelerators (e.g., Intel Agilex/Positron) as a flexible, “computer-on-a-card” middle path between expensive NVIDIA GPUs and risky ASICs for ICP. These cards (ARM cores + DDR4/HBM2, PCIe/400 GbE, BMC) run Linux and a transformer-optimized fabric that can load Hugging Face models, with vendors claiming ~2–3× better latency/throughput and energy efficiency vs H100 for mid-size LLMs thanks to mat-vec/dataflow specialization—and they’re field-reprogrammable for ongoing gains. The group discussed ICP fit: standardized 6-year hardware bets, TEE-enabled replica VMs with a simple, low-maintenance sidecar inference stack, and keeping security/determinism by avoiding external network exposure. Next steps center on a roadmap to first prototypes, evaluating vendor ecosystems, and seeding a hardware-oriented developer track (e.g., via ETH Zurich, lower-cost Agilex variants).

Links shared during the call:

1 Like

Hi everyone, this week’s session will be about non-determinism in LLMs :muscle:
We’ll discuss the links @apotheosis and @icarus shared about " Defeating Nondeterminism in LLM Inference":
GitHub: GitHub - thinking-machines-lab/batch_invariant_ops
Article: Defeating Nondeterminism in LLM Inference - Thinking Machines Lab

I’m looking forward to the session and seeing everyone then :+1:

3 Likes

Indeed!

In the larger context of LLM for blockchain — non deterministic output was a strong problem. But with determinism you can have consensus. i.e. run LLM directly within a consensus mechanism with a modest performance hit ~20% or so according to the paper.

Looking forward to the discussion.

2 Likes

Hi everyone, thank you for today’s call (2025.09.25). Special thanks to @apotheosis and @icarus for sharing the links about nondeterminism in LLMs that lead to this discussion! This is the generated summary (short version, please find the long version here ): This session explored a new perspective on determinism in LLM inference, highlighting that non-determinism stems primarily from batching rather than GPU floating-point math. The group discussed how enforcing deterministic batching could make LLM outputs consensus-safe for decentralized systems like the Internet Computer (IC), albeit at a performance trade-off. Determinism was connected to proofs of execution, zero-knowledge ML (ZKML), and trust layers for agent-to-agent interactions, with concrete implications for DeFi, regulated domains, and privacy-preserving use cases. The conversation also touched on practical verification strategies (hashes, black hole canisters, ZK proofs), the role of temperature in inference, and the IC’s unique advantages over TEE-based approaches, setting the stage for further exploration of standardized proof layers and asynchronous, verifiable AI agents.

Links shared during the call:

2 Likes

Hi everyone, I hope you’re having a great start to your week! For this week’s call, are there any items you would like to see on the agenda, or is there anything you would like to present or discuss during the call?

1 Like

I will not be able to make this week. But there is a whole web3 convo around trustless agents and erc-8004. It might be good for people to look into that and take notes on how ICP can be used for that, and probably much more.

1 Like

Here is the top post explaining ICaiBus which I was talking about in the DeAI meeting just now.

2 Likes

Hi everyone, thank you for today’s call (2025.10.02). This is the generated summary (short version, please find the long version here ): The group aligned on practical limits of running large AI fully inside IC’s WASM/consensus, explored paths to higher throughput via async “sidecar” compute on node hardware (and future accelerators), and proposed a parallel initiative to standardize trust/minimal-trust tooling for agents (identity, reputation, registries). A recurring monthly session for hardware acceleration will continue, and a new monthly track for agent tooling & standards is proposed. Pub/Sub on IC (ICRC-77/-72; “ICAIBus”) was highlighted as a fit for async AI workflows. Next week features ICP Coder (Motoko AI assistant) @Gianm ; an FPGA deep-dive update is planned in 2 weeks @icarus .

Links shared during the call:

2 Likes

Thanks for putting this together

1 Like

Hi everyone, this week @Gianm will present ICP Coder, the Motoko AI assistant he created :muscle: I’m looking forward to the session! As usual, it’s in the ICP Discord’s voice channel at 3pm CET on Thursday, this is the event: ICP

3 Likes

Hi everyone, thank you for today’s call (2025.10.09). Special thanks to @Gianm and @DiegoWenHao for presenting ICP Coder! This is the generated summary (short version, please find the long version here ): The session featured Gian and Diego presenting ICP Coder, an open-source Motoko-focused AI coding assistant that integrates with VS Code/Cursor via an MCP server and leverages RAG over curated Motoko documentation and code. It currently runs locally using Chroma as the vector database and Google Gemini as the LLM, offering semantic retrieval and context-aware code generation. The team plans to host a Web2 version soon and later migrate to ICP canisters once technical constraints allow. Discussion covered database scalability, embedding model trade-offs, and future directions like Agentic RAG, configurable sentence transformers, and multi-language support (Rust, TypeScript, Python). The demo successfully showcased live ingestion and retrieval, impressing participants with its hands-on, transparent setup.

Links shared during the call:

3 Likes

Thank you so much for the space to share our work, will keep improving the projects, new features soon!!

1 Like

Hi everyone, join us this Thursday for our next session on Accelerated Infrastructure for AI on ICP lead by @icarus :muscle: I’m looking forward to the call and seeing you then!

Hi everyone, thank you for today’s call (2025.10.16). Special thanks to @icarus for sharing his latest research! This is the generated summary (short version, please find the long version here ): In today’s call, the group explored why modern FPGAs are emerging as serious, inference-optimized accelerators for transformer LLMs—and how their properties map to IC/DeAI goals. Modern FPGAs have evolved into full SoC-class accelerators that combine reconfigurable logic, on-chip memory, ARM cores, and high-speed networking—making them powerful, deterministic, and upgradable inference engines for transformer models. Case studies like Positron AI’s Llama-3-8B inference show ~4× better performance-per-dollar and ~4.5× higher energy efficiency than NVIDIA H100 GPUs, enabled by dataflow-style architectures and 32 GB on-fabric HBM that minimize memory transfers. Their ability to run fixed-timing, verifiable pipelines and receive remote bitstream upgrades positions FPGAs as a strong complement to GPUs for decentralized, deterministic AI execution on the Internet Computer.

Links shared during the call:

2 Likes