Technical Working Group DeAI

patnorris · September 11, 2025, 2:40pm

looks interesting, thank you @apotheosis ! That sounds great for a session
Next Thursday, we’ll have a session focused on hardware and infrastructure but maybe the week after (Sep 25) or Oct 2 (if it works better)?

icarus · September 11, 2025, 3:14pm

Absolutely relevant to our long-running discussions about whether or not it would be possible to reach consensus over the outputs of identitical LLMs running on identically configured accelerator hardware in a subnet. I was reading their article about this earlier today. You can find that here Defeating Nondeterminism in LLM Inference - Thinking Machines Lab

patnorris · September 11, 2025, 3:31pm

Hi everyone, thank you for today’s call (2025.09.11). Special thanks to Sanjeev for sharing his expertise on Active Inference, and Tim for organizing this session! This is the generated summary (short version, please find the long version here ): This talk by Sanjeev Namjoshi (VERSES) introduced Active Inference and the Free Energy Principle as a unifying framework for perception, action, learning, and self-organization. It explained how agents build generative models to minimize free energy by aligning predictions with observations, balancing exploration and exploitation, and maintaining identity within preferred states. Key concepts included surprisal, variational free energy, Markov blankets, predictive coding, and Bayesian mechanics as a potential “fourth branch of physics.” Practical notes highlighted modularity, multi-agent systems, and data efficiency compared to deep RL, with current tooling spanning Matlab (SPM), PyMDP, RXInfer, and ActiveInference.jl. VERSES applies these ideas in industrial domains such as robotics, logistics, and supply chains, valuing interpretability, efficiency, and adaptability.

Links shared during the call:

apotheosis · September 11, 2025, 4:07pm

Might be a good ‘discussion session’ then. I am in no way an expert on the matter.. but we could discuss the article and what it means in the context of blockchain. on the 25th as @patnorris suggested

patnorris · September 11, 2025, 6:40pm

that sounds like a plan

patnorris · September 16, 2025, 8:29am

Hi everyone, join us this Thursday for our call on Accelerated Infrastructure for AI where we focus on hardware and infrastructure to support AI workloads on ICP. @icarus will lead the session I’m looking forward to it, see you then

patnorris · September 18, 2025, 3:40pm

Hi everyone, thank you for today’s call (2025.09.18). Special thanks to @icarus for leading the call and sharing his expertise! This is the generated summary (short version, please find the long version here ): The call explored FPGA-based AI accelerators (e.g., Intel Agilex/Positron) as a flexible, “computer-on-a-card” middle path between expensive NVIDIA GPUs and risky ASICs for ICP. These cards (ARM cores + DDR4/HBM2, PCIe/400 GbE, BMC) run Linux and a transformer-optimized fabric that can load Hugging Face models, with vendors claiming ~2–3× better latency/throughput and energy efficiency vs H100 for mid-size LLMs thanks to mat-vec/dataflow specialization—and they’re field-reprogrammable for ongoing gains. The group discussed ICP fit: standardized 6-year hardware bets, TEE-enabled replica VMs with a simple, low-maintenance sidecar inference stack, and keeping security/determinism by avoiding external network exposure. Next steps center on a roadmap to first prototypes, evaluating vendor ecosystems, and seeding a hardware-oriented developer track (e.g., via ETH Zurich, lower-cost Agilex variants).

Links shared during the call:

FPGA: Field-programmable gate array - Wikipedia
Groq engineers interview on their chip design: https://www.youtube.com/watch?v=13pnH_8cBUM
AI programmed onto FPGA devices: https://youtu.be/JJ-R-Cgo600?si=2QwljkUO7XgTv8kt
https://youtu.be/lLjnDXYsBUM?si=Byyc2lLwdPONxF13
https://www.youtube.com/live/hoIMDKrT6jM?si=kF4anKhIbb3mnabI
https://youtu.be/oeKRtNkx_Ek?si=QSHRZV4Pa5rHvYtU
https://youtu.be/9bCICtQd9yY?si=RU9jJjrjad4Sm8Y6
https://youtu.be/PWJNjYspg4A?si=I7sisnNG3SVEW-FH
Next week’s session: Defeating non-determinism in LLM inference
GitHub - thinking-machines-lab/batch_invariant_ops
Free book on Active Inference: https://direct.mit.edu/books/oa-monograph/5299/Active-InferenceThe-Free-Energy-Principle-in-Mind

patnorris · September 24, 2025, 10:16am

Hi everyone, this week’s session will be about non-determinism in LLMs
We’ll discuss the links @apotheosis and @icarus shared about " Defeating Nondeterminism in LLM Inference":
GitHub: GitHub - thinking-machines-lab/batch_invariant_ops
Article: Defeating Nondeterminism in LLM Inference - Thinking Machines Lab

I’m looking forward to the session and seeing everyone then

apotheosis · September 24, 2025, 8:18pm

Indeed!

In the larger context of LLM for blockchain — non deterministic output was a strong problem. But with determinism you can have consensus. i.e. run LLM directly within a consensus mechanism with a modest performance hit ~20% or so according to the paper.

Looking forward to the discussion.

patnorris · September 25, 2025, 2:28pm

Hi everyone, thank you for today’s call (2025.09.25). Special thanks to @apotheosis and @icarus for sharing the links about nondeterminism in LLMs that lead to this discussion! This is the generated summary (short version, please find the long version here ): This session explored a new perspective on determinism in LLM inference, highlighting that non-determinism stems primarily from batching rather than GPU floating-point math. The group discussed how enforcing deterministic batching could make LLM outputs consensus-safe for decentralized systems like the Internet Computer (IC), albeit at a performance trade-off. Determinism was connected to proofs of execution, zero-knowledge ML (ZKML), and trust layers for agent-to-agent interactions, with concrete implications for DeFi, regulated domains, and privacy-preserving use cases. The conversation also touched on practical verification strategies (hashes, black hole canisters, ZK proofs), the role of temperature in inference, and the IC’s unique advantages over TEE-based approaches, setting the stage for further exploration of standardized proof layers and asynchronous, verifiable AI agents.

Links shared during the call:

patnorris · September 30, 2025, 8:40am

Hi everyone, I hope you’re having a great start to your week! For this week’s call, are there any items you would like to see on the agenda, or is there anything you would like to present or discuss during the call?

apotheosis · September 30, 2025, 2:59pm

I will not be able to make this week. But there is a whole web3 convo around trustless agents and erc-8004. It might be good for people to look into that and take notes on how ICP can be used for that, and probably much more.

icarus · October 2, 2025, 1:57pm

Here is the top post explaining ICaiBus which I was talking about in the DeAI meeting just now.

patnorris · October 2, 2025, 2:37pm

Hi everyone, thank you for today’s call (2025.10.02). This is the generated summary (short version, please find the long version here ): The group aligned on practical limits of running large AI fully inside IC’s WASM/consensus, explored paths to higher throughput via async “sidecar” compute on node hardware (and future accelerators), and proposed a parallel initiative to standardize trust/minimal-trust tooling for agents (identity, reputation, registries). A recurring monthly session for hardware acceleration will continue, and a new monthly track for agent tooling & standards is proposed. Pub/Sub on IC (ICRC-77/-72; “ICAIBus”) was highlighted as a fit for async AI workflows. Next week features ICP Coder (Motoko AI assistant) @Gianm ; an FPGA deep-dive update is planned in 2 weeks @icarus .

Links shared during the call:

Jenda · October 2, 2025, 2:54pm

Thanks for putting this together

patnorris · October 7, 2025, 10:41am

Hi everyone, this week @Gianm will present ICP Coder, the Motoko AI assistant he created I’m looking forward to the session! As usual, it’s in the ICP Discord’s voice channel at 3pm CET on Thursday, this is the event: ICP

patnorris · October 9, 2025, 3:00pm

Hi everyone, thank you for today’s call (2025.10.09). Special thanks to @Gianm and @DiegoWenHao for presenting ICP Coder! This is the generated summary (short version, please find the long version here ): The session featured Gian and Diego presenting ICP Coder, an open-source Motoko-focused AI coding assistant that integrates with VS Code/Cursor via an MCP server and leverages RAG over curated Motoko documentation and code. It currently runs locally using Chroma as the vector database and Google Gemini as the LLM, offering semantic retrieval and context-aware code generation. The team plans to host a Web2 version soon and later migrate to ICP canisters once technical constraints allow. Discussion covered database scalability, embedding model trade-offs, and future directions like Agentic RAG, configurable sentence transformers, and multi-language support (Rust, TypeScript, Python). The demo successfully showcased live ingestion and retrieval, impressing participants with its hands-on, transparent setup.

Links shared during the call:

ICP Coder forum thread: ICP Coder: AI Assistant for Motoko code generation - #3 by Gianm
ICP Coder repo: GitHub - DiegoFloresWenHao/ICP_Coder: LLM RAG implementation for Motoko Code Generation Enhancement
Paper on RAG: [2507.18515] A Deep Dive into Retrieval-Augmented Generation for Code Completion: Experience on WeChat

Gianm · October 9, 2025, 3:23pm

Thank you so much for the space to share our work, will keep improving the projects, new features soon!!

patnorris · October 14, 2025, 8:47am

Hi everyone, join us this Thursday for our next session on Accelerated Infrastructure for AI on ICP lead by @icarus I’m looking forward to the call and seeing you then!

patnorris · October 16, 2025, 2:46pm

Hi everyone, thank you for today’s call (2025.10.16). Special thanks to @icarus for sharing his latest research! This is the generated summary (short version, please find the long version here ): In today’s call, the group explored why modern FPGAs are emerging as serious, inference-optimized accelerators for transformer LLMs—and how their properties map to IC/DeAI goals. Modern FPGAs have evolved into full SoC-class accelerators that combine reconfigurable logic, on-chip memory, ARM cores, and high-speed networking—making them powerful, deterministic, and upgradable inference engines for transformer models. Case studies like Positron AI’s Llama-3-8B inference show ~4× better performance-per-dollar and ~4.5× higher energy efficiency than NVIDIA H100 GPUs, enabled by dataflow-style architectures and 32 GB on-fabric HBM that minimize memory transfers. Their ability to run fixed-timing, verifiable pipelines and receive remote bitstream upgrades positions FPGAs as a strong complement to GPUs for decentralized, deterministic AI execution on the Internet Computer.

Topic		Replies	Views
AI and machine learning on the IC? Developers	114	10655	June 20, 2024
DeAI.chat – Decentralized AI chat on the Internet Computer Showcase DeAI	0	177	February 25, 2025
Llama2.c LLM running in a canister! Programs & Applications	61	5179	July 1, 2024
Technical Working Group: Scalability & Performance Developers Discussing , community-consideration	180	10633	October 16, 2025
funnAI: first Proof‑of‑AI‑Work on ICP Showcase	15	1143	January 18, 2026

Technical Working Group DeAI

Links shared during the call:

Links shared during the call:

Links shared during the call:

Links shared during the call:

Links shared during the call:

Links shared during the call:

Related topics