AI-Driven Governance & Security for ICP

AI-Driven Governance & Security for ICP

AI-Driven Governance & Security for ICP bridges the Eliza AI Agent Framework with the Internet Computer’s NNS to automate proposal analysis.

Problem

Governance on the Internet Computer is uniquely powerful: proposals can ship code that directly impacts core infrastructure. With this power comes risk. Voting windows are short, proposal details are scattered, and voters are faced with an increasing volume of complex technical context. Today, this process faces several challenges:

  • Time-constrained review windows often leave little space for deep technical scrutiny.

  • Fragmented context makes it hard for voters to trace proposals back to actual code changes.

  • High risk of missed issues means vulnerabilities or regressions can slip through unnoticed.

Solution

AI-Driven Governance & Security for ICP acts as an automated analyst for governance.

  • Proposals are fetched from the NNS governance canister.

  • The system infers the target repository and commit range, retrieves GitHub diffs, and audits the changes with a large language model.

  • Findings are aggregated into a structured report that is stored on-chain.

  • A lightweight web application presents the results in human-friendly visuals, while an OpenChat bot posts real-time links as reports go live.

This creates a seamless, verifiable audit trail for every governance action.

System Components

  • Eliza NNS Plugin - Fetches and filters NNS proposals for downstream processing.

  • AI Neuron Agent - Analyzes proposals, retrieves associated code, and generates structured AI reports.

  • AI Neuron Canister - Provides on-chain storage for metadata and reports, with pagination and batch access.

  • Web App - Allows users to browse proposals, view detailed reports, and inspect severity breakdowns.

  • OpenChat Bot - Notifies subscribed communities when new reports are available, linking directly to the web app.

The Value

  • For voters - timely, contextualized risk insights before casting a vote.

  • For developers - precise findings down to the file and line level.

  • For the community - a permanent, auditable history of governance outcomes.

Future Plans

The framework is planned to be extend beyond the NNS to the broader Service Nervous Systems (SNS). This will enable:

  • Ingestion and analysis of SNS governance proposals across participating DAOs.

  • Per-SNS dashboards with trend analytics and risk scoring.

  • OpenChat alerts linking directly to full reports for DAO communities.

Status of the Project

Proof-of-concept components are live, including proposal ingestion, AI-driven auditing, on-chain report storage, a web-based report viewer, and OpenChat notifications.

Links

:globe_with_meridians: Website: https://kcyll-maaaa-aaaak-quk5q-cai.icp0.io/
:bird: X post: https://x.com/querio_io/status/1967281794753171777
:open_file_folder: Repository: https://github.com/CrossChainLabs-ICP/ai-neuron
:clapper_board: Demo: https://www.youtube.com/watch?v=hK7f7GCjkZY

8 Likes

thanks for sharing,@andreea!

please consider adding this also here → 📢 Call for Ecosystem Updates – Developer Newsletter #6

would love to get some feedback of experienced reviewers whether they find this valuable. (cc @Lorimer @wpb @Zane @ZackDS @hpeebles @ilbert @cyberowl @zenithcode @yuvika @ipsita @Gwojda @timk11 @LaCosta )

2 Likes

Here are others who have participated in proposal reviews and have opinions on the utility of AI generated options…
@NathanosDev @ZoLee @tiago89 @Gekctek @Cris.MntYetti

Personally, I prefer to see human effort on these proposal reviews. I am all for automating repetitive tasks and making the reviewers life easier, but I don’t like the idea of automating the critical thinking or diluting the learning opportunity. I’m a first principles guy though. I prefer to periodically derive the formulas that I use every day so I can stay sharp on the fundamentals. That’s kind of how I see these proposal reviews. We need people in the community who are studying the code and understand how it works. It seems to me that a good way to stay sharp on the fundamentals is to read through it yourself in order to understand the changes as opposed to relying on an AI. I can understand and respect other opinions though. I’m not the one actually performing the work, so I would defer to their opinions anyway.

4 Likes

It’s definitely a useful tool that can help get an overview of the changes. Personally I can’t speak about the code-related stuff, but for proposal types that require more of an analytical perspective rather than technical, like Motion, PM&NA or SM&BN, AI has been useful for creating the tools that automate some of the tasks: fetching node, operator and datacenter records, reward tables, generating subnet maps, parse that data in a more friendly and readable fashion. I wouldn’t rely on 100% AI generated reviews to make an informed decision though, most of the job for these proposal types is to actually cross-reference the fetched data with other sources, or do external researches on Google and public registries. AI is just another tool that can be added to the toolbox and help save some time on repetitive tasks, but it can’t, and shouldn’t, replace the human factor entirely.

Having this said, it’s good to see that some are working on making proposals more approachable to a broader audience. ICP governance is quite complex and understanding all the nuances is not an easy task, those who invested in the protocol more than often lack the necessary skills to understand them, especially the technical ones (me being one of them, I can only get so far when it comes to changes that require a deep understanding of the code), so having a tool that can at least break down the complexities and give a superficial overview of what’s happening is a great step forward. We need less gatekeeping, and this means creating tools that make proposals more digestible for everyday investors too.

4 Likes

I’m really keen to see what it can do. I remember there was some discussion a while back about the use of AI in proposal reviews. Basically Dfinity wanted to be sure that funded reviewers weren’t relying on AI tools to generate the content instead of doing the work ourselves. When this came up I had a look at what ChatGPT and similar tools could do on NNS proposals, to see if there was indeed something to worry about, and the type of commentary they came up with generally quite low quality. At best it was very longwinded and wandered way off point, and at worst it was straight out wrong or didn’t even comment on the right code sections.

To see a tool that’s specifically tailored for this task would be very interesting and I’m frankly quite excited about the concept. I don’t think it will replace the humans, and there always need to be humans making the end decisions on anything that comes out of AI, as is the case in just about any field. It’s certainly the case where in healthcare, which is my main background. AI tools for skin cancer diagnosis are a good case in point. Various such tools are said to be “better than a dermatologist” but while an AI tool might be quite accurate at picking a melanoma, it’s not the dermatologist’s job to simply tell you which lesions are melanoma and which ones aren’t. The dermatologist’s job is to pick which lesions warrant removal for further study, or which ones deserve a wait-and-see approach, to counsel the patient on risks and benefits of any procedure, to take into account their level of understanding, perhaps their access to support at home and psychological factors, and several other highly nuanced tasks. Nonetheless, the AI tools can be very useful if we appreciate their limitations and use them for their strengths.

I had a quick look at this tool and could see some inaccuracies straight up, but I appreciate that it’s early days and it takes time to fine-tune these things. I’ve reviewed several of these proposals myself so I’ll try and have a closer look at it when I get a chance.

@andreea You also mentioned OpenChat notifications. What do you have in place there? Quite separate from this I’ve been hoping to find something that can be used for notifications of proposals narrowed to specific topics.

4 Likes

Thanks for the tag @marc0olo, just some quick thoughts from me.

Ultimately the IC, and Web3 in general, needs this sort of thing for decentralised governance to really make sense. Governance needs to be convenient or it’ll never really be particularly decentralised. So many proposals would actually be quite simple to review if the grunt work were taken out of it.

I think it’s important that the output is known to be AI generated, and therefore known that it will have its limitations and cautions. Would be great to mitigate that by allowing humans to comment on the report.

Critical codebase changes will always be difficult, but doesn’t mean it can’t be made easier for regular voters to feel in the loop.

Longer term the AI would really need to be running on chain using an open source model that others can inspect for biases.


Great project @andreea! I’m looking forward to seeing this in action. Even if it’s basic, it’s great to have something in place that can be improved over time.

4 Likes

Did a high level overview. TL;DR it is great, just as Alex highlighted it is something that can be improved. Will try and run this to learn more about the LLM used. Also would be great if anyone trying it out would share the results so we can take a look at it, and review them.

2 Likes

I think it’s an interesting idea and we have to start somewhere. I think AI can potentially enhance the human conducted audits if in the right way
I don’t think we can use it as a individual contributor where we can trust its accuracy at this point but maybe rather a tool to raise any potential red flags that can be analyzed by human which could potentially help things from slipping through the cracks
But at this point I wouldn’t trust the AI if it just gave thumbs up

I think it would be more interesting now and down the line to have a series of different models trying to reach a consensus, but I just don’t think we’re there yet
That being said, even if we’re not fully there, I think it’s important to try to build and test out these systems so that when we do get there we have systems already in place
So I’m generally for it

2 Likes

@wpb @Cris.MntYetti @timk11 @Lorimer, thank you all for your thoughtful feedback, it’s very helpful at this stage. The common thread is clear, reviews are important and AI shouldn’t replace the human factor. What it can do is to handle the repetitive parts and surface context faster, making governance more approachable and helping reviewers focus on what matters.

Also, adding clear labeling that outputs are AI-generated, along with enabling community comments on reports, are both excellent suggestions. As for running open-source models, this is something we can do since we’re using the Eliza framework. Running them on-chain will be on our todo list for when this becomes possible.

Regarding the OpenChat bot, the code is available on GitHub.

As you already noticed, the tool is still early, so feedback like yours will help refine new features. We truly appreciate the perspectives shared so far.

5 Likes