Any suggestion on other forums I could share it?
How can I join the discord channel?
Clicking the link above does not work for me.
Maybe hackernews and Telegram groups, there are ICP ones and also DeAI ones (not IC specific) that could be relevant. As this is Motoko specific, maybe you can also get in contact with @Seb, who’s organizing Motoko programming bootcamps, to see if there’s a good place to announce it there. On Awesome IC, there’s a section for Motoko (GitHub - dfinity/awesome-internet-computer: A curated list of awesome projects and resources relating to the Internet Computer Protocol) and there’s also an Awesome Motoko repo (GitHub - motoko-unofficial/awesome-motoko: A curated list of Motoko code and resources.)
Are you in the ICP Developer Community Discord (ICP Developer Community)? There’s a section for Working-Groups and our channel is the latest added there so at the bottom of the list.
Does this link to our channel work? ICP Developer Community
I am in now. Thanks for the link!
I was not using discord, in favor of OpenChat, which is working great for the C++ community, although not as easily discoverable for those in discord.
Summary: Meeting 2023.12.14
Project Overview and Technical Aspects: MotokoLearn is aimed at developing a library similar to Python’s collect, catering to users who don’t require GPUs. It deals with tabular data, not image or sound data, and uses machine learning models like ensemble trees. The technical details involve classifiers, regression models, binary trees, and handling instruction limits in the underlying protocol. There’s also a discussion on the Motoko language used for development.
Challenges and Solutions: The conversation in the call highlights challenges like instruction limits and high computational costs. Solutions include using a weight function in the algorithm and chunking processes to fit within instruction limits. The discussion indicates a future improvement plan to accommodate larger sample sizes and reduce costs.
Platform and Integration Aspects: MotokoLearn is developed for the Internet Computer platform. There’s a focus on how different hardware pieces, like GPUs, could be controlled and secured for AI applications on IC. Discussions on deploying Angular projects on IC and using Python packages suggest a versatile and developer-friendly approach.
Community and Collaboration: New participants introduce themselves, bringing diverse backgrounds from infrastructure, node provisioning, and software development. They express interest in decentralized AI, particularly regarding the intensity of training AI models and the costs associated with on-chain activities. The idea of a GPU subnet and decentralized control of hardware resources is discussed as a potential community initiative.
Future Plans and Next Steps: The group discusses transitioning their meetings to Discord for better integration and continuity. There’s an intent to create a GitHub repository for the group, where they can share knowledge, showcase materials, and coordinate on technical initiatives like vector databases and the GPU subnet.
Security and Neutrality in AI Models: There’s a conversation about the importance of having AI models trained within a secure and neutral infrastructure to avoid issues like Trojan networks, where models behave as expected except under certain triggers.
Decentralized Infrastructure and Node Providers: The discussion delves into the concept of decentralized infrastructure on the IC, where node providers contribute to a larger global platform. The idea is to extend control to individual GPU servers so that IC canisters can manage them securely.
In summary, the call covers a range of topics from technical details of MotokoLearn, challenges in AI model training and deployment, the potential of decentralized infrastructure in AI, to community collaboration and future initiatives. The focus is on leveraging the Internet Computer platform for efficient, secure, and cost-effective AI solutions.
Hi everyone, I’m looking forward to our next call tomorrow, Thu Dec 21 at 5pm UTC. We’ll have the call on Discord this time, in the ICP Developer Community’s voice channel: ICP Developer Community
Agenda-wise, these are some things we’d like to discuss:
Burn deep learning framework and similar initiatives (@mnl)
Next steps for action items (Brainstorming via Miro board, Awesome IC/DeAI, GPU subnet)
What else would you like to see on the agenda?
See you tomorrow!
Hi everyone, thank you for today’s (2023.12.21) discussion. This is the generated summary:
The meeting encompassed a comprehensive discussion on the integration and deployment of AI models in decentralized environments, particularly focusing on the Internet Computer (IC) platform. Here’s a consolidated summary of the entire meeting:
- @mnl led the initial discussion, highlighting challenges in deploying AI models, especially with C++ and Rust-based systems.
- Two AI frameworks, “Burn” (GitHub - tracel-ai/burn: Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.) and “Candle,” (GitHub - huggingface/candle: Minimalist ML framework for Rust) were introduced. “Candle” is popular but lacks extensibility, while “Burn” offers more flexibility with open backends.
- The group discussed experimenting with these frameworks to enhance AI model integration with Rust and C++, with a focus on stable structures.
- Technical aspects, such as computational efficiency and the size of AI models, were debated, alongside considerations of computation costs for network and canister owners.
- The discussion shifted to the “Custom Canisters” feature, enabling custom canister types in dfx.json for user-friendly automated conversions.
- Marcin proposed a custom canister type for AI to streamline model deployment from platforms like Hugging Face.
- The group emphasized improving the developer experience in deploying AI models, regardless of whether they’re on decentralized infrastructure.
- A potential AI-focused DAO (Decentralized Autonomous Organization) for infrastructure management was discussed.
- Challenges in using external orchestration platforms like Ray (GitHub - ray-project/ray: Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads., Ray v2 Architecture - Google Docs) for distributed GPU compute within the IC were highlighted, considering network boundaries and data transfer limits.
- A participant introduced the Ionet Project, seeking clarity on its workflow, particularly regarding booking and deploying on a PPU (Processing Power Unit).
- @icarus , a systems engineering expert shared insights into the technical documentation of software and hardware, stressing the importance of security in decentralized systems.
- The idea of building a GPU subnet within the IC system was discussed, envisioning a secure and efficient model distinct from current Web 3 and Web 2 platforms.
- A request was made for a private session to delve deeper into the technicalities of building infrastructure like Ionet.
- Participants were reminded to contribute to testimonials and brainstorming sessions, focusing on the GPU discussion.
- Testimonials: see Marcin’s post and thread in our Discord channel: Discord
- Same ideas can be posted to @lastmjs ’s tweet: https://twitter.com/lastmjs/status/1737149036241563910
- Brainstorming with Miro: see @icpp ’s post in our Discord channel: Discord
- The meeting schedule for next week was left open due to the holiday season, encouraging asynchronous discussions on Discord.
- The meeting concluded with holiday wishes, acknowledging the productive discussions and looking forward to future collaborations.
Throughout the meeting, there was a clear focus on enhancing AI capabilities within a decentralized framework, balancing technical feasibility, security considerations, and ease of development.
Hi everyone, happy new year! I hope you all had a great transition into 2024
We can pick up our regular calls this Thursday, Jan 4 again. As before, the calls will be on Thursdays at 5pm UTC and we’ll use the voice channel in the ICP Developer Community Discord:
Discord invite link: ICP Developer Community
voice channel link: ICP Developer Community
Please let @icpp or me know if there’s anything specific you’d like to talk about and have on the agenda.
See you then!
Hey! We plan on joining this working group going forward.
We are really interested in GPU subnet and/or using ICME’s zk tech as middleware.
Great, looking forward to having you! If there’s anything else you’d like to see on the agenda, please let us know.
Hi everyone, thank you for joining today. This is the generated summary of today’s (2024.01.04) discussion:
Introductions and Project Overviews: Several individuals introduce themselves and describe their specific areas of interest or projects they’re working on. This includes work on IoT, ICP (Internet Computer Protocol), AI, NLP (Natural Language Processing), and more.
Utilization of Tools and Platforms: The conversation kicks off with acknowledgments of the existing tools and platforms and how participants are engaging with them.
Blockers and Challenges: Participants discuss current challenges, such as instruction limits and memory constraints in AI development, and how these might be affecting their projects.
Technical Solutions and Ideas: There’s an in-depth technical discussion about potential solutions to the challenges presented, including using specialized subnets for latency-sensitive applications, utilizing stable memory for large model weights, and the impact of instruction limits on AI capabilities.
Community and Action Items: The group also talks about leveraging forums and other community platforms to discuss these issues more broadly and gather support. They propose creating threads to discuss specific problems like instruction limits and memory issues.
Timeliness and Efficiency: There’s a concern about the timeliness of responses, especially in voice applications, and how latency can affect user experience. The importance of balancing speed and accuracy in AI responses is highlighted.
Future Directions and Queries: The conversation turns to the future, considering how new features like websockets might influence development and the potential for continually evolving infrastructure to meet AI needs.
Integration of GPUs in Nodes for AI: The participants discuss the potential and challenges of integrating GPUs into the nodes of the network to facilitate more robust AI and machine learning tasks. They explore the implications, including the technical feasibility, consensus mechanism modifications, and the significant cost associated with high-end GPUs.
Data Privacy and Security: The conversation also focuses on ensuring stringent data privacy and security measures. Participants express the need to adhere to strict data regulations like the EU’s and discuss using cryptographic keys and client-side encryption to enhance user data security. They emphasize the user’s ownership and control over their data and the platform’s inability to access or share this data without consent.
Technical and Financial Implications: There’s a detailed discussion about the technical roadmap, the financial implications of adding GPUs to the nodes (referred to as Gen. 3 specs), and the need for replicated compute across nodes. The need for transparency in the development process and a clear understanding of the costs involved is highlighted.
Community Involvement and Feedback: The participants mention the importance of community involvement, brainstorming sessions, and feedback mechanisms. They discuss using Discord channels and forums for ongoing discussion and updates, reflecting a collaborative approach to problem-solving and innovation.
Query Structure Modification
To enhance AI capabilities and reinforce security on the Internet Computer (IC), I propose a modification to the current query structure. By introducing properties like ic_cdk::query(cycles_fee=true)
or ic_cdk::query(cycles_fee=true, composite_query=true)
, we can responsibly increase instruction limits per query—currently capped at 5 billion for queries, 20 billion for updates, and 200 billion for initialization. This change not only maintains essential safeguards against DDOS attacks but also significantly broadens the potential for AI on the IC. The impact on AI inference would be substantial, enabling larger model sizes, cost-effective resource utilization, and notably faster response times. Providing AI developers these tools will also prove to be a crucial step towards the efficient training of ML models on the IC.
Proposal for GPU Integration in Dfinity’s Gen 3 Nodes: Embracing Small-Scale GPUs for Scalability and Efficiency
In the context of Dfinity’s evolving technology, particularly for the upcoming gen 3 node specifications, I propose a strategic shift in the GPU integration approach. Instead of opting for the high-end, energy-intensive H100 GPUs, I recommend the adoption of many more but smaller, less powerful GPUs. This recommendation is grounded in the capabilities of the current gen 2 nodes, which are equipped with 500 GB RAM and robust processing power. These nodes adeptly parallelize processing and concurrently run numerous single-threaded WASM canisters, each limited to 4GB of RAM consumption.
While GPUs offer significant improvements in execution parallelization, their architecture does not support task-specific parallelization effectively. This limitation means that using GPUs with 80GB VRAM for a task that only requires 1GB would still monopolization the entire GPU resource. While the allure of state-of-the-art (SotA) hardware is understandable, the integration of any form of GPU into Dfinity’s nodes would mark a significant advancement from their current capabilities.
Starting with smaller GPUs presents a scalable and cost-effective solution. It not only complements the existing parallel processing features of the nodes but also provides the flexibility to incrementally increase GPU power in response to growing demands and evolving task complexities. Most if not all deep learning models can be split across many GPUs. This approach is not just about upgrading hardware; it’s about aligning technological enhancements with strategic goals for scalability, efficiency, and future growth.
@jeshli ,
Those are great proposals.
I wonder if your query proposal is a good candidate for a formal NNS proposal.
Does anyone in the WG have experience with submitting an NNS proposal?
Thank you for finding the Query Charging forum thread and cross-posting. That thread provided insightful information on the topic:
- Current Situation: As we know, canisters on the IC are only charged for updates, not for queries. Dfinity agrees that this leads to an imbalance as canisters with more update traffic bear higher costs, inadvertently subsidizing those with more query traffic. This discrepancy, which originated from technical challenges at IC’s launch, contradicts the principle of fair resource consumption charging and is seen as unsustainable.
- Technical Background: The absence of query charging initially stemmed from the complexity in achieving consensus among all nodes on the cycles to charge for queries. This is complicated by the fact that queries are executed by a single, non-replicated node.
- Gradual Implementation Proposal: To lessen the impact on canisters unprepared for query charges, a gradual implementation of query charging is suggested. This would involve incrementally increasing query fees from 0% to 100% over several months and would start on one subnet and before pushing to the entire network.
It appears that the IC team is already exploring query charging, recognizing the need to avoid making queries entirely free. The gradual implementation approach seems prudent. The critical aspect now might be to discuss with Dfinity the idea of increasing the instruction limit. Ideally, the IC team, as protocol experts, could find an optimal solution where more instructions linearly scale with cycle costs. Since development is already underway, our primary role could be to emphasize the importance and urgency of this consideration.
I would happily join you in any effort to go through a more formal process as well, because it could be a fun learning process.
@icpp and @jeshli
I have submitted required Node Provider proposals to the NNS and while a governance proposal would be different in format I doubt it is any more difficult (as in “easy”).
It is done using the dfx cmdline tool running in an env where a hotkey links the dfx proposal submission to your neuron with at least 10 ICP in it for the proposal cost (returned if the proposal is executed not rejected)
Happy to help with this and if we (the WG) want to submit the proposal when it is ready then I can offer my neuron and ICP for the deposit on behalf of the group.
@icarus ,
Thanks for that explanation and offering the use of your neuron for this purpose. It indeed sounds straightforward.
Let’s do it!
Is there a template doc that we can fill out?
The formal part of the process would be submitting an NNS proposal of type “Motion” under the topic “Governance” which has no payload data to be automatically executed (if accepted) but contains a simple motionText statement and a full description of the proposed intent to read and be accepted (or not) by the IC voting community.
Therefore the work mostly involves carefully describing the basis for, intent of, consequences, work required, etc of the proposal should it be accepted.
A clear example of this is illustrated in the Community Consideration: Explore Query Charging thread itself which began with an RFD (request for discussion) then was formed into a draft of the proposal text here which was then formally submitted to the NNS here as a proposal named “Motion for Query Stats Aggregation”.
This was accepted by majority vote and is now in a status of “Executed” which forms a commitment by the IC community to support implementation of that motion in the form of @stefan-kaestle and his team coding the canister query stats feature to support query charging.
If we (as the DeAI WG or @jeshli as the lead on this particular topic) want to make a formal proposal for a specific query charging model then I think we can take this proposal 123481 as a process template and also refer to it in the new proposal to show it is compatible with the already executed proposal 123481
Other than that procedural work I think the best approach is exactly what @jeshli has already initiated in the query charging thread which is to explain our rationale, provide use case and analysis and ask for community feedback and discussion. Then write a proposal description that is specific enough to be clearly understood and executed (by IC dev team) but not so specific that unnecessary technical details are enforced on the implementation
So TLDR: keep going as we are and form a clear concrete proposal that builds on prior work and discussion that could actually be executed by IC devs
Should we separate this proposal discussion out into a separate thread and link back to here as the origin of our discussion about GPU subnets, Gen3 replica aspects and requirements we are seeking from GPU compute calls from IC canisters?
I also responded to the GPU paragraph of your post on the Explore Query Charging thread.
@jeshli if you agree please go ahead so we can call in the relevant Dfinity engineering team contacts to ask questions about current specs and intentions.
We should also be aware that there is significant interest among the wider IC community about GPU enabled IC subnets due to the current focus on AI technology. I think we can facilitate effective information sharing with the community by keeping it focused on IC developer requirements for current on-chain AI canister development work such as yours, @icpp , ELNA @branbuilder , Kinic and others