I need to store multiple types of data such as ChatHistory, UserProfile into Canister

Hi,
I have a problem and I need everyone’s help.
I want to store chat history on a canister, but since each canister has a maximum capacity of 4GB, I plan to build a canister controller that orchestrates the creation of a new canister when the current one is nearly full. When retrieving a list of data, it should call multiple canisters. Please provide a solution or related documentation. Thank you very much.

I’m quite familiar with this scenario. Here’s what you need to do:

  1. Dynamically create canisters via a central canister (A Canister) and keep track of each canister’s currently used storage size, along with a list of keys stored in each canister (similar to BTreeMap<CanisterId, HashSet>). When fetching a key, you can simply iterate through this list of canisters, which avoids the need to repeatedly store the CanisterId-Key mapping. You could also consider using a Bloom filter or other structures, depending on your specific needs.
  2. When requesting a piece of data, you can send the request to A Canister, which will then retrieve the actual value using the composite-query method (note: this requires that A Canister and the data-storing canisters reside on the same subnet).

Additionally, keep in mind that the entire Wasm heap memory cannot be used solely for data storage, since some of it is needed for runtime operations. Therefore, it’s recommended to limit storage to around 3GB per canister.

2 Likes

Alternatively I could recommend:

  • Motoko: enabling Enhanced Orthogonal Persistance
  • Rust: use ic-stable-structures crate

Both these language specific alternatives allow for a single canister to store significantly more data than the 4gb heap limit (multiple hundreds of gb).

Opting for multiple canisters, should primarily be considered for architectural reasons instead since doing so adds significant complexity.

4 Likes

Thank you for your response. I will consider the options carefully to make the most suitable choice.

Thank you for your response. I will look into it further based on your suggestions and the link you provided.Thanks!

@nguyendola read more about this here: Officializing Enhanced Orthogonal Persistence (EOP)

Thank you very much, @marc0olo. I will look into it more. By the way, how about Rust?

https://internetcomputer.org/docs/building-apps/developer-tools/cdks/rust/stable-structures

1 Like

I have some questions that I hope you can help me with.

I’m currently running canister in a local environment.

  1. When I deploy the project, a default canister is created with canister_id = avqkn-guaaa-aaaaa-qaaea-cai.

  2. I have a .did file with the following content:
    type ChatHistory = record {
    id : nat64;
    updated_at : nat64;
    content : text;
    reply_to : opt text;
    source : SourceType;
    canister_id : opt text;
    created_at : nat64;
    sender : SenderType;
    user_id : text;
    group_id : opt text;
    message_id : opt text;
    };
    type CreateChatHistoryRequest = record {
    content : text;
    reply_to : opt text;
    source : SourceType;
    sender : SenderType;
    user_id : text;
    group_id : opt text;
    message_id : opt text;
    };
    type ListChatHistoryRequest = record {
    source : SourceType;
    page : opt nat64;
    user_id : text;
    limit : opt nat64;
    reverse_order : opt bool;
    };
    type Result = variant { Ok; Err : text };
    type Result_1 = variant { Ok : nat64; Err : text };
    type Result_2 = variant { Ok : vec ChatHistory; Err : text };
    type Result_3 = variant { Ok : ChatHistory; Err : text };
    type SenderType = variant { AI; Human };
    type SourceType = variant { Web; Telegram };
    type UpdateChatHistoryRequest = record { id : nat64; content : text };
    service : () → {
    check_and_create_canister : () → (Result);
    create_history : (CreateChatHistoryRequest) → (Result_1);
    delete_history : (nat64) → (Result);
    get_active_canister : () → (opt text) query;
    get_canister_ids : () → (vec text) query;
    get_histories : (ListChatHistoryRequest) → (Result_2);
    get_history : (nat64) → (Result_3) query;
    update_history : (UpdateChatHistoryRequest) → (Result_3);
    }

I loaded this .did file into my Python project using the code below:

from ic import Client, Identity, Agent, Canister

def init(self, settings: Settings):
self.client = Client(url=settings.canister_url)
self.identity = Identity()
self.agent = Agent(identity=self.identity, client=self.client)
with open(settings.candid, “r”) as f:
self.candid = f.read()
self.canister = Canister(agent=self.agent, canister_id=settings.canister_id, candid=self.candid)

async def create_history_async(self, chat: dict):
try:
chat_data = {
‘content’: ‘Example Chat Message’,
‘reply_to’: None,
‘source’: {‘Web’: None},
‘sender’: {‘Human’: None},
‘user_id’: ‘240520’,
‘group_id’: None,
‘message_id’: None
}
return await self.canister.create_history_async(chat_data)
except Exception as e:
print(f"Error calling create_history_async: {e}")

Some details:

  • canister_id: avqkn-guaaa-aaaaa-qaaea-cai
  • canister_url: localhost port 4943
  • candid: path to icp_storage.did
  • ic-py version: 1.0.1

But when I run the create_history_async method, I get this error:

Error calling create_history_async: Invalid record {<ic.candid.TextClass object at 0x14b0003d0>: ‘text’, <ic.candid.OptClass object at 0x39fb832e0>: ‘opt (text)’, <ic.candid.OptClass object at 0x39fb83340>: ‘opt (text)’, <ic.candid.OptClass object at 0x39fb833a0>: ‘opt (text)’} argument: {‘content’: ‘Example Chat Message’, ‘reply_to’: None, ‘source’: {‘Web’: None}, ‘sender’: {‘Human’: None}, ‘user_id’: ‘240520’, ‘group_id’: None, ‘message_id’: None}

Using dfx work fine
dfx canister call icp_storage create_history '(
record {
content = “Example Chat Message”;
reply_to = null;
source = variant { Web };
sender = variant { Human };
user_id = “240520”;
group_id = null;
message_id = null;

}
)’
(variant { Ok = 4 : nat64 })

If I’m wrong about anything, please explain it to me. Thank you so much!

It seems this issue is caused by using ic-py, as ic-py’s handling of Candid has some problems. I recommend first trying the Rust agent.

Since ic-py hasn’t been maintained for a long time (it was previously managed by another team), I’ve only recently taken over its maintenance, and the Candid-related fixes will follow the security patches—so it may take some time.

2 Likes

@nguyendola or, if you are more familiar with JavaScript/TypeScript, you can use the JavaScript agent.

Let us know if you need further help!

3 Likes

Hi,
I think I’ve found the cause. I tested it with the get_histories function and
it also has the same issue.

#[derive(CandidType, Serialize, Deserialize, Clone, Debug)]
pub struct ListChatHistoryRequest {
pub user_id: String,
pub source: u8,
pub page: Option,
pub limit: Option
}

For fields that are of type Option, in Python you need to wrap the value in square brackets.
For example:
params = {
“user_id”: “240520”,
“source”: 0,
“page”: [0] instead of 0,
“limit”: [50] instead of 50
}
return await self.canister.get_histories_async(params)

I haven’t tried the create_history function yet, but I think it works the same way.

Thanks!

2 Likes