Awarded: ICDevs.org - Bounty #45 - File Uploader Pattern - JS, Rust, Motoko - $10k

Thanks for your answer, this time I clearly understand what I should do!

Hi @heyuanxun, any updates?

I just want to provide a progress report for the Motoko side of file uploading. I closely followed the interface from @dfinity/assets. However, I simplified the interface by not creating two distinct methods for uploading files. Previously, there was one method for files smaller than a certain byte size and another for chunking larger files. Now, I have designed it to chunk all files, regardless of size. Many tests demonstrate how this works, and I plan to create a video explaining its usage. I would appreciate feedback from the community regarding any additional features or improvements they would like to see.

Chunking is required over about 2MB. What do you have the chunk size at?

I have chunking size at 2MB

I wrote above. I did the same in my project. If possible (preferably) make it possible to adjust the size of the piece.Since the user needs (not always) to count the hash or checksum. It doesn’t always succeed in a round.

I just didn’t understand why a developer would choose anything less than 2MB chunking size. I guess in extremely poor or unstable internet connections it might be useful. Going to add the retry to the client, should be easy enough. Anything else that you think might be good to have let me know.

So I added retry logic. I think one thing missing is checksum. However, going to think about this and then add it.

We are MetaBox Team,we will start working on this bounty.
In MetaBox, once we have fulfilled the requirements mentioned above, we will abstract these functions.
MetaBox : app.metabox.rocks

Hey C-B-Elite…the bounty is currently assigned to Cyberowl.

1 Like

Can you take a look at the code above and offer any suggestions that you would find helpful for MetaBox. Thanks

1 Like

Seems like checksum is a good idea here are my approach to doing it:

#Checksum Verification

Client-side (using JavaScript):
a. Compute the checksum of the file before uploading, using a JavaScript library that supports the desired checksum algorithm (e.g., MD5, SHA-1, or SHA-256).
b. Send the computed checksum along with the file to the server.

Server-side:
a. Form File from all the chunks
b. Compute the checksum of the received file using the same algorithm as the client.
c. Compare the server-generated checksum with the client-generated checksum.

If the checksums match, it is highly likely that the file was uploaded correctly and without errors. If they don’t match, there might be an issue with the uploaded file, such as corruption or an incomplete transfer.

I thought maybe it would be good to add a sync file mech as well, but that will probably take more time. However, I think a simple checksum will be good for file integrity in case some chunks are injected in some way while uploading the file.

1 Like

#Version-Update

I implemented CRC-32 checksums for file uploads. This ensures that if any chunk is missing when committing a batch, an error will be triggered. Additionally, checksums are performed at both the chunk and file levels, allowing for potential synchronization of different chunks in the future. For instance, updating specific parts of an executable file could be made possible. While I haven’t tested this functionality yet, it’s an experiment I plan to explore next. Another aspect I’m interested in trying is implementing gzip compression and decompression on the client side.

2 Likes

Is there any particular reason to why you chose to use transient memory instead of ZhenyaUsenko’s stable hashmap? Afaik it is more performant than the one used in the base Motoko library despite using stable memory and it would lift the storage limit per canister to 48GB.

I don’t think ZhenyaUsenko’s uses stable memory, it is just a module. There is no particular reason, just not brining in more dependancies than necessary. I still don’t understand why we don’t support improvement of base hashmap vs creating additional deps. However, if the community prefers that I can change it.

1 Like

#Version-Update

Taking into consideration your feedback @Zane I have updated refactor: use dep hashmaps from ZhenyaUsenko ¡ cybrowl/upload-file@904a3e8 ¡ GitHub. to move towards hashmap dep. I think it makes code easier to read as well. Anyway, let me know if you have any other suggestions.

That is great, but just to be sure I’d also ask @skilesare’s opinion on this one, the additional dependency argument you pointed out makes sense an since he is the bounty issuer he deserves the decide if it’s a worth compromise for more storage capacity.

Are more granular permission per asset planned?

Also remember to update this line to match the new domain, ic0.app won’t be used anymore for new canister after the 20th April: upload-file/utils.mo at main · cybrowl/upload-file · GitHub

1 Like

Yeah I just saw a tweet about the domain name change. I don’t think the hashmap dep adds any increase in storage capacity but will add support for stable storage as well next since it seems like a good thing. There are limitations though and I will outline them in the readme. Permission per asset can only be done via token generation and is def something to explore.

#Version-Update

Initially, I wrote the @dfinity/file-upload package in JavaScript. I chose this approach because I didn’t want to learn TypeScript and didn’t see the need for a small library. However, to maintain consistency with agent-js, I decided to rewrite it in TypeScript. I also updated all my tests to be compatible with Jest. I created a new repository to manage changes while testing the package at GitHub - cybrowl/agent-js-file-upload: fileupload pkg. After testing, it can be more easily merged into agent-js. Currently, there is a minor issue I couldn’t fix with agent-js, where the Jest tests are not working for me. It seems to be a configuration issue related to the monorepo structure.

The next step is to start using the library in a project, and I am considering building a simple UI for people to see what they can do with the project. Domwoe also suggested that I should add it to Internet Computer Loading, which I plan to do soon.

1 Like

Using Evginey’s hashmap is great. It is fast and performant. It doesn’t do stable storage but takes advantage of the stable migrations provided by motoko so you shouldn’t need to do or/post/upgrade.

A stable version that gets you up to 48GB wold be great, but I think we need to let @matthewhammer burn in the stable region stuff(is that ready to go Matt?)

2 Likes