Completed: ICDevs.org Bounty #8 - HttpRequest Parser

I’d suggest switching to triemap as hashmap has serious memory issues.

1 Like

Do you happen to have some examples that demonstrate how to use it?

Yes, I have an example canister in the repo with a simple HTML form for uploading a file and some fields. You can clone the repo and deploy it locally to access the form at http://localhost:8000. I have a function called debugRequestParser that prints the parsed request to the console once the form is submitted. This function should show how to use and access some fields in the object.

How does one actually call it when it is deployed to either the local network or the IC?

The parser is a module that can be imported into your canister by adding the .vessel.dhall and package-set.dhall files specified in the example canister and this line, import HttpParser "mo:HttpParser";, to your code. It would be called in the http_request function on the incoming request to the canister. Here’s a snippet of what it would look like:

    public query func http_request(rawReq: HttpParser.HttpRequest) : async HttpParser.HttpResponse {

        let req = HttpParser.parse(rawReq);

        let {host; port; path; queryObj; anchor; original = url} = req.url;

               ...
2 Likes

Thanks, I will switch to that. Do you have any other suggestions on how to be more efficient? Parsing files over 30kb is really slow. It takes about 40s. I think it’s because I’m concatenating the characters in every line when I only need to check the first character for a match and move to the next line if it fails. I will try this out and get back to you about the performance improvements if there are any.

I just implemented this, and there were some improvements. The module now parses data at 100kb/s, which is still relatively slow as it takes about 10 hours to parse a 3GB file. I will check other similar parsing libs to see how I can increase the performance further.

Wow. We are going to need a strategy for that. We really need a highly performant regex function. I’ll try to take a look, but perhaps @paulyoung has an idea?

One thing to keep in mind is that requests on the IC are limited to 2MB, so you are unlikely to run into a scenario where you need to parse more than that. I think this applies to http_request as well.

If you want a file bigger than that, you have to chunk it.

As I said earlier in this thread; I suggest using parser combinators, or at least a parser that consumes the input as it goes.

I haven’t used these but they might be a good place to start.

3 Likes

Any new updates? Looking forward to paying out the bounty!

Yes, I’ve made a few updates. I’ve added support for percent-encoded search queries, written unit tests for each class and added documentation for the ParsedHttpRequest data type.

However, I haven’t been able to get the module to parse files faster. I’ve tried looking into parser combinators (thanks @paulyoung for this btw), but I haven’t been able to wrap my head around them. It will take some time to understand how they work and use them in the module.

Please push your changes as they are already super useful. There are better ways to get files into the IC than http_request anyway. We can always explore and add a version later.

1 Like

This is a beginner tutorial in Rust but I think it describes some fundamental concepts well. I would focus on the “Defining the parser” and “Combinators” sections.

https://bodil.lol/parser-combinators/

Something like this might be closer in syntax: Building Parser Combinators (Part 1) - Swift Talk - objc.io

1 Like

Hey @skilesare , I have updated the repo with my changes and completed the development for this version. I believe the package is ready to be reviewed and would be awaiting your feedback on any other improvements that could be made.

Fantastic. This bounty is officially in preview mode.

We are using a fork in the Origyn NFT already! I’ll review and update to this.

Community: Please review @tomijaga work. This is a very cool library that will save you a ton of time when interpreting http_request queries!

3 Likes

This bounty is now closed and awarded. Congrats @tomijaga!

5 Likes

Thank you, @skilesare and everyone in this forum for your contributions and support.
I will continue to maintain the repo and work on a more performant version in my free time.
If you are interested in using this lib in your project, you can find the repo here