ICDevs.org Bounty #8 - HttpRequest Parser

This bounty likely needs some more discussion around the signature of the parsed object. I’m not an HTTP expert, so please weigh in below if you can think of some other functions/objects that would be helpful or if I’m not handling some edge cases. One thing I considered was if we should provide hardcoded pathways to some of the common headers. It is up for debate.

Create a HTTP Request Parser in Motoko - #8

Current Status: Discussion

  • Discussion (01/11/2022) ← We are here
  • Ratification
  • Open for application
  • Assigned
  • In Review
  • Closed

Latest Official Issue Info - ICDevs.org

Bounty Details

  • Current Bounty Amount: 20 ICP
  • ICDevs.org Match Available: 20 ICP - (For every ICP sent to 860bd56f4c8a9d40f26462e51e2a4dd4e27cf0e1463372a1179df089695bfd63, ICDevs.org will add one more ICP to the bounty, up to 20 ICP, After 20 ICP, Donations to the above address will add .25 ICP to this issue and .75 ICP to fund other ICDevs.org initiatives)
  • Time Left: Expires 12/31/2022
  • Project Type: Single Contributor
  • Opened: 01/22/2022
  • Time Commitment: Days
  • Project Type: Traditional
  • Experience Type: Beginner - Motoko
  • Issue Type: Application Development

Description

This bounty gives the opportunity to

  • learn how http_request works with the Internet Computer
  • learn and contribute to string parsing in motoko
  • learn how to create and publish a vessel package

The developer will need to create a vessel package called HttpRequestParser that parses the HTTPRequest type into a more useable and extensible object. We suggest:

{
    method: Text;
    url: {
        original: Text;
        protocal: Text //http or https - may always be https?;
        port: Nat16; //maybe always 443? What about local replica?
        host: {
            original: Text;
            array: Array<Text>; //(canisterID at 0, ic0 at 1, app at 2) //will we always have this structure?
            canister: Principal; //parse the canisterID into a principal
        };
        path: {
            original: Text;
            array: Array<Text>; // split path by "/" into an array that can be referenced;
        };
        query: {
            original: Text; //everything after the ? and before an anchor
            get: (Text) -> ?Text; //pass in a key and get value. null if not present
            hashMap: HashMap<Text, Text>;
            keys: [Text]; //list of query keys
        };
        anchor: Text; //an anchor if available(after the #;
        
    };
    headers: {
        original: Array<(Text, Text)>;
        get: (Text) -> ?Text //pass in a key and get value. null if not present
        hashMap: HashMap<Text, Text>;
        keys: [Text]; //list of header keys
    };
    body: ?{ //Get requests won't have a body
        original: Blob
        size: Nat; //size of the body
        form: { //if the content-type is as specified at https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/POST, parse the form and populate the collection
            get: (Text) -> ?Text //pass in a key and get value. null if not present
            hashMap: HashMap<Text, Text>;
            keys: [Text]; //list of form keys
        };
        text: () -> Text; //converts the Blob to plain text
        files: (Text) -> ?Buffer<Nat8>; //returns the formdata as a byte array; null if the form entry does not exist 
        file: () -> ?Buffer<Nat8>; //if not formdata and only one file is provided it will be here.
        bytes: (start, end) -> Buffer<Nat8>;//return the specified bytes from the blob.
    };

};

This library is an opportunity to start a RegEx like library for motoko. RegEx is hard and big and complicated, so this is not a requirement, but if the bounty hunter wanted to dive into the basics of RegEx and explore how well/poorly motoko was going to be for RegEx it would be a bonus.

Prior art that may help in getting you started:

https://github.com/aramakme/aramakme_nft_auction/blob/f0ca7fb629814dc24a90ad84c7d024a49390e38b/main.mo#L2757
https://github.com/dfinity/motoko-base/blob/57c3bb724dfe36928d443f5a81446872bf646de9/src/Text.mo#L346

To apply for this bounty you should:

  • Include links to previous work writing tutorials and any other open-source contributions(ie. your github).
  • Include a brief overview of how you will complete the task. This can include things like which dependencies you will use, how you will make it self-contained, the sacrifices you would have to make to achieve that, or how you will make it simple. Anything that can convince us you are taking a thoughtful and expert approach to this design.
  • Give an estimated timeline on completing the task.
  • Post your application text to the Bounty Thread

Selection Process

The ICDevs.org developer’s advisors will propose a vote to award the bounty and the Developer Advisors will vote.

Bounty Completion

Please keep your ongoing code in a public repository(fork or branch is ok). Please provide regular (at least weekly) updates. Code commits count as updates if you link to your branch/fork from the bounty thread. We just need to be able to see that you are making progress.

The balance of the bounty will be paid out at completion.

Once you have finished, please alert the dev forum thread that you have completed work and where we can find that work. We will review and award the bounty reward if the terms have been met. If there is any coordination work(like a pull request) or additional documentation needed we will inform you of what is needed before we can award the reward.

Bounty Abandonment and Re-awarding

If you cease work on the bounty for a prolonged(at the Developer Advisory Board’s discretion) or if the quality of work degrades to the point that we think someone else should be working on the bounty we may re-award it. We will be transparent about this and try to work with you to push through and complete the project, but sometimes, it may be necessary to move on or to augment your contribution with another resource which would result in a split bounty.

Funding

The bounty was generously funded by the community and the DFINITY Bounty Accelerator Grant. If you would like to turbocharge this bounty you can seed additional donations of ICP to 860bd56f4c8a9d40f26462e51e2a4dd4e27cf0e1463372a1179df089695bfd63. ICDevs will match the bounty 1:1 for the first 20 ICP and then 0.25:1 after that. All donations will be tax deductible for US Citizens and Corporations. If you send a donation and need a donation receipt, please email the hash of your donation transaction, physical address, and name to [email protected]. More information about how you can contribute can be found at our donations page.

General Bounty Process

Discussion

The draft bounty is posted to the DFINITY developer’s forum for discussion

Ratification

The developer advisor’s board will propose a bounty be ratified and a vote will take place to ratify the bounty. Until a bounty is ratified by the Dev it hasn’t been officially adopted. Please take this into consideration if you are considering starting early.

Open for application

Developers can submit applications to the Dev Forum post. The council will consider these as they come in and propose a vote to award the bounty to one of the applicants. If you would like to apply anonymously you can send an email to austin at icdevs dot org or sending a PM on the dev forum.

Assigned

A developer is currently working on this bounty, you are free to contribute, but any splitting of the award will need to be discussed with the currently assigned developer.

In Review

The Dev Council is reviewing the submission

Awarded

The award has be been given and the bounty is closed.

Matches

10 ICP - DFINITY Accelerator Grant

Other ICDevs.org Bounties

6 Likes

Seen front the client perspective I think this is the feature I’d wish for the most.

Here are some references that might come in handy:

1 Like

I used the url crate to parse the URL portion in Rust. It might serve as some inspiration.

I suggest trying to use parser combinators over regular expressions.

4 Likes

I would strongly suggest not assuming that the Host always refers to a canister. That this is currently usually not he case is a slight embarrassing state of affairs, let’s not cement it further. (In fact it’s not even true now; identity.ic0.app exists, and people do run their own icx-proxy instances to get nice hostnames.)

Also, it’s odd to represent data redundantly (map and function and list?). I’d parse it as just [(Text,[Text])], this way repeated fields are supported (as they are allowed in HTTP), and users needing a hashmap can use HashMap.ofList easily.

Great point! I’ll adjust the requirments.

I forgot about those! I’m trying to think if there is any reason for this to try to get the canister ID and I don’t think there is…this should be a tool to get things organized and not a place to get work done so anyone using it that wants the canister ID should be able to get from their canister code.

1 Like

Updated Spec.

Changes:

  • Moved files into the form as that is how that is pushed in if you are doing multiple files

  • Support multiple files of the same name

  • Support multiple headers of the same name

  • Support multiple form fields of the same name

  • Specified that the get function is just a helper method. I think obj.header.get(“content”) is cleaner IMO than HashMap<Text, [Text]>.ofList(“content”, myList); (And I can’t actually find ofList in the base library…is this something new @nomeata ?)

    {
    method: Text;
    url: {
    original: Text;
    protocal: Text //http or https - may always be https?;
    port: Nat16; //maybe always 443? What about local replica?
    host: {
    original: Text;
    array: Array; // host split at the "."s
    };
    path: {
    original: Text;
    array: Array; // split path by “/” into an array that can be referenced;
    };
    query: {
    original: Text; //everything after the ? and before an anchor
    get: (Text) → ?Text; //helper function: pass in a key and get value. null if not present
    hashMap: HashMap<Text, Text>;
    keys: [Text]; //list of query keys
    };
    anchor: Text; //an anchor if available(after the #;

      };
      headers: {
          original: Array<(Text, Text)>;
          get: (Text) -> ?[Text] //helper function: pass in a key and get values. null if not present
          hashMap: HashMap<Text, [Text]>;
          keys: [Text]; //list of header keys
      };
      body: ?{ //Get requests won't have a body
          original: Blob
          size: Nat; //size of the body
          form: { //if the content-type is as specified at https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/POST, parse the form and populate the collection
              get: (Text) -> ?[Text] //helper function: pass in a key and get value. null if not present
              hashMap: HashMap<Text, [Text]>;
              keys: [Text]; //list of form keys;
              files: (Text) -> ?[Buffer<Nat8>]; //helper function returns the formdata as a byte array; null if the form entry does not exist 
          };
          text: () -> Text; //converts the Blob to plain text
          file: () -> ?Buffer<Nat8>; //if not formdata and only one file is provided it will be here.
          bytes: (start, end) -> Buffer<Nat8>;//helper function: return the specified bytes from the blob.
      };
    

    };

This is now in “application” status. Who wants to build it? This is a chance to build something that almost every project will.use in the future.

If you are one of the projects that needs it, consider accelerating the bounty(info above).

2 Likes

What should it look like in the end?
Option 1
A canister with a function and a json string parameter, for example http_request(json: Text) → Trie
where json is { method: Text; url: { }…}
And all interaction on the client is also via Actor and HttpAgent (@dfinity/agent)
Or
Option 2
Is it possible to build a request from scratch using HTTP/HTTPS specifications?

I would love to work on this bounty. I have a repo with some of the fields in the spec implemented here https://github.com/tomijaga/http-parser.mo . I plan to add tests and documentation once I have completed the completed body field. I estimate it will take two weeks to complete this project.

My Solution

My solution splits the url into its different parts (scheme, domain, subdirectories, query and anchor) and returns them in the object spec.

For the request body, I plan to convert the blob to text and follow the specifications on the site POST - HTTP | MDN.

An external dependency I am using to retrieve the HeaderField and Request types is the http.mo package from aviate-labs

An issue I came across is I can not use query as a field in the object because it is already a keyword in the Motoko programming language. So I changed it to queryObj . I am open to better naming suggestions.

From a user’s perspective, I think it would be helpful to add a deserialize() method that converts a JSON text in the body to an object.

About Me

I have some experience contributing to open source. These include a js SDK, a rust SDK and an HD key generation tool for an open-source blockchain project. These projects can be on my github profile

I took part in the Motoko Bootcamp and learned a lot about Motoko from completing the daily challenges. I better understand Motoko syntax and how to use vessel for importing and publishing packages.

2 Likes

Fantastic. Thanks for the detailed application. I’ll submit it to the board, but I don’t see any reason why you shouldn’t get started!

1 Like

You are assigned! Please create a repo to hold your work and let us know what it is!

1 Like

I think the repo is private, can you make it public? I’d love to follow along

1 Like

My bad, I’ve made it public now. GitHub - tomijaga/http-parser.mo: HTTP Request Parser for Motoko

2 Likes

@tomijaga ,

Can I already do some testing, or should I wait a bit?

I came across this recently and thought the people in this thread might be interested.

The url crate refers to the URL standard whereas this adheres to IETF RFC 3986.

1 Like

Is there a comptable standard for a body parser as well? I’d like to get something out and then we can get nitty about standards because they usually help.

What’s left is percent decoding, testing and documentation. So you can start testing it out. If you run into any problems, pls let me know.

I made some changes to the initial spec. I added a fileKeys array to the form object type and created a new type for files.

form: {
        get: (Text) -> ?[Text];
        hashMap: HashMap.HashMap<Text, [Text]>;
        keys: [Text];
        
        fileKeys: [Text];
        files: (Text) -> ?[File];
    };

public type File = {
        name: Text;
        filename: Text;
        
        mimeType: Text;
        mimeSubType: Text;

        start: Nat;
        end: Nat;
        bytes: Buffer.Buffer<Nat8>;
    };

@tomijaga ,
that’s great! I will let you know how it works out.

Do you happen to have some examples that demonstrate how to use it?

How does one actually call it when it is deployed to either the local network or the IC?

I would be great if you can share some examples. Maybe in curl format or even better a postman collection?

1 Like