Questions about data structures and migrations

Hi everyone,

I’m new here and have recently learned about DFINITY. I’ve started toying around with Motoko (already made a PR :stuck_out_tongue:) and the idea of orthogonal persistence (which I’ve always been a fan of), and I have a few questions in mind.

With these questions, I’m trying to understand how to build the data layer for a Reddit-like platform (which I think is a great use case) on DFINITY and hopefully discover some of the potential best practices for doing so.

Relational data

Given a Reddit-like platform where we have users, communities and posts, how can we efficiently find a user's posts within a community?

Iterating over the entire posts array for each query would be slow and costly when dealing with a data set as big as Reddit’s. Even with NVRAM vs SSDs, we still compute too much for this query.

What I have in my mind now is introducing indexes in the form of hash maps, for example:

  • Map<UserId, List<PostId>> - A map of user ids to their authored post ids.
  • Map<Tuple<Community, UserId>, List<PostId>> - A map of posts ids of a user within a community (solves the querying efficiency problem above).

However, this category of solutions requires developers to manually manage indices with code which brings us back to C-like memory management problems, i.e. developers forgetting to update indices or mistakenly corrupting them.

I imagine a better solution would be to build database-like data structures which abstract index management from the developer, doing it automatically as new elements are inserted/updated/removed.

Does Motoko offer some sort of an array-like data structure that supports indexing on a “per-column” basis?
Do you think such a data structure can ever be enough to replace the need of a database table with indices (for at least the 80% use case)?

Is there another solution to this problem that I’m not aware of?

Data migration

How can we handle changes to data structures in canisters?

For example, if I add one field, delete one field and rename another in the Profile structure in LinkedUp, what would happen to the existing data?

Is there some sort of standard for doing data migrations?

BTW, I couldn’t find sufficient information about how structural types in Motoko work, but if they are anything like structs in statically typed languages, then field names aren’t stored in the memory, is that correct?

// Define to-do item properties
type ToDo = {
	id: Nat;
	description: Text;
	completed: Bool;
};
6 Likes

That’s a very well put piece, you’ve essentially described the same thought process I went through, re indexing and abstracting this away. I’d welcome some more input on this from anyone trying similar things.

There may be room for the team to suggest some approaches too and possibly add library support.

There will be more information and tooling to come for the data migration, it may include separating your data into its own canister.

And welcome to the forum!

2 Likes

Great question. Regarding data migration, sounds similar to the problem solidity contracts faced which was to some extent solved with proxy contracts. Take a look at https://blog.indorse.io/a-well-tested-guide-to-upgradeable-proxy-ethereum-smart-contracts-f4b5111c12b0. Curious myself about the relational data, but i’m too new to dfinity to answer that :smile:

1 Like