Private user data, big data, and data isolation


I have been fleshing out an idea for an open-source developer-oriented contextual meta-search engine in the engineering problem/solution space. It would do both local and system-wide machine learning.

I would love to develop or plan on developing on dfinity as a platform, but there are many questions I have which have no answer in the documentation.

  1. I understand that application data is encrypted, but that level of encryption is not enough when it comes to private data. The application should not be able to read the private data of a user, otherwise if the application is hacked, the user data is exposed. What provisions are there for securing user data?
  2. Can an IC application run analysis on local machine data and have file access?
  3. What is the performance on data access? Is it sufficient to run big data operations?
  4. What level of isolation is there between the data of separate bundles? Is all data on one block-chain? Or is data per IC (like a private blockchain)? Is the blockchain sharded at all or is it a traditional bitcoin/ethereum 2D array type block structure?

The reason for the 4th question is isolation, performance and vulnerability. The performance of an app I create on a single blockchain will be affected by demand of others or attacks (DDOS and otherwise) on the blockchain.

I know these are a lot of questions, and I know you guys are working on building a working system.

I would argue that documenting the future state of the project and answers to questions such as these would attract real-world projects and developer support earlier. I think many would like to develop on an open, secure, immutable system, but without knowing how their real-world problems like big data, data-privacy and isolation, etc would be solved, they have to wait.



Thanks Jim. Great input.

Generally: The docs are very early stage at the moment so while it’s not there yet, info like this will begin to appear over time. I agree much of this would be useful to developers.

Just re point 2: You could explore including a wasm binary in with the front end files and pulling that down to the browser for local execution, it could then have sandboxed access to the local machine as per browser executed webassembly.


Briefly on point 4: The data isn’t replicated identically across every node, it’s striped across the network; a lot of work has gone into optimising access and performance, but there aren’t any public details to share on this at the moment.

You will be able to run private chains too, and connect them to the public network, if that’s useful to you.

1 Like

Hi Ori,
Great - those are good options. I will definitely consider dfinity then. Right now I will spend some time (besides my day job) in design, so I will continue to follow the project during that process.
Thanks for your help.


No problem. Really good to have you here, it’s great to have these discussions.

1 Like

Is there any more general information on data privacy?

Jimbo mentions that he/she understands that application data is encrypted. Is there any more information Dfinity’s privacy model?


Hey DK, just what the FAQ briefly mentions at the moment, there aren’t really any more details public on all this yet.


Cross-Linking this