Private user data, big data, and data isolation

jimbo · February 29, 2020, 7:07pm

Howdy,

I have been fleshing out an idea for an open-source developer-oriented contextual meta-search engine in the engineering problem/solution space. It would do both local and system-wide machine learning.

I would love to develop or plan on developing on dfinity as a platform, but there are many questions I have which have no answer in the documentation.

I understand that application data is encrypted, but that level of encryption is not enough when it comes to private data. The application should not be able to read the private data of a user, otherwise if the application is hacked, the user data is exposed. What provisions are there for securing user data?
Can an IC application run analysis on local machine data and have file access?
What is the performance on data access? Is it sufficient to run big data operations?
What level of isolation is there between the data of separate bundles? Is all data on one block-chain? Or is data per IC (like a private blockchain)? Is the blockchain sharded at all or is it a traditional bitcoin/ethereum 2D array type block structure?

The reason for the 4th question is isolation, performance and vulnerability. The performance of an app I create on a single blockchain will be affected by demand of others or attacks (DDOS and otherwise) on the blockchain.

I know these are a lot of questions, and I know you guys are working on building a working system.

I would argue that documenting the future state of the project and answers to questions such as these would attract real-world projects and developer support earlier. I think many would like to develop on an open, secure, immutable system, but without knowing how their real-world problems like big data, data-privacy and isolation, etc would be solved, they have to wait.

Thanks,
–jim

Ori · February 29, 2020, 9:16pm

Thanks Jim. Great input.

Generally: The docs are very early stage at the moment so while it’s not there yet, info like this will begin to appear over time. I agree much of this would be useful to developers.

Just re point 2: You could explore including a wasm binary in with the front end files and pulling that down to the browser for local execution, it could then have sandboxed access to the local machine as per browser executed webassembly.

Ori · February 29, 2020, 9:26pm

Briefly on point 4: The data isn’t replicated identically across every node, it’s striped across the network; a lot of work has gone into optimising access and performance, but there aren’t any public details to share on this at the moment.

You will be able to run private chains too, and connect them to the public network, if that’s useful to you.

jimbo · March 1, 2020, 3:17am

Hi Ori,
Great - those are good options. I will definitely consider dfinity then. Right now I will spend some time (besides my day job) in design, so I will continue to follow the project during that process.
Thanks for your help.
–jim

Ori · March 1, 2020, 8:18am

No problem. Really good to have you here, it’s great to have these discussions.

Dunning · March 2, 2020, 3:10pm

Is there any more general information on data privacy?

Jimbo mentions that he/she understands that application data is encrypted. Is there any more information Dfinity’s privacy model?

Ori · March 2, 2020, 9:05pm

Hey DK, just what the FAQ briefly mentions at the moment, there aren’t really any more details public on all this yet.

cryptoschindler · September 23, 2020, 1:13pm

Cross-Linking this

Topic		Replies	Views
Private computation on the IC DFINITY	4	1985	October 3, 2020
Some basic questions General	7	1046	May 13, 2021
Few general (noob) questions about the internet computer DFINITY	8	3444	February 15, 2021
GDPR and Storing User Data General	1	357	October 11, 2021
Permanent blockchain data Developers	4	1329	February 10, 2022

Private user data, big data, and data isolation

Related topics