Secure Data Storage WG Agenda - Thu Jan 21st, 2021
Agenda
- IPR Reminder
- Introductions and Re-Introductions
-
Replication Discussion Continued
-
Issue Review
Notes
Introductions / Re-introductions
- Jace from Bloom re-introduced himself
Replication Discussion
- Orie gave a brief reflection on our prior discussion on replication, touching on topics such as
- filtered / partial replication, are we looking for EDV's to support only syncing a subset of information
- how does this intersect with the AuthZ model we select
- what role the client plays and what role the server play, e.g is it server controlled, client controlled or somewhere in between like client controlled but server facilitated.
- Adrian asked the question "what would we do differently to other DBs like CouchDB"
- Orie answered Adrians question, saying right now it is not defined but the likely solution will be similar however there are quite unique constraints that come about when the data is client side encrypted.
- Adrian asked how does an architecture like this protect against a honey pot style approached
- Daniel B make the observation that AuthZ is an important component in syncing that directly relates to the trust model assumed between clients and instances
- Andreas stated that the real problem in syncing between instances is the model we assume around consistency i.e do we assume strong consistency?
- Dave L responded to Adrians query by stating his assumption is that the server would not have the ability to decrypt the documents only perhaps Zcaps required to make authorized calls to another instance
- Michael asked is the replication model assumed multi-master or not? And also how granular does the sync get?
- Chris asked if we are going to adopt a model like CouchDB's replication do we want to invest in attempting to standardised this?
- Daniel B shared previous experiences with couch db, finding it in-compatible with their requirements for hubs
- Orie responded to Michaels partial sync strategy by explaining how CouchDB works with map-reduce queries, citing npm as a popular project that leverages CouchDB, but also noted the distinct difficulties that are introduced when the data is client encrypted.
- Adrian expressed frustration at the introduction of hubs into the conversation, saying that he finds them orthogonal and a distraction to the discussion on sync and replication. He clarified his issue w.r.t the honey pot question saying by encrypting we create a key management issue where clients need HSMs to interact with an EDV.
- Dmitri agreed that adrians general question around key management is good, but to an extent out of scope for this working group.
- Manu in response to the question around standardizing a replication model like couchdb said it is potentially a very difficult and involved task. Furthermore he asked had anyone implemented a model like this? Or instead can we talk about a simpler replication model like uni-directional replication.
- Zokama asked why is couchdb the main focus of the discussion?
- Michael echoed the same sentiments as Zokama
- Tobias responded to adrians question saying he does not expect clients to need an HSM to interact with an EDV, instead an EDV client will need to be able to perform crypto operations but where the keys used are sourced from will likely be varied e.g HSM SSM or derived from knowledge factors like a password.
- Orie asked to re-frame the discussion back into use-cases for replication. He also proposed whether we can initially constrain the conversation to uni-directional replication (the simple case) before the more complex derivatives.
- Dmitri brought the conversation back to some core properties uni-directional vs bi-directional, partial vs full sync and how the sync is performed. He also echoed Orie's ask to start simple w.r.t sync and replication.
- Daniel said he has done a tone of research on sync and replication and wants to be sure we build a system that caters for the more complicated sync scenarios and is committed to that cause.
- Manu echoed Dmitri and Ories ask to work on uni-directional replication initially and asked us to work on more terse use cases that describe the business need.
- Adrian asked for the use-cases to be more granular and relatable than just backup and recovery
- Andreas pointed out there are other examples of P2P replication protocols that can be looked at for sources of inspiration.
- Michael asked what sort of change tracking is build into an EDV to date?
- Manu asked for those using the terminology of "master slave" to instead move towards language like "primary secondary" clarifying he doesn't believe anyone on the call means anything bad when they use it, just wanting to drive the ecosystem towards more inclusive language.
- Dmitri responded to Michaels query by saying it is partially dependent on what use-cases we use to guide us.
Attendees
- Dmitri Z
- Tobias Looker
- Kaylia Young
- Daniel Buckner
- Manu Sporny
- Dave Longley
- Orie Steele
- Adrian Gropper
- Andreas Freund
- Chris Were
- Derek Trider
- George Aristy
- Jace Hensley
- Juan Caballero
- Michael Herman
- Michael Shea
- Nader Helmy
- Sze Wong
- Troy Ronda
- Zokama
Recording (Zoom)
Transcript (Otter.ai)