Data portability, privacy and personal data stores

Marshall Kirkpatrick wrote an interesting post yesterday on ReadWriteWeb entitled Towards a Value-Added User Data Economy. He applies network theory to data portability to show that all companies will be better off if they all allow data to be ported in and out. Essentially each social application will add value to the data making the overall experience richer on every site. To adopt this strategy a site needs confidence in the quality of its offering and its ability to keep innovating, but the alternative is to try and lock the user in, and we all know where that ends up – eventually.

The eventually is important here – it took a long time for AOL’s walled garden to fail and their investors and management made a lot of money in the meantime.

Marshall also discusses the privacy issues created by data-portability. Until I read his post my thoughts on this topic had been limited to the simple point that porting data from one application to another creates more copies in more places, thereby increasing privacy risks. Marshall makes the additional point that because these applications are social our personal data is inextricably bound up with that of our friends – thereby increasing the complexity of the problem.

All of which makes me think of personal data stores – the idea that we store our core personal data in a single place and allow services to access it on a permissioned basis. The sites that access that data and add value to it might store the derivative data they create, but the core data would be in one place. Replacing the many to many relationships of multiple social apps talking to each other with a hub and spoke architecture like this would give the user better control over their private data whilst maintaining the network benefits that data portability offers.

I feel an example might aid understanding.

Let’s say I’m a music fan – the core data would be the music I listen to – maybe scrobbled by LastFM, or scanned from my harddrive. That should live in my personal data store and be accessed by derivative services that might generate recommendations. The recommendations wouldn’t constitute core data and could live in the application that generated them.

The personal data store might be an existing service like Facebook (or even LastFM) or a new service created specifically to form this function. And different people might choose to use different applications as their hub.

This model of a personal data store where the user allows different service to access the data on a fine grained persmissioned basis has a lot in common with the VRM vision of how advertising might evolve.

I’m attracted to the conceptual elegance of this view of the future, as well as the efficiencies and benefits I describe above. I think we will get there eventually, but it may be we don’t take the straightest path.

  • Dave Brown

    Nic, In your “what I listen to” example I think lastfm and others need to have a copy of your data so that they can do clever stuff like data mining comparisons between you and others to find more recommendations for you – it wouldn’t work if they had to retrieve all the data every time from each individual’s datastore. If you are a site manipulating this data or doing some sort of transaction then it will get very complicated if you don’t store the personal data but rely on an external data store (eg. your shipping address can change over time but the merchant needs to know what was shipped where in case of issues).

  • Dave Brown

    Nic, In your “what I listen to” example I think lastfm and others need to have a copy of your data so that they can do clever stuff like data mining comparisons between you and others to find more recommendations for you – it wouldn’t work if they had to retrieve all the data every time from each individual’s datastore. If you are a site manipulating this data or doing some sort of transaction then it will get very complicated if you don’t store the personal data but rely on an external data store (eg. your shipping address can change over time but the merchant needs to know what was shipped where in case of issues).

  • nic

    Good point Dave. For the model to work they would have to copy the data and then synch up periodically, or something similar.

  • nic

    Good point Dave. For the model to work they would have to copy the data and then synch up periodically, or something similar.

  • Dave, Nic,

    Discussed this topic with Joe Andrieu. And he came up with some great points with regards to the rights to read/write etc.

    “… Here are a few rights that users might want to be able to secure for their data, as well as some privileges they could provide to vendors:

    Reciprocity–That vendors who access a particular type of data also agree to reciprocally provide updates to that data. For example, I might let Amazon access my media history records if they agree to update it with my past and future media purchases at Amazon.

    Non-propagation–No further distribution of the data beyond the specific services authorized. No reselling to third-parties. No re-use by other divisions.

    Non-persistence–No retention of the data beyond the session of the current transaction. For example, an emergency room physician can access my personal medical history while I’m under his or her care, but he or she can’t store that data on any internal systems.

    Anonymous Persistence–Data can be retained, but only if it is suitably anonymized and disassociated from the individual user.

    Editable Persistence–Data may be retained by the vendor, but it must be editable and deletable by the user.

    Anonymized Analytic Rights–Vendor has the right to query the PD at a later point for business or operational analysis, as long as that analysis ensures anonymity after the fact.

    So one more business opportunity for somebody to tackle this … 😉

    Enjoy the weekend

  • My company has developed a fast-growing personal data store to which we are considering added controlled access features (and others) like those discussed above. It’s called Evernote and has been in invitation-only beta since late February: http://www.evernote.com. If anyone interested in “Data portability, privacy and personal data stores” wants to try it out and give me feedback on features, send an email request with the subject “personal data store” to [email protected] and I’ll send you an invitation.

  • My company has developed a fast-growing personal data store to which we are considering added controlled access features (and others) like those discussed above. It’s called Evernote and has been in invitation-only beta since late February: http://www.evernote.com. If anyone interested in “Data portability, privacy and personal data stores” wants to try it out and give me feedback on features, send an email request with the subject “personal data store” to [email protected] and I’ll send you an invitation.

  • I am looking forward to the time when these ‘system/platform’ issues are resolved and we can move onto the more exciting stuff – in Chris Saad’s words: “…then it’s a competition to see which vendors can add the most value to the free flow of data”

    david – looking forward to seeing how evernote will contribute to this – will email you shortly.

  • I am looking forward to the time when these ‘system/platform’ issues are resolved and we can move onto the more exciting stuff – in Chris Saad’s words: “…then it’s a competition to see which vendors can add the most value to the free flow of data”

    david – looking forward to seeing how evernote will contribute to this – will email you shortly.

  • iainhenderson

    Hi Nick – yes i'd the the personal data store is absolutely critical to VRM – part of the plumbing; and VRM won't really do much without it other than some tactical, point-making stuff.

    Personally, I don't think the path from here to there will waiver too much – the smartest organisations have already figured out that the current modus operandi around 'we gather the data' is not optimal and will engage with personal data stores as soon as they emerge.

    Mydex CIC is working on that, as are others in the VRM space – i'd hope to see something live in the market within 6 to 9 months from now (Sept 2009).

    Cheers

    Iain

  • iainhenderson

    Hi Nick – yes i'd the the personal data store is absolutely critical to VRM – part of the plumbing; and VRM won't really do much without it other than some tactical, point-making stuff.

    Personally, I don't think the path from here to there will waiver too much – the smartest organisations have already figured out that the current modus operandi around 'we gather the data' is not optimal and will engage with personal data stores as soon as they emerge.

    Mydex CIC is working on that, as are others in the VRM space – i'd hope to see something live in the market within 6 to 9 months from now (Sept 2009).

    Cheers

    Iain