Mining personal data – the next big frontier


This week Eric Schmidt of Google said he would help us answer questions like “What am I going to do tomorrow?”. I applaud the sentiment here, I really do, but I don’t think Eric is the right guy for the job, and he certainly isn’t going about it the right way.

A lot of people have a bad reaction when Google does things like this – Does Eric Schmidt want to sniff the armpits of my mind? is a very funny example, and indeed this post was in part inspired by some friends saying at dinner last night how much Shcmidt’s arrogance pissed them off.

Underlying all this are some very real privacy concerns which I will come back to, but first I want to focus on how useful these sorts of services could be.

Firstly, at a consumer level services that automatically personalise based on my personal data (click stream, search history, anything else I tell it) will simply be better for me. Services that do this in a small way like LastFM and our portfolio company Lovefilm are massively popular so imagine the power of a cross media recommendation engine.
Secondly (and this is why Google is interested), you can charge MUCH more for personalised ads.

In my somewhat utopian view of the world I hope we will end up in a virtuous circle where we offer more personal information, our favourite sites make more money, our favourite sites make themselves better, we give them more data… and so it goes on.

So, back to privacy, the problem is that in order to get good data about me I need to let you monitor everything I do. I need to trust you completely in order to do that, which probably means that the guy who collects and manages the data for me needs to work for me – ie I am his customer. If it is someone who makes money directly from having that personal information, like say Google then there is too much conflict of interest there. Plus they will only be able to exploit my data on their own properties, which is likely only a small fraction of my surfing activity.

Phorm (formerly known as 121 Media and now out of stealth mode) is making this play here in the UK via relationships with ISPs. This has always struck me as a smart route. Another thought is that maybe it ends up in big data companies that are a bit like today’s credit checking bureaus. The privacy issues mean that this space will be heavily regulated, and that seems to lend itself to this model.
The prize here is big, really big, but it is matched by the challenges. The most obvious ones are: persuading consumers to let you capture their data, then finding a good technical way of doing it, then finding a way to let websites take advantage of the data to personalise their sites and their ads, and then charging for it.

This is not a new idea. I have been talking about it since I started in VC back in 1999 and there have been numerous attempts to make it happen. Microsoft, Google and Yahoo! all have this as their strategy and the whole behavioural targeting movement, including Wunderloop is about getting value out of personal data. The reason we haven’t gotten very far down the track up to now is that the challenges I listed above are very hard to overcome.

Now I think times might be a’changin.