I have been hearing and thinking a lot about personal data stores recently. Three small examples: personal data stores are at the heart of VRM which I have been blogging about recently, yesterday I was learning about Y-Combinator startup WebMynd which is a clickstream history play (if you are reading please get in touch) and then this morning I have been reading about how personal data stores mean traditional search is dead:

Consider how much information you voluntarily provide on your Facebook profile. Now imagine if you could combine that with your Netflix renting and Amazon buying habits. Then throw in the suggestions of your friends and the pages you visit the most often. All those various sources of information about you are currently stored in different locations—on your computer’s browser history, on your Facebook page, on the servers for Netflix and Amazon—but just imagine how accurate a search could be if every time you had a query, the mass of data about you that exists on  he Internet could inform the results.

Because everything we do on the web leaves a trail lots of information is being generated. All these ideas are variations on the theme of capturing that data in one place and then using that as the basis of a service. This concept is not new to technology – CRM was all about collecting customer data in one place and using that as the basis to revolutionise customer service, the whole data wharehousing and business intelligence space is about collecting and mining enterprise data more generally and you can even look on Google’s index of the web as another example.

The challenge for consumer oriented apps in this market is finding a way to make the data capture painless and deliver benefits when the data store is still small. This contrasts with the enterprise examples I gave above where CIOs convinced of the long term benefits invested considerable amounts of time and money in the expectation of future payback. That won’t happen at the consumer level.

As Webmynd clearly understand the way through this challenge is to build a great service which doesn’t depend on the data store to start with, but generates the data as a buy product. Then the service can start getting better once enough information has been captured.