Shared data services – the next frontier?

I have an interest in cloud computing from two angles:

  1. software infrastructure via our investment in Zeus (which is making some great headway in this space); and
  2. what it means for web apps

So I took the time yesterday to read this long, but very interesting article on cloud by Paul Miller that was posted on ReadWriteWeb back in December.

My key take away was that the next wave of value creation on the web could well come from efficiencies wrung from sharinb data.

Part of this is an old idea – we have all known for a long time that the value of most web services resides in their data rather than anywhere else.  Facebook’s software could be recreated by Google or Microsoft or anyone else with enough money, but their data, now that is a different story.  Moreover luminaries like Tim O’Reilly have been talking about this pretty clearly for 3-4 years (for more on this check out point 3 from Tim’s What is web2.0 post from Sept 2005, do a search on ‘data is the Intel inside’, or you could even look at what I have written about it before), or worse, under the moniker ‘semantic web‘ Tim Berners-Lee has been on this point since the beginning of the internet.

There were however two things that stood out as new for me.

Firstly Miller paints a beautifully simple picture of the productivity gains to be had from shared data as the logical next frontier following on the heals of network infrastructure hardware (Sun, Cisco), network infrastructure software (BEA, Oracle), and web apps (Google, Amazon).  He puts it thus:

Just as ‘we’ used to duplicate and under-utilize computational resources, so we do something very similar with our data. We expensively enter and re-enter the same facts, over and over again. We over-engineer data capture forms and schemas, making collection exorbitantly expensive, whilst often appearing to do all we can to limit opportunities for re-use. Under the all-too-easy banners of ‘security’ and ‘privacy’ we secure individual data stores and fail to exploit connections with other sources, whether inside or outside the enterprise.

Secondly this brings out the value to be created by sharing data rather than by controlling it.  There is a fantastic example of this unfolding before our eyes in the form of Twitter – as Tim O’Reilly pointed out Twitter would have a fraction of it’s current impact if it hadn’t made itself and it’s data so open to third party apps (my favourite of which is Tweetdeck btw).