In the ‘shadows’ of big data

By February 8, 2013Venture Capital

I’ve just read Betaworks 2012 Shareholder Letter written by the CEO John Borthwick. At 43 pages its a long read. Its pretty dense too. But there’s an amazing amount of great insight there – from developments in the investment market to interesting startup themes.

One of the themes John talks about is drawing value from the ‘shadows’ of big data – i.e. gleaning insights from big data sets that bare little relation to the reason the data was collected in the first place. It is easy to think of some obvious examples – including traffic data from people using Google maps (see picture above) and opening hours for restaurants and bars from Foursquare check-in data – but the value will be in finding the larger number of smaller bits of value. Call it making big data useful, if you will. This applies to consumer and enterprise, and I think we will see startups succeed both in making enabling infrastructure and tools, and in making the end user applications.

I’m going to close with a fun example of drawing insight from the shadows of data, also from the Betaworks shareholder letter. This time the original dataset was links shared on Bitly (quote from Hilary Mason, Bitly Chief Scientist):

“People in Brooklyn are more likely to read about food than people in any other part of NYC. People in Manhattan are more likely to read about business. People in Queens are more likely to read about sports.

The top Wikipedia article of 2012 was "Lunch"

People who read about fashion read about physics, but people who read about chemistry don’t read about anything else (which is behavior we see in two other categories — religion and adult).

Sports was the one topic for which people actually only want to consume fresh information. Every other topic was roughly equal, with religion being the slowest.“

That made me laugh on a number of levels. I know people think with their stomachs, but it’s amazing that ‘lunch’ is the most popular article in the world’s most read encyclopedia, and it’s amusing to read about chemists… (not that I draw any conclusions).

  • Is their analysis based on the sharer of the link or the person who clicks on the link? My suspicion is that they are looking at the sharer. So they are saying that people who share chemistry links only share chemistry links. This seems to make sense since chemistry is quite an inaccessible subject for non-chemists. Assuming most of the links are shared on Twitter, this equates to chemistry twitter news mostly being shared by specific chemistry sources (e.g. Chemists Weekly). On the other hand fashion and physics are much more mainstream (physics, as in stars and similar “wow” parts of physics) so will be shared on mainstream sources.

    So in conclusion I propose that chemists don’t only read about chemistry. I think chemistry sources are just more focused on only that subject.

    A recovering chemist

  • ☺ I don’t know whether the basis was clicks or shares, but I would be surprised if chemists were really very different to other scientists in terms of the breadth of their interests. I mostly posted the example because it was amusingly provocative.

  • Pingback: The digitisation of analogue information « The Equity KickerThe Equity Kicker()