Friday, April 16, 2010

Economist: Data, too much information and the Yottabyte

I have been saying for three years now that the biggest challenge we have to face in the online travel industry is "too much information". Now, thanks to the Economist, we can actually start to see how much is too much.

Once or twice a month the Economist publishes a special report on a country or subject. A few weeks back they published Data, data everywhere, a supplement devoted to the amount of information and data swirling around us. Let me jump straight to an extract from the punchline to the article
"According to one estimate, mankind created 150 exabytes (billion gigabytes) of data in 2005. This year, it will create 1,200 exabytes. Merely keeping up with this flood, and storing the bits that might be useful, is difficult enough. Analysing it, to spot patterns and extract useful information, is harder still."
Not all of this is online but according to Cisco, by 2013 667 exabytes of data will be flowing over the internet.

To put an exabyte into context, to store 1 exabyte of data would take 15.6million top of the range 64GB iPads. I struggle to think how we can capture, digest, store, manage, secure, use and more that amount of data.

The Economist gave three interesting snapshots of companies trying to deal with this amount of data:
  • Facebook: currently storing more than 40 billion photos;
  • Wal-Mart: processing 1 million transactions per hour; and
  • Oracle, IBM, Microsoft and SAP: have spent more than $156billion on buying software firms specialising in data management and analytics.
The official definition of information is data processed in timely, relevant and accurate form. An IBM survey reported by the Economist found that half of the managers quizzed did not trust the information they had to base decisions on. We have more data than we know what to do with but don't trust it to help us come to the right outcome. The data flood's first consequence is to stifle our ability to turn data into information.

From all this it is clear that we need to learn some new words. Screw the giga, tera, peta and even exabyte. Time to introduce you to the Zettabyte (2 to the 70 bytes or a 1000 exabytes) and the Yottabyte (2 to the 80 bytes or 1000 Zettabyte). Though you will not need to worry about the Yotta just yet. Even the Economist admits that the Yotta is currently not just too much information but "too big to imagine".

The Data special report is a great read - check it out.

PS if you want to read more about what exciting things I think we should be doing with data, check out my series of posts on my concept of EveryYou.

thanks to J.Kleyn for the photo via flickr


Erez Armoza said...

I think the Economist article has failed to mention one major source for the overwhelming amount of data, which is the social media. The number of Blog Posts, Twitter twits, etc. about any subject is overwhelming.
Mining that information is very difficult, and the amount of time business people are spending on following their industry and their competitors may prove to be spiraling out of control.

I've been reading your blog for a while, but since this is the first time I'm responding, I guess this is a good opportunity to complement you on a great blog.

Erez Armoza

Tim Hughes said...

@Erez - thanks for the comment and feedback. I agree with you. It is relatively easy now for a customer care agent to track most of the mentions of a compay on twiiter, facebook etc. Only a matter of time before we can no longer manually track the quantity of data on social networks.

Anonymous said...

Tim, Numbers don't seem to add out ...

- 150 exabytes in 2005.
- 1,200 exabytes in 2010.
- according to Cisco, by 2013 667 exabytes of data will be flowing over the internet.

Tim Hughes said...

@Anon - the difference in that one is the amount of data in total (the bigger number) and the other is the amount of data flowing over the internet (smaller number)