Every second, the Large Hadron Collider (LHC) near Geneva, Switzerland produces 40 terabytes of data. That’s more data than can be currently stored and analyzed. The scientists working on the project are forced to collect just a slim set of the data and they hesitantly ignore the rest. While this example is at the top end of the data deluge that our increasingly digitized world is creating, we can all relate to other, closer to home examples: Every minute, 20 hours of video is uploaded to the popular video-sharing site, YouTube. On Facebook, the biggest and most popular social networking site, 2 billion photos are uploaded each month. Whether it is our text messages (a volume greater than the population of the earth are sent each day), or credit card transactions that reveal the intimate details of our purchasing behaviors, or recorded search entries in search engines that tell us so much about the human experience, or the cameras that photograph and video us as we go about out daily activities, the quantity of data that humanity is collecting and storing is staggering. And it’s increasing exponentially. We are moving from data scarcity to data abundance and disrupting our conventional view of economics. We call it big data.
You might ask why is big data happening and what are the implications?
This data deluge is being driven by a number of factors. First, we are increasingly migrating from an analog to digital world. We don’t use the gelatin silver process for photos anymore; we store them as bits in an ethereal cloud. The same applies to such things as movies, music, newspapers, books, our bank records, our bills, and our airline tickets. It’s a very long list. Second, it’s become cheap to store digital data. At the time of writing, the estimated cost of a gigabyte is 10 cents. The hard drive I am writing this blog on has 100 GB: a total cost of $10 that includes every digital photo I have taken and every song I have purchased and there is still plenty of room left over. Finally, the tools for us to produce content, store it, and share it have become highly accessible and quite often: free! Today, any one of us can be a publisher or credit-card collecting retailer. This creates enormous amounts of data.
Collecting and sharing all this data clearly presents a number of challenges and in my view, incredible innovation opportunities. This is not last year’s data-mining. This is data-mining on steroids! While I won’t dwell on the challenges, they are clear: increasing transparency in our lives potentially results in less privacy. More information and equal access to it means that the value of some intellectual property is trending downward. Easy access to content and communication tools means even the bad guys have a voice on the global stage. And while the risks and challenges are many, I want to turn our attention to the innovation opportunities.
In order to track the spread of the H1N1 Flu virus, the Center for Disease Control (CDC) was able to leverage the search entries of a popular search engine. Since people search for things like symptoms when they are sick and since the search engine knows where the search is located, we can easily see how large volumes of searches produce a beautiful visual of how a disease is spreading both over time and in what direction. It’s the invisible knowledge that the data deluge unveils that makes me excited about this area.
I’ve identified five net new possibilities that big data presents:
- Answer formerly unanswerable questions. With so much data being collected in so many different ways, it is now possible to ask and get answers to questions we just couldn’t before. For example, if we analyze the social connections between employees that their digital footprints create, might we be able to identify specific knowledge domain experts? Sometimes the questions will be deliberate, but I’ll bet we’ll accidentally be able to answer insightful questions we didn’t intend to.
- The formulation of new questions. Now that we know certain data is being collected, what new questions could we ask? Imagine how empowering that could be to C-suite executives!
- More informed, evidence-based decision making. How many times have you wished you had more detail on a subject in order to make a better decision? Better data can mean better and timelier decisions. Big data opens up a whole new opportunity for competitive advantage.
- Democratization of data. We no longer need to build systems that silo critical data and create an enterprise digital divide. Aggregated data in volume can be easily made available so everyone can benefit. It can be re-purposed resulting in new value that goes way beyond its initial intent. For example, governments all over the world are embracing open-data policies. This gives the electorate unprecedented access to local, state, and federal insight. For example, local community groups are using the data to combat crime and to discover societal inequalities (is this neighborhood getting an unfair access to a resource?). Give data to your employees and they may tell you something about your organization that reduces costs or builds your next billion dollar product or service.
- Visualization of invisible knowledge. Big data creates amazing and valuable visualizations. And these visualizations unveil secrets that were previously hidden. For example, tag clouds (groups of words that become larger the more each word is used) tell us what people are talking about on social networks. A map of the world superimposed with real-time stock trading data flows tells us a lot about global commerce.
Of course I’m only skimming the surface here. What I’m trying to convey in this introductory blog on the subject is that big data is a big deal. Moreover, rather than viewing it as a threat–which many folks will–this is an opportunity for incredible innovation. Big data will bring entirely new value propositions and it will force the reinvention of entire industries and business models.
In my view, big data is the next big thing.