“Big data is everywhere! And you should be doing something with it; otherwise your value is diminished…”

Just read the ads and analyst reports.
I don’t know about you, but every time I hear the words “big data“ and someone tries to explain to me that I should be doing something about it, I get a little depressed, and frustrated.
As with “cloud”, big data has entered the vernacular in a manner that basically could mean anything to anyone.
So what is big data?
One definition is that big data is “high-volume, high-speed and diverse modes of information that require advanced analytical techniques to organise, interpret and process”.
It can also be defined as large data sets that have manifested themselves via the advent of sensor and sensor-based processing. These sensors range from smartphones, to PLCs, to software-derived data results.
So rather than big data being a process, or outcome, or panacea, it is actually the problem.
Why?
Well, we have all this data. We store it; we may get some value from it during its creation, and then what?
It’s about predictive analytics. Tell me something about what is going to happen so I can make a decision. How do I get enhanced insight?
Rather than just storing this data ad infinitum, we need to decide whether it’s of value and do something with it.
In projects I have been involved with around machine data, we see that the real value comes from about one percent of the data collected.
So 99 percent has no value? It depends on how you define value. It may be that it’s just control data – on or off – then I would surmise that its value is limited.
And there’s the rub.
Some interesting statistics around big data:
- 90 percent of the world’s data was created in the last two years,
- More data was created in 2012 than in the last 5000 years,
- By 2020, 40 percent of the world’s data will come from sensors,
- Walmart collects 2.5 petabytes of data a day from customer transactions.
Are you scared yet? Are you ready? Do you want to be ready?
So how do we deal with this data deluge?
Let’s cut through the marketing hype and get to the real issues.
- Firstly, the solution is not about software. Put down that brochure that espouses knowledge and problem free analytics.
- Secondly, the solution is about understanding your data. Not at the minutiae, but understanding it in a way that allows you to think more holistically about the things that you know are not being answered.
- Thirdly, it’s about trial and error. The data we collect can and does hide gems of knowledge and understanding. This has context and people have context. Match the data context to the people context and you have value. This value may be fleeting, but its value nonetheless.
- Fourthly, it’s about timing. Certain types of data have value that is time specific. Its ability to provide decision support depends on its visibility. Understand the strata of your decision processes to eke out the right data at the right time.
- And lastly, just because you collect lots of data, doesn’t mean you have insight. Insight comes from context, analysis and understanding. Get rid of data you know is of little value. Reduce your surface area and focus.
So your big data journey has already started. Just forget about the buzzwords and let your instinct experience, intuition and creativity lead you down the path of enlightenment.
And don’t ask yourself what you think you know; ask the data what you don’t know.