is a term that is used all too frequently, sometimes with limited understanding of what it actually means. Rewind a few years and took the world of BI and analytics by storm. There was then a blind frenzy to come up with a proper definition to allow people to understand exactly what was meant. Someone had an attachment to the letter V and so the overused ‘volume, velocity and variety’ came to life. This was then added to with every other v-word in the dictionary: ‘Veracity, visualization, value, variability’ and so on.
Don’t get me wrong; it is great to have an accessible understanding of the term Big Data, but simplifying and moulding the definition so we can all use words beginning with V takes us back to square one and confuses the understanding. More than that, in my opinion the whole thing set off in the wrong direction at the very start. Many put the focus on just the technical side of data, whether it be its size, structure or how often it is updated. This came with the advent of new technologies such as Hadoop or NoSQL systems, where new technical advances meant that data was processed faster (“velocity”), more of it could be stored (“volume”) and structured or unstructured versions could be captured and analyzed (“variety”).
As a result, people started to believe that you could only refer to a Big Data use case if all the described v-words were met, or at least the initial three. And that’s where the shortcomings come to light. What about situations where people store and analyze petabytes of just structured data? Or if they analyze vast amounts of unstructured data but just run nightly bulk load jobs? Are they not cases of Big Data, too?
And so, back to my point: The Big Data discussion focused too much on the technical side of things. It was only later when additional v-words were introduced into the mix that a broader discussion could take place that included non-technical aspects such as “value.” And yet, what on earth does value have in common with data volumes? Does more data equal more value? And where is the connection between velocity and visualization? In fact, is there any real correlation between all these dimensions at all? One that conjures up the perfect Big Data scenario? I doubt it. And always have done.
I believe that the market meant something totally different; it was talking about “digitalization.” These exciting Big Data use cases were different because they were brand-new; they were things you had not heard of before. We quickly saw the emergence of new technology trends: mobile devices, geolocation and traffic information generated by millions of GPS-tracking devices, products that included sensors to measure all kinds of interesting things, logistic and production chains where everything could suddenly be tracked accurately. All these new approaches threw up the need for totally new applications for the world of data. Police departments started to predict the probability of crimes, companies started to offer real-time road traffic monitoring, new value-add services such as predictive maintenance came about. In total, completely new data-driven business models were born.
Today, we live in an age where unstructured e-mails and photos are now stored, processed and analyzed, and as a result there are vast data lakes that store vast amounts of bits and bytes, with mobile devices and sensors guaranteeing an almost seamless, uninterrupted flow of data that is captured and processed. But while this is true, ask yourself a simple question: whenever you read an interesting Big Data use case, is it not true that it has to do with the new digitalized era? You’ll soon agree that it does. So, it is for that reason, that I hope we will stop talking about Big Data using the Vs, for no matter how interesting they may sound, they are uncorrelated and too technical. Instead, let’s find a far more suitable and simpler definition for what we have been talking about over the last few years. My personal vote is for D and digitalization.
Indeed, perhaps the time has come to bury the expression “Big Data” for good, as Mark Torr stated in his recent post Is Big Data Dead?.