Big Data: not unprecedented; but not bunk, either

Wednesday July 23rd, 2014

Larry Ellison, Oracle’s flamboyant CEO, once remarked that “the computer industry is the only industry that is more fashion-driven than women’s apparel”.  The industry’s current favourite buzzword – “Big Data” – is so hyped that it has crossed over from the technology lexicon and entered the public consciousness via mainstream media.  In the process, it has variously been described as both “unprecedented” and “bunk”.

So is this all just marketing hype, intended to help vendors ship more product?  Or is there something interesting going on here?

To understand why the current Big Data phenomenon is not unprecedented, recall that Retailers, to take just one example, have lived through not one but two step-changes in the amount of information that their operations produce in less than three decades, as first EPoS systems and later RFID technology transformed their ability to analyse, understand and manage their operations.

As a simple example, Teradata shipped the world’s first commercial Massively Parallel Processing (MPP) system with a Terabyte of storage to Kmart in 1986.  By the standards of the day this was an enormous system (it filled an entire truck when shipped) that enabled Kmart to capture sales data at the store / SKU / day level – and to revolutionise the Retail industry in the process.  Today the laptop that I am writing this blog on has a Terabyte of storage – and store / SKU / transaction level data is table-stakes for a modern Retailer trying to compete with Walmart’s demand-driven supply chain and Amazon’s sophisticated customer behavioural segmentation.  Similar analogies can be drawn for the impact of billing systems and modern network switches in telecommunications, branch automation and online banking systems in retail finance etc., etc., etc.

The reality is that we have been living with exponential growth in data volumes since the invention of the modern digital computer, as the inexorable progress of Moore’s law has enabled more and more business processes to be digitized.  And anxiety about how to cope with perceived “information overload” predates even the invention of the modern digital computer.  The eight years that it took hard-pressed human calculators to process the data collected for the 1880 U.S. census was the motivation for the invention of the “Hollerith cards” by Herman Hollerith, founder of the Hollerith’s Tabulating Machine Company – which later became International Business Machines (IBM).

Equally I would argue that it is a mistake to dismiss Big Data as “bunk”, because significant forces are currently re-shaping the way organisations think about Information and Analytics. These forces were unleashed, beginning in the late 1990s, by three disruptive technological innovations that have produced seismic shocks in business and society; three new waves of Big Data have been the result.

The first of these shocks was the rise (and rise, and rise) of the World Wide Web, which enabled Internet champions like Amazon, eBay and Google to emerge.  These Internet champions soon began to dominate their respective marketplaces by leveraging low-level “clickstream” data to enable “mass customisation” of their websites, based on sophisticated Analytics that enabled them to understand user preferences and behaviour.  If you were worried that my use of “seismic shock” in the previous paragraph smacked of hyperbole, know that some commentators are already predicting that Amazon – a company that did not exist prior to 1995 – may soon be the largest retailer in the world.

Social Media technologies – amplified and accelerated by the impact of increasingly sophisticated and increasingly ubiquitous mobile technologies – represent the second of these great disruptive forces.  The data they generate as a result are increasingly enabling organisations to understand not just what we do, but where we do it, how we think, and who we share our thoughts with.  LinkedIn’s “people you might know” feature is a classic example of this second wave of Big Data, but in fact even understanding indirect customer interactions can be a huge source of value to B2C organisations – witness the “collaborative filtering” graph Analytics techniques that underpin the increasingly sophisticated recommendation engines that have underpinned much of the success of the next-generation Internet champions, like Netflix.

The “Internet of Things” – networks of interconnected smart devices that are able to communicate with one another and the world around them – is the third disruptive technology-led force to emerge in only the last two decades.  Its ramifications are only now beginning to become apparent.  A consequence of the corollary of Moore’s Law – simple computing devices are now incredibly inexpensive and fast becoming more so – the Internet of Things is leading to the instrumentation of more and more everyday objects and processes. The old saw that “what gets measured gets managed” is increasingly redundant as we enter an era in which rugged, smart, – and above all, cheap – sensors will effectively make it possible to measure anything and everything.

We can crudely characterize the three “new waves” of Big Data that have accompanied these seismic shocks as enabling us to understand, respectively: how people interact with things; how people interact with people; and how complex systems of things interact with one another.  Collectively, the three new waves make it possible for Analytics to evolve from the study of transactions to the study of interactions and observations; where once we collected and integrated data that described transactions and events and then inferred behaviour indirectly, we can increasingly measure and analyse the behaviour – of systems as well as of people – directly. In an era of hyper-competition – itself a product of both globalisation and digitisation – effectively analysing these new sources of data and then taking action on the resulting insight to change the way we do business can provide organisations with an important competitive advantage, as the current enthusiasm for Data Science also testifies.

Contrary to some of the more breathless industry hype, much of what we have learnt about Information Management and Analytics during the last three decades is still relevant – but effectively exploiting the three “new waves” of Big Data also requires that we master some new challenges.  And these are the subject of part 2 of this blog, coming soon.

One thought on “Big Data: not unprecedented; but not bunk, either

Leave a Reply

Your email address will not be published. Required fields are marked *