Part 2 – The Value of Big Data
In Part 1 we saw that the main paradigm shift of Big Data is that until now IT always limited the business to the amount and complexity of data they could store and analyse. But no more: with Big Data the message is “all the data, all of the time”.
What does this mean to the business?
First of all, no more throwing away useful data. We can afford, quite cheaply, to store everything forever. What makes this happen? The Hadoop file system (HDFS) which is combined with an efficient retrieval method (Map-Reduce).
Second, we can derive useful insights from the data in reasonable time (both human time and machine time). The reduction in human time is due to advanced analytics techniques and the ability to write SQL-like code (which most analysts are familiar with) against the data. The reduction in machine time is a design feature of Hadoop: everything is performed in parallel by a large number of cheap processors.
Lastly, the value comes from combining analysis of the “previously not feasible” data with well-known enterprise information. For example, we may do an analysis of our customer base to locate influencers and predict their churn probability (“previously no feasible”); this can be translated into actionable insight once it is combined with our knowledge of the value of the customer (probably from the Data Warehouse).
Note that some of these Analytics projects will end up as “explored but no business value found”.
This can be caused by several reasons:
- Insufficient data for the perceived goals
- Insufficient data quality
- Lack of business ability to use the insights for real business advantage
- Inability to prove a business case
- Real lack of business opportunity
This is where a “sandbox” approach is required: the “sandbox” is a data store that allows the enterprise to quickly load-in data, even if it is not 100% enterprise-ready, and do useful things with it. The successful attempts, that quickly show promise, can be improved and moved to the ‘proper’ data stores. Failures happen quickly and cheaply. This is an environment for the Data Scientists and the Data Analysts. Here they can explore the data, gain insights and try them out. We call this area the “Data Exploration Platform”.
In the 3rd and final part of this series,we look at how these can be overcome; we also justify the title “Big Data is not Hadoop”.
Ben Bor is a Senior Solutions Architect at Teradata ANZ, specialist in maximising the value of enterprise data. He gained international experience on projects in Europe, America, Asia and Australia. Ben has over 30 years’ experience in the IT industry. Prior to joining Teradata, Ben worked for international consultancies for about 15 years and for international banks before that. Connect with Ben Bor via Linkedin.