The Revolution in Data Discovery

Posted on: November 14th, 2013 by Guest Blog No Comments

In what appears to be a short space of time, there has been a revolution in the data discovery and analytics space. Mainstream BI,  the art of delivering static reports and dashboards focused on the descriptive ‘What Happened ?’ is now overshadowed by the more diagnostic approach to  data discovery focusing on the ‘Why did it and What will happen ?’.

In 2012 the visualisation-based data discovery component of the BI market grew at a rate of more than 30% and is expected to outpace the overall BI market by a factor of 3-to-1 until 2015.

However, for me the proliferation of visualisation based discovery is only one of three factors in the surge to en vogue status of data discovery in the current era. Certainly, visualisation software such as Tableau and QlickTech have empowered the end user and helped remove the traditional IT SDLC bottlenecks which often held back innovation and insights but there are two other significant weapons in today’s analytical arsenal.

The first of these is the increasing capability of today’s RDBMS based Enterprise Data Warehouse (EDW) platforms to perform sophisticated analytics in-database. This is helping to significantly reduce model development and execution timeframes providing a much more agile, iterative and productive environment for the analysts to work within. Also the in-database philosophy enables analysts to develop their models on entire historical data sets rather than being restricted to sample analytical sets, allowing them to benefit from the power of the MPP database engine and the removal of network transfer of data from the whole equation.

Secondly there is the coming of ‘Big Data’ and the need to not only store new types of data which are not well suited to traditional relational databases but to start to understand the possible value, if any, of this new data. An analytical task which is typically better suited to be executed through the MapReduce framework than relying on SQL based code.   

There is a lot of talk in the market for the need for a Discovery Platform and I agree, if analytics is something a company is serious about then this will inevitably be on the shopping list in the coming months/years. However more importantly is the focus on delivering the complete discovery eco-system, an environment which enables analysts and data scientists to seamlessly combine data from both traditional EDW platforms and the new big data driven Discovery Platforms to start answering the questions that have never been asked before or discover the questions which have yet to be thought of.

These discovery eco-systems need to support OLAP, statistical analysis, text analysis, and predictive modeling and analysis and should deliver pre-built analytic function libraries for processing to enable rapid and iterative data analysis and provides faster time to value. After all by definition discovery is all about ‘being first’ and the competitive advantage and associated  benefits to those who are first can be significant indeed.

These are certainly exciting times in the world of data discovery and analytics, and if I had my time at University again perhaps I would be swapping my Accounting and Finance degree for one in Pure Mathematics and Statistics, so I could get myself one of those cool Data Scientist titles, oh well hindsight is a wonderful thing.

Want to learn more about Data Discovery? Listen to key thought leaders at our upcoming Data Discovery webinar including Scott Gnau President ,Teradata Labs by registering for the webinar “ Gain an Unfair Advantage with a Data Discovery Platform’ on 3 December, 2013 at 1pm (AEDT)

David Hudson is a Senior Solutions Consultant at Teradata ANZ. He has 10 Years Data Warehousing Experience, primarily focussed on Enterprise Data Model solutions. This includes data integration, ETL design and Logical Data Modelling.

Leave a comment

*