Tag Archives: MapReduce

Spotting the pretenders in Data Science

Wednesday February 15th, 2017

The term “Data Scientist” is often over-used or even abused in our industry. Just the other morning I was watching TV and a news piece came on talking about the hottest careers in 2017 and data science was top of the list. Of course this is good for those who have been dealing with data for… Read More »

The Revolution in Data Discovery

Thursday November 14th, 2013

In what appears to be a short space of time, there has been a revolution in the data discovery and analytics space. Mainstream BI,  the art of delivering static reports and dashboards focused on the descriptive ‘What Happened ?’ is now overshadowed by the more diagnostic approach to  data discovery focusing on the ‘Why did… Read More »

Hype, Hadoop and The Logical Data Warehouse

Monday June 24th, 2013

“Hadoop has fundamentally transformed the economics of data management, making it possible to choose to keep all (of) one’s data, without an exorbitant, on-going investment in a cumbersome technology that can’t keep pace with the growth of data or the evolving needs of a business.” Mike Olson (Cloudera), quoted from Warehouse = Relic? Cloudera’s Mike… Read More »

5 Steps to Making BI Smarter in Big Data Analytics

Monday May 20th, 2013

In recent months, I met with the Business Intelligence (BI) teams in different countries to discuss Big Data Analytics. What transpired from the meetings is clear lack of awareness of what Big Data Analytics can do for the BI team and how Big Data Analytics fit within the enterprise data warehousing (EDW). As ambassadors to… Read More »

Reduce the Pain of MapReduce

Thursday September 6th, 2012

Recently I re-read Dean and Ghemawat’s much cited 2004 paper which did so much to popularize MapReduce. I thought it would be nice to implement a couple of the problems which they cite as algorithms which are well suited to being solved by MapReduce. What made it even nicer was the ease with which these… Read More »

In Defence Of The Data Warehouse Database Management System

Friday February 24th, 2012

Gartner’s “hype cycle” model for the deployment of new technology rests on the assumption that our industry typically over-hypes innovation, so that new technologies typically scream up “the peak of inflated expectations” – before crashing unceremoniously into “the trough of disillusionment”.  Select technologies eventually proceed – typically at a more leisurely pace – onto “the… Read More »