Big Data and the Three Little Pigs

Tuesday September 2nd, 2014

I’ve recently been involved in a project that advised clients on how to manage their enterprise data assets.  This invariably revolved around the issue of how to remain agile and responsive to business demands for analytics while maintaining integrity and reliability of the data.  There is also the issue of soaring ETL cost for provisioning this data.

Big Data looms large in this discussion, recognising the need to be able to manage this beast and at the same time continue to support the need for traditional enterprise relational structured data. The answer is around matching the level of data integrity and reliability with the intended use of the data.  We often hear an approach where data is categorised in to value types –  Gold, Silver or Bronze.  I actually prefer the three little pigs model of STRAW, STICKS, and BRICKS. This more closely reflects the nature of housing data in an environment with differing levels of protection (integrity and reliability), matched to effort (cost) of delivery.

3LittlePigs

Financial, external reporting and regulatory compliance data needs to be built in a house made of BRICKS.  It needs to be on solid foundation, with data reconcilable to the original source.  And as often is the case for financial data, it is used for multiple functions in the enterprise covering decisions on product and service pricing, sales performance and analysis and the all-important staff/executive incentive reward calculations.

Data housed in STICK is sturdy and has a well thought out designed and structure.  It is application specific and is not designed for enterprise usage and cross functional sharing.  A lot of marketing data marts would generally fall in to this category with purpose specific data transformation.  The data transform, reconciliation and lineage requirements would just be enough for the specific business purpose and nothing more. By nature, a house made of STICKS is cheaper and faster to build than a house made of bricks.

And finally, the house made of HAY.  It’s essentially a pile of data that is transformed, grouped, summarised on the specific day that it needs to be used.  There is little structure and probably in the original source data format.  You can shape it in any way you want. The solution design is often not easily replicated or scaled up, but is good in  answering a non-specific and non-recurring business issue. As such there is little requirement for reconciliation and replication.

The analogy is  fairly elementary.  Where it comes into its own is in applying to Data Governance principles.  Business areas would want a fast and cheap answer if they can get away with it – IT would want a solution that they can stand on to deliver a robust and reliable service. Here-in lies the inherent conflict in most Business and IT discussions.

The HAY solution is attractive because it’s generally quick and cheap and the cost of failure is very low.  Data labs and discovery environment are thriving because of this new economics of value recognition from data (including Big Data) specially in a self-service environment. It is common for the business to prefer “quick” and “ cheap” solutions.

The big shock comes when IT comes back with the cost of converting the HAY solution in to a STICK or BRICK environment.  “Why can’t they use the solution we designed in 2 weeks?”.  Well, they can. But building bricks on top of a hay foundation . . .  even sticks on hay doesn’t work.  And there is little  IT savings in leveraging a designdone in HAY in order to build STICK and BRICK solutions.  The main value is the certainty of the usefulness of the information that will be produced.

Teradata’s approach to enterprise information management recognises the differences in data use and it has developed platform solutions to cater for BRICKS (Integrated Data Warehouse), STICKS (Data Appliances) and HAY (Teradata Aster, Hadoop) requirements.  The main challenge though is building a Data Governance policy and process that recognises CONSEQUENCES.  Stakeholders involved in the decision for a data management approach (HAY, STICKS OR BRICKS) need to be fully cognizant of the downstream  consequences when their brilliant “idea”  developed in Hay needs to be migrated to more stable environment.

There is of course a Big Bad Wolf looming in this analogy. But that’s for another blog.

Renato Manongdo is a Senior Financial Services Industry Consultant at Teradata ANZ and is also the practice lead for Business Value Measurement in Asia Pacific. Connect with Renato Manongdo on Linkedin.

The following two tabs change content below.
avatar

Renato Manongdo

Industry Consultant at Teradata
Renato Manongdo is a Senior Financial Services Industry Consultant at Teradata ANZ. Renato provides senior management advice on how clients can better capitalise on their information and data investments. Particular focus on Business Intelligence and Analytics as it applies to business processes covering customer, channel and product management functions across Financial Services including Big Data.
Category: Renato Manongdo Tags: , , ,
avatar

About Renato Manongdo

Renato Manongdo is a Senior Financial Services Industry Consultant at Teradata ANZ. Renato provides senior management advice on how clients can better capitalise on their information and data investments. Particular focus on Business Intelligence and Analytics as it applies to business processes covering customer, channel and product management functions across Financial Services including Big Data.

One thought on “Big Data and the Three Little Pigs

  1. avatarRaja

    I implement hadoop, hive for a different business case and it is really good and helpful.Some clients are adamant.In this case, in my opinion, if HAY is required by the client, then so be it. Let us make hay while the sun shine. When downstream is flooded with dirt, then think of saving by catching straw if it can save else the Wall made of robust bricks is there :).

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *


*