A friend of mine sent an interesting article that highlights the differences between the Dead Sea and the Sea of Galilee which I reproduce below with due credit to its original author. It provides a good analogy to the different styles of data lake implementations and consequences.
“I recall how fascinated I was when we were being taught about the Dead Sea in Geography class at school. As you may know, the Dead Sea is really a lake, not a sea (and as my Geography teacher pointed out, if you understood that, it would guarantee 4 marks in the term paper!). It is so high in salt content that the human body can float easily. You can almost lie down and read a book!. …..… And all that saltiness has meant that there is no life at all in the Dead Sea. No fish. No vegetation. No sea animals. Nothing lives in the Dead Sea. And hence the name: Dead Sea”.
Dead Sea Source: (http://www.dailyscubadiving.com/wp-content/uploads/2010/03/dead-sea.jpg)
“While the Dead Sea has remained etched in my memory, I don’t seem to recall learning about the Sea of Galilee in my school Geography lesson. So when I heard about the Sea of Galilee and the Dead Sea and the tale of the two seas – I was intrigued. It turns out that the Sea of Galilee is just north of the Dead Sea. Both the Sea of Galilee and the Dead Sea receive their water from river Jordan. And yet, they are very, very different.
Unlike the Dead Sea, the Sea of Galilee is pretty, resplendent with rich, colourful marine life. There are lots of plants and lots of fish too! In fact, the Sea of Galilee is home to over twenty different types of fishes”.
Sea of Galilee – Source: http://templars.wordpress.com/category/jerusalem/page/3/
“Same region, same source of water, and yet while one sea is full of life, the other is dead. How come?
The River Jordan flows into the Sea of Galilee and then flows out. The water simply passes through the Sea of Galilee in and then out – and that keeps the Sea healthy and vibrant, teeming with marine life.
But the Dead Sea is so far below the mean sea level, that it has no outlet. The water flows in from the river Jordan, but does not flow out. There are no outlet streams. It is estimated that over a million tons of water evaporate from the Dead Sea every day, leaving it salty, full of minerals and unfit for any marine life”.
Just as the water flowing in and out of the Sea of Galilee provides a healthy and vibrant ecosystem, disparate sources of data in their native format flowing through the data lake are expected to increase agility and accessibility to gain insights from data analysis and drive enterprise-wide business value.
Many organisations claim to have a data lake by virtue of collecting large volumes of data onto a single platform. Often, the data is in raw format and not organised so it looks suspiciously like the OLTP system files from which they were sourced. These implementations qualify more as data dumping grounds effectively resembling the Dead Sea. While the data may be consolidated onto a single platform in such deployments, it is likely that it has not been integrated for effective decision support.
A data lake, when properly planned and executed as part of a logical data warehouse, provides the integration of data that facilitates navigation and analysis across multiple subject areas without the need for heroics on the part of business users for traversing business relationships in the data. Perhaps, ‘A tale of two Seas’ provides a lesson or two for those trying to effectively plan, harvest and nourish the data and information ecosystem in the enterprise.
Read more on the latest news release about Teradata Loom and the benefits to companies using Hadoop.
Sundara Raman is a Senior Communications Industry Consultant at Teradata. He has 30 years of experience in the telecommunications industry that spans fixed line, mobile, broadband and Pay TV sectors. He specialises in Business Value Consulting, business intelligence, Big Data and Customer Experience Management solutions for communication service providers. Connect with Sundara on Linkedin.
Latest posts by Sundara Raman (see all)
- Making Smart City projects smarter with Smart Organisations - March 29, 2017
- IoT will accelerate industry convergence and structural disruption - October 25, 2016
- Internet of Things – Lessons from an IoT prototype project - August 22, 2016
- How Come NPS (Net Promoter Score) Data Doesn’t Rate Ben Affleck Movies? - August 17, 2016
- Which Open Source technologies are suitable for your Big Data roadmap? - June 27, 2016