In the Star Trek movies, “the Borg” refers to an alien race that conquers all planets, absorbing the people, technology, and resources into the Borg collective. Even Captain Picard becomes a Borg and chants “We are the Borg. You will be assimilated. Resistance is futile.”
It strikes me that the relational database has behaved similarly since its birth. Over the last thirty years, Teradata and other RDBMS vendors have innovated and modernized, constantly revitalizing what it means to be an RDBMS. But some innovations come from start-up companies that are later assimilated into the RDBMS. And some innovations are reactions to competition. Regardless, many innovations eventually end up in the code base of multiple RDBMS vendor products --with proper respect to patents of course. Here are some examples of cool technologies assimilated into Teradata Database:
• MOLAP cubes storm the market in the late 1990s with Essbase setting the pace and Cognos inventing desktop cubes. MicroStrategy and Teradata team up to build push-down ROLAP SQL into the database for parallel speed. Hyperion Essbase and Teradata also did Hybrid OLAP integration together. Essbase gets acquired, MOLAP cubes fall out of fashion, and in-database ROLAP goes on to provide the best of both worlds as CPUs get faster.
• Early in the 2000s, a startup called Sunopsis shows a distinct advantage of running ELT transformations in-database to get parallel performance with Teradata. ELT takes off in the industry like a rocket. Teradata Labs also collaborates with Informatica to push-down PowerCenter transformation logic into SQL for amazing extract, load, and transform speed. Sunopsis gets acquired. More ETL vendors adopt ELT techniques. Happy DBAs and operations managers meet their nightly batch performance goals. More startups disappear.
• XML and XQuery becomes the rage in the press -- until most every RDBMS adds a data type for XML --plus shred and unshred operators. XML-only database startups are marginalized.
• The uptick of predictive analytics in the market drives collaboration between Teradata and SAS back in 2007. SAS Procs are pushed-down into the database to run massively parallel, opening up tremendous performance benefits for SAS users. This leads to many RDBMS vendors who copy this technique; SAS is in the limelight, and eventually even Hadoop programmers want to run SAS in parallel. Later we see “R,” Fuzzy Logix, and others run in-database too. Sounds like the proverbial win-win to me.
• In-memory technology from QlikView and TIBCO SpotFire excites the market with order-of magnitude performance gains. Several RDBMS vendors then adopt in-memory concepts. But in-memory has limitations on memory size and cost vis-à-vis terabytes of data. Consequently, Teradata introduces Teradata Intelligent Memory that caches hot data automatically in-memory while managing many terabytes of hot and cold data on disk. Two to three percent of the hottest data is managed by data temperature (aka - popular with users), delivering superfast response time. Cool! Or is it hot?
• After reading the Google research paper on MapReduce, a startup called “AsterData” invents SQL-MapReduce (SQL-MR) to add flexible processing to a flexible database engine. This cool innovation causes Teradata to acquire AsterData. Within a year, Aster strikes a nerve across the industry – MapReduce is in-database! This month, Aster earns numerous #1 scores in Ovum’s “Decision Matrix: Selecting an Analytic Database 2013-14” Jan 2014. The race is on for MapReduce in-database!
• The NoSQL community grabs headlines with their unique designs and reliance on JSON data and key-value pairs. MongoDB is hot, using JSON data while CouchBase and Cassandra leverage key-value stores. Teradata promptly decides to add JSON data (unstructured data) to the database and goes the extra mile to put JSONPath syntax into SQL. Teradata also adds the name-value-pair SQL operator (NVP) to extract JSON or key-value store data from weblogs. Schema-on-read technology gets assimilated into the Teradata Database. Java programmers are pleased. Customers make plans. More wins.
“One trend to watch going forward, in addition to the rise of multi-model NoSQL databases, is the integration of NoSQL concepts into relational databases. One of the methods used in the past by relational database vendors to restrict the adoption of new databases to handle new data formats has been to embrace those formats within the relational database. Two prime examples would be support for XML and object-oriented programming.”
- Matt Aslett, The 451 Group, Next-Generation Operational Databases 2012-2016, Sep 17, 2013
I’ve had conversations with other industry analysts and they’ve confirmed Matt’s opinion: RDBMS vendors will respond to market trends, innovations, and competitive threats by integrating those technologies into their offering. Unlike the Borg, a lot of these assimilations by RDBMS are friendly collaborations (MicroStrategy, Informatica, SAS, Fuzzy Logix, Revolution R, etc.). Others are just the recognition of new data types that need to be in the database (JSON, XML, BLOBs, geospatial, etc.).
Why is it good to have all these innovations inside the major RDBMS’s? Everyone is having fun right now with their science projects because hype is very high for this startup or that startup or this shiny new thing. But when it comes time to deploy production analytic applications to hundreds or thousands of users, all the “ities” become critical all of a sudden – “ities” that the new kids don’t have and the RDBMS does. “ities” like reliability, recoverability, security, and availability. Companies like Google can bury shiny new 1.oh-my-god quality software in an army of brilliant computer scientists. But Main Street and Wall Street companies cannot.
More important, many people are doing new multi-structured data projects in isolation -- such as weblog analysis, sensor data, graph analysis, or social text analysis. Soon enough they discover the highest value comes from combining that data with all the rest of the data that the organization has collected on customers, inventories, campaigns, financials, etc. Great, I found a new segment of buyer preferences. What does that mean to campaigns, sales, and inventory? Integrating new big data into an RDBMS is a huge win going forward – much better than keeping the different data sets isolated in the basement.
Like this year’s new BMW or Lexus, RDBMS’s modernize, they define modern. But relational database systems don’t grow old, they don’t rust or wear out. RDBMS’s evolve to stay current and constantly introduce new technology.
We are the RDBMS! Technology will be assimilated. Resistance is futile.
Latest posts by Dan Graham (see all)
- Data Lake Best and Worst Practices - November 10, 2014
- MongoDB and Teradata QueryGrid – Even Better Together - June 19, 2014
- How $illy is Cost per Terabyte? - May 16, 2014