Teradata RainStor 7 – Seven Years of Enterprise Archiving

by Mark Cusack, Chief Architect, Teradata RainStor

The RainStor team at Teradata is very excited to announce the release of RainStor ®7.  This software release is the first since RainStor became a member of the Teradata family last December.  The primary objective of my first blog post for Teradata is to describe the new features of the release.  However, I’d also like to take the opportunity to give insight into the journey we’ve taken since becoming part of Teradata.

It’s been seven years since the first production deployment of RainStor, and version 7 further strengthens our position as an innovator and leader in big data archiving.  RainStor is Teradata’s official archiving solution, which is a testament to Teradata’s belief in archiving as a discipline of growing importance, and to their superb taste in acquisitions, of course.   The release comprises of a new product line-up that more accurately reflects what drives enterprises to archive data, and provides more choice in terms of storage platforms. It should come as no surprise that this release also sets RainStor on a course of deeper integration with the Teradata Unified Data Architecture™.

The product offerings in RainStor 7 are divided into three categories.  First, we have the flagship Teradata RainStor Regulatory Archive solution, which includes all the data governance, security, query access, and compliance features that our customers are familiar with.  The Regulatory Archive is available on a wide range of platforms, including Hadoop (via the Teradata Appliance for Hadoop or commodity Hadoop), NAS/SAN shared storage, and EMC Isilon with SmartLock – the latter providing SEC17a-4 levels of compliance for archived data.

Second in the new lineup is the Teradata RainStor Online Archive solution, which was designed for users who are principally concerned with maintaining deep history for analytics purposes, rather than for reasons of compliance.  These users think of archives as the system of record for their enterprise, which can be mined for new business insights and competitive advantage.  It has all of the classic RainStor features such as security, data lifecycle management, and interactive SQL query access, but lacks the compliance-oriented features offered by the Regulatory Archive, such as encryption, masking and WORM support.  This solution is available on Hadoop, and shared storage.

The final variant of RainStor 7 is the Teradata RainStor System Archive solution.  This solution specifically targets archiving of Teradata data warehouses.   It has the compression, security and data governance features of the other two solutions, but doesn’t include access to RainStor’s native SQL query engine.  Archived data in the System Archive is accessed from the source Teradata system via Teradata QueryGrid, and Hadoop is the only storage platform of choice for now.

Support for Teradata® QueryGrid 14.10 is the stand-out new feature in RainStor 7.  With QueryGrid support, Teradata users can now access their archived data by referencing RainStor tables from Teradata queries using the QueryGrid syntax they are familiar with.  RainStor’s take on QueryGrid 14.10 is worth highlighting.  In addition to supporting the standard partition filtering and column pruning that QueryGrid provides, RainStor’s version also allows complex filtering expressions to be pushed down from Teradata to RainStor.  Further, far more of the computational workload is transferred from the Teradata system to the RainStor cluster with RainStor’s QueryGrid implementation.  We’ve also exposed RainStor’s security layer within the QueryGrid syntax, which means that only authorized users can access remote data in RainStor on Hadoop from their Teradata queries.

We’ve expanded our platform coverage in this new release to include support for RainStor running on Hortonworks HDP 2.2 and Cloudera CDH 5.3.  We’ve also added support for the Teradata Appliance for Hadoop 2.1, and for SLES 11.

Since the last release, we’ve also made a number of under-the-hood enhancements to our parallel SQL query engine with an emphasis on improving performance.  We’ve also reaffirmed our commitment to mimic as many different dialects of SQL as possible by adding new IBM® Netezza® functions.  Finally, we’ve added support for user defined functions.  Now users can write C or C++ scalar functions and reference them within their own RainStor SQL queries.

We’ve tried to maintain a business-as-usual posture during our first six months at Teradata, which means that this release is set against a backdrop of new deployments, and ongoing support for customers.  We’ve worked hard to ensure that we continue to meet our existing commitments, as well as delivering the latest release of RainStor on time.  It’s been an exciting journey so far.  And there’s lots more excitement to come, now that we are a part of something much, much bigger.

Learn more about RainStor7.

Mark Cusack-Color-sq

Mark Cusack joined Teradata in 2014 as part of its RainStor acquisition. As a founding developer and Chief Architect at RainStor, he has worked on many different aspects of the product since 2004.  Most recently, Mark led the efforts to integrate RainStor with Hadoop and with Teradata. He holds a Masters in computing and a PhD in physics.



Leave a Reply

Your email address will not be published. Required fields are marked *