Garbage In-Memory, Expensive Garbage

Posted on: July 7th, 2014 by Patrick Teunissen 2 Comments

A first anniversary is always special and in May I marked my first with Teradata. In my previous lives I celebrated almost ten years with Shell and seventeen years creating my own businesses focused on data warehousing and business intelligence solutions for SAP. With my last business “NewFrontiers” I leveraged all twenty seven years of ERP experiences to develop a shrink wrapped solution to enable SAP analytics. 

Through my first anniversary with Teradata, all this time, the logical design of SAP has been the same. To be clear, when I say SAP, I mean R/3 or ‘R/2 with a mouse’ if you’re old enough to remember. Today R/3 is also known as the SAP Business suite, ERP or whatever. Anyway, when I talk about SAP I mean the application that made the company rightfully world famous and that is used for transaction processing by almost all large multinational businesses.

My core responsibility at Teradata is the engineering of the analytical solution for SAP. My first order of business was focusing my team on delivering an end-to-end business analytic product suite to analyze ERP data that is optimized for Teradata. Since completing our first release, my attention turned to adding new features to help companies take their SAP analytics to the next level. To this end, my team is just putting the finishing touches on a near real-time capability based on data replication technology. This will definitely be the topic of upcoming blogs.

Over the past year, the integration and optimization process has greatly expanded my understanding of the differentiated Teradata capabilities. The one capability that draws in the attention of types like me ‘SAP guys and girls’ is Teradata Intelligent Memory. In-memory computing has become a popular topic in the SAP community and the computer’s main memory is an important part of Teradata’s Intelligent Memory. However Intelligent Memory is more than “In-Memory” -- because with Intelligent Memory, the database addresses the fact that not all memory is created equal and delivers a solution that uses the “right memory for the right purpose”. In this solution, the most frequently used data – the hottest -- is stored In-Memory; the warm data is processed from a solid state drive (SSD), and colder, less frequently accessed data from a hard disc drive (HDD). This solution allows your business to make decisions on all of your SAP and non-SAP data while coupling in-memory performance with spinning disc economics.

This concept of using the “right memory for the right purpose” is very compelling for our Teradata Analytics for SAP solutions. Often when I explain what Teradata Analytics for SAP Solutions does, I draw a line between DATA and CONTEXT. Computers need DATA like cars need fuel and the CONTEXT is where you drive the car. Most people do not go the same place every time but they do go to some places more frequently than others (e.g. work, freeways, coffee shops) and under more time pressure (e.g. traffic).

In this analogy, most organizations almost always start building an “SAP data warehouse” by loading all DATA kept in the production database of the ERP system. We call that process the initial load. In the Teradata world we often have to do this multiple times because when building an integrated data warehouse it usually involves sourcing from multiple SAP ERPs. Typically, these ERPs vary in age, history, version, governance, MDM, etc. Archival is a non-trivial process in the SAP world and the majority of the SAP systems I have seen are carrying many years of old data . Loading all this SAP data In-Memory is an expensive and reckless thing to do.

Teradata Intelligent Memory provides CONTEXT by storing the hot SAP data In-Memory, guaranteeing lightning fast response times. It then automatically moves the less frequently accessed data to lower cost and performance discs across the SSD and HDD media spectrum. The resulting combination of Teradata Analytics for SAP coupled with Teradata’s Intelligent Memory delivers in-memory performance with very high memory hit rates at a fraction of the cost of ‘In-Memory’ solutions. And in this business, costs are a huge priority.

The title of this Blog is a variation on the good old “Garbage In Garbage Out / GIGO” phrase; In-Memory is a great feature, but not all data needs to go there! Make use of it in an intelligent way and don’t use it as a garbage dump because for that it is too expensive.

Patrick Teunissen is the Engineering Director at Teradata responsible for the Research & Development of the Teradata Analytics for SAP® Solutions at Teradata Labs in the Netherlands. He is the founder of NewFrontiers which was acquired by Teradata in May 2013.

Endnotes:
1 Needless to say I am referring to SAP’s HANA database developments.

2 Data that is older than 2 years can be classified as old. Transactions, like sales and costs are often compared with the a budget/plan and the previous year. Sometimes with the year before that but hardly ever with data older than that.

2 Responses

  1. Rob Klopp

    July 29, 2014

    As you must surely know… SAP HANA does not require all data to be stored in-memory. Further, it manages the data kept in-memory using an LRU algorithm that keeps hot data in-memory and allows colder data to float to disk.

    Thank you,

    Rob

    Reply
    • Patrick Teunissen

      August 1, 2014

      Hello Rob,

      I was not debating the HANA technology or it’s capabilities. Your blogs maybe a much better source for that ;-).

      I often speak to our customers that run SAP R/3 (ECC, ERP) for transaction processing and what HANA has certainly done is put the computers main memory in the spotlights. I find that impressive because up to 5 years ago SAP R/3 users were not talking about hardware and smart database stuff. But today people talk about in memory computing and some say that all data should go there as the costs are going down so fast they are becoming neglectable. What I am pointing out in my blog is that storing data in main memory could be great for some purposes but whatever anyone says it is still expensive when compared to other storage media.

      Added to that the R/3 (ECC, ERP) application is unique but the transaction data in the database is not. In fact this data looks and behaves like the transaction data processed in any other ERP application. For example a sales order created in R/3 may from a certain moment in time hardly ever appear in a query after it has been shipped, invoiced and completed. I have seen R/3 systems of 10 years and older and I think that keeping all these completed orders “in memory” is not the right thing to do. Obviously I have pointed out how Teradata addresses this issue with our Intelligent Memory.

      But because you have brought up the point. I was surely aware of the fact that HANA has a capability to rebuild the in memory database should that be needed. However HANA’s LRU capability was new for me and based on what I could find on the internet I get the impression that data is removed (“unloaded”) from the database and loaded back again if needed for a query. Not sure whether that is the same as what I am talking about.

      Reply

Leave a comment

*