big data analytics

 

In advance of the upcoming webinar Achieving Pervasive Analytics through Data & Analytic Centricity, Dan Woods, CTO and editor of CITO Research, sat down with Clarke Patterson, senior director, Product Marketing, Cloudera, and Chris Twogood, vice president of Poduct and Services Marketing, Teradata, to discuss some of the ideas and concepts that will be shared in more detail on May 14, 2015.

Dan:

Having been briefed by Cloudera and Teradata on Pervasive Analytics and Data & Analytic Centricity, I have to say it’s refreshing to hear vendors talk about WHY and HOW big data is important in a constructive way, rather than platitudes and jumping into the technical details of the WHAT which is so often the case.

Let me start by asking you both in your own words to describe Pervasive Analytics and Data & Analytic Centricity, and why this an important concept for enterprises to understand?

Clarke:

During eras of global economic shifts, there is always a key resource discovered that becomes the spark of transformation for organizations that can effectively harness it. Today, that resource is unquestionably ‘data’. Forward-looking companies realize that to be successful, they must leverage analytics in order to provide value to their customers and shareholders. In some cases they must package data in a way that adds value and informs employees, or their customers, by deploying analytics into decisions making processes everywhere. This idea is referred to as pervasive analytics.

I would point to the success that Teradata’s customers have had over the past decades in terms of making analytics pervasive throughout enterprises. The spectrum in which their customer have gained value is comprehensive, from business intelligence reporting and executive dashboards, to advanced analytics, to enabling front line decision makers, and embedding analytics into key operational processes. And while those opportunities remain, the explosion of new data types and breadth of new analytic capabilities is leading successful companies to recognize the need to evolve the way they think about data management and processes in order to harness the value of all their data.

Chris:

I couldn’t agree more. It’s interesting now that we’re several years into the era of big data to see how different companies have approached this opportunity, which really boils down to two approaches. Some companies have taken the approach of what can we do with this newer technology that has emerged, while others take the approach of defining a strategic vision for the role of the data and analytics to support their business objectives and then map the technology to the strategy. The former, which we refer to as an application centric approach, can result in some benefits, but typically runs out of steam as agility slows and new costs and complexities emerge; while the latter is proving to create substantially more competitive advantage as organizations put data and analytics – not a new piece of technology – at the center of their operations. Ultimately, these companies that take a data and analytic centric approach are coming to a conclusion that there are multiple technologies required, and their acumen on applying the-right-tool-to-the-right-job naturally progresses, and the usual traps and pitfalls are avoided.

Dan:

Would you elaborate on what is meant by “companies need to evolve the way they think about data management?”

Chris:

Pre “big data,” there was a single approach to data integration whereby data is made to look the same or normalized in some sort of persistence such as a database, and only then can value be created. The idea is that by absorbing the costs of data integration up front, the costs of extracting insights decreases. We call this approach “tightly coupled.” This is still an extremely valuable methodology, but is no longer sufficient as a sole approach to manage all data in the enterprise.

Post “big data,” using the same tightly coupled approach to integration undermines the value of newer data sets that have unknown or under-appreciated value. Here, new methodologies to “loosely couple” or not couple at all are essential to cost effectively manage and integrate the data.   These distinctions are incredibly helpful in understanding the value of Big Data, where best to think about investments, and highlighting challenges that remain a fundamental hindrance to most enterprises.

But regardless of how the data is most appropriately managed, the most important thing is to ensure that organizations retain the ability to connect-the-dots for all their data, in order to draw correlations between multiple subject areas and sources and foster peak agility.

Clarke:

I’d also cite that leading companies are evolving the way they approach analytics. We can analyze any kind of data now - numerical, text, audio, video. We are now able to discover insights in this complex data. Further, new forms of procedural analytics have emerged in the era of big data, such as graph, time-series, machine learning, and text analytics.

This allows us to expand our understanding of the problems at hand. Key business imperatives like churn reduction, fraud detection, increasing sales and marketing effectiveness, and operational efficiencies are not new, and have been skillfully leveraged by data driven businesses with tightly coupled methods and SQL based analytics – that’s not going away. But when organizations harness newer forms of data that adds to the picture, and new complimentary analytic techniques, they realize better churn and fraud models, greater sales and marketing effectiveness, and more efficient business operations.

To learn more, please join the Achieving Pervasive Analytics through Data & Analytic Centricity webinar on Thursday, May 14 the from 10 - 11:00am PT

 

High Level Data Analytics Graph
(Healthcare Example)

 <---- Click on image to view GRAPH ANIMATION

Michael Porter, in an excellent article in the November 2014 issue of the Harvard Business Review[1], points out that smart connected products are broadening competitive boundaries to encompass related products that meet a broader underlying need. Porter elaborates that the boundary shift is not only from the functionality of discrete products to cross-functionality of product systems, but in many cases expanding to a system of systems such as a smart home or smart city.

So what does all this mean from a data perspective? In that same article, Porter mentions that companies seeking leadership need to invest in capturing, coordinating, and analyzing more extensive data across multiple products and systems (including external information). The key take-away is that the movement of gaining competitive advantage by searching for cross-functional or cross-system insights from data is only going to accelerate and not slow down. Exploiting cross-functional or cross-system centrality of data better than anyone else will continue to remain critical to achieving a sustainable competitive advantage.

Understandably, as technology changes, the mechanisms and architecture used to exploit this cross-system centrality of data will evolve. Current technology trends point to a need for a data & analytic-centric approach that leverages the right tool for the right job and orchestrates these technologies to mask complexity for the end users; while also managing complexity for IT in a hybrid environment. (See this article published in Teradata Magazine.)

As businesses embrace the data & analytic-centric approach, the following types of questions will need to be addressed: How can business and IT decide on when to combine which data and to what degree? What should be the degree of data integration (tight, loose, non-coupled)? Where should the data reside and what is the best data modeling approach (full, partial, need based)? What type of analytics should be applied on what data?

Of course, to properly address these questions, an architecture assessment is called for. But for the sake of going beyond the obvious, one exploratory data point in addressing such questions could be to measure and analyze the cross-functional/cross-system centrality of data.

By treating data and analytics as a network of interconnected nodes in Gephi[2], the connectedness between data and analytics can be measured and visualized for such exploration. We can examine a statistical metric called Degree Centrality[3] which is calculated based on how well an analytic node is connected.

The high level sample data analytics graph demonstrates the cross-functional Degree Centrality of analytics from an Industry specific perspective (Healthcare). It also amplifies, from an industry perspective, the need for organizations to build an analytical ecosystem that can easily harness this cross-functional Degree Centrality of data analytics. (Learn more about Teradata’s Unified Data Architecture.)

In the second part of this blog post series we will walk through a zoomed-in view of the graph, analyze the Degree Centrality measurements for sample analytics, and draw some high-level data architecture implications.

[1] https://hbr.org/2014/11/how-smart-connected-products-are-transforming-competition

[2] Gephi is a tool to explore and understand graphs. It is a complementary tool to traditional statistics.

[3] Degree centrality is defined as the number of links incident upon a node (i.e., the number of ties that a node has).

Ojustwin blog bio

Ojustwin Naik (MBA, JD) is a Director with 15 years of experience in planning, development, and delivery of Analytics. He has experience across multiple industries and is passionate at nurturing a culture of innovation based on clarity, context, and collaboration.

 

(This post discusses the results1 of Forrester Consulting’s research examining the economic impact and ROI of an online retailer that implemented a Teradata analytics solution. The new integrated platform runs big data analytics and replaces the retailer’s third party analytics solution.)

What is more beneficial? The quantifiable pay out of a big data solution...or the resulting improvements in corporate culture like encouraging innovation and increased productivity?

In this case, the big data solution is a Teradata Aster Discovery Platform.  Previously, the retailer relied upon a third-party web solution for analytics – which was inefficient, difficult to manage and not at all scalable. And with its limited IT support staff and ever exploding business requirements, the online business needed an easy-to-manage big data analytics solution able to handle its compiled customer data.

The platform has the ability to analyze and manage unstructured data plus has data visualization tools to help illuminate key business insights and improve marketing efficiency. And, it’s easy on labor costs. Because of the platform’s ready-to-use functionality, acquiring data, performing analysis and presenting output can be done by a wide variety of IT skills sets and resources. The organization does not need a full team of expensive data scientists and engineers to manipulate and use data.

Does it pay out? Forrester confirmed the retailer’s increases in new customer conversions, overall sales, savings from IT and end user productivity...all resulting in a direct impact to net bottom line profits.

 “For us, it has been relatively easy to monetize and

justify our investment in Aster Discovery Platform;

the changes that have resulted from the product

have offered us much increase in revenue.”

~Director of data engineering, online retailer

The cultural intangibles?  The retailer estimates 20% of its total employees (both IT and business) have a direct benefit, a gradual increase in their productivity from Year 1 to Year 3 due to how quickly business insights can be generated and the business practices optimized.

Performance throughout the organization improved dramatically. With the Aster Discovery Platform, the online retailer avoids multi-step, non-scalable procedures to run analytics and instead can just type a simple query. The organization’s planning process has become tighter. Better forecasts and predictions using predictive insights allows the organization more efficiency within the product life cycle delivering noticeable impact across a variety of measures like incremental sales, customer retention and customer satisfaction.

“We have a lot of test cases and product changes that we

have been able to make internally as a result of the analytics

that is taking place on the platform.”

~Director of data engineering, online retailer

The Teradata Aster Discovery Platform is the industry’s first next-generation, integrated discovery platform. Designed to provide high-impact insights, it makes performing powerful and complex analyses across big data sets easy and accessible.

1The Total Economic Impact TM Of Teradata Aster Discovery Platform. Cost Savings And Business Benefits Enabled By Implementing Teradata’s Aster Discovery Platform.  October 2014. © 2014, Forrester Research, Inc.

Learn more about Teradata’s big data analytics solutions.

Big Data Apps Solve Industry Big Data Issues

Posted on: February 13th, 2015 by Manan Goel No Comments

 

Leverage the power of big data to solve stubborn industry issues faster than your competition

Big data solutions – easier, faster and simpler – are today’s best means of securing an advantage in 2015.  If there was a way to quickly and easily leverage big data to address the nagging issues in your industry – like shopping cart abandonment for retail or churn for wireless carriers – wouldn’t that be appealing?

What if the leader in big data analytics told you that insights into the problems in your industry and organization could be in your hands and operational within a matter of weeks...and with far less complexity than big data solutions available just 6 months or a year ago?

Teradata has developed a collection of purpose-built analytical apps that address opportunities surrounding things like behavioral analytics, customer churn, marketing attribution and text analytics. These apps are built using the Teradata Aster AppCenterand were intentionally developed to solve pressing big data business challenges.

Industries covered include consumer financial, communications and cable,  healthcare (payer, provider and pharmaceutical), manufacturing, retail, travel and hospitality, and entertainment and gaming.  Review the following to see if your organization’s issues are covered.

Consumer Financial – Fraud Networks and Customer Paths.

Communications – Network Finder, Paths to Churn and Customer Satisfaction.

Cable – Customer Behavioral Segmentation and Customer Paths.

Healthcare –Paths to Surgery, Admission Diagnosis Procedure Paths, HL7 Parser, Patient Affinity & Length of Stay, Patient Compare, Impact Analysis and

Drug Prescription Affinity Analysis.

Retail – Paths to Purchase, Attribution (multi-channel), Shopping Cart Abandonment, Checkout Flow Analysis, Website Flow Analysis, Customer Product Review Analysis and Market Basket & Product Recommendations.

Travel & Hospitality – Customer Review Text Analysis, Website Conversion Paths, Diminishing Loyalty, Customer Review Sentiment Analysis.

Entertainment & Gaming – Companion Matcher, Diminishing Playing Time,

Network Finder and Paths to Churn.

More and more, organizations and in particular, business users and senior management, understand the value of and opportunities nested in big data. But across the enterprise, managers struggle with a number of hurdles like large upfront investments, labor demands (both resource time and specialized skills), and a perceived glacial movement toward real insights and operational analytics.

Now the biggest hurdles are removed. Big data apps tackle the investment, time and skills gap. Configured to your organization by Teradata professional services, the apps enable quick access to self-service analytics and discovery across the organization.  Big data apps allow for lower upfront investment and faster time to value – a matter of weeks, not months or years. How? Industry accepted best practices, analytic logic, schema, visualization options and interfaces are all captured in the pre-built templates.

Enter the world of big data in a big way. Tackle your biggest issues easily. Realize value faster. Let the excitement of discovery with big data help analytics infiltrate your organization. Momentum is a powerful driver in instilling a culture of innovation.

Learn more about big data and big data solutions.

LA kicks off the 2014 Teradata User Group Season

Posted on: April 22nd, 2014 by Guest Blogger No Comments

 

By Rob Armstrong,  Director, Teradata Labs Customer Briefing Team

After presenting for years at the Teradata User Group meetings, it was refreshing to see some changes in this roadshow.  While I had my usual spot on the agenda to present Teradata’s latest database release (15.0), we had some hot new topics including Cloud and Hadoop, some more business level folks were there, more companies researching Teradata’s technology (vs. just current users) and there was a hands-on workshop the following day for the more technically inclined looking to walk through real world Unified Data Architecture™ (UDA) use cases of a Teradata customer.  While LA tends to be a smaller venue than most, the room was packed and we had 40% more attendees compared with last year.

I would be remiss if I did not give a big Thanks to the partner sponsors of the user group meeting.  In LA we had Hortonworks and Dot Hill as our gold and silver sponsors.  I took a few minutes to chat with them and found out some interesting upcoming items.  Most notably, Lisa Sensmeier from Hortonworks talked to me about Hadoop Summit which is coming up in San Jose, June 3-5th.  Jim Jonez, from Dot Hill, gave me the latest on their newest “Ultra Large” disk technology where they’ll have 48 1 TB drives in a single 2U rack.  It is not in the Teradata line up yet, but we are certainly intrigued for the proper use case.

Now, I’d like to take a few minutes to toot my own horn about the Teradata Database 15.0 presentation that had some very exciting elements to help change the way users get to and analyze all of their data.  You may have seen the recent news releases, but if not, here is a quick recap:

  • Teradata 15.0 continues our Unified Data architecture with the new Teradata QueryGrid.  This is the new environment to define and access data from Teradata to other data servers such as Apache Hadoop (Hortonwoks), Teradata Aster Discovery Platform, Oracle, and others.  This lays the foundation for an extension to even more foreign data servers.  15.0 simplifies the whole definition and usage as well as added bi-directional and predicate pushdown.  In a related session, Cesar Rojas provided some good recent examples of customers taking advantage of the entire UDA ecosystem where data from all of the Teradata offerings were used together to generate new actions.
  • The other big news in 15.0 is the inclusion of the JSON data type.  This allows customers to store direct JSON documents in a column and then apply “schema on read” for much greater flexibility with greatly reduced IT resources.  As the JSON document changes, there is no table or database changes necessary to absorb the new content.

Keep your eyes and ears open for the next Teradata User Group event coming your way, or better yet, just go to the webpage: http://www.teradata.com/user-groups/ to see where the bus stops next and to register.  The TUGs are free of charge.  Perhaps we’ll cross paths as I make the circuit? Until then, ‘Keep Calm and Analyze On’ (as the cool kids say).

 Since joining Teradata in 1987, Rob Armstrong has worked in all areas of the data warehousing arena.  He has gone from writing and supporting the database code to implementing and managing systems at some of Teradata’s largest and most innovative customers.  Currently Rob provides sales and marketing support by traveling the globe and evangelizing the Teradata solutions.

Change and “Ah-Ha Moments”

Posted on: March 31st, 2014 by Ray Wilson No Comments

 

This is the first in a series of articles discussing the inherent nature of change and some useful suggestions for helping operationalize those “ah-ha moments."

Nobody has ever said that change is easy. It is a journey full of obstacles. But those obstacles are not impenetrable and with the right planning and communication, many of these obstacles can be cleared away making a more defined path for change to follow.   

So why is it that we often see failures that could have been avoided if changes that are obvious were not addressed before the problem occurred? The data was analyzed and yet nobody acted on these insights. Why does the organization fail to what I call operationalize the ah-ha moment? Was it a conscious decision? 

From the outside looking in it is easy to criticize organizations for not implementing obvious changes. But from the inside, there are many issues that cripple the efforts of change, and it usually boils down to time, people, process, technology or financial challenges.  

Companies make significant investments in business intelligence capabilities because they realize that hidden within the vast amounts of information they generate on a daily basis, there are jewels to be found that can provide valuable insights for the entire organization. For example, with today's analytic platforms business analysts in the marketing department have access to sophisticated tools that can mine information and uncover reasons for the high rate of churn occurring in their customer base. They might do this by analyzing all interactions and conversations taking place across the enterprise and the channels where customers engage the company. Using this data analysts then begin to  see various paths and patterns emerging from these interactions that ultimately lead to customer churn.   

These analysts have just discovered the leading causes of churn within their organization and are at the apex of the ah-ha moment. They now have the insights to stop the mass exodus of valuable customers and positively impact the bottom line. It’s obvious these insights would be acted upon and operationalized immediately, but that may not be the case. Perhaps the recently discovered patterns leading to customer churn touch so many internal systems, processes and organizations that getting organizational buy in to the necessary changes is mired down in a endless series of internal meetings.   

So what can be done given these realities? Here’s a quick list of tips that will help you enable change in your organization:

  • Someone needs to own the change and then lead rather than letting change lead him or her.
  • Make sure the reasons for change are well documented including measurable impacts and benefits for the organization.
  • When building a change management plan, identify the obstacles in the organization and make sure to build a mitigation plan for each.
    Communicate the needed changes through several channels.
  • Be clear when communicating change. Rumors can quickly derail or stall well thought out and planned change efforts.
  • When implementing changes make sure that the change is ready to be implemented and is fully tested.
  • Communicate the impact of the changes that have been deployed.  
  • Have enthusiastic people on the team and train them to be agents of change.
  • Establish credibility by building a proven track record that will give management the confidence that the team has the skills, creativity and discipline to implement these complex changes. 

Once implemented monitor the changes closely and anticipate that some changes will require further refinement. Remember that operationalizing the ah-ha moment is a journey.  A journey that can bring many valuable and rewarding benefits along the way. 

So, what’s your experience with operationalizing your "ah-ha moment"?

Ready to Explore Data Connections?

Posted on: March 20th, 2014 by Guest Blogger No Comments

 

By Bill Franks, Chief Analytics Officer

I’ll be participating in the Data Discovery In Action virtual event this next Thursday, March 27. The focus of the event is how to combine various processing paradigms and analytic techniques to maximize the ability of an organization to discover and deploy new high impact analytics. During my talk, I’ll outline what it takes to make discovery a core competency in your organization. I’ll discuss topics ranging from why you need a discovery platform, to how a discovery platform lowers risk while enabling innovation, to staffing and organizational issues, to important cultural considerations.

The goal of my talk is to get attendees’ mindsets in the right place in advance of the presentations that follow. The importance of data discovery is rising with the onslaught of big data. It is necessary for organizations to shift from traditional, siloed, and distinct discovery environments to one that is fully integrated within a unified analytics environment. I think that the virtual event has a terrific line up of speakers and topics to explain how to do this. Attendees will be provided everything from high level strategic advice to actual demonstrations of discovery in action. I encourage you to take the time to attend. You can register for the event here.  I look forward to seeing you (virtually).

Get to know Bill Franks.

 

Farrah Bostic, presenting a message encouraging both skepticism and genuine intimacy, was one of the most provocative speakers at Strata 2014 in Santa Clara earlier this year. As founder of The Difference Engine, a Brooklyn, NY-based agency that helps companies with research and digital and product strategy, Bostic warns her clients away from research that seems scientific but doesn’t create a clear model of what customers want.

Too often, Bostic says, numbers are used to paint a picture of a consumer, someone playing a limited role in an interaction with a company. The goal of the research is to figure out how to “extract value” from the person playing that role. Bostic suggests that “People are data too.” Instead of performing research to confirm your strategy, Bostic recommends using research to discover and attack your biases. It is a better idea to create a genuine understanding of a customer that is more complete and then figure out how your product or service can provide value to that person that will make their lives better and help them achieve their goals.

After hearing Bostic speak, I had a conversation with Dave Schrader, director of marketing and strategy at Teradata, about how to bring a better model of the customer to life. As Scott Gnau, president of Teradata Labs, and I pointed out in “How to Stop Small Thinking from Preventing Big Data Victories,” one of the key ways big data created value is by improving the resolution of the models used to run a business. Here are some of the ways that models of the customer can be improved.

The first thing that Schrader recommends is to focus on the levers of the business. “What actions can you take? What value will those actions provide? How can those actions affect the future?,” said Schrader. This perspective helps focus attention on the part of the model that is most important.

Data then should be used to enhance the model in as many ways as possible. “In a call center, for example, we can tell if someone is pressing the zero button over and over again,” said Schrader. “This is clearly an indication of frustration. If that person is a high value customer, and we know from the data that they just had a bad experience – like a dropped call with a phone company, or 10 minutes on the banking fees page before calling, it makes sense to raise an event and give them special attention. Even if they aren’t a big spender, something should be done to calm them down and make sure they don’t churn.” Schrader suggests that evidence of customer mood and intent can be harvested in numerous ways, through voice and text analytics and all sorts of other means.

“Of course, you should be focused on what you know and how to do the most with that,” said Schrader. “But you should also be spending money or even 10% of your analyst’s time to expand your knowledge in ways that help you know what you don’t know.” Like Bostic, Schrader recommends that experiments be done to attack assumptions, to find the unknown unknowns.

To really make progress, Schrader recommends finding ways to break out of routine thinking. “Why should our analysts be chosen based on statistical skills alone?” asks Schrader. “Shouldn’t we find people who are creative and empathetic, who will help us think new thoughts and challenge existing biases? Of course we should.” Borrowing from the culture of development, Schrader suggests organizing data hack-a-thons to create a safe environment for wild curiosity. “Are you sincere in wanting to learn from data? If so, you will then tolerate failure that leads to learning,” said Schrader.

Schrader also recommends being careful about where in an organization to place experts such as data scientists. “You must ­add expertise in areas that will maximize communication and lead to storytelling,” said Schrader. In addition, he recommends having an open data policy wherever possible to encourage experimentation.

In my view, Bostic and Schrader are both crusaders who seek to institutionalize the spirit of the skeptical gadfly. It is a hard trick to pull off, but one that pays tremendous dividends.

By: Dan Woods, Forbes Blogger and Co-Founder of Evolved Media

Big Apple Hosts the Final Big Analytics Roadshow of the Year

Posted on: November 26th, 2013 by Teradata Aster No Comments

 

Speaking of ending things on a high note, New York City on December 6th will play host to the final event in the Big Analytics 2013 Roadshow series. Big Analytics 2013 New York is taking place at the Sheraton New York Hotel and Towers in the heart of Midtown on bustling 7th Avenue.

As we reflect on the illustrious journey of the Big Analytics 2013 Roadshow, kicking off in San Francisco, this year the Roadshow traveled through major international destinations including Atlanta, Dallas, Beijing, Tokyo, London and finally culminating at the Big Apple – it truly capsulated the appetite today for collecting, processing, understanding and analyzing data.

Big Analytics Atlanta 2013 photo

Big Analytics Roadshow 2013 stops in Atlanta

Drawing business & technical audiences across the globe, the roadshow afforded the attendees an opportunity to learn more about the convergence of technologies and methods like data science, digital marketing, data warehousing, Hadoop, and discovery platforms. Going beyond the “big data” hype, the event offered learning opportunities on how technologies and ideas combine to drive real business innovation. Our unyielding focus on results from data is truly what made the events so successful.

Continuing on with the rich lineage of delivering quality Big Data information, the New York event promises to pack tremendous amount of Big Data learning & education. The keynotes for the event include such industry luminaries as Dan Vesset, Program VP of Business Analytics at IDC, Tasso Argyros, Senior VP of Big Data at Teradata & Peter Lee, Senior VP of Tibco Software.

Photo of the Teradata Aster team in Dallas

Teradata team at the Dallas Big Analytics Roadshow


The keynotes will be followed by three tracks around Big Data Architecture, Data Science & Discovery & Data Driven Marketing. Each of these tracks will feature industry luminaries like Richard Winter of WinterCorp, John O’Brien of Radiant Advisors & John Lovett of Web Analytics Demystified. They will be joined by vendor presentations from Shaun Connolly of Hortonworks, Todd Talkington of Tableau & Brian Dirking of Alteryx.

As with every Big Analytics event, it presents an exciting opportunity to hear first hand from leading organizations like Comcast, Gilt Groupe & Meredith Corporation on how they are using Big Data Analytics & Discovery to deliver tremendous business value.

In summary, the event promises to be nothing less than the Oscars of Big Data and will bring together the who’s who of the Big Data industry. So, mark your calendars, pack your bags and get ready to attend the biggest Big Data event of the year.

Teradata’s UDA is to Data as Prius is to Engines

Posted on: November 12th, 2013 by Teradata Aster No Comments

 

I’ve been working in the analytics and database market for 12 years. One of the most interesting pieces of that journey has been seeing how the market is ever-shifting. Both the technology and business trends during these short 12 years have massively changed not only the tech landscape today, but also the future of evolution of analytic technology. From a “buzz” perspective, I’ve seen “corporate initiatives” and “big ideas” come and go. Everything from “e-business intelligence,” which was a popular term when I first started working at Business Objects in 2001, to corporate performance management (CPM) and “the balanced scorecard.” From business process management (BPM) to “big data”, and now the architectures and tools that everyone is talking about.

The one golden thread that ties each of these terms, ideas and innovations together is that each is aiming to solve the questions related to what we are today calling “big data.” At the core of it all, we are searching for the right way to enable the explosion of data and analytics that today’s organizations are faced with, to simply be harnessed and understood. People call this the “logical data warehouse”, “big data architecture”, “next-generation data architecture”, “modern data architecture”, “unified data architecture”, or (I just saw last week) “unified data platform”.  What is all the fuss about, and what is really new?  My goal in this post and the next few will be to explain how the customers I work with are attacking the “big data” problem. We call it the Teradata Unified Data Architecture, but whatever you call it, the goals and concepts remain the same.

Mark Beyer from Gartner is credited with coining the term “logical data warehouse” and there is an interesting story and explanation. A nice summary of the term is,

The logical data warehouse is the next significant evolution of information integration because it includes ALL of its progenitors and demands that each piece of previously proven engineering in the architecture should be used in its best and most appropriate place.  …

And

… The logical data warehouse will finally provide the information services platform for the applications of the highly competitive companies and organizations in the early 21st Century.”

The idea of this next-generation architecture is simple: When organizations put ALL of their data to work, they can make smarter decisions.

It sounds easy, but as data volumes and data types explode, so does the need for more tools in your toolbox to help make sense of it all. Within your toolbox, data is NOT all nails and you definitely need to be armed with more than a hammer.

In my view, enterprise data architectures are evolving to let organizations capture more data. The data was previously untapped because the hardware costs required to store and process the enormous amount of data was simply too big. However, the declining costs of hardware (thanks to Moore’s law) have opened the door for more data (types, volumes, etc.) and processing technologies to be successful. But no singular technology can be engineered and optimized for every dimension of analytic processing including scale, performance or concurrent workloads.

Thus, organizations are creating best-of-breed architectures by taking advantage of new technologies and workload-specific platforms such as MapReduce, Hadoop, MPP data warehouses, discovery platforms and event processing, and putting them together into, a seamless, transparent and powerful analytic environment. This modern enterprise architecture enables users to get deep business insights and allows ALL data to be available to an organization, creating competitive advantage while lowering the total system cost.

But why not just throw all your data into files and put a search engine like Google on top? Why not just build a data warehouse and extend it with support for “unstructured” data? Because, in the world of big data, the one-size-sits-all approach simply doesn’t work.

Different technologies are more efficient at solving different analytical or processing problems. To steal an analogy from Dave Schrader—a colleague of mine—it’s not unlike a hybrid car. The Toyota Prius can average 47 mpg with hybrid (gas and electric) vs. 24 mpg with a “typical” gas-only car – almost double! But you do not pay twice as much for the car.

How’d they do it? Toyota engineered a system that uses gas when I need to accelerate fast (and also to recharge the battery at the same time), electric mostly when driving around town, and braking to recharge the battery.

Three components integrated seamlessly – the driver doesn’t need to know how it works.  It is the same idea with the Teradata UDA, which is a hybrid architecture for extracting the most insights per unit of time – at least doubling your insight capabilities at reasonable cost. And, business users don’t need to know all of the gory details. Teradata builds analytic engines—much like the hybrid drive train Toyota builds— that are optimized and used in combinations with different ecosystem tools depending on customer preferences and requirements, within their overall data architecture.

In the case of the hybrid car, battery power and braking systems, which recharge the battery, are the “new innovations” combined with gas-powered engines. Similarly, there are several innovations in data management and analytics that are shaping the unified data architecture, such as discovery platforms and Hadoop. Each customer’s architecture is different depending on requirements and preferences, but the Teradata Unified Data Architecture recommends three core components that are key components in a comprehensive architecture – a data platform (often called “Data Lake”), a discovery platform and an integrated data warehouse. There are other components such as event processing, search, and streaming which can be used in data architectures, but I’ll focus on the three core areas in this blog post.

Data Lakes

In many ways, this is not unlike the operational data store we’ve seen between transactional systems and the data warehouse, but the data lake is bigger and less structured. Any file can be “dumped” in the lake with no attention to data integration or transformation. New technologies like Hadoop provide a file-based approach to capturing large amounts of data without requiring ETL in advance. This enables large-scale data processing for data refining, structuring, and exploring data prior to downstream analysis in workload-specific systems, which are used to discover new insights and then move those insights into business operations for use by hundreds of end-users and applications.

Discovery Platforms

Discovery platforms are a new workload-specific system that is optimized to perform multiple analytic techniques in a single workflow to combine SQL with statistics, MapReduce, graph, or text analysis to look at data from multiple perspectives. The goal is to ultimately provide more granular and accurate insights to users about their business. Discovery Platforms enable a faster investigative analytical process to find new patterns in data, identify different types fraud or consumer behavior that traditional data mining approaches may have missed.

Integrated Data Warehouses

With all the excitement about what’s new, companies quickly forget the value of consistent, integrated data for reuse across the enterprise. The integrated data warehouse has become a mission-critical operational system which is the point of value realization or “operationalization” for information. The data within a massively parallel data warehouse has been cleansed, and provides a consistent source of data for enterprise analytics. By integrating relevant data from across the entire organization, a couple key goals are achieved. First, they can answer the kind of sophisticated, impactful questions that require cross-functional analyses. Second, they can answer questions more completely by making relevant data available across all levels of the organization. Data lakes (Hadoop) and discovery platforms complement the data warehouse by enriching it with new data and new insights that can now be delivered to 1000’s of users and applications with consistent performance (i.e., they get the information they need quickly).

A critical part of incorporating these novel approaches to data management and analytics is putting new insights and technologies into production in reliable, secure and manageable ways for organizations.  Fundamentals of master data management, metadata, security, data lineage, integrated data and reuse all still apply!

The excitement of experimenting with new technologies is fading. More and more, our customers are asking us about ways to put the power of new systems (and the insights they provide) into large-scale operation and production. This requires unified system management and monitoring, intelligent query routing, metadata about incoming data and the transformations applied throughout the data processing and analytical process, and role-based security that respects and applies data privacy, encryption and other policies required. This is where I will spend a good bit of time on my next blog post.