Analyzing your Analytics

By | January 26, 2016

In all my travels to hundreds of customer meetings, one theme pulses through them all like ocean waves: scalability.   Every corporation I visit is challenged by the increasing scale of data and their ability to wrench value from it.  New kinds of data arrive every month and analytics have entered the board room as a competitive imperative.   As technologists, Teradata has a lot of this covered, it’s our birthright, our core competency.  But something is broken.  Analytics people are not scalable.    People and budgets don’t scale up as fast as data.  We can’t manufacture silicon data scientists.  If we cannot grow the analytic staff to match the demands of the company, there are a couple choices.  One: stop trying to keep up with analytic competitors and stagnate.  Two: dramatically increase the productivity of existing analytic users.

Examining the way business analysts work, especially the on-boarding of new hires, most corporations rely on word-of-mouth and tribal knowledge to share experiences about their data and analytics.  This means, Alice-the-new-hire must struggle through 12 months of on-the-job training to become productive (i.e. learning from mistakes).  Do you have a reference guide for your corporate analytics to give to Alice?  Are there internal blogs on how to calculate your key-performance-indicators?  How can Alice find a subject matter expert to learn the right way to calculate net present value in your company?  Enterprises need an internal collaboration platform where business users can learn everything they need to know about your corporate analytics.  Word-of-mouth is not a good strategy.  After all, your company paid huge sums to discover that knowledge, so why is it locked up in one person’s head?   Analysts need to know where to access the right data, what transformations to make to keep metric definitions consistent and what needs to be shared with whom.  People need to learn from each other systematically.  They need to leap-frog tasks by grabbing ideas and guidance from others that have done similar tasks.  Newly hired business analysts need to tap into that corporate knowledge in seconds, not hours.  In my view, we must build a future where business users get recommendations like “people who work with this data set also use this other data set.”  Sound familiar?  Applying the wisdom of crowds is what will enable decision making to scale up.

Collaborative ideation is part of Teradata’s Sentient Enterprise vision describing the evolution of analytic ecosystems over the next decade.   We have plenty of technical goals to achieve but they won’t be successful without an evolution in corporate culture too.  Collaborative ideation is aimed at accelerating business analyst productivity.  Collaborative ideation combines a software platform with crowdsourcing techniques.  Imagine a platform for analyzing your analytics.  This will allow us to capture knowledge, prior works, and automate many discoveries.  To kick start this platform, it must be automatically populated with a lot of information on how users actually use analytic tools.  The business analysts then add their personal knowledge to this platform somewhat like a combined LinkedIn and Wikipedia for analytics.

Teradata is partnering with Alation to provide a collaborative ideation platform to our customers.  It fits precisely into Teradata’s Unified Data Architecture (UDA) because of its multi-system support strategy.  Alation connects to the Teradata Data Warehouse, Hadoop distributions and –soon – Aster Analytics.  See 5 Stages in Becoming a Sentient Enterprise for a short introduction.

On first examination, technical people will think Alation is a metadata tool.  Alation automatically builds its catalog by combining metadata from MySQL, Teradata, Hive, BI Tools, and other sources.  It collects information on business users, queries, who uses the queries, and schemas.   Because of the data collected, data scientists and programmers will enjoy using the data lineage view, the automatic SQL syntax recommendations, and the popularity scores for each column in a table.   Something I especially like is the data temperature analysis at the column level.  Data stewards will like the ability to prioritize their work by having consistent insight into the most heavily used tables and columns that should be documented first.  Governance teams can track how data sets and users relate to each other.  And yes, there is security built-in using OAuth, LDAP, and Active Directory.   It even leverages security permissions from the RDBMS to protect access to table data.

If you conclude this is an interesting metadata tool, you are half right.  The automated capture of multi-system metadata is the first half of the solution.  Business user productivity and crowdsourcing is the other half.   Business users, programmers, and data scientists can tag items in the catalog.  They post things like data descriptions, how to calculate a KPI, or pitfalls in working with the data.  They can attach articles, best practices, tips, even SQL queries into the catalog.  Like a wiki, the catalog helps people capture their hard won research where everyone else can see it.  And it doesn’t get lost when someone leaves the company.

Alation also provides subject matter expert searches.  If Alice-the-new-hire does a search and finds that Bob uses Table-Y extensively, she will go talk to Bob regarding her Table-Y reports.    Or maybe Alice-the-new-hire needs to research the differences between a Hadoop table that is derived from the data warehouse.  With Alation, these tasks become easy.  Alice solves these problems in minutes instead of days and weeks of making mistakes.

Some might wonder if this Silicon Valley startup can deliver on such ambitious goals.   In my mind, being the collaboration hub for tens of petabytes of data spread across a million tables and 900 eBay analysts establishes Alation’s credibility.  (See the eBay video on  There seems to be a natural fit, a symbiosis between Teradata customers and Alation.  Consequently, Alation is already installed in other major Teradata sites like Safeway, Paypal and GoDaddy.  The number of Teradata customers interested in Alation has exceeded our expectations.

Scalability is surely a challenge in many dimensions.  When we had orders of magnitude less data, we had orders of magnitude fewer productivity concerns.   Collaborative ideation is an elegant way to benefit from the expansion of data and analytics and not be trampled by it.  Programmers collaborate on Github.  Sales and Marketing collaborate with Jive software.  I recommend business analysts use Alation for the same reasons –it helps people help each other.

Leave a Reply

Your email address will not be published. Required fields are marked *