Data scientist

 

Farrah Bostic, presenting a message encouraging both skepticism and genuine intimacy, was one of the most provocative speakers at Strata 2014 in Santa Clara earlier this year. As founder of The Difference Engine, a Brooklyn, NY-based agency that helps companies with research and digital and product strategy, Bostic warns her clients away from research that seems scientific but doesn’t create a clear model of what customers want.

Too often, Bostic says, numbers are used to paint a picture of a consumer, someone playing a limited role in an interaction with a company. The goal of the research is to figure out how to “extract value” from the person playing that role. Bostic suggests that “People are data too.” Instead of performing research to confirm your strategy, Bostic recommends using research to discover and attack your biases. It is a better idea to create a genuine understanding of a customer that is more complete and then figure out how your product or service can provide value to that person that will make their lives better and help them achieve their goals.

After hearing Bostic speak, I had a conversation with Dave Schrader, director of marketing and strategy at Teradata, about how to bring a better model of the customer to life. As Scott Gnau, president of Teradata Labs, and I pointed out in “How to Stop Small Thinking from Preventing Big Data Victories,” one of the key ways big data created value is by improving the resolution of the models used to run a business. Here are some of the ways that models of the customer can be improved.

The first thing that Schrader recommends is to focus on the levers of the business. “What actions can you take? What value will those actions provide? How can those actions affect the future?,” said Schrader. This perspective helps focus attention on the part of the model that is most important.

Data then should be used to enhance the model in as many ways as possible. “In a call center, for example, we can tell if someone is pressing the zero button over and over again,” said Schrader. “This is clearly an indication of frustration. If that person is a high value customer, and we know from the data that they just had a bad experience – like a dropped call with a phone company, or 10 minutes on the banking fees page before calling, it makes sense to raise an event and give them special attention. Even if they aren’t a big spender, something should be done to calm them down and make sure they don’t churn.” Schrader suggests that evidence of customer mood and intent can be harvested in numerous ways, through voice and text analytics and all sorts of other means.

“Of course, you should be focused on what you know and how to do the most with that,” said Schrader. “But you should also be spending money or even 10% of your analyst’s time to expand your knowledge in ways that help you know what you don’t know.” Like Bostic, Schrader recommends that experiments be done to attack assumptions, to find the unknown unknowns.

To really make progress, Schrader recommends finding ways to break out of routine thinking. “Why should our analysts be chosen based on statistical skills alone?” asks Schrader. “Shouldn’t we find people who are creative and empathetic, who will help us think new thoughts and challenge existing biases? Of course we should.” Borrowing from the culture of development, Schrader suggests organizing data hack-a-thons to create a safe environment for wild curiosity. “Are you sincere in wanting to learn from data? If so, you will then tolerate failure that leads to learning,” said Schrader.

Schrader also recommends being careful about where in an organization to place experts such as data scientists. “You must ­add expertise in areas that will maximize communication and lead to storytelling,” said Schrader. In addition, he recommends having an open data policy wherever possible to encourage experimentation.

In my view, Bostic and Schrader are both crusaders who seek to institutionalize the spirit of the skeptical gadfly. It is a hard trick to pull off, but one that pays tremendous dividends.

By: Dan Woods, Forbes Blogger and Co-Founder of Evolved Media

Big Apple Hosts the Final Big Analytics Roadshow of the Year

Posted on: November 26th, 2013 by Teradata Aster No Comments

 

Speaking of ending things on a high note, New York City on December 6th will play host to the final event in the Big Analytics 2013 Roadshow series. Big Analytics 2013 New York is taking place at the Sheraton New York Hotel and Towers in the heart of Midtown on bustling 7th Avenue.

As we reflect on the illustrious journey of the Big Analytics 2013 Roadshow, kicking off in San Francisco, this year the Roadshow traveled through major international destinations including Atlanta, Dallas, Beijing, Tokyo, London and finally culminating at the Big Apple – it truly capsulated the appetite today for collecting, processing, understanding and analyzing data.

Big Analytics Atlanta 2013 photo

Big Analytics Roadshow 2013 stops in Atlanta

Drawing business & technical audiences across the globe, the roadshow afforded the attendees an opportunity to learn more about the convergence of technologies and methods like data science, digital marketing, data warehousing, Hadoop, and discovery platforms. Going beyond the “big data” hype, the event offered learning opportunities on how technologies and ideas combine to drive real business innovation. Our unyielding focus on results from data is truly what made the events so successful.

Continuing on with the rich lineage of delivering quality Big Data information, the New York event promises to pack tremendous amount of Big Data learning & education. The keynotes for the event include such industry luminaries as Dan Vesset, Program VP of Business Analytics at IDC, Tasso Argyros, Senior VP of Big Data at Teradata & Peter Lee, Senior VP of Tibco Software.

Photo of the Teradata Aster team in Dallas

Teradata team at the Dallas Big Analytics Roadshow


The keynotes will be followed by three tracks around Big Data Architecture, Data Science & Discovery & Data Driven Marketing. Each of these tracks will feature industry luminaries like Richard Winter of WinterCorp, John O’Brien of Radiant Advisors & John Lovett of Web Analytics Demystified. They will be joined by vendor presentations from Shaun Connolly of Hortonworks, Todd Talkington of Tableau & Brian Dirking of Alteryx.

As with every Big Analytics event, it presents an exciting opportunity to hear first hand from leading organizations like Comcast, Gilt Groupe & Meredith Corporation on how they are using Big Data Analytics & Discovery to deliver tremendous business value.

In summary, the event promises to be nothing less than the Oscars of Big Data and will bring together the who’s who of the Big Data industry. So, mark your calendars, pack your bags and get ready to attend the biggest Big Data event of the year.

Introducing In-Database Visual MapReduce Functions

Posted on: February 20th, 2013 by Teradata Aster No Comments

 

Ever since Aster Data became part of Teradata a couple years ago, we have been fortunate to have the resources and focus to accelerate our rate of product innovation. In the past 8 months alone, we have led the market in deploying big analytics on Hadoop and introducing an ultra-fast appliance for discovering big data insights. Our focus is to provide the market with the best big data discovery platform; that is, the most efficient, cost-effective, and enterprise-friendly way to extract valuable business insights form massive piles of structured and unstructured data.

Today I am excited to announce another significant innovation that extends our lead in this direction. For the first time, we are introducing in-database, SQL-MapReduce-based visualization functions, as part of the Teradata Aster Discovery Platform 5.10 software release. These are functions that take the output of an analytical process (either SQL or MapReduce) and create an interactive data visualization that can be accessed directly from our platform through any web browser. There are several functions that we are introducing with today's announcement, including functions that let you visualize flows of people or events, graphs, and arbitrary patterns. These functions complement your existing BI solution by extending the types of information you can visualize without adding the complexity of another BI deployment.

It did take some significant engineering effort and innovation from our field in working with customers to make a discovery platform produce in-database, in-process visualizations. So, why bother? Because these functions have three powerful characteristics: they are beautiful; powerful; and instant. Let me elaborate in reverse order.

Instant: the goal of a discovery platform like Aster’s is to accelerate the hypothesis --> analysis --> validation iteration process. One of the major big data challenges is that the data is so complex that you don't even know what questions to ask. So you start with 10s or 100s of possible questions that you need to quickly implement and validate until you find the couple questions that extract the gold nuggets of information from the data. Besides analyzing the data, having access to instant visualizations can help data scientists and business analysts understand if they are down the right path of finding the insights they're looking for. Being able to rapidly analyze and – now – visualize the insights in-process can rapidly accelerate the discovery cycle and save an analysts time and cost by more than 80% as has been recently validated.    

Powerful: Aster comes with a broad library of pre-built SQL-MapReduce functions. Some of the most powerful, like nPath, crunch terabytes of customer or event data and produce patterns of activity that yield significant insights in a single pass of the data, regardless of the complexity of the pattern or history being analyzed. In the past, visualizing these insights required a lot of work – even after the insight was generated. This is because there were no specialized visualization tools that could consume the insight as-is to produce the visualizations. Abstracting the insights in order to visualize them is sub-optimal since it is killing the 'a-ha!' moment. With today’s announcement, we provide analysts with the ability to natively visualize concepts such as a graph of interactions or patterns of customer behavior with no compromises and no additional effort!

Beautiful: We all know that numbers and data are only as good as the story that goes with them. By having access to instant, powerful and also aesthetically beautiful in-database visualizations, you can do justice to your insights and communicate them effectively to the rest of the organization, whether that means business clients, executives, or peer analysts.

In addition, with this announcement we are introducing four buckets of pre-built SQL-MapReduce functions, I.e. Java functions that can be accessed through a familiar SQL or BI interface. These buckets are Data Acquisition (connecting to external sources and acquiring data); Data Preparation (manipulate structured and unstructured data to quickly prepare for analysis); Data Analytics (everything from path and pattern analysis to statistics and marketing analytics); and Data Visualization (introduced today). This is the most powerful collection of big data tools available in the industry today, and we're proud to provide them to our customers.

Teradata Aster Discovery Portfolio - figure 2

Our belief is that our industry is still scratching the surface in terms of providing powerful analytical tools to enterprises that help them find more valuable insights, more quickly and more easily. With today's launch, the Teradata Aster Discovery Platform reconfirms its lead as the most powerful and enterprise-friendly tool for big data analytics.

Leading the Pack with Unified Data Architecture

Posted on: January 29th, 2013 by Scott Gnau No Comments

 

In the technology game, industry analysts are important players, and some would argue that Gartner is right up there near the top with their Magic Quadrant reports.  Those of us who follow Gartner’s Magic Quadrants know the importance of that deceptively simple-looking market research grid. Behind it lays a wealth of knowledge, with uniform criteria that bring useful snapshots of markets and their participants. I am again proud to see that Gartner’s latest MQ covering data warehousing and analytics continues to show Teradata leading the pack for our vision and performance.

Over the years, we’ve shared our vision with Gartner for a future where information is readily collected, processed and integrated in boundless configurations to allow businesses to exploit all of their data to their advantage. With the demonstrated success of data-driven organizations, we are again seeing our vision become a reality for many organizations capturing, analyzing and gaining insights from traditional and new data types in a heterogeneous environment.

This vision aligns perfectly to Gartner’s view of a Logical Data Warehouse. At a high level, Gartner defines the Logical Data Warehouse as an information management architecture where all data, including highly unstructured data, is stored and analyzed. This architecture includes technology approaches like data virtualization, distributed processes and ontological metadata, among other characteristics, as enabling a single version of the truth. In a recent Gartner blog, analyst Mark Beyer says that the “logical data warehouse is the next significant evolution of information integration …this is important. This is big. This is NOT buzz. This is real.”

We at Teradata agree. This is real, and our own vision and R&D investment has closely aligned with this. The fact that our best-in-class systems are available today and have the ability to analyze structured, unstructured and semi-structured data, or what we call multi-structured data as an all-up term, shows how long indeed we’ve been on this path.  The October 2012 release of the Teradata Unified Data Architecture introduced a new framework for business users – very much aligned with Gartner’s vision of the Logical Data Warehouse – to ask any question, against any data, with any analytic, at any time across multiple Teradata systems – analytical platforms and discovery platforms – and open source Hadoop data management platforms. This is the result of years of development and millions of dollars of R&D investment.  This investment has enabled us to be the first to deliver a solution like UDA to the market, empowering our customers to change their game by competing on analytics.

UDA

We continue to get positive reports from our customers as we allow organizations to deploy, support, manage, and seamlessly access all their data in an integrated and dynamic Teradata Unified Architecture. Teradata’s integration of these technologies, which our customers have learned is more than the sum of the individual components, creates real value.

These efforts are all in service to a vision of intelligent systems that leverages the value of data warehouse, data discovery and data staging technology. We believe in the value of open source technology, that’s why the Teradata Unified Data Architecture supports open source apache Hadoop.  The Teradata Unified Data Architecture is further certified with HDP from Hortonworks and enables a host of interoperability features, which allow for the transparent, seamless movement of data in and out of diverse systems for storage and refinement and analysis.

The Teradata Unified Data Architecture indeed represents the new normal in combining systems and approaches. It captures, refines, and stores detailed data in Hadoop. Teradata Aster then performs subsequent analysis for the discovery of new insights. And then, the resulting intelligence is made available by the Teradata Integrated Data Warehouse for use across the enterprise.

The Teradata Unified Data Architecture, with best-in-class technology, provides business users fast and seamless answers to their questions regardless of the type of data analyzed. In the process, we are embodying what Gartner and others value as the leadership in building practical solutions that help businesses derive the best insights possible from all their data, whether big, small, or somewhere in between.

Scott Gnau

2 months & 10 questions on new Aster Big Analytics Appliance

Posted on: December 18th, 2012 by Teradata Aster No Comments

 

It’s been about two months since Teradata launched the Aster Big Analytics Appliance and since then we have had the opportunity to showcase the appliance to various customers, prospects, partners, analysts, journalists etc. We are pleased to report that since the launch the appliance has already received the “Ventana Big Data Technology of the Year” award and has been well received by industry experts and customers alike.

Over the past two months, starting with the launch tweetchat, we have received numerous enqueries around the appliance and think now is a good time to answer the top 10 most frequently asked questions about the new Teradata Aster offering. Without further ado here are the top 10 questions and their answers:

WHAT IS THE TERADATA ASTER BIG ANALYTICS APPLIANCE?

The Aster Big Analytics Appliance is a powerful, ready to-run platform that is pre-configured and optimized specifically for big data storage and analysis. A purpose built, integrated hardware and software solution for analytics at big data scale, the appliance runs Teradata Aster patented SQL-MapReduce® and SQL-H technology on a time-tested, fully supported Teradata hardware platform. Depending on workload needs, it can be exclusively configured with Aster nodes, Hortonworks Data Platform (HDP) Hadoop nodes, or a mixture of Aster and Hadoop nodes. Additionally, integrated backup nodes are available for data protection and high availability

WHO WILL BENEFIT MOST BY DEPLOYING THE APPLIANCE?

The appliance is designed for organizations looking for a turnkey integrated hardware and software solution to store, manage and analyze structured and unstructured data (ie: multi-structured data formats). The appliance meets the needs of both departmental and enterprise-wide buyers and can scale linearly to support massive data volumes.

WHY DO I NEED THIS APPLIANCE?

This appliance can help you gain valuable insights from all of your multi-structured data. Using these insights, you can optimize business processes to reduce cost and better serve your customers. More importantly, these insights can help you innovate by identifying new markets, new products, new business models etc. For example, by using the appliance a telecommunications company can analyze multi-structured customer interaction data across multiple channels such as web, call center and retail stores to identify the path customers take to churn. This insight can be used proactively to increase customer retention and improve customer satisfaction.

WHAT’S UNIQUE ABOUT THE APPLIANCE?

The appliance is an industry first in tightly integrating SQL-MapReduce®, SQL-H and Apache Hadoop. The appliance delivers a tightly integrated hardware and software solution to store, manage and analyze big data. The appliance delivers integrated interfaces for analytics and administration, so all types of multi-structured data can be quickly and easily analyzed through SQL based interfaces. This means that you can continue to use your favorite BI tools and all existing skill sets while deploying new data management and analytics technologies like Hadoop and MapReduce. Furthermore, the appliance delivers enterprise class reliability to allow technologies like Hadoop to now be used for mission critical applications with stringent SLA requirements.

WHY DID TERADATA BRING ASTER & HADOOP TOGETHER?

With the Aster Big Analytics Appliance, we are not just putting Aster and Hadoop in the same box. The Aster Big Analytics Appliance is the industry’s first unified big analytics appliance, providing a powerful, ready to run big analytics and discovery platform that is pre-configured and optimized specifically for big data analysis. It provides intrinsic integration between the Aster Database and Apache Hadoop, and we believe that customers will benefit the most by having these two systems in the same appliance.

Teradata’s vision stems from the Unified Data Architecture. The Aster Big Analytics Appliance offers customers the flexibility to configure the appliance to meet their needs. Hadoop is best for capture, storing and refining multi-structured data in batch whereas Aster is a big analytics and discovery platform that helps derive new insights from all types of data. Hadoop is best for capture, storing and refining multi-structured data in batch. Depending on the customer’s needs, the appliance can be configured with all Aster nodes, all Hadoop nodes or a mix of the two.

WHAT SKILLS DO I NEED TO DEPLOY THE APPLIANCE?

The Aster Big Analytics appliance is an integrated hardware and software solution for big data analytics, storage, and management, which is also designed as a plug and play solution that does not require special skill sets.

DOES THE APPLIANCE MAKE DATA SCIENTISTS OR DATA ANALYSTS IRRELEVANT?

Absolutely not. By integrating the hardware and software in an easy to use solution and providing easy to use interfaces for administration and analytics, the appliance allows data scientists to spend more time analyzing data.

In fact, with this simplified solution, your data scientists and analysts are freed from the constraints of data storage and management and can now spend their time on value added insights generation that ultimately leads to a greater fulfillment of your organization’s end goals.

HOW IS THE APPLIANCE PRICED?

Teradata doesn’t disclose product pricing as part of its standard business operating procedures. However, independent research conducted by industry analyst Dr. Richard Hackathorn, president and founder, Bolder Technology Inc., confirms that on a TCO and Time-to-Value basis the appliance presents a more attractive option vs. commonly available do-it-yourself solutions. http://teradata.com/News-Releases/2012/Teradata-Big-Analytics-Appliance-Enables-New-Business-Insights-on--All-Enterprise-Data/

WHAT OTHER ASTER DEPLOYMENT OPTIONS ARE AVAILABLE?

Besides deploying via the appliance, customers can also acquire and deploy Aster as a software only solution on commodity hardware] or in a public cloud.

WHERE CAN I GET MORE INFORMATION?

You can learn more about the Big Analytics Appliance via http://asterdata.com/big-analytics-appliance/  – home to release information, news about the appliance, product info (data sheet, solution brief, demo) and Aster Express tutorials.

 

Join the conversation on Twitter for additional Q&A with our experts:

Manan Goel @manangoel | Teradata Aster @asterdata

 

For additional information please contact Teradata at http://www.teradata.com/contact-us/

Santa Claus and Data Scientists

Posted on: December 3rd, 2012 by Teradata Aster No Comments

 

Who do you believe in more – Santa Claus or Data Scientists? That’s the debate we’re having in New York City on Dec 12th at Big Analytics 2012. Due to the sold-out event this panel discussion will be simulcast live to dig a little deeper behind the hype.

Some believe that data scientists are a new breed of analytic professional that mergers mathematics, statistics, programming, visualization, and systems operations (and perhaps a little quantum mechanics and string theory for good measure) all in one. Others say that Data Scientists are simply data analysts who live in California.

Whatever you believe, the skills gap for “data scientists” and analytic professionals is real and not expected to close until 2018. Businesses see the light in terms of data-driven competitive advantage, but are they willing to fork out $300,000/yr for a person that can do data science magic? That’s what CIO Journal is reporting with the guidance that “CIOs need to make sure that they are hiring for these positions to solve legitimate business problems, and not just because everyone else is doing it too”.

Universities like Northwestern University have built programs and degrees in analytics to help close the gap. Technology vendors are bridging the gap to make new analytic techniques on big data tenable to a broader set of analysts in mainstream organizations. But is data science really new? What are businesses doing to unlock and monetize new insights? What skills do you need to be a “data scientist”? How can you close the gap? What should you be paying attention to?

Mike Gualtieri from Forrester Research will be moderating a panel to answer these questions and more with:

  • Geoff Guerdat, Director of Data Architecture, Gilt Groupe
  • Bill Franks, Chief Analytics Officer, Teradata
  • Bernard Blais, SAS
  • Jim Walker, Director of Product Marketing, Hortonworks

 

Join the discussion at 3:30 EST on Dec 12th where you can ask questions and follow the discussion thread on Twitter with #BARS12, or follow along on TweetChat at: http://tweetchat.com/room/BARS12

... it certainly beats sitting up all night with milk and cookies looking out for Santa!