Monthly Archives: December 2017

A holiday message from Teradata

December 25, 2017

A holiday message from Teradata

2017 has been an exciting year full of data and analytics. It seems that every day has brought us new examples of how data and analytics can transform businesses and revolutionize the management of organizations – no matter what their size. It’s been a great year for Teradata, and together with our customers and partners, every success continually confirms our strong belief that data and analytics are rapidly becoming essential for businesses: defining how they are run and how they compete.

At Teradata, our goal is to make it as simple as possible to unlock the value of data across the enterprise – providing organisations with the tools and skills they need to drive revenue and make better business decisions. We want to support companies in their continued focus on using their data to drive the business forward, whether they want to better inform their decision makers, find new working efficiencies, discover new sources of revenue, or any business transformation you can imagine!

With the year drawing to a close, we want to thank you for your continued engagement with Teradata through this blog. The Teradata blog is a fantastic outlet for our thought leaders and industry professionals to discuss the most pressing and forward-looking issues facing each and every industry today. Hopefully you’ve learnt something, challenged or maybe even reinforced your view on broad topics from product innovation, to customer experience and risk mitigation, to name just a few.

Looking to 2018, we already have an idea of the topics and technologies that are going to matter to your business, and we’ve got teams of industry consultants and data scientists excited to kick start the conversation. We’ll be back early in January with regular content for you to digest, authored by the true data and analytics experts from Teradata and Think Big Analytics.

May good health, peace, prosperity, and happiness be yours throughout the holidays and the new year.

Happy holidays from all of us at Teradata.

The Future Supply Chain: Breaking Down Silos and Anticipating Change

December 20, 2017

RS3595_shutterstock_556989229

What do self-driving cars and supply chains have in common?

On the surface, the process for both may seem entirely automated. But when you dig a little deeper, both rely on a heavy backlog of collected and analyzed data so they can deal with any anomalies that come up on the fly. You need to put as much in in advance as possible and work with exceptions as you go.

Blake Johnson, consulting assistant professor of management science and engineering at Stanford University, anticipates that having this built-in knowledge in supply chain management will lead to significant business outcomes.

“If I’m an executive … the more I can bake in the same kind of self-driving controls, rules and priorities, that would be great for me,” he says. He imagines a future where businesses can catalogue different circumstances and make recommendations to management on what the next steps should be the vast majority of the time.

In supply chain’s history there has been a lot of uncertainty — uncertain demand, uncertain supply, uncertain manufacturing yields, regulations and so on that were hard to predict, according to Johnson.  

“Add on top of that that there are a lot of entities in our own company — sales, supply chain, procurement, logistics. They only get part of the puzzle, information wise or incentive wise,” he says. And once you add external entities to the mix, then supply chain companies face a siloing of information across a complex network. This can leave supply chain companies feeling like they are stuck in L.A. traffic.

The shift to smart automation

But the supply chain of the future will shift away from former business intelligence norms, where industries could just capture a picture of what is happening right now. Instead they will shift to an approach that blends knowledge of past successes and errors with  automation, where they can stay one step ahead by knowing future demands in markets.

At its highest level, Johnson says supply chains will be able to know what action should be taken to meet future demand and then assign who owns that action. And there are two fundamental steps to getting there.

First, businesses must understand that uncertain circumstances are not going to go away — but they can quantify them better with better analysis of past events. By setting some operating boundaries, a supplier can opt to place an order within a certain range of numbers given a circumstance. By putting a minimum and maximum in place, it sidesteps today’s scenario, full of over-ordering for the sake of a happy sales department but then having a ton of change orders later on.

“Typically, as a supplier, I share a forecast with you. You know it’s gonna be wrong, and I know it’s gonna be wrong,” he says.

Second, these new parameters can help supply chain partners make a demand forecast. And then these analytics hold everyone accountable.

The most effective automation requires a lot of pre-committed actions,  just like self-driving cars, which means more upfront work. But that leads to complete alignment, says Johnson. “Everyone knows exactly what to do based on the circumstances they’re in.”

It will take a cultural shift, to move from a situation where there is a plan in theory for an exact order to an approach that estimates many possible projected orders with automated triggers that carry out the best case scenario for each of those possibilities, but this equals savings down the road, says Johnson. And automation that leverages and learns from past data will allow companies to find better tasks for their employers.

“I think this makes people’s lives better. I work with one company that all 300 employees just negotiate change orders,” he says. “We can have these people do something interesting and add value instead.”

Is it Too Late for Your Business to Win the Race to AI?

December 19, 2017

RS2848_shutterstock_397577668

Artificial intelligence feels like it’s in the middle of an arms race — in the boardroom at least.

Over the last few months, nearly every executive I’ve talked to, be it in banking, telco, manufacturing or many other industries, has talked up AI, machine learning and deep learning as “must-have” technologies. And while Teradata’s latest AI report reveals that 80 percent of enterprises have invested in AI, from my experience, not many of these businesses feel like they have AI down pat.

Gartner has commented that “AI washing” is muddying the benefits of AI — making everyone feel like they have adopted it but proliferating confusion about its actual benefits.

The reality is, for any enterprise seeking AI dominance, it’s less about the technologies they buy today than it is about the foundation they’ve built up till now. It’s a tough balance to strike: It will be very difficult for any business to leapfrog over AI’s fundamentals — a strong data science foundation with advanced analytics capabilities. But at the same time, businesses can’t take long to play catch-up.

Michele Goetz of Forrester says, “While many of the rules for business competitiveness and survival had already been redefined before AI became broadly available, its emergence as a viable capability has brought markets and businesses to a tipping point as the next cycle of technology disruption begins in earnest.” It is clearly a business differentiator that is rapidly going from sought-after asset to mature technology.

However, the larger an organization is, the more complex it is. These organizations likely have more fragmentation, a larger proliferation of bad data science practices and have a higher exposure to disruption. This is a double-edged sword. Complexity is difficult to overcome, but complex organizations also have more areas that AI can transform.

The hype around AI is so fervent that enterprises have assigned it this “winner takes all” stature. But it’s not hard to see why this has businesses laser focused on adopting the technology. It takes a very short-term memory to realize what Amazon has done to retail in the last few years. Even the most established giants of that industry are feeling the heat of competition. The reality is enterprises today feel they either have to be a threat or be threatened. They see AI as an opportunity they can’t miss.

The good news is, competitors haven’t figured out how to turn AI into a silver bullet either. So while the race is certainly on, there are no clear winners just yet. The challenge is the agility to change. And even the businesses right now that have the most agility are struggling in that sense.  

Jeff Bezos likes to call Amazon a “Day One” company — always in startup mode, always ready to change. And in fact, not adopting AI is something that Bezos believes threatens to turn companies into “Day Two” organizations. At the same time, Amazon isn’t known for its employees’ long tenures. The rate of change necessary to succeed to customers at these companies is in other ways potentially the source of internal friction. We are truly witnessing some sort of terminal velocity point for rate of change, and if a business can live up to both internal and external pressures, it could come out on top in the AI revolution.

All this promise and opportunity, combined with the need to pivot fast, is creating a lot of tension surrounding AI adoption. And the only way to not get caught up in the drama is to have those foundational elements that ensure your business is data-driven in the first place. If you aren’t there now, it’s time to get there right now. It’s time to let go of any technological debt your company has invested in. You need an architecture that allows you to deliver on today and predict tomorrow. You need to be ready to embrace new approaches, and you have to remain competitive. To do anything else is to risk irrelevancy.

Want to measure your progress with AI against other enterprises? Read our recent State of Artificial Intelligence for Enterprises global report, where business, in multiple industries, get granular about what they’re doing with AI technologies.


Screen Shot 2017-09-25 at 9.26.14 PM

Atif Kureishy – VP, Global Emerging Practices | AI & Deep Learning at Think Big Analytics, a business outcome-led global analytics consultancy.

Based in San Diego, Atif specializes in enabling clients across all major industry verticals, through strategic partnerships to deliver complex analytical solutions built on machine and deep learning. His teams are trusted advisors to the world’s most innovative companies to develop next-generation capabilities for strategic data-driven outcomes in areas of artificial intelligence, deep learning & data science.

Atif has more than 18 years in strategic and technology consulting working with senior executive clients. During this time, he has both written extensively and advised organizations on numerous topics; ranging from improving the digital customer experience, multi-national data analytics programs for smarter cities, cyber network defense for critical infrastructure protection, financial crime analytics for tracking illicit funds flow, and the use of smart data to enable analytic-driven value generation for energy & natural resource operational efficiencies.

Smart Cities 2.0 — Boosting Citizen Engagement

December 18, 2017

RS3543_shutterstock_429072145

Today, a smart city mostly means infrastructure — better roads, more efficient energy use and a higher quality of life for citizens as a result of all these new, connected data points. But the next step for smart cities may look a lot less like public resource management and a lot more like marketing.

The next step of smart city management may map citizen journeys, the same way marketers now track and cajole buyers along a customer journey to get a desired outcome. For enterprises, the goal is usually repeat sales, but for cities, it could be items like boosting public transportation use or determining where the next downtown marketplace should exist.

We’ve seen cities attempt to crowdsource citizen data before, like by creating pothole registries. But this method wouldn’t require engagement on the front end. It would use available data sets blended together to get a cohesive picture of all citizens — not just ones already engaged enough to fill out a form.

And if cities take this passive monitoring approach, they can track behavior not only of citizens, but of visitors as well. Then locations like college campuses, which have a weekly influx of thousands of people on football Saturdays, can guide tourists to use available public services and also stand up additional services on these days that cater to natural traffic flow evident from this data set.

Transportation is a key area citizen engagement initiatives could tackle. With un-siloed data, cities could take particular interest in multimodal transportation — connecting how its citizens use its trains, buses, taxis, ride-hailing services and so forth. Governments could put together apps that show its citizens what the fastest way to get to a certain location is during rush hour, or which option to get them to work is cheapest.

Of course, through all this, there is a privacy angle that cities must tackle. The point is not to literally track every single citizen by name and mobile phone. The point is to make citizen data pseudonymized — not anonymized — so each person becomes represented in trend data. When data is anonymous, it’s not rich enough to provide context clues to base analysis on. But pseudonymized data allows cities to track an entity that represents an individual or group of individuals. For instance, cities could come up with some type of citizen classification, kind of like how businesses come up with buyer personas — there is the doctor, the teacher, the underserved citizen, the government worker. Then a city could provide individuals fitting each of those groups with targeted engagement opportunities based on these categories.

As cities grow, they must be smart enough to know they need digitally connected citizen information, like the beacon in everyone’s smartphone, but also savvy enough to not impede on cultural norms and personal freedoms. The point is to make cities better, not to know what each citizen’s social life looks like. And this kind of common dataset availability can only be made really powerful at the government level. Instead of the end goal being a product or a sale, it can be engaging underserved neighborhoods or citizens with access to the resources they need to thrive in their everyday lives. From thousand-year-old cities in Europe to newer municipalities stateside, citizen engagement initiatives will allow cities to take their smart city planning to the next level.

When cities apply internet of things technology to a problem and integrate it with real-time data on citizen behavior, cities can provide better emergency care, on-time transit options and access to government facilities in a way that caters to the public. Instead of being an internet of things, the cities of the future will more likely look like an internet of us.

Learn more about how Siemens’ is building a smart city with the Internet of Things.


 

john-timmermanJohn Timmerman has spent the last 23 years with Teradata and has seen all sides of the Teradata enterprise and across most industries and geographies through his work in sales, business development, sales support, product management and marketing. For the last 12 years, his focus has been in the areas of CRM, Customer Interaction Management and Inbound Marketing.

 

Q&A with Sri Raghavan: The Future of AI for Enterprises

December 14, 2017

RS2713_shutterstock_417941752

Sri Raghavan, senior global product marketing manager at Teradata, answered a few questions on algorithms, bias detection and the maturity of enterprises using AI.

As data analytics progresses, do you think there will be significant progress in the sophistication of algorithms?

It’s not so much that one algorithm is going to make a difference in your life. Gone are the days when there was a silver bullet you could used to be able to address questions.

To get better analytics, we are doing a couple things. We are, of course using new algorithms and theoretical practices to be able to develop algorithms, but we are also combining algorithms in a multi-genre way. That’s Teradata’s way — being able to mix and match intelligence. Yes, new models and algorithms are always developed, but we are always finding ways to mix and match the techniques, and we are getting better results over and over again.

How can enterprises be aware of any biases appearing in their analytics?

There’s no universal panacea for this, and that’s an unsatisfactory answer. Everyone is biased. Once you have a frame of reference in your mind, a bias always occurs, unless it’s a tabula rasa, which no one past 6 months of life ever has. It’s impossible not to remove biases out of the picture.

I’m not concerned about removing biases in a numerical notion. What I’m concerned more about is if you are aware of your biases. Then an honest person will at least acknowledge their biases and provide the recommendations that work around them.

I don’t think biases will ever go away. But I want people to understand that there are biases that they bring to the discipline of data collection. And as long as they are able to detail it out in as honest of a way as possible, that’s the best thing you can do.

Do you think organizations are properly structured right now to deal with the level of agility they need to have to be data-driven in the manner described in “The Sentient Enterprise” book?

It’s a hard question to answer, because many organizations don’t experiment with organizational structure to figure out sentience. The notion of sentience is something we are — no pun intended — becoming conscious of, but it involves many parts of the organizational hierarchy. Teradata recognizes that a business needs to center around the customer. How are you impacting your customers, and how are you working for them? It affects other activities in the organization.

Where are we in the continuum of AI use in business?

I don’t believe that AI even has come to the point where people are focused on ROI. I think there is still an enormous amount of confusion about what AI is. Everyone talks about AI and machine learning, deep learning, neural networks, expert systems as if all these terms can be interchangeably used. We have a huge taxonomic confusion. People don’t understand the genealogy of how these disciplines evolve, and they don’t quite understand how they can be applied to address customer problems. The industry is getting better at education, but many companies have a long way to go before they significantly lower the barrier of AI use.

If companies aren’t focused on ROI yet, how can they get there?

One of the big things is not to focus on AI as monolithic. There are many intractable problems today that can be addressed through perfectly well-established techniques, not that AI isn’t well established — it’s been around forever. But what’s most important, and what Teradata has been focused on for quite some time, is it’s not the technology or the underpinning analytics. It’s about what ails you. Is your problem customer attrition? Is your problem fraud? Is your problems patients are not satisfied? Is your problem the need to provide quality services?

Once you put a frame around that and define the use case, then that automatically allows data scientists and analytics professionals to pick and choose the various analytical techniques which then inform the use cases.

I think that’s a far better approach than a frame of reference that says, “I’m going go to do AI.” It’s not fitting the use case of the problem to the analytic; it’s fitting the analytic to the problem. The problem needs to comes first.

 

Want to learn more about how AI is being used in the Enterprise? Get detailed insights here.


sri-raghavan-hadoop16Sri Raghavan is a Senior Global Product Marketing Manager at Teradata and is in the big data area with responsibility for the AsterAnalytics solution and all ecosystem partner integrations with Aster. Sri has more than 20 years of experience in advanced analytics and has had various senior data science and analytics roles in Investment Banking, Finance, Healthcare and Pharmaceutical, Government, and Application Performance Management (APM) practices. He has two Master’s degrees in Quantitative Economics and International Relations respectively from Temple University, PA and completed his Doctoral coursework in Business from the University of Wisconsin-Madison. Sri is passionate about reading fiction and playing music and often likes to infuse his professional work with references to classic rock lyrics, AC/DC excluded.

BYOL, Fold/Unfold Now Available on Both Azure, AWS

December 13, 2017

RS5047_shutterstock_131805221Good news! Release 5 has been published to both Azure and AWS Marketplaces, bringing important new features including:

  • Azure and AWS: BYOL (Bring Your Own License)
  • Azure: Fold/Unfold
  • Azure: New Storage and AMP Configurations
  • AWS: Storage Elasticity
  • AWS: Teradata AppCenter

BYOL (Bring Your Own License) gives you the ability to buy portable software subscriptions directly from Teradata and then deploy on either Azure or AWS Marketplaces – rather than do the entire transaction through the Marketplaces.

Buying directly from Teradata gives you more purchase options, including:

  • Bundled offers that include Teradata hardware, software, and/or services
  • Lower pricing for longer term commitments (e.g., 1 year or 3 years)

Having BYOL is great for your flexibility because it enables self-service TCore portability between Public Cloud platforms (AWS and Azure) – which helps de-risk your deployment decision. This “Move Any Time” advantage is one of the key tenets of the Teradata Everywhere strategy.

—-

Teradata Database Fold/Unfold on both Azure and AWS gives you the ability to double or quadruple your compute resources while keeping AMPs and storage volumes unchanged. Thus, in a matter of minutes, you can Unfold your Teradata system to increase compute capacity when needed, and then Fold to reduce compute capacity (and cost) when no longer needed.

The Azure Getting Started Guide has more information about Fold/Unfold – and the equivalent information for AWS is here.

—-

Besides BYOL, Storage Elasticity is the big change in AWS in Release 5. Instead of fixed increments of available storage for the m4.xlarge family of instances (e.g., 5TB, 15TB, 20TB, 30TB), you can now select any storage value from 5TB to 60TB in 1TB increments – and storage may be adjusted before or after provisioning.

Storage Elasticity enables you to start small and grow only when needed, avoiding the requirement to specify (and pay for) a higher quantity of Elastic Block Store (EBS) storage up front when it’s not being utilized.

The AWS Getting Started Guide has more information about Storage Elasticity; see “Storage Expansion”.

—-

Lastly, Release 5 includes the publication of Teradata AppCenter on AWS Marketplace. Teradata AppCenter (IntelliSphere) – not to be confused with Teradata Aster AppCenter, which is still listed on AWS Marketplace – is part of Teradata IntelliSphere, which bundles 10 ecosystem software ingredients into one convenient package that may be added to a Teradata Database subscription.

Teradata AppCenter is a self-service environment that enables the easy creation and reuse of analytics. With numerous prebuilt features, Teradata AppCenter allows your data scientists and developers to find, build, share, and deploy analytics; non-technical users can run apps, visually study results, and share insights.

See the Teradata AppCenter and Teradata IntelliSphere pages for more details.

—-

Bottom line: there are now three different ways that you can buy and use Teradata software in the Public Cloud using standard Azure and AWS infrastructure:

  1. IntelliCloud* – Advanced and Enterprise tiers
  2. BYOL – Base, Advanced, and Enterprise tiers
  3. Marketplace – all four Teradata Database tiers

*IntelliCloud on Azure will be launched later this month for US commercial Azure regions; stay tuned!

All exciting stuff – and we will continue to innovate across all our Public Cloud offers.

Let us know if you have any questions or comments about what we offer on Azure and AWS.


brian wood headshot teradata cloud marketingBrian Wood is director of cloud marketing at Teradata. He is a results-oriented technology marketing executive with 15+ years of digital, lead gen, sales / marketing operations & team leadership success. He has an MS in Engineering Management from Stanford University, a BS in Electrical Engineering from Cornell University, and served as an F-14 Radar Intercept Officer in the US Navy.

Supervised learning in disguise: the truth about unsupervised learning

December 12, 2017

Supervised learning in disguise – the truth about unsupervised learning - Danko Nikolic

One of the first lessons you’ll receive in machine learning is that there are two broad categories: supervised and unsupervised learning. Supervised learning is usually explained as the one to which you provide the correct answers, training data, and the machine learns the patterns to apply to new data. Unsupervised learning is (apparently) where the machine figures out the correct answer on its own.

Supposedly, unsupervised learning can discover something new that has not been found in the data before. Supervised learning cannot do that.

The problem with definitions

It’s true that there are two classes of machine learning algorithm, and each is applied to different types of problems, but is unsupervised learning really free of supervision?

In fact, this type of learning also involves a whole lot of supervision, but the supervision steps are hidden from the user. This is because the supervision is not explicitly presented in the data; you can only find it within the algorithm.

To understand this let us first consider the use of supervised learning. A prototypical method for supervised learning is regression. Here, the input and the output values – named X and Y respectively – are provided for the algorithm. The learning algorithm then assesses the model’s parameters such that it tries to predict the outputs (Y) for new inputs (X) as accurately as possible.

In other words, supervised learning finds a function: Y’ = f(X)

Supervised learning success

Supervised learning success is assessed by seeing how close Y’ is to Y, i.e. by computing error function.

This general principle of supervision in learning is the basic principle for logistic regression, support vector machines, decision trees, deep learning networks and many other techniques.

In contrast, unsupervised learning does not provide Y for the algorithm – only X is provided. Thus, for each given input we do not explicitly provide a correct output. The machine’s task is to “discover” Y on its own.

A common example is cluster (or clustering) analysis. Before a clustering analysis, there aren’t known clusters for the data points within the inputs, and yet the machine finds those clusters after the analysis. It’s almost as if the machine is creative – discovering something new in the data.

Nothing new

In fact, there is nothing new; the machine discovers only what it has been told to discover. Every unsupervised algorithm specifies what needs to be found in the data.

There must be criterion saying what success is. We don’t let algorithms do whatever they want, or ask machines to perform random analyses. There is always a goal to be accomplished, and that goal is carefully formulated as a constraint within the algorithms.

For example, in a clustering algorithm, you may require the distances between cluster centroids to be maximized, while the distances between data points belonging to the same cluster are minimized. Plus, for each data set there is an implicit Y, which for example may state to maximize the distance-between/distance-within ratio.

Therefore, the lack of supervision in these algorithms is nothing like the metaphorical “unsupervised child in a porcelain shop”, as this would not give us particularly useful machine learning. Instead, what we have is more akin to letting adults enter a porcelain shop without having to send a nanny too. The reason for our trust in adults is that they have already been supervised during childhood and have since (hopefully) internalized some of the rules.

Something similar happens with unsupervised machine learning algorithms; supervision has been internalized, as these methods come equipped with algorithms that informs what are good or bad model behaviours. Just as (most) adults have an internal voice telling them not to smash every item in the shop, unsupervised machine learning methods possess internal machinery that dictates what constitutes good behaviour.

Supervised vs. unsupervised

Fundamentally, the difference between supervised and unsupervised learning boils down to whether the computation of error utilizes an externally provided Y, or whether Y is internally computed from input data (X).

In both cases there is a form of supervision.

As all unsupervised learning is actually supervised, the main differentiator becomes the frequency at which intervention takes place. For example, do we intervene for each data point or just once, when the algorithm for computing Y out of X is designed?

Hence, within the so-called unsupervised methods, supervision is present, but hidden (it is disguised) because no special effort is required from the end user to supply supervision data. The algorithm seems to be magically supervised without an apparent supervisor. However, this does not mean that someone hasn’t gone through the pain of setting up the proper equations to implement an internal supervisor.

Consequently, unsupervised learning methods don’t truly discover anything new in any way that would overshadow the “discoveries” of supervised methods.

Looking to discuss how machine learning – supervised or unsupervised – can benefit your business? Get in touch.


Danko NikolicDanko Nikolic is a brain and mind scientist, as well as an AI practitioner and visionary. His work as a senior data scientist at Teradata focuses on helping customers with AI and data science problems. In his free time, he continues working on closing the mind-body explanatory gap, and using that knowledge to improve machine learning and artificial intelligence.

From Senegal to North Korea: Finding New Analytics Solutions to Fight Economic Disparity

December 11, 2017

RS3214_shutterstock_605838239Enterprises have grown dramatically in their ability to apply analytics to solve corporate ailments. But, as clichéd as it must sound, there is a far loftier purpose for analytics than just enabling businesses to make money, reduce costs, mitigate risks, engage in smarter purchases and so on.  Yes, these are important. But, of equal importance ought to be the focus on how we can use data and our vast troves of analytics core competence in areas that correct societal wrongs and better people’s lives.

One of these areas is enabling better economic outcomes in regions of the world where bias or lack of information are holding a culture back.

Financial credit to the indigent and the entrepreneurial in Senegal

In Senegal, a West African nation of about 14 million people that is plagued by high unemployment rates, the drive to become economically prosperous is often unmatched by the ability to lubricate the credit system to serve the wellspring of entrepreneurship in Senegalese society. Traditional sources of data to determine creditworthiness of the poor are universally absent. Much of the economic activity is cash based and reliance on some of the modern-day credit instruments is often minimal to nonexistent. Under these circumstances, applying some of the standard credit evaluation methodologies is likely to be a nonstarter.

Enter the French digital finance group Microcred, which works to contribute to the growth of local economies in developing nations by offering simple, accessible financial services. The organization provides lending to nearly a half-million micro-entrepreneurs in eight African countries that either lack or have weak guarantees, and would otherwise be unable to access financial markets.

Microred, in partnership with Datakind (a pro bono group with the goal of offering data services through data scientists to nonprofit organizations), undertook primary data collection through highly personalized loan applications that considered the unique economic conditions of the population that they aimed to serve to develop credit risk models that would attempt to realistically assess the likelihood of loan repayment or default.  The data from about 110,000 loan applications made over a seven-year period was used to build rich predictive models to assess default likelihoods.  The idea behind developing these predictive models was not just to preselect the defaulters and provide loans that are likely to be safe but also ensure that drivers of loan default are identified and the borrower pools are significantly expanded in the process. The result is a far greater availability of sensible credit options, higher levels of economic activity and greater overall economic prosperity that significantly alleviates widespread poverty and desperation.

There are many such examples of advanced analytics having a positive impact on other areas, such as anticipating and protecting against environmental disasters, developing lifesaving medical interventions, and creating social programs for women and children in less developed economies.  

What happens when data is sparse or missing?

While wanting to effect good outcomes with data is a worthy aspiration, it is not always easy. Sometimes crucial data are missing and that can hamper the best of us from making reasonable assessments of the available policy options. Take for example structured survey data on national income levels. Those of us living in the developed world are accustomed to interacting with entire institutions dedicated to collecting, processing and analyzing data at scale on all things associated with economic development. This data is often triangulated with other sources of information so the resulting conclusions hold up to rigorous scrutiny — biased electioneering slogans notwithstanding. This is not the case in less developed economies where data is often poorly collected and curated and can be malleable.  

This is where alternate data sources can be used as effective proxies. Sendhil Mullainathan, an economist from Harvard, shows how several new techniques and unusual data can be used to accurately assess economic activity. In Uganda, satellite photos can be used to estimate harvest sizes so earnings from agriculture for the year can be accurately forecast. In North Korea, nighttime luminosity obtained from satellite data pinpoints the stark divergence in external estimates of rural electrification numbers from official figures.  

Economic activity in North Korea is either a purely daytime affair or is not quite as robust as what their government leads us to believe. In Rwanda, cell phone metadata was used to determine concentrations of wealth. Richer people tended to make calls of longer durations at certain times of the day, while less economically fortunate people tended to make calls of shorter duration. All these assessments result in public, economic and political policy prescriptions, none of which are possible without an out-of-the-box analytic approach where unconventional data sources are included in the analyses.

Organizations like Flowminder have pioneered innovative data collection, ingestion and analytics practices to address intractable problems, such as precision epidemiology and understanding the effects of natural disasters on population displacement. The upshot is that the non-availability of good, structured data should be a wakeup call to search for other nontraditional sources that are available but hidden and on which multi-genre analytics can be applied to deliver critical insights that addresses serious socioeconomic challenges.  


sri-raghavan-hadoop16

Sri Raghavan is a Senior Global Product Marketing Manager at Teradata and is in the big data area with responsibility for the AsterAnalytics solution and all ecosystem partner integrations with Aster. Sri has more than 20 years of experience in advanced analytics and has had various senior data science and analytics roles in Investment Banking, Finance, Healthcare and Pharmaceutical, Government, and Application Performance Management (APM) practices. He has two Master’s degrees in Quantitative Economics and International Relations respectively from Temple University, PA and completed his Doctoral coursework in Business from the University of Wisconsin-Madison. Sri is passionate about reading fiction and playing music and often likes to infuse his professional work with references to classic rock lyrics, AC/DC excluded.

Understanding Teradata Elasticity

December 7, 2017

RS2874_shutterstock_236054035As a child in 1980 with the last name of Armstrong, you were bound to be teased with the nickname “Stretch.” (Google it.)  Now many years later, it is a fortunate coincidence to have that nickname “legacy” as Teradata delivers on a long-desired capability of greater elasticity of the database environment with the hybrid cloud offerings and our Teradata Everywhere licensing models.

Scalability and elasticity

It is first important to note that for elasticity to be realized, there must be the scalable foundation already in place. It makes no sense to expand platform resources if the software cannot fully take advantage of that expansion.

While Teradata has been the leader in scalability for decades, that scale did come with some strings attached. Due to a variety of hardware and software design features, the ability to grow or shrink the Teradata environment required downtime to make sure the data was redistributed and the parallelism took full advantage of all resources. It was also true that Teradata was normally deployed as a physical system in a customer’s data center.

But with the advent of cloud environments, both public and private, people are looking for a different type of scale and elasticity, which is much more on demand without outages. They also need to scale with much more granularity and frequency. People expect that all environments should be able to be quickly, and seamlessly, configured to fit the need of the day. More importantly, they also want to only pay for what was used for those purposes.

Any one of these desires presents challenges and, when taken together, they look daunting.  Clearly there is not one solution, but there needs to be a spectrum of options that provide an elastic continuum.

Elasticity comes in many flavors

The hybrid cloud environment recognizes that companies will invest in many different types of deployment, from having on-premises physical systems to managed cloud offerings and to publicly available cloud providers. Being able to provide elasticity across all these options must include not only the hardware aspects but the software ones as well. Understanding each environment brings different challenges — and opportunities. Teradata has now four distinct types of elasticity to offer. These options are shown below:

Elasticity

Dynamic Workload Prioritization

While not elasticity in the classic sense, one of the goals of elasticity is to manage resources to meet demand. Here, there is a defined system, and rather than adjusting the amount of resources, the resources are directed for the most critical workloads. The underlying goal is to meet service level agreements and ensure that a prioritized workload gets completed. With Dynamic Workload Prioritization, systems can be configured to ensure that the right resources are applied to the right workloads, dynamically at the right time, according to business needs.

Performance on Demand

As we move through the spectrum, being able to add more system capacity on demand provides for elastic demand, both up and down. In this instance, there is a physical platform, and, by using system controls, the CPU is capped at 75 percent, with reduced operating costs. As workload changes or peak periods are encountered, the system can have the additional resources made available, and charged for, in 1 percent increments. After the peak volume has passed, the system can be brought back down to the 75 percent level. This requires no downtime and ensure costs are more directly related to need and usage. The benefit here is that companies do not need to “over purchase” for the peak and have “paid for” resources sitting idle during the normal workloads.

Expand compute power

With the new IntelliFlex platform, Teradata can add CPU and IO independently. Teradata has taken this new capability to provide the next level of elasticity. In this option, nodes are connected to the platform and, with a small restart, additional CPU and memory resources are made available to handle peak workloads for a temporary timeframe, such as month-end processing or an annual peak demand, such as Christmas time. This is called “unfolding” a system. Again, after the peak need has passed, the system can be “folded” to the original configuration, once again reducing costs.

Rapid operational expansion

In the past, a Teradata system expansion required an extensive outage to redistribute the data to the new nodes in the system. Now with Teradata 16.10 and the introduction of multiple hashmaps, that is no longer required. In this scenario, new capacity is added to the system for a longer term capacity need. The customer then has the option to migrate tables to the larger configuration (i.e., more AMPs) at their discretion. Tables can be grouped to make the redistribution together, and it is no longer an “all or nothing” expansion.

This option is more applicable when customers need to expand to accommodate data and processing growth and it is not expected to just meet a temporary surge.  

Getting what you paid for and paying for what you get

Elasticity is about making sure that you are getting the resources you need to meet your workload, but at the same time, only paying for the resources that you need to use to meet your needs at the time. By combining the above elasticity options across the software and hardware spectrums, with the Teradata Database licensing options, customers can now match their workload need to system resources throughout the ebb and flow of the business processes.


Rob ArmstrongStarting with Teradata in 1987, Rob “Stretch” Armstrong has contributed in virtually every aspect of the data warehouse and analytical processing arenas. Rob’s work in the computer industry has been dedicated to data-driven business improvement and more effective business decisions and execution. Roles have encompassed the design, justification, implementation and evolution of enterprise data warehouses.

In his current role, Rob continues the Teradata tradition of integrating data and enabling end user access for true self-driven analysis and data-driven actions. Increasingly, he incorporates the world of non-traditional “big data” into the analytical process.  He also has expanded the technology environment beyond the on-premises data center to include the world of public and private clouds to create a total analytic ecosystem.

Rob earned a B.A. degree in Management Science with an emphasis in mathematics and relational theory at the University of California, San Diego. He resides and works from San Diego.

Built like Blockchain? Creating a Foundation for Trusting AI Models

December 6, 2017

RS3812_shutterstock_465653942What is my AI model doing?

That question is critically important to companies today — especially in heavily regulated industries. Banks need to clearly tell their customers and regulators why they blocked someone’s request for more credit or why a certain transaction triggered a fraud warning. But finding out the answer isn’t always immediately obvious.

This isn’t just an issue in financial services or for insurance companies, though. Any business needs the ability to trust its analytics, to be certain its corporate intelligence isn’t leading the business awry. In fact, Gartner predicts that by 2022, enterprise AI projects with built-in transparency will be 100 percent more likely to get funding from chief information officers.

It’s no surprise then that, within AI’s current capabilities, data scientists are the golden children of the analytics-driven enterprise. They hold the keys when it comes to training up a model in the first place, emphasising representation learning, or feature learning, to build out supervised neural networks using labeled, inputted data. In the state of AI, the model economy is a precious resource and data scientists can unlock the black box and see what’s actually behind an insight.

But it’s inevitable that one day we’ll progress past very structured data and labor-intensive features engineering and move onto a world where AI models can leverage unstructured data and determine insights. We’ll no longer assign data scientists the job of wrangling data on the front end. We’ll have semantic smoothing that’s able to perfectly understand natural language and sentiments. And even though there are physical limitations on computational power itself, one day we’ll even push those to the limit, stretching power, space and cooling restrictions right up to their environmental constraints. Quantum principles are already being applied to specific problem sets, and these kinds of landmark moments will likely define 21st century computing.

Alongside all these amazing accomplishments, there must be a proportional effort to build trust into these AI models as they progress. Another one of Gartner’s 2018 predictions is “AI will fuel a broad reaction in terms of growing concerns over liability, privacy violations, ‘fake news’ and pervasive digital distrust.”

The reality is that if you push more intelligence into a machine and then that machine collaborates with other machines, that environment needs to be steeped in trust. Otherwise, unchecked power is bound to be exploited.

But, advancements in an adjacent field that seems poised to get just as hyped as AI could hold some answers — blockchain.

Currently mostly known as a cryptocurrency, the immutable ledgers created by blockchains are a prime example of how to take a generalized approach to decentralizing how machines interact with each other via a process that also ensures that the output is exchanged and embedded with trust. This technology is still in its infancy, but many experts are already imagining how blockchain paired with AI could create massive-scale distributed databases that push out more sophisticated — and still auditable — AI models. As such, it’s no surprise that these highly regulated fields, like health care, have the most blockchain hype.

This type of trust in AI models isn’t inevitable, however. It’s something that needs to be prioritized to ensure that as the field progresses, it has the fidelity necessary to shape how enterprises make their decisions.

For more on overcoming the issue of trusting smart machines, read our blog on how to cultivate understand among us humans and establish trust of AI.


 

Screen Shot 2017-09-25 at 9.26.14 PMAtif Kureishy – VP, Global Emerging Practices | AI & Deep Learning at Think Big Analytics, a business outcome-led global analytics consultancy.

Based in San Diego, Atif specializes in enabling clients across all major industry verticals, through strategic partnerships to deliver complex analytical solutions built on machine and deep learning. His teams are trusted advisors to the world’s most innovative companies to develop next-generation capabilities for strategic data-driven outcomes in areas of artificial intelligence, deep learning & data science.

Atif has more than 18 years in strategic and technology consulting working with senior executive clients. During this time, he has both written extensively and advised organizations on numerous topics; ranging from improving the digital customer experience, multi-national data analytics programs for smarter cities, cyber network defense for critical infrastructure protection, financial crime analytics for tracking illicit funds flow, and the use of smart data to enable analytic-driven value generation for energy & natural resource operational efficiencies.