The New Year’s arrival is always a good time for any industry to take stock and chart the way forward. And, as we kick off 2014, it definitely seems to me we’re at a crossroads on several key issues in big data analytics. Below are a few signposts to consider, and I’ll be writing more about these in the weeks and months ahead.
The Expanding Data Universe: Consider modern day wide-scale revolutions like converting from 3g to 4g in telecommunications, augmenting traditional web logs with behavioral and social data, transitioning from aggregate to real-time sensor data or consumers migrating from the Web to mobile apps. These trends carry huge implications for our industry. Part of Hadoop’s allure is the promise of low cost storage, but what about insights? Our sector not only needs to meet high expectations as the data management backbone to support sudden, order-of-magnitude increases in data of 10x or more, but we also need to combine data sources to see the full picture—and turn data to insight and insight to action.
Cloud Evolution: I’ve spoken before about how the cloud is increasingly a realm for not just traditional services, but also high-performance computing; and this evolution comes with some hurdles. As confidence and capabilities grow, the cloud is an increasingly viable option for business-critical operations, and cloud-based architectures can allow for more frequent and easier experimentation—a hallmark of successful analytics. We as an industry need to continue to optimize both traditional and cloud-based solutions to offer the best flexibility, to allow for breakthrough insights.
New Techniques, New Users: We are constantly seeing new and exciting frontiers emerge in big data analytics – including NoSQL, MapReduce, graph analysis and beyond – that multiply options and perspectives around an ever expanding variety of data. Our industry has to continue working to cohesively blend these data sets and capabilities. Just as important, we need to find more ways to expand easy, intuitive access to as many eyes as possible...not just data scientists. The best insights come from a nexus of many data types, many analytic options and many different people taking part in the effort.
Further Optimizing In-Memory: In-Memory promises lightning fast performance. But, while the cost of memory generally is going down, it remains the most expensive component of the hardware stack. Performance and cost requirements should not be at odds, and our industry needs to keep working on comprehensive multi-temperature data management solutions that strategically and automatically prioritize in-memory technology for only the hottest, most used data. Economical, systems-level approaches to multi-temperature data management are the new norm, and we need to continue optimizing.