I have been struggling to reconcile two different thoughts over the last few months and watching a video recently forced me to think about this again. There seems to be a catch 22 between finding value in new data and having the tools and mechanisms justified by the value to find the value. Firstly I see a lot of organisations struggling to get their analytics initiatives underway and sustainable, there are many articles on the web about this. The second is why should organisations have a data discovery capability, is this a marketing term or is there real value in it?
Something occurred to me recently. We were discussing the value chain of analytics, or something analogous to a value chain for analytics in large organisations and exploring how different pieces of data have different value and how this could be used in a BI Centre of Excellence to engage with users. The question was how do you decide when to put new data into the warehouse and what data remains outside the warehouse?
What occurred to me was that a lot of the value of much of the data we were considering had already been established or simply assumed (someone asked “loudly” enough for it and they got it). It was being stored and managed in a data warehouse, it was being accessed by users using various toolsets and although critical to the management of the business is fundamentally operational in nature. The value of a particular piece of data had been established previously and subsequently significant investment had gone into that piece of data to get it into the warehouse on an ongoing basis. That piece of data was now being used to manage and change the business so it was creating impact.
A short digression - because the value of data is determined by what it can be used to accomplish. Data has no intrinsic value, in fact even insight or actionable insight has no value unless it is put into action or changes something. Unless you change the business, a process or an offering in some way the data is merely interesting not important. Analytics teams are often cut off from the business and the ability to impact the business in a meaningful way.
The question became how do you get new pieces of data added to the operational store, the warehouse, because that is how it gets used and therefore that’s how it becomes valuable when you don’t know it is valuable yet? You have to know it is worth something before you integrate it into the data warehouse because there is a significant investment in integrating a piece of data into the warehouse. You have to know it is worth something before you invest in operationalising it. Seems a little confusing because you have to know it is valuable before you make it available to make it valuable, in which case you don’t actually know it is valuable.
Knowing that a piece of data is worth something is also important in justifying analytics teams and getting analytics initiatives up and going. Until you know something has value for a fact, there is no way to build a business case that funds the building of an analytics team. I have seen a number of organisations try address this problem by employing a small team, sometimes an individual, to be the analytics team. Problem solved as it makes it an opex problem, something the business can fund month to month. But this approach struggles to gain momentum, struggles to justify value and is very difficult to build a long term business case around.
These teams are either accommodated in IT where they can often get the data but not the business question or they are housed in the business and struggle to get the data. In addition to this much of the value in analytics comes from looking across the business. Silos tend to be quite good at optimising their narrow world, the value comes from optimising across silos.
So how do you confirm there is value in something, like data, when you are removed from the business process itself?
This is when a discovery platform may be valuable. If you can provide easily accessible analytical ‘sandboxes’ that are both easy to use and can access all types of data you change the problem from being one of funding to one of testing the findings. Currently discovering the value in a piece of data is hard. There is no single technology that addresses all requirements requiring users to employ multiple tools. HDFS and Hadoop is attracting a lot of interest but is not the easiest to use, especially for business users. SQL is positioned as more of a business language but does not access all data structures. So what do you do if you want to find valuable data but are skills constrained?
Someone mentioned to me one of the ways this can be done is using an agile analytics methodology or approach. In my experience of agile, admittedly mostly in software development, “agile” has often become an excuse for no documentation, no objective or no accountability so I have been a little sceptical about anything labelled ‘agile’. Admittedly it has come a long way since I first bumped into agile so decided to test my bias and looked through some articles on an internal website about an analytical agility capability. Don’t get me wrong, I buy the drivers for agility – very short term deliverables, direct business involvement, the output is more important than the governance or methodology so I would like it to work. This was also about analytic agility not so much an agile methodology.
After reading through most of there was something that did strike me and it revolved around what we call a ‘Discovery Platform’.
The only way to identify new items of valuable data is to experiment and test. Something I have been hearing is “fail fast” which sounds bad but really means test lots of things, do it properly and determine which ones are not going to work fast. Take successful experiments and operationalise them fast. This is what a discovery platform can enable. I would like to get other peoples view but this to me is a way to rapidly test things and determine which data long term should be integrated into the data warehouse.
There is the emergence of the Discovery platform, a set of technologies that makes it easier to integrate multiple sources and types of data while providing a uniform mechanism to access them. Namely SQL. They also provide a means to test the insight in a rapid way and thereby prove value before investing in operationalising an insight. You get to test the value in a meaningful way and evaluate the value before having to invest in making it available.
If anyone has a view on either an agile approach to identifying valuable data or how a Discovery platform can help in opreationalising analytics it would be great to hear some views.
Craig Rodger is a senior Pre-sales Consultant with Teradata ANZ focusing on advanced analytics. He has spent 20 years in the IT industry working on how to get value out of systems rather than getting things into them. Having been a member of a number of executive management teams in software, technology and consulting companies and helping build a number of technology business ventures he joined an advanced analytics vendor.