logo logo

Sick as a parrot

Image Source: http://www.marketing.chOh dear, oh dear, oh dear.  Electing to change my flights so that I could watch the England versus Germany game in the Lufthansa lounge at Frankfurt airport en route to Madrid on Sunday turned out not to be my smartest ever move; not only did I have to suffer the indignity of watching England crash out to a technically and tactically superior German team, I was also surrounded by jubilant German fans as the debacle unfolded.  For me at least, the controversy over the disallowed Lampard goal should not obscure the fact that England performed poorly throughout the tournament.  The young German team has earned the right to contest a quarter-final with Diego Maradonna’s Argentina team, although I cannot quite bring myself to wish them well in that endeavour.

 

(This should not upset my German friends and colleagues unduly, as every team that I have cheered for on tour has been ignominiously defeated: France crashed 2-0 to Mexico the day after I chanted “allez les Bleus!” in Paris; in Rome, citing a long-dead maternal Italian Great Grandfather, I melodramatically ripped-off my business shirt to declare my allegiance to the Azurri, only to watch Italy succumb 3-2 to Slovakia; and I have said enough already about the England game.  Football and I may be about to enter a period of trial separation.)

 

And so to Madrid, for the final stop of our 2010 CTO Road Show.  Madrid’s Puerta del Sol, is the centre of the country’s road network (el kilómetro cero).  The Spanish traffic administration is spearheading the application of analytics in road and traffic management – and is going well-beyond merely spreading information about congested motorways and on-going construction work in doing so.  The authority responsible for traffic administration – the Dirección General de Tráfico (DGT) - is mining its traffic data to identify trouble spots where significantly more accidents occur than elsewhere and then taking measures to improve road safety at these places.  The DGT started on the journey of developing the analytical capabilities necessary to succeed in this endeavour about three years ago, when it started the implementation an EDW that incorporated data from a variety of sources, including driver, vehicle and meteorological data as well as traffic tickets and data from automated radar devices.  By tracking the impact of speed controls, for example, the authorities can optimize the location of their radar devices; bad news for drivers like myself with a heavy right-foot, but important to improving public safety, nonetheless.  This analysis will become both easier and more sophisticated this year, as DGT is preparing to make use of Teradata’s new geospatial capabilities to plot the location of traffic incidents with greater accuracy and to manipulate this data much more efficiently.

 

All of which is proof – if proof were needed – of what my travelling companion and Teradata technology supremo, Stephen Brobst, has been saying throughout our European tour: the “Internet of things” is upon us and time and space are critical dimensions where the Tsunami of sensor data are concerned.  The accurate and consistent representation of time and location attributes are necessary, but not sufficient; DBMS technology must also be able to support scalable, simple, high-performance manipulation of these data.  As Stephen like to point out, “a write only Data Warehouse is not very interesting” critical, we must also be able to manipulate these data.  Stephen’s explanation of Teradata’s own implementation of high-performance geo-spatial processing is a model of clarity; if you were unfortunate enough to miss his presentation on tour and would like to know more, please get in touch to discuss your requirements.

 

While Madrid is focussing on the calles and carreteras, Paris is using analytics to improve travellers’ experience of public transport.  The Syndicat des transports d'Île-de-France (STIF) has been recording departure and arrival times, carrier and dates of travel - mainly by capturing anonymous mobile phone data of each individual traveller - since 2008.  By using special algorithims to process the resulting data - more than 17 million events per day - STIF is able to model the impact of construction detours or line outages on consumer usage patterns.  This gives STIF a much better idea of the actual transportation demand, frequency, journey time and even punctuality, for which the authority had to rely on passenger surveys in the past.  The result is greatly improved investment decision-making – not to mention a happier and more relaxed travelling public.

 

This combination of detailed data and geospatial capabilities can be further enhanced with powerful visualization techniques. For example, 4D visualizations add additional dimensions like time to the picture, turning a detailed event map into a movie.  Traffic authorities can use these visualizations to study how traffic patterns vary between weekday rush hours and Saturdays, for example, or to get an impression of the effects that a closed road would have on the whole road system and schedule maintenance works accordingly.  City planners may also be interested to analyze how average income levels in different areas correlate with traffic patterns, to predict likely traffic volumes as cities develop and evolve.

 

Good visualization technology – like that provided by Teradata partner Tableau - can help analysts to rapidly uncover meaning and relationships in even very large and complex data-sets.  Several million English fans did not need advanced technology to see clearly on Sunday that which one Italian apparently could not: that Peter Crouch would have been much more likely to score the 3 goals that England needed to avoid elimination from the 2010 World Cup than Emile Heskey.  Probably it made no difference to the final outcome, but still Fabio Capello was blind to the bleedin’ obvious.

 

Until the next time, adios, amigos.

 

Martin Willcox



It’s a numbers game (of two halves)

ZurichThe world centre of football is… Zurich! South Africa may be getting all of the attention at the moment, but it is here in Switzerland's largest city that most of the important decisions are made. Zurich is, of course, home to world governing body FIFA and it is here that decisions are taken about how the rules of the game should be enforced, what price broadcasters should pay to secure TV rights to the World Cup – and how that Kings Ransom should then be distributed.

 

This year’s tournament will probably earn FIFA a record $2B; by comparison, the 2002 tournament earned it only a paltry £636M (or $958M at current exchange rates). The cost to South Africa of staging the 2010 World Cup is estimated at $3.5B; a price that some commentators consider far too great for a country blighted by want on such a large scale to bear; especially as the lucrative – and predictable – marketing rights are FIFA’s to sell, whilst only the unpredictable revenues from local ticket sales are returned to the host nation. FIFA, it should be said, has made a $500M contribution towards the infrastructure costs incurred by South African organizing committee. And it will earn no substantial additional revenues until the 2014 World Cup, despite the fact that it will incur significant costs in organizing less glamorous competitions and in promoting the game at grass-roots across the world between now-and-then.

 

Rather harder to defend, perhaps, are players’ incomes, which have grown even more spectacularly than has FIFA’s revenues. According to one respected review, the average annual salary of footballers in England's top league broke through the £1m ceiling in 2007, which means that even the journeymen in the current England team now earn near enough the national average annual salary in just one week. England’s World Cup winning team of 1966 were also relatively well-paid - but earned 6 times more than the average annual salary, not 52 times more.

 

In the last 30 years, meanwhile, processor performance has increased by five million times, whilst storage performance has increased by a factor of only five. This yawning performance gap – bigger even than the gap between the performance of the England team and the expection of the English media and fans - is clearly unsustainable, particular for I/O intensive workloads such as data warehousing. And this is the reason for our excitement about solid-state storage (SSD) technology; as Teradata technology supremo Stephen Brobst has been explaining on our tour of the region, SSD technology improves random I/O performance by two orders of magnitude and will transform the database industry.

 

Though revolutionary, SSD technology remains more expensive than magnetic storage technology - and that disparity will continue to endure, even if, as Stephen predicts, the cost of SSD technology continues to decrease 40% year-on-year on a compound annual basis. Not only that, but the industry faces an explosion in data volumes.

 

Total stored data volumes are increasing exponentially. Stephen makes the point that the Petabyte (1015 Bytes) age is already upon us - and that the Zettabyte (1021 Bytes) age is in the line-of-sight. One of the current high-profile drivers of this growth - Web 2.0 and its myriad micro-conversations, mediated by Facebook & Co. – is in fact less important than the revolution in sensor technology and the Tsunami of senor data that will follow. As Stephen says of Social Networking: “in the end, there are only so many monkeys and so many typewriters – but there will be trillions of interconnected sensors.” Those interconnected sensors will re-define the meaning of “detailed data”, as we move from collecting transaction data in the data warehouse to capturing interaction data.

 

(Stephen, by the way, is an avid social networker himself and is paraphrasing the old adage that given enough monkeys and enough typewriters the Complete Works of Shakespere could be re-produced, not making pejorative judgements about the value of social networks and those that spend their time sharing online! At least I don’t think he is calling me a monkey.)

 

Teradata was, of course, the first vendor to announce an all solid-state storage data warehouse appliance, an achievement that we are justifiably proud of. But not even in our wildest dreams do we envisage that it will make sense for most of our customers to put most of their data on SSDs. And trust me, my wildest dreams are pretty wild…

 

That’s not just because solid-state storage is relatively expensive (though it is), and not just because the unit price of magnetic storage also continues to fall (which it does, albeit less spectacularly than the price of solid-state storage). It’s because the Paretto rule also applies to storage access: 80% or more of access requests address 20% or less of the data managed by the average analytical database. It follows that no sane CFO will approve a request to store most or all of an organization’s - exponentially increasing, remember – data on the most expensive, high-performance storage devices. Not when he or she could get roughly the same benefit for 20% of the outlay. And insane CFOs are as rare as England hat tricks at the World Cup (another iPod Shuffle to the first reader who can correctly identify the player who last accomplished this feat and the match and tournament concerned).

 

All of which is why Teradata’s all solid-state storage appliance is just the beginning of our journey with this technology - and the reason that the ultimate destination is mixed-storage products, with “hot” and “cold” data automatically and transparently migrated between the different storage devices using our industry-leading Teradata Virtual Storage software. The automated migration of data is much more than cool, geeky engineering, by the way - it’s essential to the trick of getting an 80% performance boost for a 20% investment, because today’s “hot data” is yesterday’s old news. That ball will keep on rolling…

 

Enough of these number games, time for the real thing. England are through to the last 16 of the World Cup – just! – and take on Germany on Sunday, a game I will be watching in the Lufthansa lounge at Frankfurt airport, en route to Madrid. Meat pie, sausage roll – come on England, give us a goal!

 

Martin Willcox


Infamy! Infamy! They’ve all got it in for me!

Rome

 

I promised yesterday that normal service would be resumed, so the title of today’s post is a Kenneth Williams’ one-liner from the film “Carry on Cleo”. This gag was voted the funniest in film history - ahead even of Leslie Nielsen’s legendary “…and don’t call me Shirley” - in a poll conducted by Sky Movies. I mention this because it’s just possible that one day this very question will be all that separates you from the million Pounds / Euros / Dollars in “Who Wants to be a Millionaire”.

 

Actually I mention this because Williams plays the part of Julius Ceasar in the film and I need a tenuous link to the Roman Empire. (Did you see what I did there? Maybe I could do this for a living!)

Benjamin Disreali said of Rome: “A great city, whose image dwells in the memory of man” – and to wander the streets of Rome is indeed to marvel that so much architecture, beauty and history can be consumed locally. What must it be like to live here; to be permanently surrounded by all of this? Does one eventually forget to stop and stare at the Colosseum, the Pantheon, the Arch of Constantine? Do modern Romans instead hurry past all of this beauty to beat the morning rush at Starbucks?

 

And then, of course, there is the Forum. As impressive as the ruins undoubtedly are, you still need a vivid sense of imagination to see this place for what it really was two millennia ago: the very centre of the civilized world. As well as being arguably the world’s first global marketplace, the forum was the centre of the Roman Republic. Governed by a complicated system of rules reflecting the class system, the Roman Republic has since become synonymous with the very idea of “the public”. Legendary statesmen like Cicero sought popular acclaim here; and cunning politicians like Octavian (“I come to bury Cesar / not to praise him” – Shakespeare, Julius Cesar) instigated public outrages that eventually swept the Republic away. The times, it seems, have always been a-changing.

 

The ideal of the “forum” continues to thrive with the arrival of the Internet age, as people increasingly meet, flirt and debate online. Most of us welcome these new virtual opportunities for window shopping and self-expression; indeed, as Teradata technology supremo Stephen Brobst has pointed out in his Social Network Analysis presentation on our tour of the region, the younger generation increasingly prefer social networking tools to SMS, chat, e-mail and voice-calls.

 

But is there trouble in paradise? Facebook recently found itself in the centre of a storm after loosening its privacy rules and giving business partners much wider access to member profiles than before. The resulting publicity seems to have made many Facebook members consider for the first time whether they are revealing more about themselves than perhaps they should, with at least one survey claiming that a majority of users are considering withdrawing from the network altogether.

 

Beyond the confines of social networks, Google has been accused of intercepting access codes and data packages from private Wi-Fi networks with its Streetview cars. And the objective of the authors of the web-site entitled www.pleaserobme.com was to raise awareness of the risks associated with the “over-sharing” of geo-spatial data, rather than to actually provide material assistance to the criminal fraternity.

 

(When operational, www.pleaserobme.com was aggregating public Twitter messages that have been pushed through Foursquare - a service whose whole raison d’etre is to tell others where you are right now – and which, of course, enables others to infer that you are not at home. Having made their point, the creators of the site promptly shut down the aggregation process, although it is still worth visiting for the pop-art depiction of a grizzled felon.)

 

Social Networking and related technology is – just like genetic science – advancing faster than the modern public’s understanding of the consequences. And who is “the public” in today’s interconnected world, anyway? Teenagers in Sydney and Stockholm may soon have more in common with each other than with their older neighbours next door. And why would we expect a sixteen-year-old who has grown up with the Internet to see privacy and security through the same lens as a 60 year old former East German, with recent experience and bitter memories of living in a totalitarian surveillance state?

 

Against this background, there is an increasingly urgent need for Society to tackle the question of what organizations and States can – and cannot - legitimately do with these data. Social Networking tools are transforming the lives of hundreds of millions of us for the better; and the mountains of data that they generate have the potential to transform B2C businesses, as well, by providing insights into customer behaviours and preferences never before available. We will all be poorer if the slings and arrows of outrageous privacy infringements kill the goose just as it is laying the golden eggs.

 

One major European Teradata customer understands this better than most. The screensaver on the desktop PCs in its online division features a young girl, on the point of blowing out the candles on her birthday cake. “Why do you want my date of birth?”, reads the caption. “Are you going to send me a birthday present?” It’s a constant and visible reminder to staff: whenever you design business processes that will capture the personal data of consumers, you must ensure that customers know and understand both that the data will be stored and why the data will be stored.

 

Social media networks and other online services need the informed consent of their members if they are to survive and not be swept away, as the ancient Roman Republic eventually was. More than that, the smartest companies will increasingly come to see transparency and honesty about their information management practices as a competitive advantage. We would all rather do business with people that we trust. Just ask Julius Ceasar.

 

Martin Willcox


Walking the ghost back home

Keen readers of this blog (hi Mum!) will have noted that I get much of my news from The Economist. Some attempt – mostly incorrectly - to infer my politics from this affiliation; whilst others recognize instead the value to the frequent traveler of a publication of such density and authority. Slim enough to fit into a laptop bag, an edition of The Economist nevertheless features reliable news reports from around the globe plus a passable survey of business, science, technology and the arts to boot. If it only had something as frivolous as a sports section…

 

It is also beautifully written. Witness the painful eloquence with which the paper reported the recent tragic death of the Polish President, Lech Kaczynski and his entourage - among them many of the country’s best and brightest - in a plane crash:

“The death of Poland's president carries a terrible echo of his country's past… the Katyn massacre, of 22,000 Polish officers in the spring of 1940… was more than the illegal execution of prisoners-of-war. It was the decapitation of the country’s pre-war elite. The officers, including many reservists, were the lawyers, doctors, teachers and intellectuals who would have posed the most profound challenge to the cynical division of Poland under the Hitler-Stalin pact of 1939.”

 

In a cruel twist of irony, Kaczynski and much of the cream of Polish intelligentsia were, of course, on their way to a ceremony to honour the Katyn dead; and so one elite was stolen and another died in attempting to celebrate their sacrifice.

 

Only the hardest heart could fail to be moved by Poland’s loss of two elites in less than 60 years – not to mention all of the other and greater losses that came before and after Katyn - just as only the strongest of those of us that fly regularly could fail to shudder inwardly at the pictures of the burning wreckage of Kaczynski’s plane. The Poles themselves, instead of preparing for their summer holidays, find themselves mid-way through an unexpected Presidential election, with all of the national soul-searching that accompanies such contests.

 

So, no long stories from me today; normal service – complete with bad jokes and torturous metaphors – will be resumed tomorrow. In the meantime, I will sign-off instead by sending my best wishes to all of my Polish friends and colleagues; by expressing my hope that the sincere and spontaneous outbursts of grief and sympathy from neighbors–and-former-adversaries to east and west will help to lay the ghosts and divisions of Katyn - and that which came before and after - to rest; and with these words of comfort, also from The Economist:

This… accident is appalling. But it does not derail Poland’s path to success, out of the ruins of the pre-war republic, from the devastation of war and communist rule…


Amen.

Martin Willcox


In the shadow of the Stephansdom

Image Source: http://tanjal21.wordpress.com/wahrzeichen/Oh, Vienna!  If, like me, you are of a certain age, you will find it almost impossible to say those words, rather than sing them out loud.  Our formative years, it turns out, are just that.

 

This was my third or fourth visit to Vienna, but the first time that I actually got to see more of the city than the airport and a business hotel.  And there is a lot to see: Vienna, former capital of the Austro-Hungarian empire, has enough Imperial majesty and Romanesque, Baroque and Classical splendour for three capitals, never mind one.  The beauty is more than skin-deep; successive surveys by the Economist Intelligence Unit and international HR consultancy Mercer have ranked Vienna amongst the very best cities in the world to live.  Most importantly, I found a great bar near the Stephansdom with cold beer and that was showing the football!  I hope that my Spanish friends enjoyed the game, but I predict that the fact that David Villa’s left leg is only for standing on will cost you dearly later in the tournament…

 

Austria’s most famous recent export is arguably Arnold Schwarzenegger.  I thought of The Governator recently when reading an article in The Economist, concerning the development of metals that can heal themselves.  Are the researchers employed by Skynet?  Is the rise of The Terminators upon us?

 

Probably not.  Whilst the twin disciplines of Artificial Intelligence and Robotics have made incredible strides in recent years, building a robot that resembles anything like a human being in form or in function is still an elusive goal, the preserve of science fiction, not engineering reality.  The next generation of smart robots are more likely to resemble amoebas than Terminators, although as this article points out, many of us “may find the idea of having an autonomous blob roaming around inside [us]” at least as frightening as we find the idea of pathological robots roaming the planet.

 

With the benefit of hindsight, it is easy to pour scorn on the past visions of many of the futurologists: we don’t fly to work in sky-cars; and neither do we meet all of our nutritional needs by popping a single pill.  But in some respects, the technological future that is almost upon us is even more incredible.  “The Internet of Things” may shortly connect very nearly everything with everything else.  A world in which 50 to 100 trillion objects are equipped with sensor technology and can be identified, tracked, measured and interact with one another is in the line of sight.

 

RFID tags, by way of example, have had a slower and more painful gestation than was originally prophesied, but the technology is increasingly robust and is likely to be much more widely deployed in the near future.  The tracking of items all the way through the supply chain will revolutionize the Retail and Transportation and Logistics industries.  Retailers, for example, will for the first time be able to understand whether you purchased that FIFA World Cup promotional pack of beer from the aisle display or the display at the till.  Much more interestingly from the point of view of the consumer, your fridge will know that the steak you bought nearly a week ago needs to be cooked today or thrown away.  And will e-mail you at work to tell you that it has ordered a pepper sauce, potatoes, a nice, crisp salad and a bottle of Bordeaux to go with it, to arrive just after you get home in the evening.  Cheers!

 

Some of this is already upon us. A leading aero engine manufacturer streams engine data (component performance, temperature, vibration, etc., etc.) from networks of sensors distributed throughout the engines to a Teradata database whilst the aeroplane that the engines power is still in flight. Analysing this data and correlating it with historical engine data enables said manufacturer to predict when components will fail and to arrange for scheduled, preventative maintenance.

 

For passengers this is reassuring, but the real impact is an economic one; mercifully, modern aircraft rarely fall out of the sky because their engines fail - but they are sometimes grounded for unscheduled maintenance.  This unscheduled maintenance represents a vast cost to the airlines - and to the engine manufacturers, who are increasingly obliged to provide their power plants on a lease or “power by the hour” basis.  Spot that a particular sub-component is running hot - and that this is characteristic of the failure of a larger component within the next 30 hours of flying time - and you can get a replacement engine bearing (and an engineer qualified to fit it) to the engine and the aeroplane concerned, before the former grounds the latter.  This sort of technology enables leading operators to keep their aircraft in the air for close to 18 hours in every 24; a remarkable achievement when you consider how complex a system a modern commercial airliner is.

 

None of which is remotely as cool as my current favourite application of sensor technology: the Wembley mousetrap.  I’ve told this story before, but it’s so good that it bears repeating.

 

The mousetraps at the new Wembley stadium are equipped with mobile phone chips and “phone home” when they detect that they have caught a rodent.  As is often the case where the introduction of new technology is concerned, the principal motivation is to improve operational efficiency; the same number of smart traps require fewer maintenance staff than does an equivalent number of dumb traps, because now we only need send someone to empty the traps that have caught something, rather than to check every trap, every day.

 

Keep the detailed data, however, and now you can forecast when a particular trap could next be expected to become temporary home to one of our furry friends.  Now when a trap catches a rodent at 5PM on a Friday, the stadium operator can make an intelligent decision about whether to pay overtime to empty it - or leave the “sleeping” rodent to lie until Monday.

 

Integrate these data with other data and things start to get really interesting: by overlaying the location of the traps on a drawing of the stadium and using a heat map I can start to understand where the traps that catch the most rodents are located and which routes the vermin are using to get into the stadium; add the location of the fast food outfits and now I know who isn’t managing the trash properly; and so on and so on and so on…  When sensor technology, combined with best practice approach to information management (keep the detailed data, integrate the data with other data) can revolutionize something as banal as the humble mousetrap, you know it is an idea whose time has come.

 

Sensor technology will soon be ubiquitous.  Detailed data from networks of intelligent sensors – especially when integrated with other data - will give the organizations that we serve and that serve us unprecedented insight into all manner of behaviours and processes, existing and not yet imagined.  And all of these sensor measurements will be associated with a time and a place, both of which will be critical to our understanding of the meaning of these measurements.  Time we discussed yesterday, place or “location” is a theme to which we will return in due course.  But for now it’s time for me to re-locate to 52° 14’ North, 21° 0’ East; via 48° 11’ North, 16° 56’ East.  Adios, amigos!

 

Martin Willcox

 


The Time Traveller’s Life

Image Source:www.schwarzaufweiss.deAnd so back to Blighty on the Eurostar. Home for me is Sheffield and so I experienced a sense of dislocation as we pulled into St. Pancras station. Normally my arrival there signifies the start of a journey, this time it was something else; not quite a homecoming - and not quite a beginning, either.

 

Still I shouldn’t complain too much about finding myself in London on the eve of England’s second World Cup qualifier. The train to Sheffield tomorrow should get me home to my family just in time for the second half of the game; had the 18th found us in Moscow or Rome, I would be likely to miss the entire match. England’s third and decisive tie next Wednesday afternoon will find me in Warsaw, pretending to pay attention to Teradata technology supremo, Stephen Brobst, whilst frantically texting my friends back home for minute-by-minute updates.

 

Of course, we English – not to be confused with the Brits, a wider clan that includes our Celtic cousins – are often regarded as rather aloof, cold even. Not a bit of it! Beneath the thin veneer of the “stiff upper lip” we are more sentimental even than the Italians.

 

To see the truth of this assertion, mention “1966” to any red-blooded Englishman, then watch as his eyes mist over and prepare to be regaled with tales of footballing heroics for several hours. Never mind that many of us - myself included – were not even born when England won the World Cup by defeating West Germany 4-2; never mind that one of the crucial goals may not even have crossed the goal line; this was surely the greatest game of football ever played! If time travel machines are ever invented and commercialized, the queue to get into the 1966 Wembley World Cup Final will be extended by several million Englishmen, all of us trembling in anticipation.

 

(If, however, you prefer the stiff upper lip on your Englishman, use instead the phrases “1970”, “Peter Bonetti” and “howler”; unfortunately, we English have experienced erratic goalkeeping before Rob Green re-acquainted us with it in the game against the USA last Saturday.)

 

Funny thing, time. As I pointed out in my Paris blog recently, perception and reality are frequently not aligned as well as we might like them to be, especially in the Data Warehouse. Consider that facts are typically recorded in any database some time after they occur in “the real world” – and that even in these days of near real-time data acquisition, several systems may lie between the data warehouse and the actual business event in question. Consider also that this latency – the gap between the timing of an action and its recording in the database - can vary very significantly: human beings forget to run an end-of-day process, delaying the transmission of point-of-sale data for some store or other by hours or days; telecommunications switches go on strike and hold on to call detail records for far longer than they should, etc., etc., etc. Not only that, but with the increasing use of the Data Warehouse for predictive modelling, we increasingly want to store forecast values for the same variable, as calculated at multiple, different points in time.

 

This problem has long been recognized and understood in academic circles; researchers studying the representation of temporal data distinguish between “valid time” (the interval during which a certain state persists – or is forecast to persist - in the external world); and “transaction time” - the set of times that a particular proposition was represented in the database as being true.

 

And databases have another problem with the representation of time: the concept of an interval is not natively supported in the ANSI SQL standard. We timestamp data rows with a representation of the point-in-time that some event occurred, but when we want to know “how long has this been true” or “how long since this has been true” we must identify and compare multiple columns and/or rows. This makes these queries difficult to express and inefficient to run.

 

It gets (even more) complex. When we represent time in the database, we should really take care to enforce a number of different temporal constraints: something shouldn’t end before it has begun; particular states shouldn’t overlap with one another (what does it mean if I am recorded as married between 2002 and now and 1999 and now?); and so on.

 

Attentive readers will note that I just introduced the rather ambiguous expression “now” with careless abandon. In fact it is more correct to speak of “the moving point now”: it is now 19:11:06 on Thursday 17th June, for example – but in a minute, “now” will be 19:12:06 and a minute later “now” will be 19:13:06 and a minute later…

 

How much support does the majority of ANSI SQL-based database technology provide for the management of all of this complexity? Basically, none. A customer wishing to represent valid and transaction time in the database, for example, must add 4 timestamp columns to every table in the database and then write complex DDL and DML statements over-and-over-and-over to keep the data consistent and to extract meaningful temporal information from it. This was a significant issue before the data warehouse started to become a key regulatory-and-compliance reporting platform in many organizations; now it’s a disaster waiting to happen.

 

Regulation, of course, is in fashion like never before. The new government in Britain wants the banking sector to pay a new levy to reimburse the taxpayer for some of the cost of supporting our, ahem, rather over-extended financial institutions. At international level, an update of the Basel Accord (Basel III) is underway, designed to tighten rules concerning risk exposure, liquidity and obligatory equity capital buffers. An impact study currently underway using real data provided voluntarily by 300 banks from around the world has apparently been something of a challenge for some of the institutions concerned; it was reported in the German edition of the Financial Times last week that at one German bank, 20 employees had laboured for 3 months just to extract these data (you won’t be surprised to learn that the institution in question does not have an Enterprise Data Warehouse solution from Teradata). Privacy and security concerns and the general “anti-business” sentiment that is the (probably inevitable) consequence of recent travails mean that other industries, too, face increased regulation.

 

All of which is why the marquee feature in the next release of the Teradata DBMS (13.10, due in the second half of this year) is native temporal support: period data types and operators to represent intervals; temporal tables in which temporal semantics are automatically applied by the DBMS; automated maintenance of temporal data, resulting in the hassle-free creation of history rows; easy-peasy-to-express temporal queries (what is, what was, what will be); and all of the above implemented and optimized for scalable, high-performance parallel execution, using an approach based on the closest thing that the database industry has to a standard. Think of it as time travel for the data warehouse.

 

Monday (valid time) will find us in Vienna; I’ll tell you all about it in my next blog post, uploaded and available on Tuesday (transaction time).

 

Martin Willcox


Your friends are bad for your health

Image Source: www.ee.washington.eduI think that someone once said, “The man who has grown tired of Paris has grown tired of life itself”. And that is a sentiment with which I was in complete agreement as I strolled from my hotel to the venue for the Paris leg of our CTO Tour in the sunshine. True, the entente cordiale had been strained somewhat by the fact that the landing page for the Wi-Fi service in my hotel had wanted me to click on an American flag for English language instructions – a fact that my travelling companion, Teradata technology supremo Stephen Brobst found hilarious - but Paris is beautiful in the sunshine and life was good. Arriving at the venue, I greeted a French colleague warmly and told him what a beautiful day it was. “Nonsense!”, he told me, “it’s cold for June!” Perception and reality, you see, are not always the same thing – a theme to which we will return in a future post.

 

My perception is shaped by the cool, damp island to the west that I am proud to call home, but such is the pull of Paris that throughout recent history people have been drawn here from all over the world: artists, tourist, gourmets, businessmen. Paris has accommodated large expat communities from the English-speaking world since the 19th century - including Irishman James Joyce, singer and bon viveur Jim Morrisson and American writer Henry James - and many other great men and women besides that have enjoyed huge cultural influence on their home countries whilst living on the banks of the Seine. Joyce, for example, was able to publish his Ulysses here, remaining safe from prosecution, though the work was deemed to be obscene in both the UK and the US and was therefore banned. And the attraction still remains, partly perhaps because you don’t always know what’s going on and what’s going to happen next.

 

In today’s globalized economy, there are probably more expats all over the world than ever before (including my parents, surrounded by vineyards and sunshine in the South of France). Expats often band together, of course, united by a common experience. The Facebook generation might like to think that is has invented social networking - just as the Baby Boomers like to think that they invented sex in the 60s – but those tight-knit groups of expats, artists and musicians from down the years prove that actually, social networks are as old as human interaction itself. The tools, methods and the speed of communication may have changed, but most everything else remains the same.

 

The science of social network analysis also pre-dates Facebook and Twitter (think Stanley Milgram’s decades old “Six Degrees of Separation” experiment): people (“nodes”) are connected to one another by relationships, direct and indirect. Graph theory and related mathematical techniques then enable us to quantify the strength and depth of these networks. And we can identify and measure relationships in all sorts of ways that don’t necessarily involve Facebook, Twitter and the rest; if you and I transfer money to one another or call one another - to name but two examples - then a quantifiable relationship exists between us. And some of these relationships can even identify us; you might change your mobile phone provider, but you still call your Mother – and probably at the same time every Sunday after lunch (or maybe that’s just me).

 

If that all sounds rather worthy and academic, then consider this intriguing fact: statistically at least, it is probable that your friends are more popular than you are. This is called the “friendship paradox” and it arises because human networks rely on “connectors”: the popular people that we all know and want to be friends with; the people who always know what’s going on and where it’s at; the human glue that cement our relationships with our wider social circle.

 

Identify the connectors and many interesting possibilities are available to you. Recent research, for example, suggests that the oh-so-popular connectors contract illnesses before most of the rest of us (serves them right) - they are in contact with more people, after all, each of whom has a finite chance of being infected - and play an important role in the spread of epidemics, because they then go on to infect all of their other many friends after contracting the dread disease themselves. It follows that if health authorities knew who these people were, they could spot outbreaks weeks in advance of current surveillance methods. Perhaps our friends should carry a Government Health Warning?

 

Connectors are also very interesting to marketing types, as – just like bands of expats - we fickle humans tend to value the opinions of our friends and family much more than we value the opinions of strangers with whom we share little and have no common experience. Identify the influencers in the social networks and influence them yourself and you realize a powerful multiplier effect. It follows that traditional recency-frequence-value approaches to customer segmentation – as valuable as they often are - tell only part of the story; an “unprofitable” customer may actually be very profitable indeed, because the value of the trade that they bring to you may greatly exceed the value of their direct contribution.

 

Facebook, Twitter and their ilk potentially take this to a whole new level; if I can persuade you to share your data with me – by offering you a free download each week if you become a “fan” of my brand, for example – then I can understand what you think of me (by running sentiment analyses on your posts) and who you are influencing (by assessing how well connected you are). Bring all of this information together in one place with all of the rest of the organization’s data and you have something very close to a true, three-hundred-and-sixty degree view of the consumer. For example, if I complain about the price of the new iPhone 4 but then the sales data shows that I bought one anyway, then Apple probably has the price about right; but if I moan about the price and don’t purchase, then it could be that I and all of my friends are about to desert to Nokia or Google Android.

 

All of which is why Apple should send me the new iPhone and iPad immediately, to arrive at the venue – Altitude 360, Millbank Tower – for the London leg of our CTO Road Show for Friday. Well, it’s worth a try!

 

Martin Willcox

 |  Older »