Getting to know data is big business

Businesses are successfully analysing vast amounts of data to understand consumer behaviour and forecast trends, as Annich McIntosh discovers


Big data is a term increasingly bandied about to describe the terabytes (TBs) of stored information now being held in databanks around the world.

It is potentially big business. This is true not only for the mega IT companies that want to collect, store and interpret the mass of data, but also because, for every single company, it offers the chance to learn more about every facet of their business, customers, and the goods and services provided.

Big data refers to data of all types, but scientists are responsible for a great deal of it.

To give one current example, the new Large Hadron Collider at CERN on the Swiss-French border is capable of generating 40 TB of data per second of operation. Astrophysics, genetics and meteorology are all areas of scientific research producing data at large and ever-increasing rates.

Regardless of your moral stance, the analysis of big data is like a giant rolling snowball and its progress can only be wondered at

In the context of customer loyalty, big data enables companies to learn even more about the actions and behaviour of consumers, especially online users, potentially providing useful insights into our wants and needs.

Take Google for example. It manages to interpret our searches and even the words in emails to trigger advertisements on the subjects we may be interested in. Is this helpful or a gross invasion of privacy? Regardless of your moral stance, the analysis of big data is like a giant rolling snowball and its progress can only be wondered at.

Data has been interpreted for a long time. Tesco for example, reinvented itself using data analytics and this is a well-accepted tenet of all loyalty programmes. Their principle was easy for consumers to understand. If they agreed to share their shopping behaviour with the retailer, they would receive loyalty points as a thank-you. Most of us agreed.

The issue gets rather more blurred as this collection moves into the online space.

Data generated by the users themselves has until recently been very difficult to analyse and interpret. Under this label of “unstructured data” or textual information, would come all social media posts, including Twitter and Facebook, “recommends” and “likes”. To be able to usefully analyse these would be a massive advantage for any company.

This new category of textual information generated by people through emails, instant messaging, blogging and so on contains both structured and well-defined fields, together with more free-form information where context and inference are vital to a complete understanding of the information. It is sometimes called “hybrid data”.

A further category, and by far the largest, is audio, image and video data. This data has the loosest structural characteristics, is the most voluminous, and is the most difficult from which to extract meaningful conclusions and useful information.

Unstructured and hybrid data categories are at the heart of the current excitement around big data. Businesses, including Amazon, Google, eBay, Twitter and Facebook, are using this information in enormous volumes to understand consumer behaviour, and predict specific needs and overall trends.

So what is so important about hybrid data?

Kris McKenzie, head of customer relationship management (CRM) at SAP UK & Ireland, describes it as: “Information which comes in all forms, shapes and sizes. In each customer interaction, vast quantities of both structured and unstructured data is generated in a multitude of guises. This can be anything from text to numerical information, of which all requires analysis in order to develop valuable insight into customer preferences.”

He says: “The biggest sin businesses can make in this situation is wasting data. Companies are guilty of hoarding data, which they have no idea how to manipulate. This effectively results in money and valuable resources going down the drain, not to mention the fact that this makes your business vulnerable in the face of competitors who are optimising the information.”

So where does all this leave us, the consumers, to whom much of this data refers?

Some 2.7 zetabytes of data exists in the digital universe today. Predictions estimate that 35 zetabytes of data will be generated annually by 2020.  IDC estimates that by 2020 business transactions on the internet, including both business-to-business and business-to-consumer, will reach 450 billion per day.  Facebook alone stores, accesses and analyses 30+ petabytes of user-generated data. How do you make sense of it all? Is there a benefit to making sense of it?

Nick Whitehead, senior director of IT company Oracle, who is the author of a recent white paper on information management and big data explains: “What organisations are beginning to work up to is the capture and analysis of model data that has never been captured before. It is a natural extension on what has been happening previously. For example, shopping behaviour can be analysed now in specific locations; the social media stuff can be added and this mix of the old and the new can be very useful.”

Geolocation, which enables companies to know where customers are using their mobile smartphones, is another good example of where big data can be useful. “We can tell where customers are but, until recently, we couldn’t match that with what they had done previously or what it was they had been researching online that they wanted to buy,” says Mr Whitehead. “Using this information, companies can serve consumers better.”

Patrick Rohrbasser, chief executive of customer insight consultancy emnos, likens the current frenzy around big data to the mid-1990s when retailers launched loyalty cards without the IT capability to usefully analyse the millions of transactions a day that were generated.

“Then, as now, the challenge was analogous to attempting to drink from a fire hose without drowning,” he says. “Analysing big data requires special skills within an integrated team, understanding the retail challenges and then being able to isolate those data sources that can most cost-effectively be brought to bear on the problem. It requires retail knowledge, data management best practice, technology, analysis capability and transformational consultancy to ensure the whole business benefits.”