Emails, instant messaging, social media posts, images, videos, employee login times: all these have one thing in common – they’re forms of unstructured data.
“Unlike many other forms of company data, which can be stored and collated in a database, this information is disorganised and difficult, if not impossible, to analyse,” says Robert Rutherford, chief executive of IT consultancy company QuoStar.
But the data holds incredible value. In fact, IT analyst company IDC estimates that by 2020, 37 per cent of unstructured data will be useful if properly analysed, resulting in $430 billion in productivity gains for organisations that can properly utilise it. The data can be used to help the way an organisation operates internally, but also can help to provide new products and services to customers or improve existing offerings.
What’s more, according to the Veritas 2017 Data Genomics Index, 16 per cent of an organisation’s data is unknown, unstructured data, and this is growing year on year, with the number of unknown files held by organisations increasing by more than 50 per cent between 2016 and 2017.
Categorising unstructured data
So how can the data be unlocked? Mr Rutherford believes that to understand the value of data, companies first need to know what kinds of information they hold. This may seem obvious, but without knowing the different data types and where they are in the organisation, they would be incredibly difficult to mine for value.
So this requires companies to start by categorising their unstructured information. “While this can seem like a simple process, it is often a hidden challenge because systems do not allow companies to classify their data at an inception point, which means the information remains unstructured and hard to analyse,” Mr Rutherford explains.
To understand the value of data, companies first need to know what kinds of information they hold
There are artificial intelligence (AI) tools that can help organisations to streamline this process so data can be categorised quickly, but it would still require a human element to understand the data that is being processed. The next step is to be able to analyse this data in the same way analytics continue to provide insight from structured data. That insight then needs to be made actionable and delivered to senior decision-makers to act on.
Jeremy Stimson, chief technology officer (CTO) at reputation risk management software company Polecat, says business stakeholders shouldn’t have to do the work of discovery themselves and this is why managing unstructured data is complicated.
“This means being able to convey data in ways that would simplify and communicate insights with clarity to vast audiences, through sharp data visualisations, charting, graphs and illustrative models, for example. There’s a whole infrastructure at work to turn unstructured data into something a CEO can actually use,” he says.
CDOs can enable businesses to mine unstructured data
It is for this reason that there needs to be a specific, C-level executive that manages this complexity of data within an organisation; someone who is not necessarily a part of the IT department, but can work alongside a CTO or chief information officer, as well as the chief marketing officer and other C-level executives.
The chief data officer (CDO) is not only a position on the rise, but the role is taking on more importance. According to Gartner, more than half of CDOs report directly to a top business leader and CDOs in general are now not only focused on data governance, data quality and regulatory drivers, but also delivering tangible business value and enabling a data-driven culture.
The CDO can access the information hidden in disorganised datasets, and enable the business to mine unstructured data and incorporate it into part of a wider strategy
The chief data officer can access the information hidden in disorganised datasets, and enable the business to mine unstructured data and incorporate it into part of a wider strategy. Although Nigel Vaz, chief executive of Publicis.Sapient International, points out that a business which hires a CDO needs to ensure they have real scope to make changes within the organisation.
“The CDO role cannot be a surrogate for collective C-suite ownership of data, but must add a set of complementary skills founded on an understanding of data as a driver of organisational efficiency and, crucially, of future customer value,” he says.
Dealing with unstructured data varies from organisation to organisation
Wrightington, Wigan and Leigh NHS Foundation Trust is seeking this kind of individual at the moment. “While the NHS is slower than the private sector, with things like GDPR [General Data Protection Regulation] now in force, it’s more apparent that there needs to be representation at board level to talk about data and analytics,” says the trust’s head of business intelligence and acting associate director of information management and technology Mark Singleton.
“We have interviews for our data protection officer who will be reporting to someone at a board level and, depending on who is recruited, they may also become our CDO,” he adds.
But different organisations have different ways of approaching how they deal with unstructured data. For example, Adobe implemented a new operating model, where its leaders agreed on a consistent data structure and definitions so the insights they gained from the customer journey could be used to improve and personalise experiences.
How unstructured data can fit into a comprehensive data strategy
Meanwhile, Hotels.com has three different data-related functions. One leads on how data is created, captured and managed, another leads on turning that data into helpful capabilities for its customers by using technologies such as machine-learning and AI, and then it has a CTO who leads on how to act on this and get it in front of its customers.
“The three of us together form a tight-knit community, an ecosystem and workflow,” says Hotels.com’s chief data science officer Matthew Fryer, who leads the middle function.
Unstructured data, therefore, forms a large part of Mr Fryer’s role and he categorises the data in three different groups. The first is where Hotels.com uses data to make predictions and recommendations, whether that is recommending a customer the best hotel, recommending them the best filter or making predictions for internal forecasting.
The second group is where the company is trying to improve on what is often a fragmented and complex travel industry. This means trying to help with a customer’s entire journey and their travelling plans, while keeping their preferences in mind.
“This is where we use some newer innovative techniques like displaying the image from a hotel that best suits their preferences, and analysing tens of millions of verified text reviews to give us and the user more insight,” says Mr Fryer, who adds that video analysis could be an area of growth in the years to come.
Another form of unstructured data that Hotels.com is working on being able to use is speech. The idea would be to enable a customer to explain everything they wanted from a hotel by speech to a service such as Amazon’s Alexa and for the system or virtual assistant to respond with clear answers.
No right or wrong when dealing with unstructured data
While Hotels.com has a clear workflow in how it uses data and three data leaders within the organisation, some organisations are not hiring a specific chief data officer, and instead are investing in third-party resources and training to get their existing staff to make better use of the data.
The Serious Fraud Office (SFO), for example, has to deal with unstructured data such as emails, documents and other written communications, and currently has a team of people that support its case teams in making sure they get the best out of the data systems they have, says Ben Denison, the SFO’s CTO.
There is clearly no right or wrong answer when dealing with unstructured data, but organisations are tasked with understanding what data they have at their disposal, defining and categorising it, analysing the data for insight, and then acting on that insight. For larger organisations, the logical move is to employ a chief data officer who can oversee this process and continue to do so as the amount of unstructured data continues to grow.