Balancing privacy with intelligence

Big data analytics have been described as a natural progression from cloud computing, which has effectively liberated organisations from data storage limitations. But possible invasion of individuals’ privacy is a matter of concern, writes Joanna Goodman

A recent IDC Digital Universe study, Extracting Value from Chaos, found that the world’s data is doubling every two years. But size is not big data’s only differentiating feature. As Steve Shelton, head of data at BAE Systems Detica, explains, its other attributes include immediacy – real-time information based on behavioural, transactional and location data, activity on internal and external networks, and social media.

Big data is differentiated by its unstructured and unpredictable format as it includes information from multiple platforms and devices. These considerations impact how it is collected, stored and used, and explain why it raises significant privacy and security implications.

The value of big data analytics lies in their insights into user behaviour.  This is sometimes considered intrusive. An article in the New York Times Magazine earlier this year drew attention to Target’s analysis of customer shopping patterns correctly identifying a student’s pregnancy before she had announced this to her family.

Some find the Target example unnerving, but it demonstrates the power of big data analysis, which has produced impressive results in highly regulated industries that handle sensitive personal data. In healthcare, tracking the use and effectiveness of treatments has supported pharmaceutical product development; in financial services, real-time analytics around transactional behaviour helps to detect and prevent fraud.

The question is how well big data is secured by trusted parties and how much control the owner or creator has over its use

“The process involves analysing large volumes of structured and unstructured data to find intelligence quickly enough for action to be taken,” explains Mr Shelton. “BAE Detica’s NetReveal works with retail banks and insurance companies to uncover organised fraud rings involving current accounts, payment cards or insurance claims by identifying the links buried in high-volume transactional data.”

Businesses use big data analysis internally to predict and identify performance issues in networks and hardware. Analysing employee behaviour patterns using electronic security pass records and network log-in data can prevent potential data breaches from inside the organisation.

However, big data has significant privacy implications. According to Ben Knieff, director of fraud product marketing at NICE Actimise, a fraud solutions vendor that uses big data analytics, there are indications that big data presents a double-edged sword. “When analysed intelligently, it can provide consumers with valuable offers or be used to reduce fraud,” he says. “The question is how well it is secured by the trusted parties holding it and how much control the owner or creator has over its use and distribution”.

Customer and industry panels at EMC’s 2012 Data Science Summit made it clear that privacy is a central concern for businesses working with big data. Chris Roche, regional director at Greenplum, EMC’s big data division, explains that this is partially down to the fact that social data is a critical part of the big data equation. It is “one of the drivers of big data analytics and a major source of concern regarding privacy”, says Mr Roche.

He highlights the gap between the Facebook generation, who have few concerns about what information they put into the public domain, and others who tend to be more cautious. “We live in a world that is divided on where to draw the privacy line,” he says.

The big question is how much information are consumers of products and services willing to give up in exchange for value. According to Conrad Bennett, vice president of technical services, Europe, the Middle East and Africa (EMEA), at web analytics company Webtrends, the key is mutual respect. “It is about enabling people to opt in or out, and respecting their choices and their privacy around sensitive data, notwithstanding the potential technical challenges.” He adds that people’s expectations of privacy and the amount of information they choose to share varies significantly by age group and technical aptitude.

Because big data uncovers more granular intelligence than standard data, it raises additional privacy concerns. Mr Roche and Mr Shelton draw attention to the fact that anonymising data may not always be effective as the correlation between big data and fast data may be sufficient to identify individuals. In other circumstances, such as fraud detection, anonymising data would be counterproductive.

Mark Webber, technology transactions partner at law firm Osborne Clarke agrees. “Initial classification and control, and careful dataflow analysis should take into account IP [intellectual property], and regulatory and ethical factors. It is important to consider what combination of information actually identifies people.”

Businesses should know where their data resides. This is particularly relevant to cloud services as privacy and access rights vary between jurisdictions. Anders Etgen Reitz, a partner at Danish law firm IUNO, explains that Danish companies, with more than 250 employees, have to employ a data protection officer and data transfers outside the EU require approval from the Danish Data Protection Authority. The European Commission is considering proposals to harmonise EU regulations, which would include these stricter rules.

Big data raises similar security issues to all cloud services – most organisations do not have sufficient storage and processing capability to cope with the size and complexity of big data sets. This intensifies the risks associated with third-party partners involved in collection and processing. “Businesses need to be comfortable that anyone handling data on their behalf has the relevant policies and procedures in place,” says Mr Bennett.

Big data presents the same risks of data breach as standard data, but its nature can protect it from fraudulent use. “Although big data is often potentially sensitive or valuable, it is protected by the fact that it is vast, unstructured, complex and obscure,” says Mr Shelton. “Losing a raw big data set is not the same as losing customer information. It requires particular analysis to access the intelligence that it may hold.”

Big data analytics have uncovered significant opportunities based on in-depth customer and user intelligence, and the key to maximising these is respect between all parties involved. “Consents are crucial,” says Mr Webber.  “Before data can be used, it is critical to get informed, freely given consent. Fair and transparent use of data is the key to making the most of big data.”