Web creator Tim Berners-Lee on the future of data

Data literacy will drive innovation, easing global warming and empowering citizens, according to Sir Tim Berners-Lee and Sir Nigel Shadbolt
Inventor of the World Wide Web Sir Tim Berners-Lee (left) and Sir Nigel Shadbolt (right)

Billions of us use the World Wide Web as our primary tool to interact online. Today, its creator Sir Tim Berners-Lee is on a new mission: to ensure data is used appropriately to create the public sector of the future.

Berners-Lee partnered with artificial intelligence (AI) expert Sir Nigel Shadbolt in 2012 to found the Open Data Institute (ODI). At the ODI Summit in early November the pair of computer scientists warned that now is a pivotal moment. As we hurtle into the digital era powered by data-hungry algorithms and AI, it’s critical to collaborate with good intentions and maximise the potential of technology, for the sake of the planet and its inhabitants. 

The acceleration of digital transformation necessitated by the coronavirus chaos is exciting, but there’s a responsibility for authorities around the world to keep pace with this incredible change. Those in power must set standards, encourage data to be opened and shared responsibly, and narrow the ever-widening skills gap. The quicker that data literacy in both private and public sectors can be improved, the better for everyone.

As Berners-Lee points out, the pandemic has unconsciously boosted public awareness of how data can save and enrich lives. “Something that took off hugely was communication through data, with the government telling us to ‘flatten the curve’ [and limit the spread of the virus],” he says. “I would imagine that the data literacy of the general population has gone up a chunk.” 

Driving change 

By improving their data literacy, leaders and members of the public could understand and challenge how data is presented, Shadbolt suggests. As public sector technology and its application develops in the coming years, fuelled by more and better-quality data, greater scrutiny will help shape products and services for the digital era. 

The opening of more data sources will super-charge the public sector of the future and drive innovation, says Shadbolt. The chair of the ODI – who’s been principal of Jesus College at Oxford University since 2015, among other roles – points to the success of open data pioneer Transport for London (TfL). Often held up as an exemplar of open data, TfL offers data feeds and guidelines about air quality, cycling, walking, planning and more. 

In 2017, Deloitte calculated that TfL’s release of open data generated annual economic benefits and savings of up to £130 million for travellers, the capital and the organisation itself. Additionally, many private businesses have taken advantage and cashed in on the open application programming interfaces (APIs).

“Imagine that a lot of data relevant to everything climate-related was just being routinely published using standard APIs,” Shadbolt continues. “It’s what we saw happen with TfL. And there’s just a bunch of sectors and areas to go for.”

However, it can be dangerous to blindly follow data. Shadbolt wonders whether Boris Johnson’s refrain during the pandemic that the government would “follow the data” to justify its pandemic-related decisions coronavirus sent out the wrong message. “It was quite a bad phrase, in some respects,” he says, “because while there should be a basic ability to understand the data, we need to interrogate and critique that data.”

Data can be good, but it never gives a complete picture

Questioning data sources is not just essential to fight fake news on social media and elsewhere – it will also enable public sector organisations to build greater trust, Berners-Lee says. With more connected data, they could trigger a shift from reactive to proactive services. 

It’s a virtuous circle, because trusted and quality datasets will widen the possibilities and reach of public sector technology and empower citizens. “Provenance is important for data quality, and provenance is important for trust,” he says.

Building trust

For example, Berners-Lee says a doctor should be able to look at the digital notes of a person with diabetes and open a data narrative explaining how this diagnosis was made and other relevant history. Public trust in the data used by the public sector is central to the adoption of technologies and services, he points out.

The general public seemed to go into different categories regarding coronavirus data, Berners-Lee says. Some accepted recommendations for pushing the curve down, but others “don’t listen to the same people as we might. Instead, they find groups of people –

the conspiracy theorists – usually on social media, who make up all kinds of strange things about the pandemic, vaccines or climate change, for that matter.”

Shadbolt says experts act in good faith with the information available at a specific time, but their visibility is limited if they have scant amounts of data. The wider the variety of good quality data sources, the fuller the picture. “We’ve talked a lot about how it’s important, particularly during the pandemic, not to regard the scientists, medics and people in white coats as telling you the whole truth,” he says. “They’re trying to give the best information, very often under conditions of considerable uncertainty.” We must take a nuanced approach, he argues, understanding that “the data can be good, but it never gives a complete picture.”

Those in the public sector and beyond must be “critically reflective” of data. “All our responses are made, in a sense, standing on the edge of error. But that’s what science is: it can believe something is wrong and can revise what we believe as these things unfold.” 

While the collaborative use of data will create smarter public services in the UK, this approach is crucial on a larger scale if humanity is to overcome its biggest challenges. It’s been vital in the response to coronavirus, while a cooperative, non-competitive and can-do attitude is also essential to reduce global warming.

“We’ve just been living through an existential crisis – a global pandemic – and we’re in the midst of another one unfolding, with the climate challenge,” says Shadbolt. “Data will be an essential part of [solving this]: the infrastructure, the institutions we might need, the trust we have [in its use], and our literacy.”

Sir Patrick Vallance, the UK’s chief scientific advisor, echoed this view at the 2021 United Nations Climate Change Conference (COP26). He warned that the challenge of global warming is a greater risk than Covid-19 and more people will die from it than the pandemic if the public sector doesn’t act quickly. Vallance also said the climate crisis could last 100 years and require “a combination of technology and behavioural change”.

Provenance is important for data quality, and provenance is important for trust

Shadbolt concurs but stresses that opening data and boosting cross-sector collaboration will accelerate meaningful change on a macro and micro scale and increase the capabilities of public sector technology. “While environment data is in the news because of COP26, there is other information that can help spur action,” he says, hinting that greater transparency from public sector organisations will ratchet up pressure on private companies to keep clean. For example, he notes that data on utility companies discharging sewage will help the Environment Agency, which struggles with funds and support. 

“We are starting to gain a sense of what data’s going to make a difference – everything from emissions to insulation. There’s a whole network of interconnected data types that we can bring together, much of it held in the public sector, and some of it held in the private sector,” he says. “We need to begin that work on joint public-private enterprises, though we are beginning to see the private sector, with its commitments to ESG, saying ‘we now have to have a public purpose as well as a private one.’” Publishing some of this data “would be a great first step”, he adds. 

Information advantage

Berners-Lee and Shadbolt were appointed as information advisors to the government in June 2009. The duo led the team that developed data.gov.uk, a single point of access for UK non-personal governmental public data. This offers real-time information on a range of topics, such as government spending, digital service performance, crime and justice, transport and more.

When the pair founded the not-for-profit ODI nine years ago, the mission was to “connect, equip and inspire people around the world to innovate with data”. Almost a decade later, the ODI continues to provide free and paid-for training courses and learning materials both in-house and online. These cover theory and practice surrounding data publishing and use. The ODI has long championed open data as a public good, but always emphasised that effective governance models are necessary to protect citizens.

Some 20 months since the start of the coronavirus crisis, people are beginning to appreciate the ODI’s work and concerns around data standards. “When the pandemic began we provided a data publication template,” says Shadbolt. “The challenge was so many people wanted to contribute data. It needed sorting and we had to determine what was helpful. If there was just a little more awareness around open standards to publish data, so that it is in a more interoperable format, it would be better for everyone.”

For public sector technology to thrive, however, public trust is critical, says Berners-Lee, who notes a difference in attitudes to tech in the UK compared to the US. “Typically in the UK people trust the government and don’t trust [the tech] industry, and in the US people trust industry and don’t trust the government,” he says. More should be done to assuage fears about how tech giants handle user data, he adds. “To an extent, it’s how people are brought up and therefore cultural. But for people in the UK to trust these large American companies then you need to have serious legislation and regulation.”

The backlash against the allegedly avaricious Facebook, which according to a recent whistleblower puts user engagement ahead of safety, is a cautionary tale for public sector organisations seeking to embrace technology solutions and partner with companies without fully knowing their policies on data privacy and other questionable values, suggests Berners-Lee. More than ever, at the outset, digital products must be “good by design”.

Data management is integral to these processes. Here too the coronavirus has proven useful, testing the robustness of so-called ‘trusted research environments’. “In these environments, the data stays behind a firewall and it’s modelled and analysed with tools that can go behind the firewall,” Shadbolt explains. “The data never actually leaves the highly secure data storage areas where 47 million patient records are linked, but incredible insights are gained.”

Offering an alternative, he says: “The other solution is to leave the data with the people who generate it, which is very local. There are different technical solutions there and there are different institutions we can build to share this. It’s a complicated area, but the ODI is looking very carefully at making data sharing more effective.”

Unfinished business

What does the future hold for the ODI as it nears its 10-year anniversary? “We started off explaining to people working in the public sector how to put your data on the web,” says Berners-Lee. Now, however, “we realise it’s important to cover the whole spectrum, from public to private, but it’s also about developing policies as well.”

This assessment chimes with Shadbolt. “There is unfinished business,” he says. “The whole commitment to getting data out there was started with open data initiatives that were very much focused around the public sector – everything from hospital data to educational data to transport data. That work has gone well. We’re now looking at extending those learnings. As governments move on [in their digital transformation journeys], you want to ensure that momentum is kept up and that the infrastructure is there to help sustain publishing the data out.”

Returning to the global climate crisis, he says of the ODI’s mission: “We did anticipate that in trying to build a trusted research data ecosystem it would become one of the consequential questions for the future of the planet and the future of our wellbeing. There’s a huge amount of work to do. We’re trying to make sense of it in terms of programmes of work, from data literacy to institutions, from ethics to infrastructure.”

Shadbolt adds: “Fundamentally the ODI’s work is about listening, it’s about trying to take ideas and put them in a format that allows that to scale. We may be an organisation of 60-odd people but we think we can have a fantastic impact and so we need to reach out and sustain ourselves to make a better future.”