How Valencia crushed Covid with AI

September 8, 2021

The article below was published in WIRED on Sept 8th, 2021, written by Willem Marx.

By leveraging algorithms and unorthodox data sources, an MIT researcher has made Valencia a Covid-19 data pioneer

When Covid-19 hit Spain last spring, the country quickly hit breaking point. In Madrid, doctors described an “avalanche” of patients as they practised “combat medicine” and emergency triage in intensive care units that were operating on a war-like footing. The first Covid-19 death was recorded on March 1. A month later, just under a thousand people were dying each day. Ambulances choked hospital approach roads and ice rinks were transformed into morgues.

In mid-March, as the virus spread to all regions of Spain, Nuria Oliver realised that this poorly understood threat required immediate action. And Oliver, who is a data scientist, felt particularly well qualified to help with this public health crisis: in her previous roles at telecoms giants, she had developed tools using GPS data to track the spread of H1N1 influenza in Mexico, Ebola in the Democratic Republic of Congo, and malaria in Mozambique. “The context was there, and the timing was right,” she says. Oliver reached out to her local government contacts in the region of Valencia, explaining how data might help combat the unfolding crisis.

A native of Alicante, Oliver had earned a PhD in 2000 from the Massachusetts Institute of Technology. For her thesis she had created algorithms that used video and other sensory inputs to automatically detect, recognise, and predict various forms of human interaction. She subsequently spent more than seven years on a research team at Microsoft, then took on senior positions at telecoms firms Telefonica and Vodafone – where she gained a global reputation for her work modelling human behaviour – before returning to her hometown for personal reasons.

In 2019, just months before the pandemic hit, Valencia’s regional government asked Oliver to help develop a new strategy that would incorporate artificial intelligence into its governing methodology. The Generalitat Valenciana is one of several Spanish regions that enjoys a stronger degree of autonomy, and excited by the opportunity, Oliver devoted herself to the project. As an evangelist for artificial intelligence and machine learning in policy-making – she often participates in public events with upbeat titles like “AI for Good” – Oliver had long wanted to change the way governments leverage AI.

“The dream is making government more efficient, making government more effective, enabling decisions that are based on evidence and scientific knowledge,” she says. “We have the tools to be able to do this.” Although the use of artificial intelligence has become relatively commonplace in large technology and engineering firms over the past decade, it is still unusual for governments to use such systems successfully. A recent OECD research paper found that just 36 countries worldwide had developed a national AI strategy that focused on the public sector, and only a fraction of those had begun real-world implementation. Valencia published its own new AI strategy – developed in consultation with Oliver – in November 2019.

When the pandemic came along, despite its horror, Oliver saw it as an opportunity to apply her theories about data’s usefulness for public policy on a large scale. She was appointed as Valencia’s first ever AI commissioner and was fortunate to find a kindred spirit, Ana Berenguer, working in the very heart of the regional government machinery. A former lawyer, Berenguer oversaw analysis and public policy in the cabinet of Valencia’s president, Ximo Puig. She ensured that Oliver joined a panel of other experts – economists, epidemiologists, and medical doctors – that would brief Puig on a weekly or fortnightly basis, and offer him recommendations on how best to contain the virus. As she looked at the faces of the other participants in those early video conference meetings, Berenguer recalls, she realised how different their approach was from Oliver’s. “They had been, traditionally, extremely sceptical of the work – especially the epidemiological side of the work we were doing with data.” Oliver was undeterred, but did recognise that the task ahead of her would require a significant amount of support. Armed with this official mandate she raced to build a data-driven epidemiological powerhouse by recruiting a multidisciplinary volunteer team of around two dozen scientists, researchers and academics.

Miguel Angel Lozano, a computer science researcher at the University of Alicante in southern Spain, saw one of Oliver’s initial emails pleading for help. For two decades Lozano had researched and lectured on computer science, with a recent focus on delineating patterns in complex systems like public transportation networks. “I think everyone wants to help with this situation,” Lozano recalls thinking. He agreed to participate, and quickly saw where his experience with mapping transportation data could be useful.

Based on her previous work with Telefonica and Vodafone, Oliver knew that measuring and understanding mobility could be crucial to staving off the spread of the virus. “If we don’t move, there is no pandemic,” is how she puts it. By chance, Spain’s National Institute of Statistics had recently started to examine anonymised telecoms data from three large mobile phone companies, for economic purposes and for monitoring the movements of the country’s workforce during rush hour, among other things. After some negotiation, Oliver and her team were granted unique access to this same mobility data and, Lozano says, “applied some of the methods we were working on to this information”.

Francisco Escolano, another University of Alicante computer scientist who had joined Oliver’s team, found the mobility data spreadsheets delivered by the statisticians to be sizable, but also relatively basic: they simply showed that a specific number of people moved from one area to another in a specified period of time. To make that data more practically useful, Escolano helped develop a system that would translate the statisticians’ spreadsheets into the Python coding language. This not only allowed them to study the data more closely but also to create clear visuals from it that could be understood by political decision makers.

Oliver’s scientists garnered data from a variety of other sources. They scoured public releases from the likes of Facebook and Google, looking for complementary data that would cover any gaps. They also developed a digital survey for the residents of Valencia, whose questions changed over time as the pandemic evolved.

The surveys were surprisingly well-received by residents, with more than 140,000 responses in the first 40 hours. Over the next few months they generated hundreds of thousands more responses – data points – that helped the Generalitat see what its populace was thinking and feeling almost in real time. Responses included the location, time and manner of social mixing, the personal protection measures residents were taking, how they perceived the relative safety of different activities from grocery shopping to restaurant dining, and whether individuals felt financially secure enough to self-isolate if required. “We have been able to answer questions that we wouldn’t have been able to answer otherwise,” Oliver says.

The survey results proved especially effective during cabinet meetings, says Berenguer, because they showed politicians how behaviour was changing in a way they could easily understand. If people sounded like they were dropping their guard and socialising too freely, for example, the government might initiate a new public awareness campaign encouraging compliance with mask-wearing or social distancing. But self-reported increases in social interaction could also help improve the accuracy for case number estimates.

The team also developed other predictive models based on machine-learning. One allowed them to forecast the prevalence of Covid-19 in a given area at a given moment; another helped them analyse wastewater from baths, basins, washing machines, and showers, and hunt down anomalies that might reflect changes in local infection rates. A third allowed them to predict future hospitalisation rates, recognising when intensive care units might reach capacity. This proved hugely helpful for local healthcare authorities as the pandemic wore on, allowing them to move personnel and equipment across the region to meet expected demand.

During the first and second waves, Valencia was spared the worst of the pandemic; in the first week of April 2020, for example, residents of Madrid were dying at four times the rate of that in Valencia. By early November 2020, Valencia had the lowest number of total accumulated infections per population size of any region in Spain.

But that changed in December 2020, when a third wave – fuelled by the more contagious Alpha variant that had emerged in England – caught regional authorities off-guard. Though Valencia later became the only part of Spain to ban inter-regional travel around the Christmas period, the Alpha variant’s transmissibility had already meant the virus was circulating more than was understood from testing. Visitors from elsewhere in Spain and overseas had been traveling in and out of this popular tourist destination during months of loosened social distancing restrictions – with disastrous results. The region’s average daily recorded infections jumped from 1,450 in late December to more than 8,000 a month later, and in that same timespan hospitalisations more than tripled and daily deaths soared six-fold. Other than a handful of areas in Portugal, Spain’s neighbour, the Valencia region over 14 days saw the highest cumulative incidence of infection in all of Europe.

This proved to be a defining moment for the wider recognition of Oliver’s work. Berenguer says that during some of the earliest meetings with other experts, in spring 2020, it had always felt “very tense” whenever Oliver presented her team’s findings. The epidemiologists, Berenguer says, were “not really very nice, they were extremely skeptical” of predictions built on such unorthodox data sources.

But in December 2020, a specific episode illustrated how that skepticism was beginning to wane, as the team’s various models spewed out predictions that proved remarkably accurate time and again. Spanish regional governments do enjoy significant autonomy, but lack the authority to institute unilateral curfews on citizens, which requires a magistrate’s approval. Up until late 2020, the head of epidemiology in Valencia’s health department had always taken charge of making the case for such interventions. But one day in December an official in the health department rang Berenguer before going to the judge for approval, insisting that their data could help make the case for closing restaurants. When the judge saw that data, he approved the curfew and restaurant closures, and has done so on each occasion since. “It’s amazing,” Berenguer says. “We’re one of the only regions that will actually get that pass, because we’re doing so much good effort, in evidence that we needed.” That was, she says, was the last time the region’s top epidemiologists voiced any skepticism in meetings about the use of innovative sources of data.

By this point Oliver’s eclectic research group felt increasingly confident about what it had achieved. The team agreed to enter a technology contest sponsored by non-profit organisation XPRIZE, which was offering $500,000 for an AI system able to automatically develop a pandemic response plan. The competition – which would pit Valencia’s efforts against top global academic institutions – boiled down to building two tools: a “predictor”, able to forecast infection numbers in dozens of different countries, and a “prescriptor” – devising detailed albeit hypothetical approaches to containing Covid-19. For the predictor, Oliver’s team relied on many of their existing techniques, but expanded them by introducing worldwide infection data that had been recorded and collated by researchers at Johns Hopkins University, allowing forecasts of case numbers for 236 separate regions and nations up to six months ahead of time. They also factored in data from the University of Oxford that enabled them to adjust numbers depending on government interventions such as social distancing mandates, school closures, and alternating factory shifts.

Amid the pandemic’s third wave in Spain, the predictive system developed for the XPRIZE was also put to use in the real-world scenario of Valencia. The tool forecast a local peak for infections in late January. The region’s health authorities were worried about ICU bed capacity, and so understanding the timing and scale of this peak was crucial. “We really hoped the model was right,” Oliver says. The model not only predicted the scale of the January surge accurately, falling within less than a percentage point of the total case numbers, but it also correctly forecast the precise day those infection numbers would begin to decline. The estimates allowed hospital capacity to ramp back up in time, and meant social distancing measures were reintroduced early enough. After that success, political decision makers finally understood the power of the system Oliver and her team had built. “They really trusted the model that we built for the XPRIZE,” Oliver says. “Now we can do so many things. It’s on the political agenda.”

But it is the second phase of the XPRIZE contest – the development of a “prescriptor” – that may prove more consequential for the future of AI as a policy-making tool. As a basis, the team calculated a series of data points to plot a curved-line graph, where the y-axis measured the stringency of various government interventions, and the x-axis marked the total number of cases, hospitalisations or deaths. The area below the line represented the range of unrealistic combinations – where rules were too strict to be enforceable or mortality rates too high to be palatable. An optimal solution for a given hypothetical government would instead lie somewhere along the curved line.

The University of Oxford analysis shows that, in any given country, on any given day, there are almost eight million variables that can be tweaked, from full or partial school closures, to changes to retail opening hours, to a broad gamut of office return policies. Oliver and her colleagues created a model that at a given moment would pick just ten interventions that lead to the “optimal trade-off” – keeping both costs and case counts to an acceptable minimum. The basic premise, Oliver says, was “you put the stuff into the system, and then it tells you what to do”.

Over time some intervention options might have to be removed and others added in order to adapt to an evolving pandemic. Alberto Conejero, one of Oliver’s closest collaborators and a professor of applied mathematics at the Polytechnic University of Alicante, says that this is a fiendishly difficult process. But, if done right, this model might be helpful in future, non-pandemic contexts. In agriculture, for example, where farmers have to balance out the efficacy of pesticides against their harmful effects on human health. “You discover that there is not a unique best decision, and you always have to keep a trade off between decisions,” Conejero says. This approach, he continues, might provide an algorithmic roadmap that shows politicians how to act in the best interests of the economy, society and the population in general.

Escolano likes to describe the input as “your reality,” and the output as being “what is the best decision to take”. But he says there are yet more complexities that should be taken into account, and for which humans may be better qualified to judge than an AI. The social cost of some measures in a pandemic – like closing schools – might feel much higher for human participants in a system, however effective they may be at driving down case numbers. Ultimately, he says, the parameters for acceptable social or economic costs need to be decided by politicians.

For Oliver, who has spent a career focused on these kinds of problems, this model represents something of a Holy Grail, because it could form the foundation for optimising public policy decisions across government. Her team won the XPRIZE contest, and the paper they co-wrote outlining their work subsequently won the top award at Europe’s most prestigious academic conference on machine learning, the ECML PKDD.

She and the others remain focused on the vaccine rollout in the Valencia region, which has one of the most vaccinated populations in the world, with a higher percentage of inoculated adults – above 90 percent – than in the US and UK. The citizen surveys have continued, and found that 93 percent of female and 90 percent of male respondents expressed a willingness to get jabbed.

Thanks to their experience over the past 18 months the researchers now have a powerful predictor that’s been road-tested during a time of unprecedented strain, and continues to be used across Valencia. They have also created a system that can suggest a small number of specific, effective pandemic-related policies or interventions that a government can make. In the future this could be repurposed for other policy areas, but right now in any one of hundreds of regions or countries around the world, it would be able to deliver the lowest number of infections within designated budget limits. Ultimately, says Escolano, after all the billions that have been spent on futile interventions over the past 18 months, cost-effectiveness should matter, both to the citizens who pay taxes and the governments who spend them. “You need early detectors and early warnings,” he says. “And this could be one of them.”