Type to search

Cybersecurity Data IT Learning Tech

The role of Big Data and Open Data in Covid-19


Ever since COVID-19 struck politicians, scientists and medical researchers have been hard at work trying to uncover the nature of the virus – what exactly is it, why does affect some people more than others, what can be done to help reduce the spread, and where will it go next through population models, travel etc. This is very much data and analytics. In essence, it is a classic Big Data exercise. Big Data lies at the heart of efforts to comprehend and forecast the impact that Coronavirus will have on all of us.

Researchers are dealing with enormous amounts of information with pressure to produce results and gain insights very quickly. Too quickly for humans to comprehend and process on their own.

Advanced technologies are very much proving their worth in mining scientific literature and open data, tracking the virus, modelling and even understanding social media responses to gauge community concerns and model likely behaviours.

Epidemic modelling is complex and modelling on what might happen is based on a number of assumptions. The reason why there is no one standard model or predictive outcome that you can see presented as the single source of truth is that modelling requires assumptions. These could be based on likely behaviour of a population in regards to social distancing and essential travel, health system capacity and ICU beds, border closures, closing schools, population density, demographics of the population such as vulnerable people and older people, and many more. These can differ culturally and geographically. In my favourite town in Northern Italy, Mantova, they are being hit very hard by Covid-19 with many theories as to why it’s been so significant in Italy compared to other populations. Now we’re beginning to see the US overtaking Italy in the graphs in terms of rates of infection and predicted deaths.

Epidemiologists and policymakers need to aggregate and synthesize data available to feed into the modelling on a global basis. This is where Big Data combined with machine learning and AI proves it’s worth. We need real-time scenarios with as much data as possible. The more details we can get, the better. It is legendary that the more data you have the more accurate the insight and it’s never been more obvious than before that we need this information quickly with Covid-19 rapidly spreading.

This is a global pandemic and COVID-19 is a novel virus. It’s new! It keeps evolving and as it does we need to keep refining the model to the variables presented. The data sets from the COVID-19 pandemic will likely form part of the evidence package that will be presented to regulatory authorities once a therapy or therapies have been identified that appear to be effective. This will potentially set a precedent for how Big Data and open data can be used in similar situations in the future.

Now more than ever, the unprecedented impact of coronavirus around the world has sparked the need for open data, data sharing and unprecedented partnerships to achieve the insights we need. To best support the critical need for data-driven decision making by citizens and their governments, a COVID 19 strategy should consider the below in the open data value chain:

  1.  Availability – What data are available to understand the spread of COVID-19
  2. Openness – How are open data and data availability affecting the fight against COVID-19?
  3. Dissemination – What technologies and visualizations/dashboards are available to understand COVID-19?
  4. Uptake and Use – What have been the challenges to using COVID-19 data and what are the resources to help understand the data?
  5. Feedback – What are digital initiatives people can join to combat COVID-19 with increased research and data?

Collaboration across borders nationally and internationally will contribute significantly to finding viable solutions. Open data sets about where people are moving, population demographics, the impact of travel restrictions, shopping patterns, infection epi-centres, even exercise data – that’s data that is not impossible to get. To make any real progress in this situation, we need as much information as we can get and we need to bring together people who understand the epidemiology, and those that understand Big Data.

It’s a very interdisciplinary problem, and to make any headway, we need a greater focus on open data and data sharing.

Selfishly, I just want to be able to re-visit the beautiful town of Mantova in Italy as soon as I can to hug my extended family and to feel that the world is safe again.


This article originally appeared on the author’s LinkedIn page.