COVID-19 datasets

COVID-19 datasets are public databases for sharing case data and medical information related to the COVID-19 pandemic.

Aggregate statistics

United States

Volunteer/non-government

PublisherDate of first publicationIn official use?Update

Frequency

Geographic

Level

TimeseriesTesting

Sites

Testing

Number

CasesHospitalizationsDeathsVaccination

Sites

Vaccination

Number

Description
Coders Against COVID/GISCorps[1]March 22, 2020[2]Yes, by FEMA[3] and State of California[4]DailyPoint (lat/long)YesYesNoNoNoNoYesNoA dataset of COVID-19 testing locations in the United States and Puerto Rico
USAFacts[5]April 24, 2020[6]Yes, by CDC[7]DailyCountyYesNoNoYesYesNoNoA dataset of county-level coronavirus cases and deaths that is updated daily
COVID Tracking Project[8]March 7, 2020[9]NoDailyStateYesNoYesYesYesYesNoNoA volunteer-run database of testing and medical stats in the United States
Sentiment Analysis of users reviews on COVID-19 contact tracing mobile applications[10] [11]March 2021This dataset is intended to support sentiment analysis of users' reviews on COVID-19 contact tracing mobile applications.

U.S. Department of Health & Human Services

NameGeographic

Level

TimeseriesTesting

Sites

Testing

Number

CasesHospitalizationsDeathsVaccination

Sites

Vaccination

Number

COVID-19 Diagnostic Laboratory Testing (PCR Testing) Time Series Archived 2021-03-12 at the Wayback MachineStateYesNoYesNoNoNoNoNo
COVID-19 Reported Patient Impact and Hospital Capacity by Facility Archived 2021-03-12 at the Wayback MachinePoint (lat/long)YesNoNoNoYesNoNoNo
COVID-19 Estimated Patient Impact and Hospital Capacity by State Archived 2021-03-12 at the Wayback MachineStateNoNoNoNoYesNoNoNo
COVID-19 Reported Patient Impact and Hospital Capacity by State Archived 2021-03-08 at the Wayback MachineStateNoNoNoNoYesNoNoNo
COVID-19 Reported Patient Impact and Hospital Capacity by State Timeseries Archived 2021-03-10 at the Wayback MachineStateYesNoNoNoYesNoNoNo

Global

  • Johns Hopkins Coronavirus Resource Center: Global aggregated data including cases, testing, contact tracing, and vaccine development[12]
  • World Health Organization (WHO) Coronavirus Disease Dashboard: a database of confirmed cases and deaths reported globally and broken down by region.[13] This database is part of the WHO Health Data Platform.[14]
  • COVID-19 Africa Open Data Project: a volunteer-run database and dashboard reporting region, country and district level case counts, deaths, healthcare worker infections, healthcare services and urgent needs.[15]

Data hubs

  • Health Data Research UK provides a searchable registry of health data resources from the United Kingdom, including COVID-19 related datasets.
  • NIH Open Access Datasets: The National Institutes of Health provide open-access data an computational resources related to COVID-19.[16]
  • COVID-19 Open Research Dataset (CORD-19): The Semantic Scholar project of the Allen Institute for AI hosts CORD-19, a public dataset of academic articles about COVID-19 and related research.[17] The dataset is updated daily and includes both peer-reviewed articles and preprints.[18] CORD-19 was originally released on March 16, 2020, by researchers and leaders from the Allen Institute for AI, Chan Zuckerburg Initiative, Georgetown University's Center for Security and Emerging Technhology, Microsoft, and the National Library of Medicine.[19] The dataset is created through the use of text mining of the current research literature.[20]

Topic-specific and special-interest resources

Genomics

Imaging (Radiology)

  • Characteristic imaging features on chest radiographs and computed tomography (CT) of people who are symptomatic include asymmetric peripheral ground-glass opacities without pleural effusions.[24] The University of Montreal and Mila created the "COVID-19 Image Data Collection" in March which is a public data repository of chest imaging.[25][26][27] The Medical Imaging Databank in Valencian Region released a large dataset of chest imaging from Spain.[28][29] The Italian Radiological Society is compiling an international online database of imaging findings for confirmed cases.[30] Online radiology case sharing platforms such as Eurorad and Radiopaedia serve as platforms for sharing COVID-19 case data and imaging.[31][32]

References

  1. ^ Torpey, Holly; Caballero, M.D., Jorge A. (eds.). "COVID-19 Testing Sites Data". GISCorps/Coders Against COVID Testing Site Data. Retrieved 2020-12-14.
  2. ^ "Volunteer group develops a COVID-19 testing location database for the U.S." TechCrunch. Retrieved 2021-02-03.
  3. ^ "App of the Week: COVID-19 Testing Sites Locator". Federal Emergency Management Agency (FEMA). Retrieved 2021-02-03.
  4. ^ "COVID-19 Testing Sites in California". www.arcgis.com. Retrieved 2021-02-03.
  5. ^ "US Coronavirus Cases and Deaths". USAFacts.org. 2021-02-03. Retrieved 2021-02-03.
  6. ^ "Detailed Methodology and Sources: COVID-19 Data". USAFacts. Retrieved 2021-02-03.
  7. ^ "Coronavirus Outbreak Stats & Data". USAFacts. Retrieved 2021-02-03.
  8. ^ Stephens, Autumn. "Tracking Star in Oakland". Diablo Magazine. Retrieved 2020-09-18.
  9. ^ "About Us". The COVID Tracking Project. Retrieved 2021-02-03.
  10. ^ Ahmad, Kashif (2021). "A Benchmark Dataset for Sentiment Analysis of Users' Reviews on COVID-19 Contact Tracing Applications". Harvard Dataverse. doi:10.7910/DVN/1RDRCM. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: url-status (link)
  11. ^ Ahmad, Kashif (2021). "Sentiment Analysis of Users' Reviews on COVID-19 Contact Tracing Apps with a Benchmark Dataset". arXiv:2103.01196 [cs.CL].
  12. ^ "Home". Johns Hopkins Coronavirus Resource Center. Retrieved 2020-09-18.
  13. ^ "WHO Coronavirus Disease (COVID-19) Dashboard". covid19.who.int. Retrieved 2020-09-18.
  14. ^ "World Health Data Platform - WHO". www.who.int. Retrieved 2020-09-18.
  15. ^ "COVID-19 Africa Open Data". Retrieved 16 November 2020.
  16. ^ "Open-Access Data and Computational Resources to Address COVID-19 | Data Science at NIH". datascience.nih.gov. Retrieved 2020-10-13.
  17. ^ "CORD-19". Semantic Scholar. Retrieved 2020-10-13.
  18. ^ "Analysis of COVID-19 publications identifies research gaps". EurekAlert!. Retrieved 2020-10-13.
  19. ^ "Call to Action to the Tech Community on New Machine Readable COVID-19 Dataset". whitehouse.gov. Retrieved 2020-10-13 – via National Archives.
  20. ^ "NLM Leverages Data, Text Mining to Sharpen COVID-19 Research Databases". governmentciomedia.com. Retrieved 2020-10-13.
  21. ^ "GISAID". www.gisaid.org. Retrieved 29 December 2020.
  22. ^ "Nexstrain SARS-CoV-2 Dashboard". nextstrain.org. Retrieved 29 December 2020.
  23. ^ "Nextstrain". docs.nextstrain.org. Retrieved 29 December 2020.
  24. ^ Li Y, Xia L (March 2020). "Coronavirus Disease 2019 (COVID-19): Role of Chest CT in Diagnosis and Management". American Journal of Roentgenology. 214 (6): 1280–1286. doi:10.2214/AJR.20.22954. PMID 32130038. S2CID 212416282.
  25. ^ "COVID-19 related projects". Mila. COVID-19 image data collection. Retrieved 12 July 2020.
  26. ^ "COVID-19 image data collection". GitHub. Retrieved 12 July 2020.
  27. ^ Cohen, Joseph (25 March 2020). "COVID-19 image data collection". arXiv:2003.11597 [eess.IV].
  28. ^ "BIMCV-COVID19, Datasets related to COVID19's pathology course". Medical Imaging Databank in Valencian Region Medical. Retrieved 12 July 2020.
  29. ^ de la Iglesia Vayá, Maria (1 June 2020). "BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients". arXiv:2006.01174 [eess.IV].
  30. ^ "COVID-19 Database". Società Italiana di Radiologia Medica e Interventistica (in Italian). Retrieved 2020-03-11.
  31. ^ "Pneumothorax and pneumomediastinum: a rare complication in the evolution of COVID-19 pneumonia". Eurorad. Retrieved 12 July 2020.
  32. ^ Bell, Daniel; Knipe, Henry. "COVID-19 (summary)". Radiopaedia. Retrieved 12 July 2020.

Media files used on this page

SARS-CoV-2 (Wikimedia colors).svg
Author/Creator: Geraki, Licence: CC BY-SA 4.0
SARS-CoV-2 logo in Wikimedia colors