Bookkeeping Service Providers

  • Accounting
  • Bookkeeping
  • US Taxation
  • Financial Planning
  • Accounting Software
  • Small Business Finance
You are here: Home / How Synthetic Data Accelerates Coronavirus Research

How Synthetic Data Accelerates Coronavirus Research

August 7, 2020 by cbn Leave a Comment

When your research could save COVID-19 patients, you don’t want to wait around for institutional approval to use patient data in research. Here’s an alternative.

In the midst of a crisis, quick action is often necessary to prevent greater damage. But when you operate in an environment or industry governed by many rules and regulations, quick action can be pretty difficult.

Such is the case with healthcare research. Plenty of data is gathered every day about patients — their age, gender, ethnicity, underlying health conditions, and more. But the data is sensitive and protected. After all, it’s some of the most personal data there is about people.

Image: terovesalainen - stock.adobe.com

Image: terovesalainen – stock.adobe.com

Now imagine you are a healthcare researcher working on issues around the COVID–19 pandemic. That data is valuable and being able to work with it quickly means finding answers faster and potentially saving more lives.

“If you look at the traditional way that we access patient data for research and innovation purposes, it tends to be quite cumbersome and not particularly timely,” said Philip Payne, chief data scientist and associate dean for health and data science at Washington University School of Medicine in St. Louis. “That’s because there’s a very complex set of regulatory hurdles as well as technical hurdles.”

Philip Payne

Philip Payne

Those carrriers include the need to maintain the privacy and confidentiality of patients. But modern data analytics that require a lot of iterations call for researchers to request and wait for data. Researchers may have to go back to governing bodies to get access to additional data, and that can take weeks or months. The protected status of patient data makes it hard to do data analytic research in a way that can be applied in a quick, agile way to impact a rapidly evolving crisis like the coronavirus pandemic.

Speed matters in a pandemic. Rules designed to protect patient privacy slow it all down to a crawl. But you can’t throw those rules out the window, either.

To access data at the speed required while also respecting the privacy and governance needs of patient data, Washington University at St. Louis, Jefferson Health in Philadelphia, and other healthcare organizations have opted for an alternative, using something called synthetic data.

Gartner defines synthetic data as data that is “generated by applying a sampling technique to real-world data or by creating simulation scenarios where models and processes interact to create completely new data not directly taken from the real world.”

Here’s how Payne describes it: “We can take a set of data from real world patients but then produce a synthetic derivative that statistically is identical to those patents’ data. You can drill down to the individual role level and it will look like the data extracted from the EHR (electronic health record), but there’s no mutual information that connects that data to the source data from which it is derived.”

Why is that so important?

“From the legal and regulatory and technical standpoint, this is no longer potentially identifiable human subjects’ data, so now our investigators can literally watch a training video and get access to the system,” Payne said. “They can sign a data use agreement and immediately start iterating through their analysis.”

For more on data in the enterprise, read:

How Machine Learning is Influencing Diversity & Inclusion

Why Data Science Isn’t an Exact Science

How COVID is Changing Technology Futures

Will Facial Recognition Thrive in the Post-Pandemic Economy?

In the case of Washington University and Jefferson Health, researchers are using a platform for synthetic data called MDClone that specializes in synthetic data in healthcare. This platform takes real patient data and examines the statistical distribution of things that define those patients. The statistics about real patients are carried forward into the synthetic data set. The platform essentially creates a simulated set of patients. Researchers are able to begin data analysis work using the synthetic data after an hour-long training session and signing a data use agreement. That compares to weeks or months required when researchers need to get approval from an institutional review board to use actual patient data.

That speed is essential when you are racing for new insights about a novel coronavirus that has already killed more than 150,000 people in the United States and more than 700,000 people around the world. Researchers are racing for a vaccine and treatments.

For Washington University in St. Louis, the data team was able to recognize another important trend about patients in the health system’s network of 15 hospitals and two physician groups. The team was looking at the anticipated maximum patient load, how many patients would require the ICU, how many would require ventilators, how many would require dialysis, and the personnel required for all this.

The team was able to quickly realize that its hospitals in north St. Louis were seeing greater rates of admissions and ICU admissions among COVID-19 patients. A data analysis revealed that African Americans were about 2.5 times more likely to be admitted to the hospital than any other patient group, Payne said. Once admitted, Black patients’ odds of ending up in the ICU were four times greater than those of other patient populations.

Payne said that insight led to working with public health groups to better support communities at risk.

Washington University is using MDClone in its cloud-first Microsoft Azure implementation, but MDClone can also be deployed on-premises.

To further COVID-19 research and other advanced health work, last month MDClone announced The Global Network, a research and knowledge-sharing collaborative that protects patient privacy through the use of synthetic data. The Global Network will focus on three pillars of research in its first year — health services, clinical medicine, and precision medicine. At launch members included Washington University, Jefferson Health, and Intermountain Healthcare in the western states, among several others. The network enables collaboration across these medical organizations, which is something that can accelerate and improve research.

“Synthetic data can remove restrictions to sharing data externally so you can innovate faster,” said Josh Rubel, chief commercial officer at MDClone.

Jessica Davis has spent a career covering the intersection of business and technology at titles including IDG’s Infoworld, Ziff Davis Enterprise’s eWeek and Channel Insider, and Penton Technology’s MSPmentor. She’s passionate about the practical use of business intelligence, … View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.

More Insights

Share on FacebookShare on TwitterShare on Google+Share on LinkedinShare on Pinterest

Filed Under: Uncategorized

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Archives

  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • May 2021
  • April 2021
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • March 2016

Recent Posts

  • How Azure Cobalt 100 VMs are powering real-world solutions, delivering performance and efficiency results
  • FabCon Vienna: Build data-rich agents on an enterprise-ready foundation
  • Agent Factory: Connecting agents, apps, and data with new open standards like MCP and A2A
  • Azure mandatory multifactor authentication: Phase 2 starting in October 2025
  • Microsoft Cost Management updates—July & August 2025

Recent Comments

    Categories

    • Accounting
    • Accounting Software
    • BlockChain
    • Bookkeeping
    • CLOUD
    • Data Center
    • Financial Planning
    • IOT
    • Machine Learning & AI
    • SECURITY
    • Uncategorized
    • US Taxation

    Categories

    • Accounting (145)
    • Accounting Software (27)
    • BlockChain (18)
    • Bookkeeping (205)
    • CLOUD (1,322)
    • Data Center (214)
    • Financial Planning (345)
    • IOT (260)
    • Machine Learning & AI (41)
    • SECURITY (620)
    • Uncategorized (1,284)
    • US Taxation (17)

    Subscribe Our Newsletter

     Subscribing I accept the privacy rules of this site

    Copyright © 2025 · News Pro Theme on Genesis Framework · WordPress · Log in