Coronavirus Helps in a Shift Toward Anonymization of Trial Data
Patient privacy has become more top of mind as the COVID-19 pandemic continues and a larger part of the population is participating in clinical trials.
“With everything that is happening around us, and even more so in the context of the pandemic, there is recognition that access to data is critically important,” says Sarah Lyons, general manager for IQVIA’s Privacy Analytics. “Anonymization has become very highly relevant” for sponsors and sites.
Lyons said she had “seen a steady increase” in statistical anonymization for clinical trial transparency across her customer base, with volumes “more than tripling year-over-year” since 2016. “The volume increases have accelerated even beyond this rate over the past two years,” she said.
The FDA’s Center for Drug Evaluation and Research (CDER) spokeswoman Autumn Cook said the privacy of clinical trial participants could be effectively protected through either anonymization, redaction or both. She added that there were “several practical aspects” in determining which technique would be preferred or used.
The choice, Cooks says, depends on who is using the data and how. “For example, when FDA/CDER discloses agency records pursuant to the Freedom of Information Act, it will not modify information in the records but will redact any confidential information to maintain the integrity of the underlying official record.”
But Cook said FDA “acknowledges that either approach is effective in protecting patient privacy. The selection of an anonymization approach can depend on the purpose of the disclosure, such as use of the data in further research, disclosing official agency records under FOIA, etc.”
But risk-based statistical anonymization enables researchers to share rich data that can drive greater advances in patient health than data redaction can, Lyons says.
Indirect identifying information about clinical trial participants — including demographics, medical event dates and serious adverse events — could prove “very valuable for secondary analysis,” Lyons told attendees at a recent CenterWatch webinar, but that data could also be used to reidentify a trial participant. She says information, such as a name, an address or a patient ID, “definitely needs to be replaced, such as with a pseudonym, or removed when we’re anonymizing the information.”
According to Lyons, statistical anonymization of both direct and indirect data is reached at an anonymization threshold of 0.09. At that threshold, “we can defensibly demonstrate that the information has been anonymized,” she said, adding that the marker conforms with guidance prescribed by the European Medicines Agency (EMA) and Health Canada, and is “very much consistent with global health practices for anonymizing data,” including benchmarks by HIPAA and Europe’s new data privacy rules.
Information, such as name, address or patient ID, “definitely need to be replaced, such as with a pseudonym, or removed when we’re anonymizing the information,” she said. “When we anonymize data, we’re transforming it to the degree necessary to ensure privacy protection and deliver statistical proof that it has indeed been anonymized.
“We’re actually measuring, in statistical terms, how identifying the information is — in other words, the likelihood of participants being re-identified in the data,” she said.
Regulators have been making a push toward anonymization for years. The first phase of EMA Policy 0070, which covers the publication of clinical study documents on the agency’s portal, took effect in January 2015. A second phase governing patient data has yet to be implemented. Health Canada’s Public Release of Clinical Information, which has a scope and anonymization guidance similar to the EMA policy, was rolled out in 2019. IQVIA said other regulatory bodies, including Japan’s Pharmaceuticals and Medical Device Agency, were enacting similar transparency measures.
But David Friedland, senior vice president at data-masking company IRI, says there are still instances where redaction is more appropriate. “In many cases, there’s more downside risk to maintaining the data unredacted,” he told CenterWatch Weekly. “It depends on who you are and what your responsibility is for that data.”
“The [HIPAA] law hasn’t changed one way or the other to loosen any of these protections, but I think it’s useful for people who want to go into a clinical trial to know that if health information privacy laws are followed, they have a very good expectation of privacy,” Friedland said. “The problem is there are breaches of HIPAA regulations all the time through inadvertent or maleficent actors who are hacking into systems to try get at that data. Therein lies the liability and the need to protect the data before it’s breached.”
“To be more compliant, it’s just easier to redact it all if you can. The operations involved with anonymizing data are a lot more time-consuming and energy-draining. It’s for a better benefit, but still, if you don’t have to do that work, why would you want to if you could just simply redact, encrypt or otherwise obfuscate the data instead?”
At IRI, Friedland said customers often look to comply with HIPAA by following its safe harbor provision and redacting 18 key identifiers from data. But he said IRI’s customers — clinical data processors, big pharma, consultancies and others — can also opt for the rule’s expert determination method, which includes risk-scoring and anonymization exercises.
“Researchers really don’t need access to key identifiers so much as they need access to the quasi-identifiers, which are the indirect identifiers or attributes in data that are true about someone but not unique to them,” Friedland said. But those identifiers “are still useful for research purposes.”
He added that other challenges lay ahead, including locating and blurring so-called “dark data,” or information not stored in structured datasets.
“You have a world now that involves unstructured data sources, unstructured repositories, cloud repositories, streaming data and real-time data,” Friedland said. “It all needs to be anonymized or redacted, one way or another.”
A recent white paper by d-wise cited the results of a 2020 survey of “key transparency audiences” by Informa CBI, which found half of respondent organizations said regulatory changes were the primary driver for increased transparency. The international firm, which advises CROs in the technology sphere, cited third-party research that included dozens of interviews and one focus group of senior executives involved with regulatory submissions.
“As companies build transparency teams and have completed a few submissions, the focus is beginning to shift to anonymizing data to provide the utility essential for innovation and responsible-sharing strategies,” d-wise said.