Big Data in clinical trials: Promise and pitfalls

Monday, August 15, 2016

Privacy versus medical progress. Proprietary ownership versus public information. Cost-effectiveness versus unnecessary burdens. Welcome to the swirl of opinions and emotions around Big Data and its potential—or not—in the clinical trials industry.

Big Data is currently being used on a limited basis in the clinical trials arena, but experts believe its widespread use is coming in the near future. Some hail the great promise it holds in furthering drug discovery. Others are skeptical that it will bring much value and say that enthusiasm should be tempered.

A 2016 survey examining the views of the public and the industry about privacy concerns regarding the use of Big Data in clinical trials has reopened discussions about the potential and the benefits, the barriers and the concerns. The survey, conducted by SCORR Marketing in partnership with CenterWatch, recorded the opinions of 300 members of the general public and 39 professionals in the drug development industry.

Big Data is defined as extremely large data sets that can be mined by a computer using predictive analytics to uncover patterns, trends and associations, especially those relating to human behavior and interactions. These data come from social media posts, digital pictures and videos, Internet searches, purchase transactions, cell phone GPS signals and myriad other sources.

According to IBM, 2.3 trillion gigabytes of data are created every day—so much that 90% of the data in the world today has been created in the last two years alone. Digital Universe estimates that by 2020, there will be 5,200 gigabytes of data for every man, woman and child on Earth.

Big Data is used for everything from consumer marketing to optimizing business practices, to financial trading and improving sports performance. The Advanced Performance Institute notes that in healthcare, Big Data is being used in such areas as decoding DNA, developing algorithms to predict infections in premature infants, anticipating the development of epidemics and much more.

In the clinical research industry, Big Data offers the tantalizing potential of solving the ongoing dilemma of subject recruitment. Only 5% of cancer patients actually join clinical trials, according to the National Cancer Institute. The Tufts Center for the Study of Drug Development found that 37% of clinical trials fail to reach their recruitment goals, and 11% of sites fail to recruit a single patient.

The potential of Big Data in clinical research goes beyond patient identification and recruitment. It can be used to identify new, targeted therapies based on genetic markers and biomarkers, evaluate protocol feasibility, identify adverse event responses among patient subpopulations, assess efficiency and safety responses during interim evaluations in adaptive clinical trials and for pre-population of electronic data capture (EDC) case report forms.

Using Big Data to identify appropriate patients for clinical trials is an area that holds much promise and may be the first big step. “It’s like fishing with a fish finder,” said John Potthoff, chief executive officer at Elligo Health Research. “You can really see where patients are and where to go to get them. When you are looking at a bigger pool, you can be more targeted in finding patients who meet the inclusion and exclusion criteria the best.”

But Dawn Sauro, president of Development Innovations at Sarah Cannon Research Institute, sees some significant barriers to overcome. “I don’t think we can get around the privacy issue any time soon, so I think we will be using the data to inform and to guide decisions but not necessarily to recruit in the near future,” she said. “I feel like people think it’s a magic bullet, but [it will be a while before it’s usable] for anything other than marketing trends and decision making. Maybe that’s all it ever will be.”

Current big data usage

Ellen Kelso, executive director of Strategic Development for Chesapeake IRB, said that a few groups are piloting the use of Big Data to create data sets of high-quality screening candidates for clinical trials. The work involves online analysis of information garnered from individuals who shared or sought medical information to understand their condition, or who searched for information on possible therapies or study results for a specific condition.

“Social media systems reach billions of people, so even if a small number share this sort of information, it amounts to millions of people making statements including relevant data,” she pointed out. “As the required details go up, the numbers of candidates drop, but the quality of the pool of potential screening candidates rises until it is highly refined.”

Kelso estimates that more than 60% of trials are using some form of preliminary online analysis to help identify potential research subjects. Big Data is also being used in prospective clinical trials, although the extent is difficult to quantify.

In retrospective clinical trials, the use of Big Data is more concrete. The National Health and Nutrition Survey (NHANES) database is being mined for baseline information and to create a healthy cohort as a control group. Event information is gathered from databases such as the Death Index and Medicaid. Kelso explained the use of these databases set a nice pattern to follow and expand on.

Questions and challenges

The questions and the challenges are plentiful. Who owns these staggering volumes of data? How does one cull through to find the gold amid the dross? What are the privacy and ethical considerations? Does the use of Big Data for advanced recruiting invade patient confidentiality?

“There is a fine line between protecting individuals’ privacy rights and trying to take advantage of the technologies that we have today in an effort to potentially improve medical science,” said Karen Wall, deputy general counsel at CRO Chiltern. “With the wealth of data available and so many people who are very sick, if the data can help us improve world health, that’s a great thing.”

In the SCORR survey, industry professionals indicated higher levels of trust than the general public pertaining to who can use their data and what types of data can be used. The general public was more open to the utilization of Big Data through a personal physician than by government or the healthcare industry. And the public expressed a clear preference for keeping healthcare records separate from daily life data, such as social media posts or online shopping history.

To develop the survey, SCORR gathered input from key industry leaders. “Big Data can provide a great opportunity for the industry by providing information to help locate the patients,” said Lea Studer, senior vice president of marketing and communications at SCORR, who presented the survey results at the Drug Information Association (DIA) Annual Meeting in Philadelphia in June. “To really make that difference, though, we’ve got to change the perception of how it is handled.”

Richard Malcolm, former CEO of Acurian and current partner at Bingham Associates, said patient privacy is a complicated matter. “I believe a lot of the issues around privacy is not the lack of regulations, it’s that you don’t know when you’ve signed off your privacy,” he said. “Do you know when you’ve given people the ability to aggregate your information and share it with others?

Personal information is collected in multiple silos, he noted, and is now being combined to develop a multifaceted picture of each individual. If a person gives permission at one point for a certain aspect of their medical history to be shared, does this mean the entire picture can be made public?

Whether this picture can be used to approach potential trial participants is a gray area. “I think there will be several more years of struggle at the regulatory level in our industry to find the consistency to allow clinical research to integrate with the [privacy] laws that are also supposed to impact the Googles of the world,” Wall said.

The question is whether the use of Big Data for subject recruitment is different from using it to target people interested in buying a car, for example. “For patient recruitment, I don’t think it’s nefarious and I don’t think there’s anything dangerous about it,” Malcolm said. “We’re not selling; we’re really just making them aware of an opportunity, which is dramatically different.”

Protection and trust

The SCORR survey indicated that patients turn most often to their physicians for guidance when it comes to clinical trial information and participation. “We found that, of all the people that consumers trust, physicians were at the top,” Studer said of the survey. “So it’s going to be important to get physicians more involved [in trial recruitment.]”

Jennifer Byrne, chief executive officer of PMG Research, agrees. “We talk about the patient as partners, but the thought of patients truly becoming direct partners with pharma is overreaching and unrealistic in today’s world. The best partnership for patients is in concert with their healthcare support system.”

Engaging greater physician interest and referrals around clinical trials is key, experts say. But there are certain ethical questions to consider when Big Data enters the picture. As Sauro put it, “Can Dr. Jones mine data to find a patient at Dr. Smith’s office and then approach that patient to say, ‘I have a clinical trial’?”

Another question is whether providing sites with a wealth of Big Data will help or hinder. “The issue is how we make the process easier for investigators and sites to find the right patients,” said Neil Ferguson, chief commercial officer at INC Research. “If Big Data enables that, it’s a success. If it makes it more cumbersome because we are not streamlining the process and driving efficiencies, then Big Data can become more of a problem because we could get stifled by the amount of information that we’re holding.”

How to protect confidential data once it crosses into the clinical trial realm is also a consideration. Cloud-based technology can be the most cost-effective way to store and disseminate Big Data—but it also can be vulnerable to cyber break-ins. “The anxiety people have is, ‘What if my employer finds out that I was being treated for bipolar disease or I have HIV?’” Malcolm pointed out. “I think that has a very low probability of occurrence, but I think it’s possible.”

Enormous potential

Many say that once these hurdles are addressed, the potential is enormous. Tighter inclusion and exclusion protocols and more targeted studies are making it hard to find patients that fit narrow indications. Big Data can make the enrollment process faster and more cost effective by matching the right patients to the appropriate trials.

“Used responsibly, Big Data can help us find the increasingly difficult-to-identify patients and reach out to them in an efficient way in a very regulated environment to tell them that there is a trial opportunity that they might be interested in,” Malcolm said.

Potthoff continues a fishing analogy to explain the benefits. “Instead of sticking your line out waiting for the fish to come by, you can actually target patients who meet those criteria,” he said. “If you enroll faster, you accelerate everything. Reducing the timeline is a huge cost savings to the sponsors, and it’s beneficial all around by helping to bring the trial to more eligible patients.”

For now, the notoriously slow-moving industry is approaching Big Data usage with caution. Potthoff said, “Our industry is slow to adopt new technologies, and purposefully so. I think there’s cautiousness around introducing something into a clinical trial space and potentially ruining your data or causing harm.

Ferguson believes Big Data has value in identifying potential subjects, but that value is limited. “At the end of the day, what you need are high-performing sites with well-trained investigators who are well-motivated and who can identify the right patients, keep them retained and provide quality data,” he said. “Big Data can give you indicators as to where those patients may be, but it’s not going to be the panacea for the right study.”

Malcolm is optimistic about the future role of Big Data. “I think Big Data has great potential,” he said. “Filling clinical trials is important to keep progress in medicine moving forward. So we need to have the dialogue about where the vulnerabilities are and how we make sure we maintain the very high level of responsibility that is expected of us.” 

Lisa Catanese, ELS, has been a medical writer and editor since 1986, covering clinical trials, medical research, newly approved drugs and devices, consumer health education, continuing medical education and more. She is a member of the American Medical Writers Association and is certified by the Board of Editors in the Life Sciences. Email

This article was reprinted from Volume 23, Issue 08, of The CenterWatch Monthly, an industry leading publication providing hard-hitting, authoritative business and financial coverage of the clinical research space. Subscribe >>

Looking for more news, check out the new FREE CenterWatch Weekly!

The new FREE CenterWatch Weekly is your source of critical news, emerging trends, and business issues around everything in the rapidly changing clinical research marketplace. Check out our new CWWeekly page! Sign up today for your free email newsletter, update your bookmarks and check us out regularly! We look forward to bringing you the best news and information about clinical research in 2018!