Visualizing Data: Turning Data into Comprehensible, Valuable Insights
Identifying and communicating the stories clinical trial data tell is critical to delivering actionable insights, but it’s best to start small when organizing data into easily visualized images.
“Data visualization” — conveying the tales held within data through graphic representations and transforming them into understandable, useable information and insights — is a tool that can help maximize efficiency and make business decisions within an institution, including sites, universities, IRBs and human research protections departments.
Raw data may be mostly meaningless when it comes to gleaning insights; turning that data into information is where meaningfulness starts to show, says Suzanne Caruso, senior vice president of insights and analytics at WCG Clinical. “It comes together in a way that you start to be able to look at what is the insight, what are we seeing from the data.”
The prospect of presenting data visually may tempt institutions to bite off more than they can chew when first venturing out, Caruso said during the Public Responsibility in Medicine and Research (PRIM&R) virtual conference last week, advising beginning with a single, focused question instead. Don’t be concerned that you’re starting your data visualization journey in this simple way, she says; a multitude of other questions will arise during the process.
“I have learned the hard way that when you start to think about data visualization, that is too big an idea. Everyone needs to start with asking a question,” Caruso said.
First, take the data relevant to that question and format it into a structured dataset of rows and columns; think of the basic Microsoft Excel format most people are familiar with, she advised. Some institutions will have access to platforms that can offer tremendous help with data organization, such as Amazon Web Services, Microsoft Access, Microsoft Power BI and different IRB systems, but for those that don’t have those resources or much of a tech budget, Excel can still do the trick. Even just a simple column listing all protocol numbers can be considered a structured dataset.
Using this structured data set, you can start to build out information categorized for querying, she said. That list of protocol numbers, for example, could be organized by the kinds of protocols coming in, the specific sponsor or department they’re coming from and the number of protocols submitted by specific entities. Time periods for submissions can also be useful to track.
With data organized as useful information, you can now start exploring insights that give way to more questions and, eventually, solutions, Caruso said. Organizing by protocol submission type, for example, one might see that there’s a significant percentage of submissions coming in as protocol amendments or that a single therapeutic area comprises a large chunk of submissions.
Then you can drill down further; you may find that amendment submissions are slow at the start of every month but skyrocket near the end, for instance, or that turnaround times (TAT) spike every summer. The obvious foundational questions, of course, are why these things are occurring and how they are impacting operations. Answering those questions and others that arise can enable greater efficiencies and better allocation of resources.
“You can start to evaluate the why and then start to say, ‘How do we deal with this influx of amendments that comes in and how do I prepare my organization to receive submissions that come in ebb and flow?’ You can start to build out different business processes,” she said.
Once the data are organized and the questions answered, organizations can go further and put together an even more robust, bird’s-eye view of the organization and its operations. This, the actual visualization part of the process, can be done through a variety of software, third-party vendors or a combination of the two.
Sites with tech resources have a lot of product options in this area; a number of clinical trial management systems and solutions, for instance, come with their own built-in data visualization capabilities, while standalone products can also be powerful tools. Forbes has named Power BI, Tableau, Qlik Sense, Klipfolio, Looker, Zoho Analytics and Domo as this year’s best data visualization solutions.
For sites with fewer resources, Power BI and Tableau are powerful (and free) software options that can be used while sites push for additional funds and resources on the data visualization front, Caruso said.
The Program for the Protection of Human Subjects (PPHS) within Mount Sinai’s Icahn School of Medicine has significantly overhauled its data visualization approach in recent years, going from a basic, labor-intensive process to a more efficient system that, while still being perfected, is much improved.
In the past, PPHS would ask a data-driven question, pull relevant (and irrelevant) data into an Excel spreadsheet, spend time cleaning the data and then apply relevant filters. The department would then summarize the data using formulas and finally present the summarized data in a table as an answer to the question, Marilyn Eshikena, an IRB manager at PPHS, said at the PRIM&R conference.
But this approach presented some problems for the department. For one, PPHS staff had to devote significant time to cleaning the data, putting the data summaries together and assessing the presented summarized data, which did not convey the full story in one view. It also didn’t make trends and insights readily apparent, nor did it provide answers to new questions that arose in the presented data, she says.
Today, the department kicks its data-driven questions out to a research IT team that cleans the data, applies filters and returns with a spreadsheet that includes only the dataset PPHS is interested in. PPHS then summarizes that data and presents it in a table.
While using a research IT team has helped save time, PPHS made a game-changing move when it upgraded from Excel to Tableau, Eshikena said, enabling a very visually informative dashboard. PPHS has also utilized Power BI to generate a dashboard that presents detailed graphics and information on submissions, approvals, average TAT for various activities and reviews, and more.
The next steps in its data visualization journey, according to Eshikena, will be to purchase Power BI licenses that expand the software’s capabilities, collaborate with the IT department to funnel IRB database data into Power BI, and eventually create automated reports and dashboards. This will enable live reporting and cut out the need to wait for and work with data in a spreadsheet.
“We want to be able to ask a data-driven question, push a button … and then gain answers not just to that question, but to other questions that may arise, and also have multiple levels of insight provided,” she said.
Upcoming Events
-
21Oct