NIH to expand critical catalog for genomics research

The NIH plans to expand its Encyclopedia of DNA Elements (ENCODE) Project, a genomics resource used by many scientists to study human health and disease. Funded by the National Human Genome Research Institute (NHGRI), part of NIH, the ENCODE Project is generating a catalog of all the genes and regulatory elements—the parts of the genome that control whether genes are active or not—in humans and select model organisms. With four years of additional support, NHGRI builds on a long-standing commitment to developing freely available genomics resources for use by the scientific community.

“ENCODE has created high-quality and easily accessible sets of data, tools and analyses that are being used extensively in studies to interpret genome sequences and to understand the consequence of genomic variation,” said Elise Feingold, Ph.D., a program director in the Division of Genome Sciences at NHGRI. “These awards provide the opportunity to strengthen this foundation by expanding the breadth and depth of the resource.”

Since launching in 2003, ENCODE has funded a network of researchers to develop and apply methods for mapping candidate functional elements in the genome, and to analyze the enormous database of generated genomic information. The data and tools generated by ENCODE are organized by two groups: a data coordinating center, which houses the data and provides access to the resource through an open-access portal, and a data analysis center, which synthesizes the data into an encyclopedia for use by the research community.

Pending the availability of funds, NHGRI plans to commit up to $31.5 million in fiscal year 2017 for these awards. With this funding, ENCODE will expand the scope of these efforts to include characterization centers, which will study the biological role that candidate functional elements may play and develop methods to determine how they contribute to gene regulation in a variety of cell types and model systems. Additionally, the project will enhance the ENCODE catalog by developing a way to incorporate data provided by the research community, and will use biological samples from research participants who have explicitly consented for unrestricted sharing of their genomic data.

At its core, ENCODE is about enabling the scientific community to make discoveries by using basic science approaches to understand genomes at the most fundamental level. Its catalog of genomic information can be used for a variety of research projects—for example, generating hypotheses about what goes wrong in specific diseases or understanding the processes that determine how the same genome sequence is used in different parts of the body to make cells with specialized functions. More than 1,600 scientific publications by the research community have used ENCODE data or tools.

“We found that many of the people that are using the ENCODE resource are doing so for disease studies, and this attests to its translational value,” said Mike Pazin, Ph.D., a program director in NHGRI’s Division of Genome Sciences.