• SKIP TO CONTENT
  • SKIP NAVIGATION
  • Patient Resources
    • COVID-19 Patient Resource Center
    • Clinical Trials
    • Search Clinical Trials
    • Patient Notification System
    • What is Clinical Research?
    • Volunteering for a Clinical Trial
    • Understanding Informed Consent
    • Useful Resources
    • FDA Approved Drugs
  • Professional Resources
    • Research Center Profiles
    • Clinical Trial Listings
    • Market Research
    • FDA Approved Drugs
    • Training Guides
    • Books
    • eLearning
    • Events
    • Newsletters
    • White Papers
    • SOPs
    • eCFR and Guidances
  • White Papers
  • Trial Listings
  • Advertise
  • COVID-19
  • iConnect
  • Sign In
  • Create Account
  • Sign Out
  • My Account
Home » 1000 Genomes Project data available on Amazon Cloud

1000 Genomes Project data available on Amazon Cloud

April 2, 2012
CenterWatch Staff

The world's largest set of data on human genetic variation — produced by the international 100,000 Genomes Project — is now publicly available on the Amazon Web Services (AWS) cloud, the according to the National Institutes of Health (NIH).

The public-private collaboration demonstrates the kind of solutions that may emerge from the Big Data R&D Initiative announced last week by the White House Office of Science and Technology Policy.

"The explosion of biomedical data has already significantly advanced our understanding of health and disease. Now we want to find new and better ways to make the most of these data to speed discovery, innovation and improvements in the nation's health and economy," said NIH director Francis S. Collins, M.D., Ph.D. Collins was among agency leaders speaking in support of the initiative at the launch event.

The Big Data initiative will initially engage at least six federal science agencies — including the NIH, the National Science Foundation, and the Department of Defense and the Department of Energy — committing more than $200 million to a collaborative effort to develop core technologies and other resources needed by researchers to manage and analyze enormous data sets.

Among the NIH components participating in the Big Data initiative are the National Human Genome Research Institute (NHGRI) and the NIH National Center for Biotechnology Information (NCBI) — a division of the National Library of Medicine. NHGRI played a lead role in organizing and funding the international 1000 Genomes Project. NCBI, along with the European Bioinformatics Institute of Hinxton, England, began making 1000 Genomes Project data freely available to researchers in 2008.

Since the project's launch in 2008, the data set has grown enormously: At 200 terabytes — the equivalent of 16 million file cabinets filled with text, or more than 30,000 standard DVDs — the current 1000 Genomes Project records are a prime example of big data that has become so massive that few researchers have the computing power to use them.

To help solve that problem, AWS has just posted the 1000 Genomes Project data for free as a public data set, providing a centralized repository on the Amazon Simple Storage Service. The data can be seamlessly accessed through services such as Amazon Elastic Compute Cloud and Amazon Elastic MapReduce, which provide organizations with the highly scalable resources needed to power big data and high performance computing applications often needed in research. Researchers pay only for the additional AWS resources they need to further process or analyze the data.

The public-private collaboration to store the data in the AWS cloud allows any researcher to access and analyze the data at a fraction of the cost it would take for their institution to acquire the needed internet bandwidth, data storage and analytical computing capacity.

"Improving access to data from this important project will accelerate the ability of researchers to understand human genetic variation and its contribution to health and disease," said NHGRI director Eric D. Green, M.D., Ph.D. NHGRI is a major funder of the 1000 Genomes Project, along with Wellcome Trust of London and BGI-Shenzhen of China.

Cloud access also enables users to analyze the data much more quickly, as it eliminates download time and because users can run their analyses over many servers at once. "Putting the data in the cloud provides a tremendous opportunity for researchers around the world who want to study large-scale human genetic variation but lack the computer capability to do so," said Richard Durbin, Ph.D., co-director of the 1000 Genomes Project and joint head of human genetics at the Wellcome Trust Sanger Institute in Hinxton, England.

Paul Flicek, D.Sci., co-leader of the 1000 Genomes Project Data Coordination Center (DCC), added that the new venue “fulfills a central goal of the 1000 Genomes Project to make the data as widely available as possible to accelerate medical discoveries.”

Upcoming Events

  • 16Feb

    Fundamentals of FDA Inspection Management: Reduce Anxiety, Increase Inspection Success

  • 21May

    WCG MAGI Clinical Research Conference – 2023 East

Featured Products

  • Spreadsheet Validation: Tools and Techniques to Make Data in Excel Compliant

    Spreadsheet Validation: Tools and Techniques to Make Data in Excel Compliant

  • Surviving an FDA GCP Inspection

    Surviving an FDA GCP Inspection: Resources for Investigators, Sponsors, CROs and IRBs

Featured Stories

  • SurveywBlueBackground-360x240.png

    Sites Name Tech Acceptance as Essential Factor in Selection of Sponsors, Survey Finds

  • TrendsInsights2023-360x240.png

    WCG Clinical Research Trends and Insights for 2023, Part Two

  • TimeMoneyEffort-360x240.png

    Time is Money and So Is Effort, Budgeting Experts Say

  • TrendsInsights2023A-360x240.png

    WCG Clinical Research Trends and Insights for 2023, Part Three

Standard Operating Procedures for Risk-Based Monitoring of Clinical Trials

The information you need to adapt your monitoring plan to changing times.

Learn More Here
  • About Us
  • Contact Us
  • Privacy Policy
  • Do Not Sell or Share My Data

Footer Logo

300 N. Washington St., Suite 200, Falls Church, VA 22046, USA

Phone 617.948.5100 – Toll free 866.219.3440

Copyright © 2023. All Rights Reserved. Design, CMS, Hosting & Web Development :: ePublishing