The European Bioinformatics Institute (EMBL-EBI) has seen web data requests jump from 62 million to 81 million on an average day since 2019. Researchers all over the world are highly dependent on these large-scale biological data sets to transform future research, and throughout 2021/22, EMBL-EBI plans to transfer a further 7 petabytes (PB). One petabyte is equal to 3.4 years of continuous full HD video recording – so that’s a lot of data!
The data transfer is enabled via Jisc’s Janet Network – one of the fastest and most secure internet networks in the world. Dr Steven Newhouse, head of technical services at EMBL-EBI, says: “In recent years we’ve seen an explosion in data and uptake in scientific activity, especially with the COVID-19 pandemic. The Janet networking capability and shared data centre space we access from Jisc gives us the flexibility to increase capacity and meet demand.”
A world leader in bioinformatics, EMBL-EBI is also working with UK Biobank on global human disease research.
Dr Mallory Freeberg, project lead at the EGA, says: “The UK Biobank resource enables a wide variety of biomedical research areas, for example, the effects of lifestyle factors like diet and physical activity on health, how genetics contributes to responses to population-scale health concerns like the coronavirus pandemic, and development of early detection methods for common diseases among the UK population.”
Robust digital infrastructure enables EMBL-EBI and UK Biobank to continue collaborating. As well as providing long-term storage of UK Biobank data, EMBL-EBI is also transferring this data to cloud-based analysis platform DNAnexus. This is among the first data transfers at this scale from EMBL-EBI to a cloud-based service, but is something that will become increasingly routine.
In 2017, EMBL-EBI shared the first release of genetic data held by UK Biobank via the European Genome-phenome Archive (EGA) – a joint resource developed by EMBL-EBI and the Centre for Genomic Regulation (CRG) in Barcelona, Spain. And this work is set to continue. “The hosting of UK Biobank data still has several more years to run,” explains Newhouse. “There will be future UK Biobank data releases that will provide a network load in addition to our own growth.”
Steve Kennett, executive director of e-infrastructure at Jisc, adds: “The work that EMBL-EBI and UK Biobank are doing is profoundly important, not only for UK research but for human health on a global scale.
“We are delighted to be able to support these efforts through our Janet Network, and look forward to continuing our work with these organisations as they grow and evolve.”
EMBL-EBI’s director, Ewan Birney, will be delivering a keynote at Jisc’s Networkshop49 on Tuesday 28 April, discussing the changes in how important data science has become for biological and health research. Book your place – free for Jisc member institutions.