Published in the European Journal of Epidemiology, the paper describes how the Data Portal – one of the cornerstones of DPUK’s MRC-funded activity – was set up to provide rapid access to cohort data and inform studies aimed at preventing and treating conditions such as dementia.
Launched in November 2017, the Data Portal brings together health records for over 3 million people in 42 cohort studies within a free-to-access resource. Researchers can identify which cohorts are relevant to their proposed area of study, apply for access to the data, and then analyse it in a secure, remote environment complete with analytical software packages and cross-cohort capability.
Study senior author Professor John Gallacher, Director of DPUK and Professor of Cognitive Health at Oxford University, said: ‘No matter where you are in the world, if you’re a researcher with a great idea then the Data Portal allows you to access the information you need to test your hypothesis. Our vision is to optimise and democratise the UK’s rich heritage of health data collection, powering it with state-of-the-art technology to take on the critical challenge of finding ways to prevent and treat dementia. The Data Portal is the only resource of its scale globally, and we look forward to seeing how it will develop and the impact it will have in the future.’
Joint-first author Dr Sarah Bauermeister, Senior Data and Science Manager at DPUK, said: ‘The Data Portal was born out of a recognition that big data is the future of medical research, and that dementia researchers need access to information from cohort studies in a structured, curated way – efficiently and at scale. With an effective treatment for dementia having proved elusive for more than a century, this type of high-quality data – from cognitive test results and biological samples to brain imaging and genetics – may hold the key to identifying the changes that take place in the brain and body long before symptoms start to appear.’
Since the Data Portal’s launch, there have been almost 600 requests to access individual cohort data, spanning 72 institutions (academic, commercial and governmental) across 16 countries. Study topics range from the links between childhood experiences and later-life cognition, successful cognitive ageing in the over-90s, and the association between Parkinson’s disease, dementia and insulin resistance.
The new paper, featuring more than 60 authors from DPUK’s university, industry and funding body partners, describes the end-to-end, seven-layer architecture that allows researchers to discover, access and analyse standardised cohort data. Although optimised for dementia research, the Data Portal’s remote access model ‘provides a generic solution that can be adapted to any outcome for which relevant data is available’.
According to the paper, a multi-cohort-focused repository such as the Data Portal provides an integrated solution to challenges including the need to access multi-modal data at scale to achieve statistical rigour; the prohibitive size of large modern datasets; the growing value of triangulation and replication using independent datasets; and the increasing burden of mastering bespoke models for different types of data. Among the planned future developments are linkage with electronic health records, enhanced global connectivity, and the use of cohort data to aid recruitment to highly targeted experimental medicine studies.
The authors write that the benefits of the repository-based Data Portal approach include a reduction in administrative burden for cohort teams, and an important contribution to the democratisation of science (particularly for researchers in developing countries, and those without access to high-end computing power).
The authors conclude: ‘The DPUK Data Portal was established by the MRC to support the development of new treatments for dementia by using cohort data to inform experimental medicine. By streamlining procedures for cohort research teams, increasing data accessibility for researchers, reducing costs, and adding value, the Data Portal is an investment in the future of cohort studies.’