Skip to content
Home » Research Data Goes Cloud

Research Data Goes Cloud

Written on May 25, 2020

Every time you undertake research, you create new knowledge about our world and, thus, new data. The challenge is how to store and manage all of this data for later reuse.

NFDI is an initiative within Germany, put forward by the Joint Science Conference in late 2018 and backed financially by the federal and state governments, to establish a distributed cloud infrastructure to address this issue. Here, the acronym NFDI stands for the lovely German term “ Nationale ForschungsDatenInfrastruktur,” i.e. national research data infrastructure. The NFDI Directorate is based in Karlsruhe, while the data management is taken care of by the Karlsruhe Institute of Technology (KIT) and the Leibniz Institute for Information Infrastructure (FIZ).

This data management is based on the so-called FAIR principles: Findable, Accessible, Interoperable, and Reusable. It means that research and metadata need to be findable, both by humans and bots, accessible in a standardized way, integrated well with other data, and reusable following the respective licenses. And this last point being ‘reusable’ is crucial since this is the ultimate goal of FAIR: to make data more reusable and thus the science more efficient.

Research Data Management: Now and in Future

NFDI is currently in the process of forming consortia and reviewing funding proposals, with a total amount of €85 million provisioned for the establishment of up to 30 consortia across all sciences. The long-term goal is to build an independent legal entity dedicated to research data management in Germany in conjunction with other initiatives such as the European Open Science Cloud (EOSC).

Since NFDI is a very recent initiative, probably not many researchers nor students have heard of it so far. Within the Youth German Physical Society, I am heading a team dedicated to spreading the word about NFDI and contributing a youth perspective on research data management. On the one hand, this means envisioning how such a research data infrastructure could work in the future, as we will use it in five to ten years.

Just pretend you are reading a scientific publication. Currently, it’s like you are scrolling through a PDF, and if you would like to reuse some data points, you would extract them from a crappy, low-resolution screenshot or plead to the authoring researchers and hope they are going to send over some data. Now imagine these articles were web-first, linked with all the research data, and featuring interactive graphics like in IPython. Where you put your cursor into the graphics and read off the data point, display the publishing license, and right-click to export the data for reuse in your own simulation, in accordance with the publishing license. And you were introduced to this data management system within your studies.

On the other hand, our goal is to discuss how NFDI can be integrated into teaching, in labs courses, and during thesis writing, acquaint students with using such a cloud-based research database. This includes, for example, providing sample data as open educational resources. An entire concept is presented in a position paper of the Federal Council of Physics Students in Germany.

Event: Satellite Workshop ‘NFDI @ Teaching’

On June 3, 2020, our team from the Youth German Physical Society, together with the Federal Council of Physics Students in Germany, is organizing a satellite workshop ‘NFDI @ Teaching’ for the Conference on a FAIR Data Infrastructure for Materials Genomics, which is going to take place from 9 am till 1 pm as a Zoom seminar. On this occasion, we would like to discuss designing a research data infrastructure with respect to the needs of young, aspiring researchers and what NFDI could look like in practice at university.

Also, in November 2019, we have been doing a design thinking process within the Youth German Physical Society, design thinking about who will use NFDI (by creating personas) and how these people are going to use it (by writing user stories). The results of this process will be presented as a poster contribution to our satellite workshop.

Looking forward to seeing you on June 3, 2020!