
Podcast Blechhammer
Research data management at Schmalkalden University of Applied Sciences introduces itself.
Svetlana Knaub enquires.
M
This is the start of our new podcast on the topic of Research Data Management, abbreviated “RDM”,
at Schmalkalden University. This aims to encourage researchers to engage in the subject of RDM
to organize their research more efficiently and use the results more sustainably.
Mr. Fehling, you are a contact person for research data management at our university. Perhaps you could start
by briefly introducing yourself and then tell us something about the HAWK project on research data management.
A
My name is Peer Fehling. I am an educated chemist. The research data I have dealt with is primarily measurement data. The community in research data management in Thuringia or nationwide is made up of all scientific areas, so that a diverse input from individual scientific disciplines is achieved. When I took up the position at Schmalkalden University in the HAWK-FDM project, I found my way into research data management.
HAWK project has been running at Schmalkalden University of Applied Sciences since December 2022. The German abbreviation FDM-HAWK stands for: Competence Cluster Research Data Management at Universities of Applied Sciences in Thuringia. This already shows that several institutions are involved.
M
Which facilities are these?
A
In addition to the Schmalkalden University of Applied Sciences, the Erfurt University of Applied Sciences, the Ernst Abbe University of Applied Sciences Jena and the Nordhausen University of Applied Sciences are also involved.
M
What is the project about?
A
During the research process, starting with the planning of projects, the application for funding and the realization of the projects, numerous data is collected. The aim of research data management is to preserve this data throughout the data life cycle in the best possible way, based on standardized rules and to make it available to other interested parties beyond the end of the project.
M
Why is this data of interest to third parties?
A
More than ever, modern research is carried out by highly specialized teams and the results of individual research groups are closely connected to other groups. This often involves considerable human and financial resources. Some projects are financed by funding organizations with taxpayers’ money. In the data-driven age, data and information are the true treasure of research that must be preserved. And data that is collected with taxpayers' money should also be accessible to any interested parties, for instance to avoid duplicate surveys and thus double funding.
M
Why is the project focusing specifically on the universities of applied sciences and
universities of applied sciences in Thuringia?
A
It must be said that the topic of research data management has gained increasing attention in recent years. One of the main reasons for this has been the efforts of funding organizations to create binding standards for handling research data. In this context, the "Guidelines of the German Research Foundation on Good Scientific Practice for handling research data” should be mentioned. Other funding organizations, such as the Volkswagen Foundation or the European Union require applicants to make statements on the handling of data generated in the research projects.
The establishment of research data management at universities in Germany began several years ago, not least because basic research there is financed to a large extent with taxpayers’ money.
M
How should we visualize this "establishment of research data management"?
A
Local state initiatives for research data management have been founded, which actively support universities in the development of research data management. In Thuringia, this is the Thuringian Competence Network for Research Data Management, German abbreviation TKFDM, which emerged from the University of Jena. But there are also RDM-initiatives in other federal states of Germany. Examples include HeFDI for Hesse, BW-FDM in Baden-Wurttemberg and FDM-Bavaria. At universities, temporary positions were initially created for contact persons for researchers on the topic of "research data management’, which in many cases have been made permanent in the meantime. This is an important requirement for the planned stabilization of research data management structures at universities of applied sciences.
M
And what is special about Universities of Applied Sciences when it comes to research data management compared to universities?
A
One major difference is the focus of research. While universities focus on basic research, research at universities of applied sciences is very industry- and application-oriented. Accordingly, the stakeholders have a strong interest in protecting sensitive data and securing competitive advantages from research activities. However, this is also possible with an appropriately tailored research data management system, for example through targeted licenses and the protection of exploitation rights. Nevertheless, it should not be forgotten that the re-utilization of third-party data is also associated with advantages for one's own research
M
Where can I find out more about research data management and get started quickly?
A
At Schmalkalden University, there is a section on research data management on the website of the "Research and Transfer" department, where further information is compiled. But of course, we invite everyone to follow this podcast, which is intended as an introduction to the topic of "research data management”.
M
Mr. Fehling, thank you very much for this informative overview of the project
"Research data management competence cluster at Universities of Applied Sciences in Thuringia".
In the next episode, we want to clarify what is meant by research data and research data management.
And with that, we say goodbye.

Podcast Blechhammer
Research data and research data management - getting started.
Svetlana Knaub finds out how.
M
Welcome everybody. Today's topic of our podcast on "Research data management at Universities of Applied Sciences"
deals with the following questions: What is research data and what is research data management?
Mr. Fehling, could you please explain what is meant by research data?
A
Everyone can probably imagine what research data means, at least from the perspective of their own academic background. However, this subject-specific perspective makes it challenging to establish a standardized definition. Just think of the multitude of measurement data generated in the natural and engineering sciences or the data surveys often used in the social sciences.
M
That seems to be almost all the data that has anything to do with research?
A
In the "Guidelines on the Handling of Research Data", the German Research Foundation has used an enumeration to address the topic of "What is research data" and summarised measurement data, laboratory values, texts, objects from collections or samples that are created, developed or analyzed as part of scientific work under the heading of "research data". Methodological test procedures, such as questionnaires or software, simulations and survey data, that means data related to individual “observation units", such as people, households or companies, are also mentioned.
M
That sounds pretty bulky!
A
We can simplify this and say that research data is all data that is generated or used in the planning, realisation and documentation of scientific projects. Scientific projects include project work as well as bachelor's, master's or doctoral theses.
M
And why is research data so important today?
After all, such data already existed in the past, even though in a different form.
A
That’s correct. But the current situation is as follows: Modern research is no longer carried out exclusively by outstanding individuals and no longer takes place in a "quiet room". Instead, we are dealing with teamwork between highly specialised scientific units that work together across universities at a national and international level. At this point, we need to make the connection to digitalisation. Modern research generates a constantly growing flood of digital data, which, in addition to practical application, represents the real treasure of research. It is not difficult to recognize that research data forms the basis for successful scientific work and reflects its success. Modern research is "data-driven", which means that strategic decisions about the alignment of research are made on the basis of analyzing and interpreting data.
M
The importance of research data management is now also becoming clear...
A
That's right. Basically, every researcher already takes care of their data. However, this process is becoming increasingly demanding and time-consuming, not to mention the information technology "know-how". It involves resource planning, available storage structures, data protection and data security, backup strategies, data archiving and much more.
M
How does research data management support researchers?
A
Research data management provides suitable tools for these tasks and advises researchers on the preparation, implementation and organisation of their work, effectively leaving more time for the actual research. The security and usability of the research data are always guaranteed during the project and beyond.
M
Which university institutions are involved in this process?
A
In addition to a local contact person for research data management at the university, the university data center and the library are also involved. These form the "research data infrastructure". In a broader sense, it is also about information management. The basis for this is the electronic archiving and subsequent utilization of research data. In a nutshell, research data management combines all methodological, conceptual, organizational and technical measures and procedures for handling research data during its “life cycle".
M
We hope that we have been able to contribute to the understanding of the terms
"research data" and "research data management".
The next episode will deal with the topic of "Research Data Management and Data Life Cycle”.
Until then, we say goodbye.

Podcast Blechhammer
Research data management - preferably FAIR!
Svetlana Knaub finds out how this works.
M
Welcome to the third episode on Research Data Management at Schmalkalden University.
In the second episode, we talked about research data and research data management.
At the end, when it came to how exactly the support for researchers in organizing their research data looks like,
the term "data lifecycle” came up. What is the data life cycle?
A
The data life cycle is an illustrative model for handling research data over the timeline of its existence. In simplified terms, it is divided into individual sections that do not run in strict succession, but overlap in time to some extent.
These sections are:
1. Planning the research project
2. The generation of data
3. Analyzing and processing the data
4. Sharing and publishing the data
5. Archiving the data and -last but not least-
6. The subsequent use of the data.
It should be noted that this structure exists in different labelling variants, although the content is the same.
M
Can you explain the individual phases in more detail, please?
A
Of course. The research project begins with the planning phase. It is important to clarify which data is required, generated, processed and stored. This includes logistical and infrastructural considerations regarding available storage options and storage capacities, the regulation of responsibilities, the definition of file and directory structures, but also the acquisition of funds for structured research data management.
M
That's quite a lot of information on this first point. How do you keep track of it all?
A
All this information flows into a so-called data management plan. This is a useful tool and a number of structured templates are available. In the Business Information Center of Schmalkalden University processes are designed. There are handouts in the form of a flowchart available that are helpful in creating data management plans. We will return to this in a later episode, as data management plans are now required by many funding providers in the application process for research projects.
M
And this is followed in second place by the "Data generation” section.
A
Right. Research data is generated through experiments, measurements, observations, simulations, surveys or other processes. This varies greatly from subject to subject. In this context, it is important to obtain the consent of the data owner when using third-party data or to check the license restrictions.
M
As a small note: We will also return to the topic of the legal aspects and licensing of research data in a later episode.
A
That's right. Which brings us to the third section, "Analyzing and processing the research data”. The responsibility for this lies with the researchers themselves. The associated processes include, for example, digitizing, transcribing, checking, validating, interpreting and, in the case of personal data, anonymizing or pseudonymizing the data.
M
And how do I know in the end how I got from my raw data to the processed data?
A
This is a very important aspect. So, we need a kind of description for our data: This is the Metadata. Put simply, Metadata is data about data. It’s collection is a core component of processing research data. Metadata plays an important role in the retrieval and re-utilization of research data.
M
Can we perhaps give an example of the use of metadata?
A
The metadata of digital photographs is well known. They show when the photo was taken and with which camera settings. The GPS coordinates provide information about the location.
M
And the topic of metadata will also be discussed in more detail in a later podcast episode.
And that brings us to section four of the research data lifecycle, the "sharing and publishing" of research data.
What needs to be considered here?
A
Before data is shared or published, copyright and access rights, patent rights or licenses should be defined. If data is to be published on a research data repository, that means a special public or institutional server, it is possible to exert a targeted influence in this respect when selecting the repository. Persistent identifiers, or PIDs for short, can be created to uniquely identify and reference the data. PIDs make it easier to find the data in the network. A special form of PIDs are, for example, Digital Object Identifiers, DOls, which some of you may have already come across in literature.
M
What do I ultimately gain from sharing or publishing my research data?
A
The published research data are a direct expression of the success of the research and enhance not only the researcher's own reputation, but also that of the researcher’s scientific institution. This should be expectation and incentive enough. In addition, dissemination in existing networks and communities opens up collaboration opportunities for future research projects.
M
That is plausible. And this brings us to the fifth section: Archiving research data. Why should research data be archived?
A
Quite clearly: To make scientific results comprehensible in the long term.
M
What does long-term mean?
A
Longer term means at least 10 years. Based on the guidelines of the German Research Foundation on "Good Scientific Practice”, this requirement is already being implemented by most universities and colleges in internal guidelines on the "Handling of Research Data", including at Schmalkalden University of Applied Sciences.
M
What is archiving?
A
For archiving, the data or different versions of data are copied to long-term storage. There they are stored permanently and securely. Then the original data can be deleted. Archiving storage is not necessarily designed for short-term access but is in fact often designed as a kind of data depot with correspondingly delayed access times. We will return to the topic of archiving in a later episode of the podcast.
M
Which data should be archived?
A
In principle, all important project data should be archived after completion of a research project. This allows you to fulfill your obligation to provide evidence later.
M
This leaves us with the final stage of the research data life cycle, the “Re-usability of research data".
What does this mean and who uses the research data later?
A
Re-useability means that published research data, for example research data in data journals or repositories can be used later by the researchers themselves or by third parties, with or without restrictions. Contextual re-evaluation of research data, for instance viewing it from a different perspective, can open up completely new research perspectives and approaches. Time-consuming and cost-intensive preliminary investigations are reduced, and the overall scientific output of the research is improved both qualitatively and quantitatively. The owners of the research data can decide for themselves who may use the data, restricted or not.
M
That was a lot of information about the research data lifecycle. In the end, can we derive a recommendation on how to handle research data correctly and make the best use of knowledge about the research data lifecycle?
A
There are also the international FAIR principles. FAIR spelled F-A-I-R is an acronym. The letters stand for:
F for findable
A for accessible
I for interoperable and
R for reusable.
This describes guidelines for handling research data so that its reuse is suitable for both humans and machines. We have heard some of the means of doing this in today's discussion of the data lifecycle. Data that complies with the FAIR principles is also called FAIR data. A term that is worth remembering.
M
Perhaps we can summarize once again how we make our research data FAIR?
A
Findability is ensured by assigning persistent identifiers, such as a DOI or an ORCID, the Open Researcher and Contributor Identification Number, to the author. The accessibility of the data is ensured by the licensing of the data and long-term storage method. The metadata also makes a decisive contribution to this. Interoperability is achieved by using open and free data formats with long-term usability. The use of standardized terms, that means special vocabulary, is also an important aspect. This applies equally to data and metadata. Many of the points already mentioned are important for the reusability of research data, such as the use of open file formats, structured metadata, standardized vocabulary or machine- readable licenses. We will go into the licenses in more detail in a later episode.
M
Mr. Fehling, thank you for the interesting explanations.
In the next episode of our podcast, we will be looking at
Open Science, Open Access and Open Data.
See you next time.

Podcast Blechhammer
Open science - free access to information.
Svetlana Knaub seeks clarity.
M
Welcome to the fourth episode of our Research Data Management podcast.
In the third episode, we talked about the data lifecycle, the FAIR principles and FAIR research data.
Today, we will take a look at the relationship between Open Science, Open Access, Open Data and research data management.
What does one have to do with the other?
A
There is no precise definition of the term "Open science". Open science is intended to promote the transparency of scientific processes in general and access to scientific information in particular. Individual elements of the research process are freely accessible. This includes, for example, publications, laboratory reports, software and research data. The barrier-free exchange of scientific findings thus enables a higher quality of science and is part of the digitization strategy of both, German federal states and the European Union.
M
What does this mean for us as a University of Applied Sciences
with regard to the practical implementation of research results in industry?
A
The economy naturally benefits directly from an easier transfer of scientific knowledge. Innovative strength and competitiveness are improved and the quality of industry-related research is enhanced, which also benefits future collaboration projects. However, Open Science does not only affect research that can be profitably implemented in industry, but all scientific disciplines.
M
And how can Open Access and Open Data be categorized in relation to Open Science?
A
Open Science is a generic term for a group of measures that all aim to improve access, dissemination and re-usability of scientific knowledge. In addition to Open Access, this also includes Open Source and Open Data. Open Access aims to achieve unrestricted access to scientific publications. Open Source considers the reuse of software and is probably already familiar to many. Finally, Open Data endeavours to make research data freely available.
However, further thought is already being given to Open Hardware for experimental setups, Open Services for support services and Open Educational Resources for teaching materials, opening up new fields of action in the sense of "free availability".
M
And where do we stand as a University of Applied Sciences in this process?
A
Since 2021, Schmalkalden University has had an Open Access Policy, that means a guideline or recommendations on this topic, which can be viewed on the university's website. The university names Open Science and Open Access as part of its canon of values. All members of the university are called upon to participate in the realization of the Open Science concept within the scope of their possibilities. This implies, for example, submitting publications to Open Access journals, permanently securing the exploitation rights of electronic publication versions, that means not assigning them to publishers or publishing freely accessible. A distinction is made here between the "Golden Path", which describes the first publication in an Open Access medium, and the "Green Path" for the second publication of scientific work simultaneously with the first publication or afterwards in an Open Access repository, such as the Digital Library of Thuringia or the Thuringian Research Data Repository REFODAT, which is now available.
M
Where can I get advice on Open Access Publication channels at our university?
A
The first point of contact is the library, which also provides the relevant information on its website. In 2021, Schmalkalden University joined the nationwide DEAL agreement with Springer Nature, which has been extended until 2028. This provides campus-wide access to around 2000 journals published by Springer. Articles by first authors of the university in Springer Closed Access Journals, that means subscription journals, are available worldwide Open Access and the publication fees in pure Open Access journals from Springer Nature are paid by the state of Thuringia. There is also an Open Access publication fund for the state of Thuringia. In any case, you should inform yourself in advance about the publication and exploitation rights. There are also numerous financing models for Open Access Publications.
M
And what about the implementation of the Open Science concept
regarding to Open Data and research data at Schmalkalden University?
A
The Open Access Policy recommends storing research data in a way that is findable, accessible, interoperable and reusable in accordance with the FAIR principles. We talked about how this can be achieved in the last episode.
M
I would like to thank you for your insights into the topics of "Open Science, Open Access and Open Data".
The next episode is about data documentation and the importance of metadata in research data management.
Until then, we say goodbye.

Podcast Blechhammer
Metadata describes the world of data.
Svetlana Knaub questions the details.
M
Welcome to the fifth episode of our Research Data Management podcast. Today, the topics of
"data documentation and metadata" will be discussed in more detail in the context of research data management.
As briefly mentioned in the 3rd episode of our podcast, metadata is data about data.
Using the example of a digital photo file, Mr. Fehling has already explained that, for example,
the date, aperture or GPS coordinates are such data. What is the significance of data documentation and
metadata for research data management?
A
On the one hand, data documentation is important for the reproducibility of research in terms of good scientific practice, but on the other hand, it is also important for the subsequent use of research data. If it is not known under what conditions the data was created or what it says, it is practically worthless. The data used to describe research data is called metadata. Metadata is therefore data about data that is indispensable for interpreting the research data, that means for understanding it. Ideally, they are both human-readable and machine-readable and thus enable the data to be interpreted by technical systems. And a dataset that cannot be found or is difficult to find due to missing metadata cannot be reused. This would eliminate a core element of effective research data management.
M
Let's summarize: Metadata should ideally... ?
A
... be structured, standardized and machine-readable. Only by describing the data with metadata can the research comply with the FAIR principles. Ultimately, each dataset can only be as useful as the metadata that describes it.
M
We remember episode 3 of our podcast, where the FAIR principles were discussed in more detail.
Can you briefly mention them again, as we are sure to come across them more often?
A
Of course. FAIR is an acronym that summarizes the requirements for the preparation of research data. It means:
F for findable
A for accessible
I for interoperable, that means it can be processed across platforms
and
R for reusable.
M
What information about the research data should definitely be included in the metadata?
A
There is the "5 W rule":
Who, What, Where, When and Why?
So, WHO created the data and HOW, WHAT does the data say, WHERE was it created, WHEN and for WHAT purpose, that means WHY.
This makes it clear that metadata is created at all stages of the research data lifecycle, starting with planning, through data collection, data analysis, data archiving or storage and subsequent use. The research data is fully described with information on the research project, the relevant data set and the files it contains. This project-related information is set out in data management plans, which are already mandatory for many funding providers when applications are submitted. We will discuss this in a later episode of our RDM podcast.
M
What options are available to the researcher for creating metadata?
You mentioned that, ideally, metadata should be stored in a standardized way.
Are there tools for this?
A
A simple form is the creation of a README file. Some will be familiar with this format from software, where important information about authorship, version or licenses is stored. Similarly, a README file for research data contains descriptive information about the research data. Keyword: 5 W. The README file is often available in Markdown syntax. There are corresponding templates available on the internet on the "GitHub" platform. Another option is the codebook, which contains information on all variables of a data set. Imagine a table in a non-proprietary file format, that means it can be freely used. For example, the "comma separated value" - csv -. There should not be several tables on one sheet, title lines, comments, blank lines, evaluations and special characters should be omitted, and values should be sorted by number and unit of measurement. This is also called "well-structured data".
Well-structured metadata can be obtained by using metadata schemas. These are available as templates. They can be generic, that means generally valid, or subject specific. Administrative and bibliographic metadata can be standardized across disciplines. The creation of process metadata and descriptive metadata is more demanding.
M
Can you give some examples of frequently used metadata schemas?
A
A well-known generic metadata standard is "Dublin Core". The originator is the Dublin Core Metadata Initiative. It describes the data history using 15 core fields. All fields are optional and can be extended if required, so that you can tailor the standard to your data. Another generic metadata standard is the Data Cite Metadata Generator. It creates data documentation in XML format on a question-and-answer basis and is based on Dublin Core. It is maintained by the Data Cite consortium.
M
And the subject-specific metadata standards?
A
Of the subject-specific metadata standards, CMDI that means Component Metadata Infrastructure for the field of "Artificial Intelligence" and Eng-Meta that means Engineering Metadata for the engineering sciences should be mentioned.
M
Where can I find something about available metadata standards?
A
A good overview of metadata standards is provided by the Metadata Standard Catalog of the Research Data Alliance, an international organization with the aim of promoting the open exchange of data. The website "FAIR Sharing.org", a curated website on data and metadata standards, and the "Digital Curation Centre", a British organization focusing on data management and digital archiving of data, should also be mentioned.
M
The metadata schema therefore specifies how the information on my research data is structured.
Is it up to me which terms or keywords I use for this?
A
That is an important point. The content should also meet certain standards. Special vocabularies and terminologies are available for this purpose. This is intended to bring different or incorrect spellings to a common denominator or to correct them. The terms are organized into categories called taxonomies. These categories can then be related to each other in a model-like manner to form ontologies. The result is a network of knowledge on a topic or across disciplines, which can be used easily, efficiently and without contradiction due to its standardization. The facts are greatly simplified in this context and are really more complex.
M
And where can I find more detailed information on this topic?
A
One example is the NFDI4ING Terminology Service of the National Research Data Infrastructure, a service provided specifically for the engineering sciences. Here, subject-specific terminologies for different areas of engineering are developed and networked. The subject areas are divided into 7 archetypes, which are all abbreviated by first names. The archetype DORIS, for example, stands for High Performance Measurement and Computation.
M
And what exactly does the National Research Data Infrastructure do?
A
The National Research Data Infrastructure, abbreviated NFDI, is a non-profit association founded in 2021 and funded by the federal and state governments of Germany. The aim is to make research data usable in the long term through networking. To this end, research institutions from various fields work together. The NFDI provides services, training courses and standards for handling data. The NFDI is divided into 5 sections. One of them is called "Metadata, Terminologies and Provenance". In each section, several subject-specific consortia work together thematically. As of 2024, there are 27 consortia in total.
M
But back to the topic of metadata. Where is the metadata stored?
A
The metadata is stored directly with the data it describes. This can be directly in the file, as with a photo or linked to the actual data.
M
And how do I find the metadata or the data in my search?
A
Metadata is assigned a persistent identifier or PID upon publication. For instance, the Digital Object Identifier, abbreviated DOI, of publications is well known. This creates the link between metadata and research data. The findability of the metadata itself is realized through registration and indexing in a metadata directory. This can be searched for information. It is important to note that metadata remains available even if the actual reference data no longer exists, perhaps because the server is offline or the archiving period has expired.
This means that important information on data history and usage rights is available even without the actual data. We will return to this topic in a later episode on the subject of "Publishing research data and repositories".
M
Mr. Fehling, thank you for the information on "Data documentation and metadata".
The data management plan and useful tools are the focus of the next episode. Bye.