Advances in data science and machine learning led to breakthroughs in many technological fields such as computer vision or voice recognition. Inspired by such breakthroughs, data science techniques are also increasingly applied in the research on porous materials. These advances also challenge our view on research data and how we manage and publish our experimental and computational data. As a result, all major funding agency and journals adapt their funding and publishing guidelines accordingly, requiring increasingly detailed research data management plans and openly available raw and meta data information published in suitable repositories. To cope with these requirements, we will introduce research data management tools for the whole data life cycle, that are specifically designed for the work with zeolites, MOFs and related porous materials. In the second part of our workshop, we will focus on the opportunities of exploiting our research data for porous materials, promising to accelerate the development of new materials.
The workshop is addressed to doctoral researchers and postdocs working both in experimental and computational studies on zeolites, MOFs and related porous materials. In addition, the participants will obtain the opportunity to exchange their own experiences and discuss with the speakers.
Prof. Dr. Marcus Rose | TU Darmstadt, Darmstadt |
---|---|
Dr. Manuel Tsotsalas |
Karlsruher Institut für Technologie (KIT), Karlsruhe |
Prof. Dr. Wolfgang Kleist |
Technische Universität Kaiserslautern, Kaiserslautern |
Dr. Nadja Möller |
DECHEMA e.V., Frankfurt am Main |
14:00 - 14:10
Welcome and technical information
Wolfgang Kleist and Nadja Möller
14:10 - 14:35
Introduction to RDM for porous materials:
Manuel Tsotsalas
14:35 - 15:35
ELNs
Nicole Jung (Chemotion)
ELN: Chemotion and NFDI4Chem
The role of ELNs to improve scientific documentation and reporting: examples taken from Chemotion-ELN
Electronic Lab Notebooks (ELNs) are a key prerequisite to a comprehensive documentation of research processes, the digital storage of research data, and their reuse. ELNs can be used to plan, record, store and – in combination with repositories - disclose experiments or research data. In the long run, the benefit of ELNs is the option to store and manage data in a standardized way and to enrich the data with (automatically generated) information such as metadata, identifiers and descriptors. For scientists, ELNs offer advantages such as faster research processes and a faster access to information. Selected benefits of the ELN Chemotion - an ELN that was designed for the discipline Chemistry - will be presented to show exemplarily the use of research data management tools. The ELN offers special features for chemical work and includes diverse functions that allow the use of the ELN also in other disciplines. Both, the chemistry specific as well as the generic and adaptable modules will be presented in brief.
Chemotion ELN can be used in combination with the open access repository Chemotion. The disclosure of research data to the public is possible by a direct transfer of information from the ELN to the repository. The interoperable systems ELN and repository guarantee on the one hand an easy process for the disclosure of information and on the other hand the availability of comprehensive data including primary data and descriptions.
The systems Chemotion ELN and Chemotion Repository are part of the strategy of the National Research Data Infrastructure for Chemistry (NFDI4Chem) in Germany. The strategy and measures of NFDI4Chem will be described in brief.
Matthias Schwotzer (elabFTW)
Case study of the implementation of an ELN in a research institute or "every change hurts"
Implementing an electronic laboratory notebook (ELN) system in the daily work of a research institute is indeed associated with drastic changes in everyday life. Once the decision has been made and potential software has been identified, the path to an operational ELN with corresponding hardware is in principle simple and quick. However, the real challenge of development and implementation starts at this stage. The implementation of eLabFTW, a widely used system, at the Institute for Functional Interfaces (IFG) at the Karlsruhe Institute of Technology (KIT), is shown as an example of this process. It becomes obvious that the organisation of the system has to adapt flexibly to the respective needs of the use case under consideration. Another challenge is the development of motivating moments for the scientists to apply the system and use it in an efficient way; the advantages of such a system often only become apparent after a longer period of operation. If the necessary activation energy has been raised here - additionally enforced, in particular by the external constraints of the research landscape, to integrate modern tools for research data management (RDM) into daily practice - the system can continue to develop nearly spontaneously in a vital environment of an institute. However, such a process also requires intensive care and continuous maintenance of the ELN system. In this contribution, those aspects will be highlighted and presented with reference to the eLabFTW installation at the IFG.
Pepe Marquez (NOMAD-ELN)
"NOMAD-ELN: Electronic Laboratory Notebook features for experimental Material Science"
NOMAD is a platform and open-source software powered by the NFDI project FAIRmat for making Materials Science data FAIR. In its conception, NOMAD was originally designed for computational Materials Science. Now, the software is rapidly evolving to manage, process, and publish experimental data. Within this perspective, ELN features are available in NOMAD. These features allow users to design data schemas, combine manual inputs with automatic file parsing and easily implement custom data visualization. Data retrieval can be performed with powerful search capabilities specifically designed for Materials Science in a graphical user interface (GUI) and with an API. Moreover, NOMAD can be installed in your local servers (aka NOMAD Oasis) guaranteeing your data privacy. A local NOMAD oasis can be connected to a federated (FAIRmat) network, allowing data sharing, and publishing.
In this talk, we will showcase the NOMAD-ELN features. This includes data schema customization, visualization, and search capabilities for your experimental data. Additionally, tools can be run directly on your NOMAD oasis allowing cloud computing and data analysis.
15:35 Coffee-Break
15:55 - 16:55
Data formats and repositories
Stefan Kaskel (AIF)
"A universal standard archive file for adsorption data"
New advanced adsorbents are a crucial driver for the development of energy and environmental applications. A tremendous potential is nowadays provided by machine learning and data mining techniques to identify an adsorbent for a particular application. However, the current scientific reporting of adsorption isotherms in graphs and figures is not adequate to reproduce original experimentally measured data.
Here we propose the specification of a new standard adsorption information file (AIF) inspired by the ubiquitous crystallographic information file (CIF) and based on the self-defining text archive and retrieval (STAR) procedure, also used to represent biological nuclear magnetic resonance experiments (NMR-STAR) [1]. The AIF is a flexible, general and easily extended free-format archive file and is readily human and machine readable - simple to edit using a basic text editor or parse for database curation. This format represents the first steps toward an open adsorption data format as a basis for a decentralized adsorption data library. Such an open format may facilitate the electronic transmission of adsorption data between laboratories, journals and larger databases in the effort to increase open science in the field of porous materials in future.
Lena Pilz (MOF)
"Chemotion Repository – A use case"
The need to be able to share, store and provide scientific data has increased in recent years. At this point, repositories are a very useful tool that provide the digital capabilities for this and can also offer helpful functions to publish data, for example, together with an article in a journal.
The Chemotion repository provides DOIs to make data citable, identifiable, and searchable. The main features of the Chemotion repository will be discussed and a use case will demonstrate how to move from measurement data and lab notes to published data.
Sandor Brockhauser (FAIRmat)
"FAIRmat for FAIR Data Management"
NOMAD provided an efficient data sharing platform for materials science for a long time. As one of the flagship projects of the German National Research Data Infrastructure initiative, NFDI, the project FAIRmat has now the goal of extending the NOMAD platform and providing an integrated solution for FAIR data management also covering the needs of synthesis and experimental characterisation laboratories. Next to enabling local deployments in the laboratories and the option of their integration to the NOMAD data sharing network, the newly developed NOMAD OASIS also offers customisable electronic lab notebook, and data exploration, analysis and visualisation services. The later services require an allocated compute infrastructure, and run containerised tools on it as cloud services made available in users’ browser. Making this level of integration and a machine actionable and interoperable reuse of the data possible, standardised data modelling and tools for capturing and transforming all required metadata have also been made available for the community.
16:55 Coffee-Break
17:15 - 17:40
Data science
Seda Keskin (High throughput computational screening and ML)
“Modeling of Porous Materials: From Molecular Simulations to Machine Learning”
High-throughput computational screening based on molecular simulations is very useful to quickly assess the potential of metal-organic frameworks (MOFs) and covalent organic frameworks (COFs) for various applications. Molecular simulations offer a wealth of structural property and performance data for MOFs and COFs that must be further examined. The recent application of machine learning (ML), another rapidly expanding field, to high-throughput computational screening of MOFs has been very successful in understanding the performance trends of these materials across a variety of applications, particularly for gas storage and separation. In this talk, we will describe the present state of the art in ML-assisted computational screening of MOFs and discuss the opportunities and difficulties that are arising in the field of data science for design and sicıvery of porous materials.
17:40 -18:00
Discussion
18:00 -18:10
Summary
Marcus Rose
18:10 End of the Workshop
(As of 07 November 2022. Subject to alterations.)
More information on the programme will follow.
The workshop will be streamed via Zoom.
Registration is limited and will be handled by "First come, first serve".
Image source: KIT