Data catalog
https://data.trr379.de hosts the data catalog of the TTR379. This site is currently work in progress, and will be published once the project has started to accumulate data.
A data catalog is a website that displays and organises information that you provide about your collected datasets. The purpose of a catalog is to make data more FAIR, specifcally more findable, even if they are access restricted. The catalog further encourages collaboration between partner sites. If all partners share a common data catalog which showcases their respective collected datasets, partners can more easily engage in potential collaborations with each other as they can see what data has already been collected at each site.
The data catalog only makes metadata (i.e. descriptive fields about your dataset) openly available, there is no need to make the actual data file content publicly accessible. In other words, you do not need to upload sensitive data (or publicly available data, for that matter) anywhere in order for the dataset to be part of the catalog, just metadata. Providing full access to a dataset, or merely to a limited description about a dataset is entirely within your control. If a partner wishes to pursue a collaboration regarding a certain dataset, that partner will still need to follow the existing institutional routes of requesting access to your dataset. You have FULL control over what information is displayed in the catalog, and if you want the data files to be publicly accessible or not.
Catalog sources follow a specific design and structure.
The main source of metadata for the data catalog is the DataLad dataset at ./data
, also referred to as the catalog superdataset or the catalog homepage dataset.
The idea is that this superdataset acts as a container for the metadata of all new datasets that should be represented in the catalog.
In a typical catalog source repository, you will find the following structure:
./catalog
- this is where the data catalog sources live - the live catalog site serves this directory
./code
- this directory contains scripts that are used for catalog updates
./data
- this is a DataLad subdataset of the current data-catalog dataset/repo - its origin is at: abcd-j/data - it functions as the superdataset for all datasets rendered in the catalog, i.e. it is the homepage of the catalog
./docs
- documentation sources - contains documentation for catalog users and contributors
./inputs
- input files used during catalog creation, updates, and testing