Howto TRR379
The is the documentation site of the Collaborative Research Center / Transregio 379: Neuropsychobiology of Aggression (TRR379).
Select any topic from the menu or search this site for information.
The is the documentation site of the Collaborative Research Center / Transregio 379: Neuropsychobiology of Aggression (TRR379).
Select any topic from the menu or search this site for information.
TRR379 operates a number of websites for outreach, coordination and documentation purpose.
This is the main website of the TRR379.
This website is not just a public-facing view on the consortium. It is specifically built to be the core component of the metadata concept of TRR379. It provides a collection of canonical definitions of entities essential for the function of TRR379. Such entities include
Any such entity has a dedicated page on the website, with a stable URL that serves as a URI for that entity. As such, these URLs can be used in any TRR379-related metadata to declare relationships to TRR379 entities, for example, the authorship of a publication, the origin project of a data release, etc.
The website is built with the static site generator Hugo. It capitalized on its taxonomy feature. Any page on the site is built from a metadata record. For Hugo, this metadata is presented in the form of a page’s front matter. However, these metadata may themselves be generated from the result of a database query.
Here is an example record for TRR379 spokesperson Ute Habel:
title: Ute Habel
projects:
- a02
- a04
- q01
- q04
sites:
- Aachen
- Juelich
roles:
- pi
- spokesperson
layout: contributor
params:
orcid: 0000-0003-0703-7722
name-title: Prof. Dr. rer. soc.
affiliation: Department of Psychiatry, Psychotherapy and Psychosomatics, Faculty of Medicine, RWTH Aachen University
sortkey: "Habel, Ute"
<additional content on the page's subject>
From this information, the page that identifies and describes Ute
Habel as a spokesperson is
generated. It also links and references her record on the respective pages for
projects, sites, and roles she is associated with. Consequently, the URL
https://www.trr379.de/contributors/ute-habel/ can serve as a URI for Ute Habel
within the TRR379 metadata. Moreover, ute-habel
is a unique identifier for her
as a contributor to TRR379.
While other special-purpose identification systems exist (e.g.,
https://orcid.org for academics), this approach is automatically applicable to
any concept and entity relevant to TRR379. Including roles, data acquisition
methods, instruments, etc. The domain root trr379.de
represents a unique
namespace to define and reference any required entities. This enables a timely
and unencumbered development of a metadata concept for TRR379, without
hindering alignment with and mapping to more global efforts and initiatives.
Structured metadata is rendered to an HTML website with Hugo using a template. This approach separates the information from its presentation.
The look of the website can be altered by adjusting the template, or switching to a different template entirely. This requires familiarity with Hugo and its templating mechanism.
At present, the congo template is used.
https://docs.trr379.de (this site) is the main documentation source for the TRR379. It offers information on facilities and procedures.
https://data.trr379.de hosts the data catalog of the TTR379. This site is currently work in progress, and will be published once the project has started.
TRR379 uses a dedicated CalDAV server at https://cal.trr379.de. This service can be used to host any number of online calendars. Individual calendars can be configured to be publicly accessible, or only for internal (authenticated) consumption.
Any number of non-public calendars can be created and shared with specific audiences. This can be useful to coordinate data acquisition sessions, organize room bookings, or schedule slots for internal meetings and events.
Contact the management team to request a dedicated calendar.
All public calenders are available via CalDAV URL and can be included in any calendar solution, such as Google calendar.
The URLs follow the pattern https://cal.trr379.de/public/<name>
,
where <name>
is the calendar name, as stated in the list below. For the
events calendar this is https://cal.trr379.de/public/events
.
For use with Google calendar, replace https://
with webcal://
. For example,
the events calendar can be added to Google calender with the URL
webcal://cal.trr379.de/public/events
.
Available public calendars:
events
: General TRR379 events [preview]Identifiers are an essential component of the TRR379 research data management (RDM) approach. This is reflected in the visible organization of information on the consortium website, but also in the schemas that define the structure of metadata on TRR379 outputs.
Many systems for identifying particular types of entities have been developed. A well-known example is DOI for digital objects, most commonly used for publications. However, many others exist, like ROR for research organizations, or Cognitive Atlas for concepts, tasks, and phenotypes related to human cognition.
RDM in the TRR379 aims to employ and align with existing systems as much as possible to maximize interoperability with other efforts and solutions. However, no particular identifier system is required or exclusively adopted by TRR379.
Instead, anything and everything that is relevant for TRR379 has an identifier in a TRR379-specific namespace.
TRR379 uses URIs as identifiers that map onto the structure of the main consortium website.
For example, the full TRR379 identifier for the spokesperson Ute Habel is https://trr379.de/contributors/ute-habel.
In this URI, https://trr379.de
is the unique TRR379-specific namespace prefix, contributors/ute-habel
is the TRR379-specific identifier for Ute Habel (where contributors
is a sub-namespace for agents that in some way contribute to the consortium).
Even though Ute Habel can also identified by the ORCID
0000-0003-0703-7722
, via the quasi-standard identifier system for researchers, this alternative identifier is considered an optional, alternative identifier rather than a requirement for TRR379 RDM.
The reasons for this approach are simplicity, and flexibility.
An identifier in TRR379 RDM is a simple text label, in a self-managed namespace. This self-managed namespace can cover any and all entity types that require identification with TRR379. In many cases, an identifier directly maps to a page on the main consortium website. This is a simple strategy to document the nature of any entity. It also establishes the main website as a central, straightforward instrument for communicating and deduplicating identifiers in a distributed research consortium.
Even though any relevant entity can receive a TRR379-specific identifier with the approach described above, the utility of these identifier is limited to TRR379-specific procedures and activities. However, a TRR379 metadata record on a research site (e.g., https://trr379.de/sites/aachen ) can be annotated with any alternative identifier for the same entity (e.g., https://ror.org/04xfq0f34 ). Thereby it is possible to combine the benefits of a self-governed, project-specific identifier namespace with the superior discoverability and interoperability of established identification systems for particular entities.
The additional documentation linked below provides more information on particular identifiers used by TRR379.
This page is a more in-depth description of the rationale behind the SOP for participant identifiers used by TRR379.
Q01 is the central recruitment project. Any participant included in the core TRR379 dataset is registered with Q01 and receives an identifier. This identifier is unique within TRR379 and stable across the entire lifespan of TRR379.
The dataset acquired by TRR379 is longitudinal in nature. Therefore participants need to be reliably identified and re-identified for follow-up visits. Because participants are not expected to remember their TRR379 identifier, it is necessary to store personal data on a participant for the time of their participation in data acquisition activities.
In order to avoid needlessly wide-spread distribution of this personal data, participant registration and personal data retention is done only at the site where a person participates in TRR379 data acquisitions. Each site:
The site-issued identifiers have a unique, site-specific prefix (e.g., a letter like A
for Aachen), such that each site can self-organize their own identifier namespace without having to synchronize with all other sites to avoid duplication.
The identifiers must not have any other information encoded in them.
The TRR379 participant identifiers, as described above, are pseudonymous. Using these TRR379-specific identifiers only, for any TRR379-specific communication and implementations, is advised for compliance with the GDPR principles of necessity and proportionality of personal data handling. This includes, for example, data analysis scripts that can be expected to become part of a more widely accessible documentation or publication.
Any TRR379 site that issues identifiers is responsible for strictly separating personal data used for (re-)identifying a participant, such as health insurance ID, government ID card numbers, or name and date of birth. This information is linked to TRR379-specific identifiers in a dedicated mapping table. Access to this table is limited to specifically authorized personnel.
When a participant withdraws, or when a study’s data acquisition is completed, the mapping of the TRR379 identifier to personal identifying information (1) is destroyed, by removing the associated record (row) from the mapping table. At this point, the TRR379 identifier itself can be considered anonymous. Consequently, occurrences of such identifiers in any published or otherwise shared records, or computer scripts need not be redacted.
The validity of the statement above critically depends on the identifier-issuing sites to maintain a strictly separate, confidential mapping of identifier to personal identifying information, and to not encode participant-specific information into the identifier itself.
For each project or study that is covered by its own ethics documentation and approval, separate and dedicated participant identifiers are used that are different from a Q01-identifier for a person. This is done to enable such projects to fulfill their individual requirements regarding responsible use of personal data. In particular, it enables any individual project to share and publish data without enabling trivial, undesired, and unauthorized cross-referencing of data on an individual person, acquired in different studies.
These project-specific identifiers are managed and issued in the same way as described above.
Importantly, the mapping of the Q01-identifier and a project-specific identifier is typically not shared with the requesting project. This is done to prevent accidental and undesired co-occurrence of the two different identifiers in a way that enables unauthorized agents to reconstruct an identifier mapping that violates the boundaries of specific research ethics.
Sometimes it is necessary to generate participant identifiers that are not compliant with the procedures and properties described above. For example, an external service provide may require particular information to be encoded in an identifier (e.g., sex, age, date of acquisition).
If this is the case, an additional identifier must be generated for that specific purpose. Its use must be limited in time and it must not be reused for other purposes.
Identifier generation and linkage to the standard Q01-participant identifiers is done using the procedure described for project-specific identifiers above.
[This is a draft under discussion]
The planned RDM infrastructure is a federation of interoperable site infrastructures. The key design principle is that no primary data are aggregated to a central infrastructure.
We aim to establish an infrastructure that is suitable for use within the TRR379, but not limited to this scope. Once deployed, the associated services are usable beyond the scope of TRR379.
The following schema sketches the planned infrastructure. Components that hold (primary) data are depicted in yellow. Components that hold (mostly or exclusively) metadata are shown in blue. The direction of information flow is indicated by arrows, exchange of data by solid arrows, and metadata-only exchange by dotted arrows. Infrastructures that are only accessible to authorized agents are labeled with a “lock” symbol. Further details on individual components are provided below.
graph TB; subgraph Central services C1[("Collaboration portal<br>hub.trr379.de<br> 🔐")]:::meta C2{{"Data search<br>query.trr379.de<br> 🔐"}}:::meta C3(Main website<br>www.trr379.de):::meta C4("Data catalog<br>data.trr379.de"):::meta end subgraph "Aachen 🔐" A1[(hub)]:::data A1a[(lab1)]:::data A1b[(lab2)]:::data A2{{query}}:::meta end subgraph "Frankfurt 🔐" F1[(hub)]:::data F2{{query}}:::meta end subgraph "Heidelberg 🔐" H1[(hub)]:::data H1a[(ZI-hub)]:::data H1b[(lab1)]:::data H2{{query}}:::meta end A1 -.-> C1 A1a -.-> A1 A1b <---> A1 A1 -.-> A2 A2 <-.-> C2 F1 -.-> C1 F2 <-.-> C2 F1 -.-> F2 C3 <-.-> C1 C3 -.-> C2 H1 -.-> C1 H1a <-.-> C1 H1a <-.-> H1 H1b <---> H1 H1 -.-> H2 H2 <-.-> C2 C1 -.-> C2 C1 -.-> C4 C3 -.-> C4 H1b <---> A1a %% node links to actual services click C1 href "https://hub.trr379.de" click C3 href "https://www.trr379.de" %% classes to distinguish data and metadata nodes classDef data fill:#ffa200,color:#000 classDef meta fill:#5C99C8,color:#000 %% invisible link purely for manipulating the grouping C2 ~~~ A1b C4 ~~~ A1b C2 ~~~ F1 C2 ~~~ H1b
All central services a metadata-focused. No primary data acquired at participating sites are aggregated into central databases/storage.
This is the main hub for collecting actionable links to all TRR379 resources and information. The software solution for this hub is Forgejo-ankesajo. It is a free and open-source software package, and the direct service counterpart of DataLad, the main tool proposed for implementing reproducible research workflows in TRR379 labs.
The hub will store DataLad datasets, referencing all TRR379 data resources without hosting any actual data. Instead the DataLad dataset point to the individual institutional data stores, or to community data repositories when and where data have been published.
The hub is also a place to deposit (shared) computational environments, and implementations of (shared) data processing pipelines, software publications, and source code repositories under a uniform TRR379 umbrella.
A test site is deployed at https://hub.trr379.de and is being evaluated.
See this page for a description of the main website. Importantly, the website renders essential metadata for the TRR379 (contributors, roles, publications, projects, research topics, etc.) It provides a unique URI for any such entity, to be used as identifiers in all TRR379 (meta)data resources.
The website is programmatically generated from a repository hosted on the TRR379 hub, to enable contributions by all TRR379 members.
The TRR379 data catalog is a website dedicated to providing a uniform (read-only) view on the TRR’s data resources. It is rendered programmatically by DataLad Catalog from metadata on TRR379 data resources hosts in the TRR379 hub.
This site is indexed by specialized search engines like Google’s dataset search and a key enabler for general findability of TRR379 resources.
This is a federated data discovery service that is tailored to the cohort dataset acquired by TRR379 as a whole. It will enable the discovery of individual data records matching a given set of criteria, regardless of the contributing TRR379 site.
The service is federated. Each sites runs their own instance, and has the sole authority on deciding what metadata are shared with other TRR379 sites. Only these metadata property will be accessible by TRR379 at large, while more detailed metadata records can be use for in-house queries.
The proposed solution for the query service is a version of NeuroBagel adapted to the data nature and needs of TRR379.
Sites are free to implement any RDM solutions, as long as that infrastructure provides
with the aim to enable reproducible research from primary data to published results within and across TRR379.
Q02 supports sites with software solution that facilitate interoperability within TRR379. This includes the local deployment of the software systems used to run the central services.
We aim at individual sites running their own data hubs (using the same software solution) as the central https://hub.trr379.de. In contrast to the central hub, these institutional sites can directly use the storage features of Forgejo-ankesajo, and host arbitrary amounts of data. TRR379 can communicate data availability using a federation protocol.
Mostly referred to as SOP.