1. Set of documents
This document is one of the Decentralized Digital Heritage Network specifications produced by the ErfgoedPod project by the Dutch Digital Heritage Network (NDE), meemoo - Flemish Institute for Archives and Ghent University - IDLab:
Decentralized Digital Heritage Network architecture
Use cases & Business processes (this document)
Common infrastructure in Cultural Heritage Institutions
This project also contributes to the following companion specifications of the ResearcherPod project:
Artefact Lifecycle Event Log
The NDE-Usable programme of the Dutch Digital Heritage Network (NDE) aim to redesign the way cultural heritage institutions and applications (eg. service portals that allow users to search for specific heritage data) exchange data. The current landscape shows a cascade of tightly coupled data aggregators, which has shown to be inflexible and lead to brittle integrations. As a result, both the data providing institutions and the aggregators have high maintenance costs, which are funds better spend elsewhere. Moreover, it leads to data duplication, with a subsequent loss of control for the cultural heritage institutions.
NDE-Usable envisions a discovery infrastructure that guides applications to the relevant original data sources, rather than aggregate and republish their data. The design integrates applications with institutions’s collection data in a more loosely-coupled manner, enabling institutions to reorientate their developed service integrations more easily to other vendors. To this end, NDE-Usable provides implementation guidelines, interoperability requirements and essential discovery services, ie. a register for dataset descriptions and a search index for shared terminology sources (Network of Terms).
The principles of Decentralized Web and Solid resonate strongly with NDE-Usable’s network design. Therefore, the ErfgoedPod project explores what a possible implementation of the NDE-Usable vision could look like when applying the Solid ecosystem. The project outputs a possible design of a decentralized exchange network for digital heritage data. ErfgoedPod shares the goals of the current NDE-Usable infrastructure, but is considered a research effort, and is therefore developed independently with a lower technology readiness level. In essence, the project tests whether the principles of a decentralized social network - actors announcing, sharing & following information - are a sustainable basis for exchanging digital heritage data.
The main project outcomes are protocols and components for creating generic decentralized exchange networks for various types of metadata pertaining artefacts, which is joint work with the ResearcherPod project. Additionally, ErfgoedPod provides a small background study, an architectural design and a descriptions of relevant use cases to apply these generic protocols and components in the digital heritage domain. This document captures the use cases that originate from the NDE-Usable programme and the high-level design of a decentralized network for digital heritage artefacts. To that end, it identifies a list of representative roles and services in the Dutch Digital Heritage Network and the business processes that they should be able to execute. These business processes then serve as a basis for a proof-of-concept implementation of such network, as further outlined in the architecture specification.
3. Context Dutch Digital Heritage Network
The NDE-Usable programme attempts to design a intermediary network infrastructure that facilitates
End-user Web portals to discover and consume data about cultural heritage objects; and
Cultural heritage institutions to advertise and provide data about cultural heritage collections and objects.
NDE-Usable deliberately avoids the unsustainable practice of aggregating data by designing a network of intermediary services that help Web portals navigate to relevant datasets that institutions host and publish themselves. By basing such design on Solid, the ErfgoedPod project organizes communication between the institutions, the Web portals and the mediating services using standardized protocols and in a very loose-coupled fashion. This highly increases data mobility: an integration with a service instance can be replaced with another at lower cost.
3.1. Design of a Decentralized Digital Heritage Network
ErfgoedPod provides generic designs of a network architecture and software components, described in the architecture specification. This design can be used to enable different types of actors to interact about an artefact. Actors types include the Web portals, the Cultural heritage institutions, but also value-adding services that populate the intermediary network space, such as repositories for long-term preservation or data registries. They can take up one or more of the following roles:
Digital Heritage Pod: storing and providing data in the network, typically the Cultural heritage institutions;
Digital Heritage Service Hub: consuming data from Data Pods to perform a service on top of that data, typically the value-adding service providers;
Collector: discovering and collecting data from the network for further processing or to present it to end-users, typically the Web portals, but in some cases the service providers.
Each role can have multiple instances, possibly partitioned based on region or domain, creating a selection of Service Hubs to interact with. Unlike the metadata providers, it is up to the instance how the service is implemented. However, if a service hub would provide data to other actors, it has to adopt a Pod. Hence, is not uncommon for an actor to combine several roles, especially when the service includes re-sharing data in some other shape or form. For instance, a summarization service provider (Digital Heritage Service Hub) can discover and collect data (Collector) and store the summaries for redistribution (Digital Heritage Pod).
On the network, the different actors interact with each other by sending notifications, which can be grouped according to the roles that are involved.
Pod - Pod: one institution stores an artefact (ie. digital heritage object or collection) and draws the attention of another institution to that artefact. Examples are two institutions sharing metadata because of a loan, or they do mutual enriching their collections with the metadata of the other party.
Data Pod - Service Hub: one institution requests a service from a service provider. Examples include institutions wanting to register their dataset in a dataset registry or institutions starting an ingest process in a digital archive.
Service Hub - Service Hub: one service provider involves another service provider in order to complete a service. Examples include a dataset summarization service depending on a dataset registry service.
Collector - Pod: an end-user portal collects collection data from different institutions. Examples include a thematic website about a certain topic that collects cross-institutional data about that topic.
Collector - Pod: an end-user portal depends on a service provider to collect data. Examples include the discovery of institutional data pods or for enrichments on that data.
An illustration of the results is given in the diagram below.
The Pods and Service Hubs contain the same minimal set of components:
an Inbox resource to receive Linked Data Notifications (Architecture Decentralized Digital Heritage Network §pod);
Rulebookto participate in the network (Architecture Decentralized Digital Heritage Network §participate)
Query indexand a Artefact Lifecycle Event Log to allow their infomation to be collected by others (Architecture Decentralized Digital Heritage Network §collection-information-from-a-decentralized-digital-heritage-network)
The complete setup can be summarized in a component blueprint, which can be used to implement any component in the Decentralized Digital Heritage Network. An overview is illustrated in the figure below.
The remainder of this section provides more detail on how the current cultural heritage landscape would populate such high-level design. In the tables below, we aim to establish a shared understanding the different § 3.2 Actors in the network (who is participating?), § 3.4 Roles in the network (what is their function?), and § 3.3 Artefacts in the network (what are they exchanging?) throughout this project and related initiatives. The terminology originates from the high level design and the datasets requirements and is aligned with the dutch architecture standard DERA and the flemish vocabulary standard OSLO, .
3.2. Actors in the network
The Digital Heritage Network (NDE) brings together a large number of active parties in one interoperable network. In Decentralized Digital Heritage Network architecture, these parties are defined as Actors. An actor represents a human or organizational stakeholder that needs to interact with the digital heritage data in the network. The list of possible actor types is given below.
|DERA (NL) Actoren
|OSLO (VL) Cultureel Erfgoed Event
|Cultural Heritage Institution
|Organisation that manages digital heritage information and wants to share this information over the network.
|Actor that selects, enriches or transforms cultural heritage information to provide certain services.
|Anyone who wants to use cultural heritage information.
3.3. Artefacts in the network
On the Digital Heritage Network, actors can exchange information pertaining to certain data objects related to digital heritage. These data objects are denoted by the Decentralized Digital Heritage Network architecture as artefacts. An artefact as the object of an interaction between actors. Hence, within the scope of this project, actors exchange messages about an artefact, not the artefact itself.
The list of relevant digital heritage artefacts is given below. We distinguish between digital heritage artefacts: artefacts with direct cultural heritage value and "metadata" artefacts: artefacts that are about other artefacts.
|DERA (NL) Bedrijfsobjecten
|OSLO (VL) Cultureel Erfgoed Object
|Digital heritage artefacts
|Generic representation of an entity.
|Cultural Heritage Object
|Object with cultural heritage value.
|Self-contained information related to a Cultural Heritage Object.
|Information that is not explicitly present in the original Cultural Heritage Object.
|Collection of Cultural Heritage Objects.
|Word or phrase to denote a concept.
|Source of controlled sets of terms.
|A machine actionable profile of the organization with basic details (eg. identifier, name).
Cultural Heritage Object
|Descriptive document about a Cultural Heritage Object.
|Metadata Cultuurhistorisch object
|Descriptive document about an enrichment.
|Descriptive document about a Dataset.
|Metadata Term Source
|Descriptive document about a Term Source.
3.4. Roles in the network
Actors can take up different roles in the Digital Heritage Network depending om whether they provide data about artefacts, consume data about artefacts or provide additional services that in some way affect the artefact’s lifecycle. The list of possible roles is given below.
|DERA (NL) Rollen
|OSLO (VL) Cultureel Erfgoed Object
|Person that manages the collections and datasets.
|Term Source Maintainer
|Person that manages and curates the Terms and Term Sources.
|Cultural Heritage Object Maintainer
|Person that manages the collections and datasets.
|Bronhouder metadata cultuurhistorisch object
|Person that manages an enrichment service (provider).
|Bronhouder metadata verrijkingen
|Entity that queries the network for cultural heritage information.
|Entity that provides a service in the network.
|Service Portal Provider
|Entity that provides a Service Portal to end users that selects and displays cultural heritage information from the network.
|Catalog for metadata about Datasets with cultural heritage information. Can make topic-based selections of datasets and runs enrichments services to improve these selections.
|Network of Terms
|Catalog for metadata on Term Sources. Can make selections of Term Sources based on topic.
|Catalog for finding relations between terms and objects.
|Entity that crawls the network in a targeted fashion in order to collect data that matches a query. It it used to construct Knowledge Graphs.
4. Pilot Use Cases
This section documents a selection of use cases from three Dutch Digital Heritage Network members: Van Gogh Worldwide, Brabants Erfgoed and Oorlogsbronnen. They help constructing a reference set of business processes that are common in a network of digital heritage stakeholders. The result is described in § 5 Business processes.
Because ErfgoedPod project attempts at making digital heritage more usable, the selected use cases deliberately take the perspective of the end-user applications or Service Portal. Therefore, the Service Portal Providers were consulted with the following three questions:
How do they discover datasets?
How do they decide which of these datasets are relevant for their portal?
How can the Dutch Digital Heritage Network design, in particular the Dataset Registry, Knowledge Graph, and Network of Terms actor roles, help this process?
|How can NDE support?
|Van Gogh Worldwide
|Manual effort by the Service Portal Provider based on a fixed list of cultural heritage institutions.
|Automating the discovery and delivery of metadata on relevant artworks. This includes enabling portals to select data using their relevancy criteria.
|Automatically via a regional aggregation platform Brabant Cloud, but without a discovery process.
Criteria are defined by the Brabant Cloud platform:
|Enabling the portal to find and select data using their relevancy criteria.
| A more dynamic identification of sources that make (substantial) references to WWII (or similar) terms.
In a later stage: performing a full-text analysis of datasets to identify terms from the WWII thesaurus. Stimulate the detailed description of datasets. Alert the portals when discovering metadata in a certain timeperiod.
5. Business processes
This section described the different business processes that network actors in a specific role should be able to execute in a Decentralized Digital Heritage Network. To that end, they cover processes from the perspective of the following actor-role combinations:
the Cultural Heritage Institutions and Term Source Providers that act as maintainers. A single organization can be both actors, but this is not always the case;
organizations that take up the Dataset Registry Function;
organizations that act as a Network of Terms.
organizations that act as a Knowledge Graph.
Each business process (BP1 - 10) is described as a high-level description, followed by a more detailed description of the implementation details.
5.1. Perspective of the Cultural Heritage Institution
This section takes an internal perspective on the Cultural Heritage Institution. This includes business processes that only affect actors inside a single institution and are often of preparational or operational nature such as setting up a software component.
5.1.1. (BP1) Initialize a Digital Heritage Pod from a Solid data pod
A cultural heritage institution want to participate in the (decentralized) Dutch Digital Heritage Network. Therefore, a Solid data pod is required to serve as main exchange hub for metadata about cultural heritage objects.
A maintainer, who is employed at the institution, creates a data pod using an existing service or by hosting a Solid server locally.
Next, the maintainer prepares this data pod by creating two resources:
The Solid data pod is now a compliant Digital Heritage Pod and the cultural heritage institution can paritipate in the network.
5.1.2. (BP2) Enabling automation of business processes by initializing an orchestration service
A cultural heritage institution requires interaction with other network actors to execute their business processes. Rather than manually executing the consecutive steps of such process, an institution’s maintainer can setup an Orchestrator component to automate such tasks.
A maintainer, who is employed at the institution, creates an orchestrator instance using an existing service or by running it locally as a background process.
The maintainer supplies the orchestrator instance with the location of its inbox resource, so the orchestrator is able to read notifications that might trigger a business process.
Next, the orchestrator requires a machine-readable version of all the business processes that the cultural heritage institution wants to execute on the network.
Once the orchestrator has acknowledged a successful reception of the inbox and the business processes, the initialization is complete.
5.1.3. (BP3) Notifiying institutions about enrichments
Two cultural heritage institutions can also improve the interconnectedness of their collections in a reciprocal discovery process. Thereby, it facilitates the creation of links, backlinks or augmented metadata between cultural heritage objects.
A cultural heritage institution (A) adds information (eg. a link) about a cultural heritage object to an existing dataset.
The A cultural heritage institution (A) notifies a target cultural heritage institution (B) that it has intomation about possible cultural heritage objects in their collection.
Cultural heritage institution (B) then adds this information or backlinks to the cultural heritage objects of cultural heritage institution (A).
5.2. Perspective from the Cultural Heritage Institution interacting with the Dataset Registry
This section takes the perspective of an Cultural Heritage Institution interacting with a Dataset Registry Service, which implements the design of the Dataset Registry-function using the Solid and ResearcherPod architecture. This entails business processes that are shared between both parties where the institution is requesting a service (ie. adding, updating or removing dataset summaries from the dataset registry) and the Dataset Registry responding to that request. The latter includes the possible outcome of the performed service.
5.2.1. (BP3) A Cultural Heritage Institution registers itself to a Dataset Registry
A dataset registry service collects descriptions of datasets (metadata dataset) from cultural heritage institutions. Based on this information, the dataset registry service can navigate a consumer to the digital heritage pods that store relevant datasets. However, in order to achieve this, an cultural heritage institution needs to be known and trusted by the dataset registry service.
A Cultural Heritage Institution maintains a machine-readable organisation profile that contains a basic description of the institution.
A maintainer, who is employed at the institution, submits a registration request to the dataset registry service containing a link to the profile.
When processing the request, the dataset registry downloads the organisation profile and verifies its eligibility to the service.
If the registration of the cultural heritage institution is accepted, the dataset registry service adds the institution to the list of registered dataset maintainers and informs the institution of a successful registration.
5.2.2. (BP4) A Cultural Heritage Institution registers a new Dataset with the Dataset Registry
Once a cultural heritage institution is registered with a dataset registry service, it can add dataset descriptions or metadata dataset to this service.
A maintainer, who is employed at the institution, notifies the dataset registry service that a new dataset distribution is available. The maintainer supplies a link to the metadata dataset.
The dataset registry service checks whether the cultural heritage institution is registered. If so, it downloads the metadata dataset adds it to its index to enable search.
When the indexing is done, the dataset registry service informs the cultural heritage institution that the dataset is part of the dataset registry service.
5.2.3. (BP6) A Cultural Heritage Institution updates a registered Dataset in the Dataset Registry
A cultural heritage institution can release a new version of an former dataset that is already registered with a dataset registry service. In that case, it needs to reflect the update in the dataset registry service.
A maintainer, who is employed at the institution, notifies the dataset registry service that a new dataset distribution of a registered dataset is available. The maintainer supplies a link to the updated metadata dataset.
The dataset registry service checks whether the cultural heritage institution is registered. If so, it downloads the metadata dataset and replaces the old version in the search index.
When the indexing is done, the dataset registry service informs the cultural heritage institution that the dataset has been updated in the dataset registry service.
5.3. Perspective Knowledge Graph / Network Of Terms - Dataset Registry
This section takes the perspective of the interaction between the Dataset Registry Service and the applications Knowledge Graph and Network of Terms. This entails business processes where both services notify eachother about important events. Hence, these processes are often initiated by the completion of another business process (eg. one between the institution and the dataset registry).
5.3.1. (BP7) A Knowledge Graph or Network of Terms subscribes to a topic
The Knowledge Graph and Network of Terms construct topic-oriented sets of metadata by collecting relevant metadata from the network. In case of the Knowledge Graph, these sets are about cultural heritage objects, and in case of the Network of Terms, these sets are about term sources.
Both applications need to discover datasets from cultural heritage institutions that have data relevant to their topic. Therefore, they rely on the dataset registry service to help them narrow the search space based on the metadata dataset and point them to the dataset locations.
A Knowledge Graph or Network of Terms sends a subscription request on a certain topic to the dataset registry service.
The dataset registry service adds the sender to the index entry of that topic and informs them of the subscription.
5.3.2. (BP8) A Knowledge Graph discovers a dataset related to a subscribed topic
When the dataset registry service receives a new or updated dataset description (Metadata Dataset) via § 5.2.2 (BP4) A Cultural Heritage Institution registers a new Dataset with the Dataset Registry or § 5.2.3 (BP6) A Cultural Heritage Institution updates a registered Dataset in the Dataset Registry, it identifies the topics that are related to the dataset.
The dataset registry service cross-checks each topic to the subscription index.
If Knowledge Graph applications are subscribed to any of the topics, they are notified that a dataset is relevant to their search. A link to the dataset is included in the notification.
When receiving this notification, the Knowledge Graph extracts the link and downloads the Dataset from the Cultural Heritage Institution's Digital Heritage Pod.
The Knowledge Graph processes the dataset and incorporates it in the topic-oriented set of metadata.
5.3.3. (BP9) Discovering a Term source
When the dataset registry service receives a new or updated term source description (Metadata Term Source) via § 5.2.2 (BP4) A Cultural Heritage Institution registers a new Dataset with the Dataset Registry or § 5.2.3 (BP6) A Cultural Heritage Institution updates a registered Dataset in the Dataset Registry, it identifies the topics that are related to the term source.
The dataset registry service cross-checks each topic to the subscription index.
If Network of Terms applications are subscribed to any of the topics, they are notified that a term source is relevant to their search. A link to the term source is included in the notification.
When receiving this notification, the Network of Terms extracts the link and downloads the term source from the Cultural Heritage Institution's Digital Heritage Pod.
The Network of Terms processes the term source and incorporates it in its search index.