1. Set of documents
This document is one of the Decentralized Digital Heritage Network specifications produced by the ErfgoedPod project by Netwerk Digitaal Erfgoed, meemoo - Flemish Institute for Archives and Ghent University - IDLab:
-
Decentralized Digital Heritage Network architecture (this document)
This project also contributes to the following companion specifications of the ResearcherPod project:
2. Introduction
The Decentralized Digital Heritage Network is a protocol and a set of best practices to establish a sustainable exchange network of digital heritage data between cultural heritage institutions and their services. It is an application of the generic ResearcherPod boilerplate architecture for decentralized Web networks based on the [solid-protocol] and [ACTIVITYSTREAMS-VOCABULARY]. This document lays out the high-level concepts and design of Decentralized Digital Heritage Network.
3. Overview
The decentralized digital heritage network
create diagram when more is known about the components on mellon side
4. Actors
A Decentralized Digital Heritage Network consists of multiple parties that actively participate in the exchange. Such party is henceforth considered a network actor.
All actors operate under the same protocol, but can differ in purpose. Every actor in a decentralized digital heritage network is therefore one of the following types:
- Cultural Heritage Institution
-
an individual or organisation producing and sharing digital heritage data;
- Service provider
-
an organisation consuming and processing digital heritage data to provide a service to other actors in the network;
- Service portal
-
an organisation consuming digital heritage data to provide a service to end-users.
5. Artefacts
A Digital Heritage Artefact is a unit digital heritage data that is the object of exchange between actors. We distinguish the following types of artefacts:
- Dataset
-
The description of a collection of data as defined in Requirements for Datasets §dataset.
- Dataset description
-
Metadata that publishers should provide about their dataset aligned with the machine-readable publication model described in [requirements-datasets].
- Actor profile
-
A description of an actor that is a member of the network. Often, this is about the organisation acting as cultural heritage institution, service provider, or service portal. It is used as part of a registration or identification proces.
All artefacts have a lifecycle that consists of a sequence of lifecycle events. A Lifecycle Event is a documented activity that reflect changes to the artefact’s presence or positioning on the network. The occurence of such event can render the artefact eligable for certain services or exchanges. For instance, a dataset (= artefact) can only be archived by an archiving service once its registered (= lifecycle event).
We distinguish the following types of lifecycle events:
-
create
: the artefact is created -
destroy
: the artefact is no longer available -
update
: a new version of the artefact is available -
store
: the artefact was stored in a Digital Heritage Pod -
announce
: the artefact was promoted over the network -
register
: a registry (service provider) added the artefact to its service -
preserve
: an archive (service provider) stored the artefact for long-term preservation -
index
: a repository indexed the artefact for search -
enrich
: the artefact’s metadata was enriched -
link
: a link was added to or from the artefact
complete the list of lifecycle events.
6. Components
6.1. Digital Heritage Pod
The Digital Heritage Pod is a Cultural Heritage Institution's main exchange hub for sharing digital heritage information with external service provider and other cultural heritage institutions. By design, the Digital Heritage Pod is a passive component: it can respond to requests for the digital heritage artefacts it stores, but cannot start an interaction with other actors (see the additional components layed out in § 6.3 Participating in a decentralized Digital Heritage Network for active participation).
The Digital Heritage Pod’s core fundament is a Solid Data Pod The Solid Protocol §data-pod consisting of:
-
a datastore to store digital heritage artefacts such as datasets, datasets descriptions, or profiles of organisations;
-
a Web Access Control (WAC) mechanism to manage read-write access data pod resources;
-
a Linked Data Notification (LDN) inbox receive Linked Data Notifications [LDN] from other actors;
-
a Linked Data Platform (LDP) subset implementation to expose the datastore contents and the inbox to other actors.
The digital heritage artefacts stored in the datastore originate from two data management systems at the cultural heritage institution:
-
a Digital Asset Management System: Software used by the cultural heritage institution to organize, control, and manage collections objects by "tracking all information related to and about" those objects.
-
a Collections Management System: A centralized system where cultural heritage institutions (efficiently) store, organize, manage, access and distribute their media assets.
Finally, the pod exposes a Artefact Lifecycle Event Log: a resource containing an immutable log that records all lifecycle events related to artefacts known to the pod.
6.2. Digital Heritage Service Hub
A Digital Heritage Service Hub is a service provider's exchange hub to make its services available to other network actors such as cultural heritage institutions or service portals. It consists of some of the same interface components as the Digital Heritage Pod:
-
a Linked Data Notification (LDN) inbox receive Linked Data Notifications [LDN] from other actors;
-
a Linked Data Platform (LDP) subset implementation to expose the inbox to other actors;
-
an Artefact Lifecycle Event Log resource that publishes and records all lifecycle events related to artefacts processed by the service provider.
In contrast to the Digital Heritage Pod, it is unspecified what other subcomponents a service hub should provide. Processes that store data, provide security or execute the services are considered a black box.
6.3. Participating in a decentralized Digital Heritage Network
To actively participate in the network, actors require a few components that enable them to interact with other actors. For cultural heritage institutions), these components commonly complement a digital heritage pod or Digital Heritage Service Hub.
Interations between actors are always about a digital heritage artefact and result in a lifecycle event of that artefact.
-
a Policy: set of rules that dictates what actions need to be taken when a lifecycle event occurs. These originate from the digital heritage network participation agreement, possibly amended with procedures imposed by the institution, the discipline, or personal preference.
-
an Orchestrator: an autonomeous that watches the digital heritage pod or Digital Heritage Service Hub's inbox, and can interpret and execute the policy.
-
a Dashboard: a user interface for users to interact with the contents of the digital heritage pod.
6.4. Collection information from a decentralized Digital Heritage Network
Actors retrieve two types of information from the network:
-
lifecycle information: given an object, what actions has it been involved in? eg. added to register X, indexed by Y, or archived by Z.
-
descriptive information: given descriptive information, what objects match this description? eg. what Pods have works from Peter Paul Rubens
- Query index
-
An index that allows for fine-grained search into the contents of the files stored in the Data Pod.
- Collector
-
Agent that queries or craws the decentralized network for distributed information targeted by a certain query that needs solving.
- Filters
-
Description of the information that needs to be collected.
7. Technical aspects
7.1. WebID
Simple universal identification mechanism for the Web and a core aspect of Solid. Used in ErfgoedPod to identify acting organisations in the network (eg. a Cultural Heritage Institution, a Registry, ...)
Example: http://kb.nl#me
7.2. Linked Data Notifications (LDN)
Communication protocol between two actors in the network. Defines an inbox to receive an [LDN]. An inbox can be discovered thorugh a Link
header when requesting a resource, like the WebID.
Example:
POST /inbox HTTP / 1.1 Host : registry.nde.nl Content-Type : application/ld+json;profile="https://www.w3.org/ns/activitystreams" Content-Language : en { "@context" : "https://www.w3.org/ns/activitystreams" , "summary" : "KB created dataset.ttl" , "type" : "Create" , "actor" : "http://kb.nl#me" , "object" : "created pod.kb.nl/dataset.ttl" }
7.3. Eventlog
The eventlog is a mandatory log stored in each Pod or Service Hub (eg. Registry) that participates in the network. Lifecycle events of datasets (and other artefacts) are stored there.
Example:
pod.kb.nl/eventlog
@prefix lode: <http://linkedevents.org/ontology/>. @prefix time: <http://www.w3.org/2006/time#>. _:1 a lode:Event; lode:atTime [ a time:Instant; time:inXSDDateTimeStamp 2020-04-12T10:30:00+10:00 . ]; lode:involvedAgent <http://kb.nl#me>; dc:description "Created pod.kb.nl/dataset.ttl". ...
[2020-04-12T10:30:00+10:00] Created pod.kb.nl/dataset.ttl [2020-04-12T11:30:00+10:00] Created pod.kb.nl/dataset-desc.ttl [2020-04-12T12:30:00+10:00] Requested registration: pod.kb.nl/dataset.ttl with registry.nde.nl [2020-04-12T13:30:00+10:00] registry.nde.nl registered pod.kb.nl/dataset.ttl
7.3.1. Rulebook
A rulebook is a configuration file with machine-readable business rules and is the driver for the Orchestrator component. It dictates what actions the Orchestrator should take when it is notified of an event (typically by an incoming Linked Data Notification).
8. Application to business processes
This section implements each business process from [bp-nde] using the architecture described in this document. The process description, the involvement of the § 6 Components, and other implementation details are noted in the following template:
- Roles
-
The different actor roles that interact with the process.
- Components
-
The components that are involved.
- Goal
-
The final successful outcome that completes the process.
- Stakeholders
-
Anybody or anything with an interest or investment in how the system performs.
- Preconditions
-
The elements that must be true before a use case can occur.
- Triggers
-
The events that cause the use case to begin.
- Postconditions
-
What the system should have completed by the end of the steps.
- Procedure
-
The process and steps taken to reach the end goal, including the necessary functional requirements and their anticipated behaviors.
In addition to these template, an example HTTP exchange is added for each process.
8.1. (BP1) Initialize a Digital Heritage Pod
Roles | |
---|---|
Components | |
Goal | The organization has an operational Digital Heritage Pod. |
Stakeholders | |
Preconditions | A Cultural Heritage Object Maintainer and a reachable running Solid Data Pod are provided by the Cultural Heritage Institution. |
Triggers | A Cultural Heritage Object Maintainer wants to share metadata with Solid networks. |
Postconditions | A Solid Pod has been initialized as Digital Heritage Pod. It is running and is reachable by other actors in the Decentralized Digital Heritage Network. |
System |
|
Example HTTP Sequence
use a more generic organization domain than kb.nl
8.2. (BP2) Register an Orchestrator for a Digital Heritage Pod
Roles | |
---|---|
Components | |
Goal | The organization has an operational Orchestrator connected to its Digital Heritage Pod. |
Stakeholders | |
Preconditions | There is a reachable Digital Heritage Pod within the Cultural Heritage Institution. |
Triggers | A Cultural Heritage Institution wants to share metadata with Solid networks. |
Postconditions | An orchestrator is running and is reachable by the institution. |
System |
|
Example HTTP Sequence
8.3. (BP3) Adding a Cultural Heritage Institution to the Registry
Roles | |
---|---|
Components | |
Goal | The network is aware of the organization: it can be discovered by exploring or querying the Registry. |
Stakeholders |
|
Preconditions |
|
Triggers | A Cultural Heritage Institution wants to join the network |
Postconditions | Organisation Profile is added to the search query index of the Registry. The Cultural Heritage Institution is aware that it is registered with the Registry |
System |
|
Example HTTP Sequence
split sequence with and without orchestrator?
8.4. (BP4) Adding a new Dataset to the Registry #
is there a new version of the dataset or of the dataset description?
Roles | |
---|---|
Components | |
Goal | The Dataset’s location shows up in search results when querying the Registry with dataset-level metadata. |
Stakeholders |
|
Preconditions |
|
Triggers | A new Dataset is ready for publication |
Postconditions |
|
Procedure |
|
Example HTTP Sequence
8.5. (BP5) Updating a Dataset in the Registry #
Roles | |
---|---|
Components |
|
Goal | Reflect latests changes of a registered Dataset in the query results of the Registry |
Stakeholders |
|
Preconditions |
|
Triggers | A new version of a Dataset is ready to be released. |
Postconditions |
|
Procedure |
|
Example HTTP Sequence
8.6. (BP6) Adding a Registry to the ACL of a Digital Heritage Pod
Roles | |
---|---|
Components | |
Goal | The Registry can access the necessary data in the Digital Heritage Pod |
Stakeholders | / |
Preconditions | The Registry does not have access to the Digital Heritage Pod of the Cultural Heritage Institution |
Triggers | The Registry instance is unknown to the Digital Heritage Pod |
Postconditions | The Registry is added to the ACL list of the Digital Heritage Pod |
Procedure |
|
Example HTTP Sequence
8.7. (BP7) Constructing backlinks between objects of two institutions
is the orchestrator picking up new links or does it get a notification?
Roles | |
---|---|
Components |
|
Goal | Cultural Heritage Institution B adds links to Objects of Cultural Heritage Institution A after becoming aware of links from Objects of Cultural Heritage Institution A to a Objects of Cultural Heritage Institution B |
Stakeholders | Knowledge Graphs Browser/ Service Portals Users |
Preconditions | A Dataset of Cultural Heritage Institution A known to the Registry contains links to objects of Cultural Heritage Institution B The links have not been processed by the Registry yet. |
Triggers | The Registry has added or updated a Metadata Dataset in its Query Index |
Postconditions | Cultural Heritage Institution B has enriched its Datasets with links to Objects of Cultural Heritage Institution A |
Procedure |
|
Example HTTP Sequence
8.8. (BP8) Subscribing to a topic
Roles | |
---|---|
Components | |
Goal | A Service Portal Provider or Network of Terms subscribes to a topic - defined by a query - at a Registry, of which it wants to receive notifications about related artefacts. |
Stakeholders |
|
Preconditions |
|
Triggers | A Service Portal wants to subscribe to artefacts or a certain topic. |
Postconditions | The Service Portal is subscribed to a dataset topic in the Registry. |
Procedure |
|
Example HTTP Sequence
8.9. (BP9) Discovering a dataset related to a subscribed topic
Roles | |
---|---|
Components | |
Goal | A Service Portal Provider retrieved a discovered Dataset for further processing. |
Stakeholders |
|
Preconditions |
|
Triggers | The Registry has added or updated a Metadata Dataset in its Query Index. |
Postconditions | The Service Portal has retrieved the Dataset. |
Procedure |
|
Example HTTP Sequence
8.10. (BP10) Discovering a Term source
Roles | |
---|---|
Components | |
Goal | A Network of Terms Provider adds a new Term Source to its index. |
Stakeholders |
|
Preconditions |
|
Triggers | The Registry has added or updated a Metadata Term Source in its Query Index. |
Postconditions | The Network of Terms has retrieved the Term Source. |
Procedure |
|
Example HTTP Sequence