Repository Tools 2: DSpace as a data source
Introduction
The DSpace data source is designed to allow Elements to harvest publications from your DSpace institutional repository into Elements via a Repository Tools 2 (RT2) integration. The RT2 repository data sources do not function in the same way at the publisher/aggregator data sources and it is important that they are set up and configured correctly.
Overview of crosswalking and configuration
Important notes before you start
Please ensure you do not enable this data source without following the preparation, crosswalking and configuration processes. We strongly recommend that all of these steps are first performed and tested in a Test environment before they are applied to your production instance of Elements.
Step 1. Prepare DSpace
Ensure your DSpace is a supported version and that the REST API and the OAI-PMH API have been enabled.
DSpace's OAI index must be refreshed regularly to ensure that DSpace metadata changes are reflected in Elements in a timely fashion. Please see the relevant documentation for your DSpace version: 5.x, 6.x, 7.x, 8.x or 9.x. If there are no adverse effects on system performance, we recommend an hourly schedule to match the frequency of Elements' differential harvest.
Note: REST and OAI-PMH endpoints are required to harvest from DSpace to Elements. If you also wish to deposit from Elements to DSpace (i.e. a full RT2 integration) and you use Dspace 5.x or 6.x, you will also need to enable the SWORDv2 API endpoint. This is not necessary for DSpace 7.x users (note that deposit functionality is not supported in Elements 6.8).
Step 2. Plan and prepare your harvest crosswalk
In order to feed information from your repository into Elements, we need to tell the systems how that data from the repository should appear in Elements. Essentially we specify how each piece of metadata we want to copy from the repository into Elements should be mapped from its repository value into an Elements field. We use a crosswalk map file to specify how these fields are mapped. To find out more about how to plan and prepare your harvest crosswalk review Repository Tools 2: Repository as Data Source and the Repository Tools 2: Defining Crosswalks Guide
Step 3. Configure and Enable DSpace as a data source
The final step is to configure and enable DSpace as a data source. To find out how to do this, please see Repository Tools 2: Configuring DSpace
Important notes about DSpace as a data source
DSpace as a data source and Repository Tools 2
DSpace as a data source can be used as stand-alone functionality or form part of a full RT2 integration (whereby users can deposit to DSpace via Elements, using Elements' publication metadata). Please note that a full RT2 integration requires a Repository Tools license, however use of DSpace as a data source alone does not. For more information about RT2 see the Repository Tools 2: Functional Overview
Integrating with other DSpace Repositories
The DSpace data source integration has been designed to allow you to connect Elements to your own institution's dedicated DSpace institutional repository. It is not designed to connect Elements to DSpace repositories that do not belong to your institution. This is not a supported use of the functionality. The code behind the harvesting of data is optimised under the assumption that the harvested data are all outputs of your institution. Amongst other things, this enables us to infer more about the likelihood that any given repository item was authored by a member of your staff with a matching name, something that is critical to not overpopulating users' pending publication lists with inaccurate matches, and also not slowing down the Elements database with irrelevant publications. If you are interested in connecting to an instance of DSpace that contains shared content with other institutions, please discuss this with Symplectic to ascertain whether your particular usage is supported.
Supported DSpace versions
Elements supports RT2 integrations with the following versions of DSpace:
| DSpace | DSpace | DSpace | DSpace | DSpace | DSpace | DSpace | DSpace | DSpace | DSpace | DSpace |
|---|---|---|---|---|---|---|---|---|---|---|---|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Elements |
|
|
|
|
|
|
|
|
|
|
|
Key:
Green - the DSpace version is supported
Yellow - the DSpace version is supported with some limitations (see below)
Grey - the DSpace version is not supported.
DSpace 5.9 and 5.10 do not support the RT2 integration due to issues with the DSpace API. For details see the following items https://jira.duraspace.org/browse/DS-4085 (5.9 and 5.10) https://jira.duraspace.org/browse/DS-4000 (affects 5.9 only).
As RT2 is based on API to API integration, at times certain functionality may not be available for a particular version of DSpace due to limitations or variations in the DSpace API for that version.
DSpace version | Functionality limitations |
|---|---|
DSpace 5.4 | Differential Harvest is not available. This functionality allows modified records to be updated in Elements between the standard scheduled refreshes. |
DSpace 6.0 | Embargo functionality: It is not possible to send embargo end dates from DSpace to Elements. As a result the OA Monitor and other parts of Elements are not aware of embargoes. If your institution uses embargoes we recommend contacting the Symplectic team for further information before integrating with DSpace 6.0. |
DSpace 6.1, 6.2 | Subsequent deposit functionality is not currently supported due to an issue with the DSpace API which makes it not possible to set files to automatically restrict files upon deposit. |
DSpace 7.x | Elements 6.8 supports only harvest functionality. |
DSpace 7.6.2 - 7.6.x | Deposit functionality is not supported for Elements 6.8 to 6.17 inclusive.
|
DSpace 7.x, 8.x, 9.x |
|
Using DSpace's Authority Control and Virtual Metadata features with Elements Automated Metadata Updates
The authority control and virtual metadata features of DSpace 7 and above allows DSpace to automatically enrich DSpace item metadata. They have similarities to the Elements automated metadata updates feature, in that they grant the system authority to automatically control the metadata of DSpace items. Using automated metadata updates at the same time as either of these DSpace features can lead to conflicts between the two systems, which can lead to corruption and/or loss of item metadata and DSpace entity relationships. The level of compatiblity between these features varies with Elements version; see the advice below.
Elements 6.9 to 6.22
Using Elements automated metadata updates with DSpace authority control and/or virtual metadata is not a supported use of the system, as doing so causes inevitable metadata conflicts. These conflicts arise because updates are made using HTTP PUT operations, which push a complete set of item metadata back to DSpace after it has been interpreted and modified by the automated metadata update process. This results in metadata conflicts, such as:
DSpace authority control references being overwritten by updates from Elements.
DSpace virtual metadata (i.e. enriched metadata values sourced from other entities linked to the item) being written back to DSpace as 'real' metadata on the item itself.
Elements 6.23 and above
Starting with Elements 6.23, using automated metadata updates with DSpace authority control and/or virtual metadata is supported with some important limitations. There must be no overlap between the set of DSpace fields that Elements has authority to update and the set of fields using DSpace's authority control or virtual metadata features. If any field is configured in such a way that both Elements and DSpace may write to it automatically, the systems may disagree on what its correct value should be and conflicts will arise.
This limited support in v6.23+ is possible because updates are now made using minimal HTTP PATCH operations; these send updates only to the DSpace fields that are being overwritten, leaving all other fields unaffected. This ensures that authority control references and virtual metadata added by DSpace will be unaffected by Elements updates, as long as Elements is not granted authority to update any fields that may contain them.
