Should I use the API or the reporting database?
Symplectic Elements offers two interfaces for the automated bulk extraction of research data: the API and the reporting database.
What are their purposes, and which should you use for your requirements?
Purpose of the API
The API is a RESTful web service made available to all clients. Its broad purpose is to provide a machine-readable interface to Elements that supports robust data harvest and import functions from which data pipelines to other systems can be built.
In contrast to the reporting database, its design is optimised to prioritise stability over flexibility:
Stability for your downstream systems: via backwards compatible endpoints that will not change their behaviour even when Elements is upgraded;
Faithfulness of data representation: through a highly normalised data model;
Low latency data distribution: support for the creation of up-to-the-minute low-bandwidth differential data feeds;
Security: control over access rights;
Robustness of availability: nothing you do to the API, short of a DOS attack, will block its operation.
Purpose of the reporting database
The reporting database is a reporting-oriented database mirror of the Elements operational database. Its broad purpose is to provide a synchronised data warehouse from which clients can quickly create ad-hoc human-readable reports or perform regular ETL to downstream systems.
In contrast to the API, its design is optimised to prioritise flexibility over stability:
Ease of setup: through low-cost database regeneration;
Ease of customisation: through encouragement to experiment with your own data analysis stored procedures and indexes. Have you messed up? Simply regenerate the database, correct your modifications script, re-apply and try again;
Ease of use: through meaningfully named tables and columns;
Ease of access: through direct exposure of a database to your SQL scripts and tools;
Responsiveness to change: The reporting database can be easily regenerated after you have made sweeping changes to the Elements type system.
Which should I use for systems integration?
If you are aiming for a data feed into another system that absolutely must continue to function after Elements has been upgraded, use the API.
The reporting database may also be a good starting point for implementing a quick systems integration solution, since the work required to consume data through the API is more than the work required to extract the same data from the reporting database. However, the following considerations should also be taken into account. If any of them become a problem for you, Symplectic will likely point you towards re-implementing your solution against the API, so please do consider the maintenance of your solution when choosing whether to use the API or the reporting database for your data extraction problem.
You should not use the Discovery API to integrate systems as this is not maintained for client use. Further information on the Discovery API.
An upgrade to Elements has broken my system feed from the reporting database
Why can this happen? The reporting database aims to stay up to date with respect to the data structures in Elements as Elements evolves as a product. Because a simple table structure is used in the reporting database, this cannot be done unless the reporting database tables are rebuilt after structural changes in Elements, and in particular after an Elements version upgrade.
Why is this a problem for you? If you have not anticipated the possibility that the table structures of the reporting database might change, then you might be faced with the surprise of having to unexpectedly alter your custom SQL scripts to work with the new table structures. You are faced with a choice between not upgrading Elements, or carefully managing the timing of the upgrade with your SQL script maintenance.
Why is this not a problem with the API? The API provides multiple backwards compatible endpoints, all of which are maintained such as not to break interoperability with anything written against them, even when Elements is upgraded. You have up to 2 years after an upgrade to worry about systems interoperability maintenance if you have written against the most recent endpoint. Crucially, you can do this at your leisure between Elements version upgrades.
What should I do? If you value ongoing stability over ease of creation/alteration, then use the API. If you value ease of creation/alteration over stability, use the reporting database.
An upgrade to Elements has broken my custom SSRS reports
Why can this happen? You must use the reporting database if you wish to implement a custom SSRS report accessible from the Elements user interface. You cannot use the API for this. However, since the table schema of the reporting database can change between versions of Elements, it is possible that some of the the SQL queries you have carefully crafted to drive your custom SSRS reports no longer function correctly after an upgrade of Elements.
Why is this a problem for you? It is always a problem when custom reports break during a version upgrade.
What should I do? An upgrade of Elements should always be performed in a development or test environment before being committed to in a production environment. Test each of your dependencies on the reporting database and consult the release notes for any breaking changes made to the reporting database structure. Symplectic will do its best to release as detailed a step-by-step guide as it can with product release notes whenever it makes changes to the reporting database that are likely to break your scripts. Always factor in the cost associated with this process when deciding to create and maintain a new custom SSRS report. If you value stability of your SSRS report over the convenience of consuming reporting tables managed by Elements, then you do have the option to implement your report to pull data from your own database, created and populated by you from a custom API data extraction. That way, you will have a report that is shielded from alterations to the reporting database tables generated by Elements itself, and it will not break on upgrade of Elements.
A reporting database "full sync" made some data disappear!
Why can this happen? All of the tables in the reporting database are periodically truncated and repopulated. This is called a "full sync", and is generally scheduled by the client to occur automatically, perhaps once a week at the weekend. It can take up to a few hours for the very largest Elements databases. In addition, an Elements administrator can manually run a full sync at any time. During a full sync, the reporting database contains only a partial set of data. See How the Reporting Database is Kept Up to Date for more information.
Why is this a problem for you? If you have not anticipated the possibility that a full sync results in data not being fully available, you might execute queries against the reporting database during a full sync. This will result in unexpectedly incomplete results set.
Why is this not a problem with the API? The API does not have a similar cleardown and refill lifecycle.
What should I do? If you value the constant availability of data, then either use the API, or periodically cache copies of the data you need in your own downstream system. In any case, ensure your use of the reporting database falls well outside periods in which a full sync might be run.
Which should I use to calculate statistics and reports?
Use the reporting database.
The API can of course also be used, but you will likely find the reporting database far quicker and more intuitive to use if all you want to do is query the Elements data set for statistics and measures.
Why can't I just use the Elements operational database for reporting?
Direct manipulation of the operational database is specifically against the terms of the licence agreement, and is an unsupported use of Elements. The operational database schema is under rapid and constant development, and is deliberately left undocumented.
We provide backwards compatible APIs intended for critical long term integrations and a reporting database with a defined schema and documented changes between versions for more fluid requirements.
Historically, clients who have nevertheless built reporting or integration solutions involving the main operational database have found it to represent a significant barrier to upgrade, both in terms of time and cost, as internal processes become dependent on undocumented and changing database structures.
For these reasons you must not access data from the operational database directly.
If you feel that the Elements API or the reporting database has shortcomings, we would very much welcome your input in our Feature Request forum.
