Why is the bibliographic author name not correctly resolved/hyperlinked to the appropriate user profile?

Edited

Elements offers functionality to allow the association of an author name in bibliographic metadata with the corresponding user of Elements. These associations can be seen as hyperlinks in the Elements user interface in various places. See the following images for examples of where the system has identified the Elements user "Daniel Hook" with the author names "Hook DW" and "Hook, Daniel W." in bibliographic metadata.

What determines these associations?

These associations are automatically calculated and frequently refreshed by a background process (the indexer service). As of version 5.5 of Elements, author to user associations are typically recalculated by the system within 2 seconds of data changes made to either the publication or the user. Occasionally, there can be a longer delay if the system is under very heavy load. Before v5.5 of Elements, these calculations were made much less frequently.

Choosing the user when editing an author list does not set the association

Despite the impression of control over this process currently given to a user when editing author lists in manual data records (see the screenshot below), the user-driven choice at this stage of which user an author name corresponds to is not saved in the system. The original intended result of this functionality is merely to adjust the contents of the chosen user's approved and pending lists with respect to this publication, and currently that remains its only function. For better or worse, the association of the chosen user to the particular edited author entry is immediately forgotten by the system upon clicking the Save button.

Instead, once the data is saved, the background process mentioned above reviews the whole publication as soon as it can, recalculating the correspondences between all of the author names in the data and the set of users of Elements who have claimed the publication.

Ordinarily, this automatic background recalculation would correctly associate every user who has claimed the publication with the appropriate author name in every bibliographic record of the publication, but for some user/author name combinations, the association is not made correctly.

How are the associations calculated?

For each user who has claimed the publication, the system uses a complex algorithm to make a best guess as to which author name in the bibliographic data corresponds to that user. Because this depends on both the quality of available data and a certain degree of subjectivity where data is ambiguous, it is theoretically impossible to get this best guess correct 100% of the time.

Sometimes this will manifest as an inability to associate the user to any of the author names, and sometimes it will manifest as an association to the wrong author name.

The algorithm takes account of:

  • Feature similarity of authors across the different data source records in the publication, such as authors having the same email addresses, last names, initials, other names, ORCIDs, and positions in the author list;

  • Strength of compatibility of bibliographic author name with the user's name variants in their search settings (and as of v5.7 their name data in the HR feed);

  • Strength of compatibility of bibliographic author identifiers with the user's claimed and rejected identifiers.

What might have gone wrong?

If the system is not successfully associating a user with the appropriate authors in the bibliographic data, it may be possible to alter the system's best guess by providing the system with higher quality data with which to make its association inferences.

Incorrectly formatted name variants in the user's search settings

The most common way for the system not to associate a user with the correct author entry is through incorrect or incorrectly formatted name variants in the user's search settings.

Name variants should be entered such that the last name appears first, followed by a comma, followed by the user's other names and/or initials. Where initials are provided, they should be terminated with a full stop.

This format clearly distinguishes the user's last name from their other names, and clearly distinguishes initials from first names and from each other. The system will accept names provided without the disambiguating commas and full stops, but works best when they are provided. When they are omitted, there is a good risk that the system cannot tell the difference between last names and other names, and will misunderstand the structure of the name.

One consequence of this is that the user's name is not properly recognised in bibliographic data.

Provision of name variants featuring the user's full first name(s) can help the system disambiguate between authors with the same last name.

As an extreme example of an ambiguously specified name variant, the system will find it very difficult to correctly interpret the name variant "CHEN LI", which because of use of all capitals and a lack of any appropriate punctuation, could feasibly correspond to any of the following possibilities:

  • Chen, L. I. (could be Louise Irene Chen)

  • Chen, Li

  • Li, C. H. E. N. (could be Charles Harold E. N. Li)

  • ...

The system chooses one interpretation, and if that is not the incorrect interpretation, then the accuracy of detection of that user's name in data will suffer.

The highlighted names in the example below are formatted correctly, and the un-highlighted names are formatted ambiguously, risking misinterpretation by the system. Sticking to the required format is particularly important for users with compound last names, such as "van Halen."

Inaccurate source data

Sometimes, bibliographic data is incorrect at source. Perhaps the data source has misspelled the user's name in the author list, or has specified an incorrect identifier (perhaps that of another person) as a part of the author data. This can result in a lack of association, or another user being associated to the author instead of the correct user.

At the current time, we have opted not to introduce a means of manually correcting user associations in Elements, principally as we do not believe this feature would be used sufficiently enough to reduce the number of incorrect inferences.

Another route is to correct the data at source by contacting the relevant data provider and then waiting until the data is automatically refreshed in Elements. At that point, Elements will automatically recalculate the correspondences using the corrected metadata.

Use of associations in downstream systems and reports

It is common to want to use the user to author associations calculated by Elements in downstream systems, such as public web portals and institutional repositories, or in custom reports, to hyperlink author entries to staff profiles and for other purposes.

Please bear in mind that given the associations are calculated using an automated algorithm, they cannot be guaranteed to be 100% accurate. Please take this into account when deciding whether or not to use the calculated association information in downstream systems or reports. If you are not comfortable with the existence of some level of error in the results of the calculations, then you should not use them.

Even if you choose to ignore the calculated correspondences, you can still know which users claimed the publication - you will always have that information available from Elements, both in the reporting database (in the [Publication User Relationship] table) and in the API (by examining <relationships> between users and publications).

Finally, we are always looking to improve the way Elements calculates the user to author correspondences, so any feedback would be very helpful.


Was this article helpful?

Sorry about that! Care to tell us more?

Thanks for the feedback!

There was an issue submitting your feedback
Please check your connection and try again.