PKP Copenhagen 2023 Sprint Notes released: Multilingual Metadata in Crossref

By PKP Copenhagen Sprint Working Group "Multilingual Metadata in Crossref" / PKP Communications
Thirty plus community members from around the world gather at the PKP Copenhagen 2023 Sprint to work on PKP software. The group is spread out into three rows with some people standing in the back, and some people sitting in the front. The group is highly diverse, coming from different countries, backgrounds, expertise, and organizations. The photo is in black and white, with the projector screen in the background and the springing tables with chairs in the foreground. 

The Crossref (sponsor) and Royal Danish Library (host) logos are in the lower right; the PKP 25 year anniversary logo is in the lower left.

The main message is about the PKP Copenhagen 2023 sprint notes about importing content into OJS being released.

The third set of sprint notes is now available from the PKP Copenhagen Sprint, hosted by the Royal Danish Library in June 2023.

Sprints involve PKP community members coming together in diverse groups to work on PKP software and support. The Royal Danish Library hosted eight working groups at the PKP Copenhagen Sprint last June. This is a summary of one such group’s work.

Group Members

  • Clinton Graham, University of Pittsburgh
  • Radek Gomola, Masaryk University Press Czech Republic
  • Susan Collins, Crossref
  • Emma Uhl, Public Knowledge Project
  • Ramana Fragola, National Library of Sweden
  • Jyrki Heinonen, Federation of Finnish Learned Societies
  • Esmee Klumpenaar, University of Groningen

Background: Why is this topic important?

Journals are collecting multilingual metadata for articles.  Crossref can index this either as supplementary metadata (when there is multilingual metadata for a monolingual article), or as independent DOIs (for articles that have been translated).  This facilitates discovery in the user’s language.

Goals

  • Metadata exported to Crossref is currently mono-lingual (primary locale).  We will evaluate whether current practice allows us to add multilingual metadata directly in the existing export, or if new DOIs need to be generated. https://github.com/pkp/crossref-ojs/issues/21
  • Survey current usage
  • Find use-case examples
  • Identify concerns and limits

Survey results

Survey current usage: when article translations are published, are editors creating different submissions for each translation?  Or are galleys added to a single submission?  What are best practices?

  • Of 95 journals participating in the Coalition Publica program in Canada, 20 journals had some type of translated content.
  • Five journals provided translations as multiple galleys under one submission;
  • Seven journals provided translations as separate submissions;
  • Seven journals provided translated metadata only.
    • One journal provided translations as separate submissions and as multiple galleys under one submission inconsistently.
    • One journal combined multiple languages into one galley. (Not recommended.)

Use-case examples:

Multiple formats, multiple languages, single submission (non-Latin language): https://jarps.net/journal/article/view/32

The most common scenario among journals in the Nordics is to publish articles with a galley in one language and metadata in multiple languages.

The University of Pittsburgh also has a primary use case of a monolingual article with metadata in multiple languages.  When articles are published with full translations, published translations appear as galleys under a common submission, e.g. here: https://feministasylum.pitt.edu/faci/article/view/90

Other concerns and limitations

Documentation Needs

PKP Documentation does not talk about what metadata fields are exported currently,  It should.

Are there concerns with the completeness of the multilingual metadata available to submit to Crossref?  **Note specifically, the title.

Titles are a child element to the journal_article, which has a language tag.

Are there limitations in the Crossref schema used in OJS 3.3 (4) vs. used in OJS 3.4 (5)?

4.3.6 and 5.3.1 both allow translations of:

  • Abstract (and thus the title)
  • Copyright holder (not exported currently)
  • Contributor (Person) names (but what does the language attribute mean?)

The primary title element (and the journal metadata) does not have a lang attribute.  This will be used for citation purposes.  But, note that the OJS citation generator changes the article title per language!

UI/UX design: Will a new option be needed to prevent automatically adding multilingual metadata to submissions, or can we presume to always add this?

Would this break anything if we always presumed to send all metadata translations? We don’t believe so.

Differences

Only in 4.3.6

<xsd:attributeGroup ref=”metadata_distribution_opts.att”/>

xsd:annotation

xsd:documentation</xsd:documentation>

</xsd:annotation>

<xsd:attributeGroup ref=”metadata_distribution_opts.att”/>

Only in 5.3.1

<xsd:element ref=”titles” minOccurs=”0″ maxOccurs=”1″/>

<xsd:element ref=”acceptance_date” minOccurs=”0″ maxOccurs=”1″/>

<xsd:element ref=”scn_policies” minOccurs=”0″ maxOccurs=”1″/>

Changed tags:

Old:

<xsd:element ref=”special_numbering” minOccurs=”0″></xsd:element>

New:

<xsd:element ref=”special_numbering” minOccurs=”0″/>

Old:

<xsd:element ref=”jats:abstract” minOccurs=”0″ maxOccurs=”10″>

New:

<xsd:element ref=”jats:abstract” minOccurs=”0″ maxOccurs=”unbounded”/>

Relevant changes:

  • add required ‘version’ attribute to <doi_batch>
  • add acceptance_date element to journal article
  • Added to 5.3.0 Support for ROR and other organization identifiers:
    • replace tag with to support new affiliations structure
    • add <institution_id> element to support ROR and other org IDs
    • make either <institution_id> or <institution_name> required within
  • relax regex for given_name element to allow numbers

Code:

https://github.com/pkp/crossref-ojs/compare/main…ulsdevteam:crossref-ojs:pkp-cph-crossref?expand=1

Useful links

Related issues on PKP Github:

Results

A proposed modification to the ojs-crossref plugin which adds JATS for abstracts and titles for every language with data entered.

https://github.com/pkp/crossref-ojs/compare/main…ulsdevteam:crossref-ojs:pkp-cph-crossref?expand=1

Next Steps

Many journals found that are publishing full translations of articles as galleys on a single submission, which will not conform to Crossref best practice guidelines.  Future work could improve documentation and OJS best practice recommendations in order to better align the community with best practices.

This enhancement might be a candidate to be back-ported to OJS 3.3.

Thanks

Thank you to the working group for all its effort on importing content into OJS, and for sharing its notes. Our thanks to the Royal Danish Library for hosting the sprint, and to the PKP community, both in attendance at the sprint and elsewhere, for their valuable guidance. We are also grateful to Crossref for sponsoring this event.