The sprint notes from the PKP Hannover Sprint, hosted by the Leibniz Information Centre for Science and Technology in September 2023 are now available.

Sprints involve PKP community members joining diverse groups to work on PKP software and support. The Leibniz Information Centre for Science and Technology (TIB) hosted six working groups at the PKP Hannover Sprint in September. This is a summary of the fourth working group regarding XML.
Group members
- Jeanette Hatherill, Coalition Publica
- Edith Cannet, IR Métopes
- Dominique Roux, IR Métopes
- Marisa Tutt, PKP
- Martin Brändle, University of Zurich
- Dulip Withanage, TIB
- Ipula Ranasinghe, TIB
Background
XML has been a long-standing conversation at PKP. For reference, see the discussion at the 2018 PKP Sprint overview of XML tools for OJS.
What are the problems users face in production?
- Compatibility JATS and Open Edition / Commons Publishing TEI (subvocab of TEI with a focus on granularity) thinking in terms of modularity (cf. Craft OA)
- Submit XML and convert to full-text HTML (including HSS contents)
- Metadata (automatically complete, controlled): from submitted file vs from OJS db
- For editors needing to export to multiple places, the current practice in Canada is to find money and outsource.
- It is difficult for editors to follow a structured format to get output.
- Documentation can go a long way in supporting the user journey.
- What is required for plugins, and then what documentation is required?
- The biggest problem is people entering data correctly and avoiding duplication as much as OJS can generate automatically is best.
- The editors need user-friendly ways to avoid errors that are easy for them.
- The real problem lies in manual work and looking for ways of minimizing input errors.
- Would a metadata checker enable the possibility to feed the metadata fields from XML file import?
- Compatibility between different JATS versions, e.g., requires exactly 1.2.
- Problem with which metadata fields are needed/required.
- What is being used, and then expand support?
- Need to be making evidence-based decisions.
- Can we understand the 80%?
- What flexibility is needed for the 20%?
Goals
- Test OJS métopes TEI to JATS integration.
- Document the workflow to contribute to identifying the requirements for XML Publishing and the landscape. Utilize as a use case.
Results
- Testing worked!
Plugin development on GitHub: https://github.com/withanage/tei2jats/tree/main
Demo: https://github.com/withanage/teitojats/raw/main/demo.webm
2. Documenting the workflow to contribute to identifying the requirements for XML publishing and the landscape. Utilize as a use case.
Dulip described the current workflow within OJS to treat Docx to JATS (https://github.com/Vitaliy-1/docxConverter) and texture editor (https://github.com/pkp/texture), highlighting some of the limitations. Explained some of the workflows around the use of these plugins at TIB open publishing.
The Métopes team provided an overview of the publishing landscape in France and their workflows. Métopes provides a suite of services to French journals, aimed mainly at editors. Generally speaking, France has a culture of templating, with Open Edition and Cairn being the major players, the templating happens at both the author and editor level. The full-text display of content is embedded in the culture of FR publishing. Single-source publishing is the norm, editors have XML at the base.
Numerous publishers (in France) have XML-encoded collections (TEI Commons, JATS xx, etc.) and wish to expose them in full text (e.g. OpenEdition, Cairn, etc.). SHS publishers need rich editorial environments (in terms of editorial objects processed) that are independent of the JATS schema, adaptable and extensible (low tech: XSLT), and oriented towards OJS and OMP.
Métopes has existed since 1999-2000, since 2012, with support from the French government, and since 2016 as a major piece of national infrastructure as part of the National Open Science Initiative.
Generally speaking, a team of 2-4 people over time, systematically does training over 3 days with incoming editors across the FR publishing system. Time team spent ⅓ training, ⅓ support/help, and ⅓ development. Teams of engineers at developers across institutions for the development. Focus on low-cost technologies and low-cost implementation.
Métopes provides tools for building (style sheets and XSLTs), transforming (Circé and XSLTs) and displaying (xml2html_Pkp_plugin) structured collections, with a particular focus on tools built around TEI Commons Publishing and JATS Publishing (validation is realized into addon – developed and maintained by Métopes – in an XML Editor: XMLMind XMLEditor – sources have been bought by Métopes to diffuse the software to public publishers).
- Management and display of the complexity of XML
- Modular template based in docx, freely available
- Templates are available for LibreOffice but have fewer options
- Available in English, Spanish and French.
- Direct download: https://git.unicaen.fr/fnso/i-fair-ir/modeles-stylage/-/blob/master/MS_Word/Metopes/COMMONS-Metopes.dotm (to be open)
The current (and now somewhat deprecated) PKP plugins built around the JATS schema have been tested and have shown their limitations (from our SHS point of view)(footnotes, for example). The solution would be to develop an OJS plugin, within the plugin being developed for OJS looking to permit the display/use of indexes
To do this work Métopes looked at the following :
- TEI Commons Publishing schema (Métopes, OpenEdition) ODD
- Word processing model for structured content.
- XSLT transformation to TEI Commons Publishing
- XML (XML Mind, XML editor) environments for editing (schema validation), annotation (repository API, publication data links), and transformation (XSLT to Jats, OpenEdition, Cairn, ePub…).
- Transformation XSLTs (including TEI<>JATS).
- Transformation server (Circé) and pipelines (API)
- Structured content exposure plugin : xml2html_Pkp_plugin (2 implementations: Journal JATS>OJS; Book TEI>OMP)
IR Metopes functions
- Response to publishers’ requests
- Demand for Metopes in Africa, Latin America, and Middle East
- Opening up TEI content
- Breaking out of French particularism
- Results of testing existing plugins
- Choice of specific development
- Full-text exposure plugin (oriented galleys production) accepting any XML feed as input and producing HTML (OJS or OMP galley) as output – parameterization using XSLT; template + CSS for the graphics layer; JS for interactivity.
In the context of the CRAFT OA project
Basis of interoperability between systems
Link: https://operas-eu.org/projects/craft-oa/
Next Steps
The goal of the sprint is to complete short projects before the end, but often there are leftovers to explore and the results here become the starting point for future work.
APPLICABILITY FOR LARGER OJS COMMUNITY
OJS User based in Europe (AEUP-Association of European University presses)…, Latin America (REUN-Red de Editoriales de Universidades de la Argentina), ASEUC-Association de Editoriales Universitarias de Colombia)…, Africa (e.g. Wits University Press in South Africa), Lebanese universities… can now find proposals and solutions to establish relations between the use of Metopes for the production of structured content and its dissemination in full-text format using PKP tools.
Another experiment was realized by directly deploying edited contents (EHESS journal Annales HSS) on Cambridge University Press servers.
How transferable to a distributed platform?
WORKFLOWS
Different workflows may be possible for publishing XML content of external systems with OJS.
These workflows must support the asynchronous operation of processes that may take some time e.g. for unpacking, transformation, etc.
Variant 1 describes a workflow where the external service for the transformation of content to other formats is called from OJS.
Circé is a public service that has XML/XSLT pipelines for the transformation and validation of documents. Link: https://metopes.unicaen.fr/circeui/
Variant 1 – using external service (simplified, may also be asynchronous)
Variant 2 is a workflow where the publication of documents is pushed from the external service into OJS.
Variant 2 – harvesting a package from an external service (external service pushes publication)
Example process description:
- The external system creates an ingestion package (like a SIP for digital archives, OAIS compliant).
- The external system validates all package content and creates, e.g. a METS description of the package and its contents, including checksums.
- The external system creates an API (or other call) to OJS and communicates the package location.
- OJS fetches the package and writes a log of all transactions.
- OJS sends the URL of the log back as an API answer.
- OJS extracts the package and does checksum validations.
- If ok, OJS publishes content.
- OJS sends confirmation of the finished task back to the external system.
- The external system fetches the log and can process it further in case of errors.
Questions:
- OAIS compliance of OJS?
- Can API be used to trigger asynchronous processes?
- Are other exchange mechanisms than REST API more suitable?
Requirements to do:
For variant 2, find a suitable message protocol that is generic enough to be used between any external system and OJS (e.g. SOAP and WSDL? ). Shall OJS prescribe the protocol and do other services have to comply with it?
Variants 1 and 2: Develop requirements for the plugin to configure an (abstract) external service (API REST Points, message protocol, API keys, or OAuth2)
Next Steps for Métopes in their project
- Finalizing the TEI Commons to JATS transformation (October 2023).
- Consider connections with Circé pipelines.
- More advanced testing and implementation of xml2html_Pkp_plugin (TEI for OJS and BITS for OMP).
- Storing the HTML page in the database.
- Inclusion in Métopes v. 3.1 and Métopes 4.
- Manage the economy of metadata circulation between the workflow management system and the source document.
- Validation of metadata quality.
- Connection to an online XML validator-editor.
Métopes Integration
- Move the saxon parser to the central configuration of OJS.
- Validate TEI files and JATS XML files.
- Identify content type, when loading the XML files.
- Jats to TEI.
- Rename the plugin to a more user-friendly name.
References
https://www.balisage.net/Proceedings/vol26/print/Imsieke01/BalisageVol26-Imsieke01.html