Article Production in PKP Software: Directions and Plans

By parth sarin

This post highlights production changes and plans for PKP software versions 3.5 and 3.6 that will lower the cost of publishing, increase accessibility, and support new publishing models. Note that while this post focuses on OJS, all PKP software (OJS, OMP, OPS) is built upon a shared codebase, which means some of the changes and plans also apply to OMP or OPS.  

Open Journal Systems (OJS) is designed to give journals full control over the file formats they accept and publish. Current versions of OJS allow journals to manage a wide variety of file types, from word processing documents to LaTeX manuscripts to multimedia files.

By not intervening in the content of the files uploaded, OJS has given journals maximum flexibility in what files they accept and publish. However, this approach has also limited our ability to develop features that could lower the cost of publishing, increase accessibility, and push forward new publishing models.

Over the last year, we have been working with members of the PKP community to develop a new workflow for article production in formats that better serve the needs of journals using OJS. We have received feedback from hundreds of people at sprints, (e.g. sprint in Turin, Italy), regular convenings of a working group focused on article production, and through public comments at webinars. 

Now, we are pleased to announce our plans for article publishing in OJS and other PKP applications: OJS will begin to directly support the production of most common publishing formats — HTML, PDF, and JATS XML.

To enable this, we will make three sets of changes that will independently benefit journals, preprint servers, and presses. When brought together, these changes will enable low-cost, accessible, article production.

Preparing for Production

Unlike most publishers who need to move content between systems, in OJS, the publishing front end is connected to the manuscript management functions. This allows some of the production steps to take place at any stage in the editorial workflow, including right at the submission stage. 

Leading up to the production, OJS will facilitate the following: 

➡️ Capturing metadata: OJS 3.6 will prompt authors to upload a manuscript of their article as the first step of submission and the system will support tools to automatically pre-fill metadata. For example, for journals that can deploy or query Grobid, there are OJS plugins that import metadata from its output to the submission.

➡️ Editing content: The main contents of the article (i.e., without the title page or any additional information) can be extracted from the author submission into the new editor within OJS. This tool will allow for editing an article in minimal HTML format and support all the standard research articles elements (namely, headings, tables and figures with captions and notes, footnotes, lists, references, etc.). 

➡️ Processing references: The article’s references will be entered separately from the main article file and stored as their own piece of metadata (there is already a field for this in OJS). Each reference will be processed by the OJS Citation Manager (currently a plugin, but planned for inclusion in a subsequent OJS 3.5 version – not 3.5.0) and if it includes a recognizable DOI or accession number, additional citation metadata will be populated from OpenAlex, Orcid, or Wikidata.

    Production

    Once the content and metadata have been reviewed and finalized into the respective fields in the system, the production of various formats will involve a series of export tools that are able to automatically transform the already structured content using templates, layouts, and stylesheets.

    ➡️ Just in time exports: OJS 3.6 will support on-the-fly exports to JATS XML, HTML, and PDF without any manual intervention. This ensures that any updates or corrections to metadata or content can be reflected across all formats immediately. 

    ➡️ Multiple formats: Different versions of the same article can be produced, not just using different stylesheets, but also with different templates. For example, HTML versions of articles could be displayed within the main OJS structure (e.g., below the abstracts) or opened as a standalone page.

    ➡️ Easier multimedia embedding: Images, audio, and video articles can also be embedded within an HTML structure, instead of being linked externally to the site. 

    ➡️ Markup fidelity: The editor can produce marked-up article content at a level of granularity determined by our analysis of existing JATS XML documents published by Open Research Europe, Coalition Publica, Redalyc, SciELO, and Consejo Superior de Investigaciones Científicas, as well as the requirements for inclusion into widely-used indexes like PubMed Central. We will publicize the precise XML schema supported by the editor prior to the release of OJS 3.6.

    ➡️ Greater accessibility: Giving OJS access to contents of articles allows us to ensure that export formats follow the best accessibility guidelines for users with a wider range of needs and abilities. 

    ➡️ Built open source tools: As with everything at PKP, all components built into OJS, such as an HTML editor, will be built using open source software so that they can be sustainably maintained and improved upon. 

    ➡️ Optimized for integrations: By creating a central, standard way of storing metadata, content, and references, OJS will be able to offer access to these components to plugins and additional tools,  further optimizing and improving production. Third-party integrations have always been a strength of OJS, and the production workflow will be no different.

    We expect that this production workflow will immediately lower the cost for those using third-party services (e.g., outsourcing) and will greatly improve the quality (e.g., aesthetics, accessibility, and indexing) for a large majority of journals, that today only publish simple PDFs (e.g., printed from a Word document).

    Production Workflow Diagram

    The following diagram illustrates the planned production workflow for OJS 3.6:

    Design Decisions

    There are a number of important design decisions for this plan:

    ➡️ Bring your own JATS XML: Using the new JATS XML home, introduced in OJS 3.5, users can upload any XML markup they produce using their own tools or a third party vendor. Key metadata fields in uploaded XML can be compared against the metadata inside of OJS and, with some limitations, the article body content can be imported into the editor.

    ➡️ “Minimal HTML” as an intermediary: The final galleys are produced from a “minimal HTML” representation of the article that will be editable inside the article editor. This is a simpler, less detailed option for HTML and PDF production that will not require JATS production for those who don’t need it.

    ➡️ Interoperable with OPS: Bringing production tools to OJS means that the same tools will be made available for Open Preprint Systems (OPS). This is crucial, as few preprint servers today support posting the HTML or XML versions of a manuscript alongside the author-supplied PDF file. 

    ➡️ Swappable stylesheets and templates: The final export to HTML and PDF involves placing the minimal HTML representation of an article into a template and adding a stylesheet. One of the benefits of this approach is that these galleys can automatically adopt the fonts and colors of the journal, and further customization can be added.

    ❓ If you have any technical questions related to these changes and plans, please post in the PKP Community Forum so everyone can benefit.