PKP Turin 2024 Sprint notes: Typesetting

By PKP Turin Sprint Working Group "Typesetting” & PKP Communications
Typesetting working group at the PKP Turin Sprint 2024

The Typesetting summary from the PKP Turin Sprint, hosted by the CRAFT-OA project in October 2024, is now available.

Sprints involve PKP community members joining diverse groups to work on PKP software and support. In October, the CRAFT-OA project and the University of Turin hosted eight working groups at the PKP Turin Sprint. This is a summary of one such group’s work.

Abstract

The session aimed to assess existing tools and workflows used in OJS to improve document handling, particularly for JATS XML and HTML generation. The focus was on balancing technical improvements with accessibility for authors across different regions and skill levels. The main objective was to co-create one or two workflows adaptable to OJS and identify compatible tools to support XML and HTML document types, improving accessibility and ease of use.

Building on the morning session, the discussion focused on workflow adjustments and the potential impact of integrating document editing tools directly in OJS. The group aimed to outline a workflow where tools like FidusWriter could improve the editorial process, automate HTML generation, and support JATS XML output

Working Group members

Background

Group 1: (Patricia, Erwan, Davin, Jan, Denis)

  • Metopes: This step operates independently of OJS, with a Word-like interface where styles and tags can be applied. Files are imported to an XML editor, generating TEI XML. However, this process is quite demanding for editors.
  • Editor Type and Previewing Outputs: Are we considering an HTML or XML editor for OJS? Templates within OJS could streamline this. It’s also important to think through how outputs are previewed—JATS XML editors may be too complex for editors, so finding an intuitive solution will be key.

JM Workflow Example:

  • Word files are exchanged during the review.
  • Copyediting takes place in Google Docs once accepted.
  • Files are converted to markdown with Pandoc, followed by manual cleanup, and then converted to JATS XML, with XSLT used to streamline the XML cleanup for PDF generation.
  • Final files (XML, HTML, PDF) are uploaded to OJS, but it’s challenging since OJS 3.4+ doesn’t support an XML viewer.

Early Conversion Considerations:

  • Getting authors involved in editing early on may require additional training or support, as some may not be comfortable with these tools. Kotahi, for instance, allows direct author edits.

XML Pipeline Options:

  • Transpect (open-source, with paid setup) offers a configuration cascade that could support our needs. The timing of XML validation, ideally at acceptance, will also be important, and using an HTML editor in OJS would require highly structured HTML styles to meet our needs.

Group 2: (Joao, Alec, Cecilia, Rino, Piero, …)

References can be challenging, as they often cross between body text and metadata. Not everyone in our group is currently working with JATS, but there’s strong interest in tools like FidusWriter.

We’re currently outsourcing a lot of the formatting and conversion work, with tools like Pandoc and Unoserver used to convert between formats. However, we’ve had mixed experiences with HTML-to-PDF conversions, especially with references, footnotes, images, and captions. It’s a reminder to approach HTML-to-PDF conversion carefully, as it’s not always straightforward.

Finally, there’s a noted concern about changing the tools available to authors, so keeping their experience in mind will be essential as we move forward.

Group 3 (John, Martin, Nathalie, Hanna, Kay)

  • Online Editors Across Disciplines: Online editors could serve many disciplines well, provided they meet specific needs. Ideally, the editor would hide JATS as much as possible, with options to view or edit it depending on user role (similar to HTML editors).
  • Support Questions: Once users log into an OJS environment, support questions often become editor-related (“How do I make things bold?” or “Why did my text disappear?”), a challenge less common with offline tools like Word.
  • Learning from Other Tools: It could be valuable to ask the creators of similar editing tools, like CLARIAH and their data stories editor, about their experiences with these kinds of challenges. (As a note, we’ve already been in touch with CLARIAH about a potential tool for editing JATS XML into enriched HTML – thanks, Kay!)
  • User-Friendly Experience: The goal is to create an editing experience similar to Google Docs or OverLeaf that feels intuitive and familiar.
  • Extensibility with Plugins: The editor should support plugins for specialized tasks, such as creating MathML for mathematical formulas or CML for chemical drawings that can embed into JATS.
  • Exploring Existing JATS-XML Workflows: It might be helpful to look into established JATS-XML workflows, like JATS XML at srce.hr and JATS XML Converter Service Guidelines and Tutorial.
  • Peer Review in a Live Editor: For live editor peer review, we could consider allowing annotations or adding a form specifically for peer review questions.

Goals

Sprint: Co-create one or two workflows that integrate seamlessly with OJS, identifying compatible tools and where specific settings should be located.

Overall PKP Goal:

  • Enable JATS XML Export (for PubMed).
  • Display and Generate HTML: Determine the best stage for HTML generation—whether at submission, after copyediting, or at another point in the workflow.

File Conversion: Facilitate conversion of files from Word, LaTeX, and PDF formats to HTML.

Results

Discussion Summary:

  • Author Culture Considerations: There is a general concern about changing the tools that authors use, as many are comfortable with current practices.
  • Workflows: We discussed both existing and ideal workflows, focusing on how typesetting, HTML, and XML generation could take place within OJS.
    • Current Workflow (John/Alec):
      • Submission: Authors submit in Word format.
      • Review: Reviews are conducted using Word or HTML generated from the Word file.
      • Revisions: Reviewer comments and revisions are fed back into the submission and can be regenerated into HTML as needed.
      • HTML Generation: A WYSIWYG viewer would be ideal for converting and displaying HTML content, with OJS maintaining a full audit trail of all actions.
    • In-Text Editor Access: The in-text editor would be restricted to editors only.
  • Tools and Challenges:
    • We highlighted the tools currently in use, challenges with existing workflows, and some of the tricky aspects of converting HTML to PDF, especially with references, footnotes, images, and captions. It’s important not to underestimate the complexity of HTML-to-PDF conversion.
    • Several additional questions were raised during the discussion and will need further consideration.

Key Learnings:

  • The copyediting stage is often not used by many journals within this community.
  • We identified additional tools currently used by community members that may be worth considering.
  • Many authors within the OJS community still prefer working with Word.
  • Wishlist: An author invitation process could be beneficial, potentially building on the GDPR improvements expected in version 3.5.

Proposed workflow

Submission: Authors submit their work in Word format.

Review: Reviews are conducted either in Word or through generated HTML.

Revisions: Reviewer comments and revisions are incorporated back into the submission, with the option to regenerate the content in HTML.

HTML Generation: A WYSIWYG HTML generator/converter is used, allowing for easy viewing and editing.

Audit Trail: OJS maintains a complete audit trail of all actions to ensure transparency and traceability.

In-Text Editor Access: The in-text editor feature would be available only to editors, providing them with specialized editing capabilities.

Next Steps

Devika will review notes and generate a proposed workflow based on today’s discussion

Thank you

We once again thank the Sprint host institutions, and all participants for their valuable contributions to the PKP user community. Special thanks to the CRAFT-OA Project and the University of Turin for their support and partnerships.