PKP XML Publishing Roadmap & ePub format

Postby vlilloh » Mon Aug 17, 2009 9:49 am

Hi PKP team,

ePub seems to be consolidated in the standard format more than ever for electronic publishing (International Digital Publishing Forum, Adobe, Sony, ...)

Nowhere on the PKP web I see references to this format. Does PKP will have a policy concerning this? ePub for scholar documents?

In the PKP Wiki I read:
The goal is to have our software ingest XML (initially NLM Journal Publishing Tag Set Version 2.3 with Docbook, TEI and Erudit to follow) and render/display HTML and PDF.

You still thinking about Docbook, TEI and Erudit?

Do you know of any NLM to ePub conversion tool? For example, I think there are XSLT style sheets to convert DocBook XML books and articles to EPUBs.

All this is because as a digital editor, yet I hesitate to use one format or another. To finally decide, I need to know sure the following:
1) Support from PKP software.
2) That the publication is adapted to any device: computers, smartphones, ebooks readers (Kindle, Sony,...), the future iPad... :)

What do recommend looking into the future?

Re: PKP XML Publishing Roadmap & ePub format

Postby jmacgreg » Wed Aug 19, 2009 11:16 am

Hi Vicente,

Some good questions! I'm not currently aware of any XSLT that converts NLM to ePub, although I am somewhat familiar with the Docbook XSLT, which does indeed contain ePub XSLT.

We haven't been concentrating as much as we'd like on the XML conversion stuff written about on the wiki, but we'll get there eventually. I would say that if you are currently working with NLM to keep on doing so; and that if you haven't yet made up your mind, NLM is probably the best way to go. When we do update the PKP software suite's XML functionality, NLM will be the backbone of pretty much everything (to put it simply). Any other 'supported' XML formats (Docbook; TEI; Erudit; etc.) will be converted into NLM upon ingest, and as the document works its way through the system it will be as NLM XML.

NLM can (and will) be converted into HTML and/or PDF by OJS, depending on server capabilities. (You can already do this by using the existing XML Galleys plugin.) I'm not particularly familiar with eBook readers, but most should handle either one of these formats easily (the Kindle DX now supports PDF without conversion, for example). However, if you do manage to convert your documents into ePub format, you can upload those as a galley just like any other file.

Re: PKP XML Publishing Roadmap & ePub format

Postby vlilloh » Thu Aug 20, 2009 1:44 am

Hi James,

thank you very much for your answers, as always :)

The ideal workflow today would be the next?
- DOC, ODT, ... to NLM (with support from Lemon8), and NLM to PDF & HTML (with OxS)
- Optionally, DOC to ePub (Not cost so much work with the right tools)
Any hue to correct?

That would be great with journals and conferences documents, but what would happen with monographs and the OMP support?
What workflow is recommended to have the total OMP software support in the future? NLM too for monographs? Non academic documents as novels? I'm a bit lost here.

And one last question. What do you think about DITA? Do you think that you could get to work on documents like this?

Re: PKP XML Publishing Roadmap & ePub format

Postby mj » Thu Sep 10, 2009 5:42 am

Hi Vicente,

A couple of comments to follow on those from James, as well as your questions:

The main reason why we're focusing around the NLM XML format specifically, is that it is heavily semantically-oriented -- that is, it is intended to represent the meaning and structure of a document in detail (for example, complex citation information), rather than just its layout/appearance. This is why the workflows you see use NLM as a source for generating layout representations like PDF and HTML. ePub (at least, OPS) falls more closely toward the latter; basically being a variant on XHTML and some embedded CSS-type rules. The more recent OPF packaging standard does provide some facility for use of item-level metadata using Dublin Core, but it's still quite basic.

That said, some workflows you're more likely to see would include:

  • NLM --> ePub
  • ePub --> PDF (probably already there now)
  • ePub --> HTML (likely similar to how Kindle-type devices render ePub presently)
So, if ePub is sufficient for your purposes, then it's a fine format to use -- but our core work will likely continue around NLM. That said, ePub is definitely an important format for OMP in particular (since it's designed for monographs and non-journal type documents), so to answer your second question, I think an "ideal" workflow would be something like:

DOC/ODT --> NLM --> ePub --> HTML/PDF

Lastly, regarding DITA, it looks like a much richer standard for electronic documents, and may be something we'll consider working with in the future, but it's a relatively new standard and since it doesn't have much adoption as yet, our limited efforts are better directed toward what's in use currently (being NLM and ePub).

Hope this helps,
Re: PKP XML Publishing Roadmap & ePub format

Re: PKP XML Publishing Roadmap & ePub format

