OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



TEI?

For discussion of ideas, functional requirements, interests regarding the Open Monograph Press.

Moderators: jmacgreg, John

Forum rules
Please feel free to post any questions or start any discussion about OMP here. We'll let you know via this forum when OMP is ready to download and test. You may also want to keep an eye on our OMP page.

TEI?

Postby lhumble » Fri Jun 17, 2011 2:48 pm

Hi there,

I just attended my first DHSI conference and am all jazzed up about TEI markup. Very interested in the use of TEI for scholarly publishing.

Will OMP accommodate this type of XML file?
lhumble
 
Posts: 3
Joined: Sun Apr 10, 2011 8:50 pm

Re: TEI?

Postby JasonNugent » Mon May 21, 2012 12:01 pm

Hi lhumble,

OMP does currently support OAI which can provide different types of document payloads via crosswalks. We are also including basic support for ONIX 3.0, a very rich metadata format for the exchange of book information used by many resellers like Amazon and others.

To answer your question, at first our OAI export will support Dublin Core. But, there will probably be more work in this area eventually, since OJS supports other formats.

Regards,
Jason
JasonNugent
Site Admin
 
Posts: 848
Joined: Tue Jan 10, 2006 6:20 am

Re: TEI?

Postby springday » Fri Sep 07, 2012 1:52 pm

I think TEI is not a markup format for metadata, like OAI or ONIX, but a markup format for book content. So it's more comparable to Docbook or DITA. All those three would be great to be supported by OMP, because if book content was saved in those formats, automatic transformations into all kinds of other formats (PDF via XSL-FO, EPUB and .docx via XSLT) could be handled by the software.

Regards,
Kai
springday
 
Posts: 111
Joined: Wed Jul 25, 2012 2:56 pm
Location: Munich, Germany

Re: TEI?

Postby asmecher » Fri Sep 07, 2012 2:05 pm

Hi Kai,

So far we're playing agnostic about the content of document uploads (typically Word documents or the like), but a full XML workflow is definitely on our agenda. Have a look e.g. at http://www.oucs.ox.ac.uk/oxgarage/ for one promising project that we've been discussing internally. Lots of integration potential. The big question remains, however: who is responsible for investing the time in transforming a layout-centric document (e.g. doc, PDF, and almost everything else) into a semantic one (e.g. TEI or NLM XML)? Authors have the vested interest, but mostly don't understand the difference between semantic and layout formats.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8315
Joined: Wed Aug 10, 2005 12:56 pm

Re: TEI?

Postby springday » Fri Sep 07, 2012 4:26 pm

Hi Alec,

in the publishing house I work for we're still hoping we can enforce a rigid paragraph style based workflow. We have an empty Word template with predefined paragraph and character styles for our authors, a manual how to use them and a copy-editing team that is supposed to check not just for typos, etc. but also for the correct styling. We then also ask our layouters to maintain this style information - while they're allowed to add certain extra styles, that we have defined as well.
Those workflows will never get perfect, but the results converting from either Office Open XML or IDML (InDesign-XML) to a more semantic XML format are not so bad either. However, I also must admit that we're still very far from production level, I'm still in the stage of random testing here and there.

Best wishes,
Kai
springday
 
Posts: 111
Joined: Wed Jul 25, 2012 2:56 pm
Location: Munich, Germany

Re: TEI?

Postby asmecher » Mon Sep 10, 2012 8:48 am

Hi Kai,

Our first attempt at XML transformation was with the retired Lemon8-XML tool, which attempted to perform the transformation automatically (without a transformation-oriented style set). This is still used in one or two places but in general was too domain-specific and often didn't meet expectations. We've been speaking with a number of groups and it seems that style-based transformation is the most successful means used in production. (The other approaches we hear of commonly are either doing the transformation manually with in-house labour, or with an offshore team, but neither of those approaches are very good IMO for what we're trying to build). As I understand it, oxgarage facilitates a style-based transformation much like you're doing.

We still discuss automatic approaches sometimes (not requiring specialized styling) and may even begin some limited active development around it, but that would be speculative work.

In any case, we're very interested in integrating these approaches more closely with our workflow tools, particularly if we can make use of solid external efforts like oxgarage. I think the XML tools built into PHP5 are becoming very solid, too; when we first looked into this there were competing XML APIs and none of them were especially reliable. We've recently dropped support for PHP4 and can finally start making use of some of those language features.

As with many parts of our work, we're sometimes limited by the fact that we don't actually publish any journals of our own -- we're dependent on outside groups for feedback. On such a production-oriented issue as XML workflow, the more information, feedback, and outside review/contribution we can collect about these processes the better we can do.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8315
Joined: Wed Aug 10, 2005 12:56 pm

Re: TEI?

Postby springday » Mon Sep 10, 2012 11:02 am

Hello Alec,

I remember I tested Lemon-8 once about two years ago - and probably it's still on my localhost-test server somewhere. It's a very interesting, yet extremely difficult, approach to start out making guesses about the meaning inherent in a document's structure. And that approach's certainly the most low-level and a good help for people who feel difficult to separate form and content of documents.
However I see you also understand why so many people choose a different approach, trying to apply some formatting constraints right in the source so the cleaning up work would be easier later. But the Lemon-8 type of approaches is not dead. It reminds me of the businesses built around conversions from PDF. I have not seen any totally conving result yet, but I've seen a lot of people / software trying to get something "meaningful" out of (non-tagged) PDF, and be it only something like XHTML or EPUB. It may even work well for fiction books, but it's very hard for academic / professional literature.
Anyway, I'll keep in mind that you're interested in those workflow topics. For the moment I'm very very occupied with customizing our new OJS-based website www.reinhardt-journals.de, but I hope to get some more time in October / November to push our publishing workflows. If I encounter anything that might be of interest here, I'll let you know.

Regards, Kai

P.S.: Thanks for pointing me to oxgarage, I had never heard of this project before.
springday
 
Posts: 111
Joined: Wed Jul 25, 2012 2:56 pm
Location: Munich, Germany


Return to OMP Discussion

Who is online

Users browsing this forum: No registered users and 1 guest