I've been reading about the complexity to create a parser that can successfully parse most sorts of documents and convert them into XML with Lemon8. I agree that it will never be possible to obtain a parser that works 100% efectively with all documents that must be converted to XLM.
My question is ... why isn't lemon8 providing guidelines on how should word, or ODT documents be structured to obtain a 100% parsing eficiency? I undestrand this might differ significantly from the original lemon8 objectives, but if guidelines were provided editors could adapt their own guidelines to comply with the lemon8 ones and improve conversion efficiency.recognizing a single author per article. Maybe something could be added in the original document to let Lemon8 know that the provided document has been edited to fit lemon8 requirements.
An example is the fact that in order to parse effectively lemon8 is now only one author is recognized when parsing because of the great differences between documents formating.
Has anyone a template or guidelines to create documents that can be more effectively parsed by lemon8?
It is easier for non technical people to format a word or open office document in a certain way than trying to correct the XML file once it has been converted.