OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



Clean MS WORD Garbage out of Abstracts

Are you responsible for making OCS work -- installing, upgrading, migrating or troubleshooting? Do you think you've found a bug? Post in this forum.

Moderators: jmacgreg, michael, John

Forum rules
What to do if you have a technical problem with OCS:

1. Search the forum. You can do this from the Advanced Search Page or from our Google Custom Search, which will search the entire PKP site. If you are encountering an error, we especially recommend searching the forum for said error.

2. Check the FAQ to see if your question or error has already been resolved. Please note that this FAQ is OJS-centric, but most issues are applicable to both platforms.

3. Post a question, but please, only after trying the above two solutions. If it's a workflow or usability question you should probably post to the OCS Conference Support and Discussion subforum; if you have a development question, try the OCS Development subforum.

Clean MS WORD Garbage out of Abstracts

Postby tshore » Wed Apr 14, 2010 8:50 am

TinyMCE has created a pretty big mess for my conferences. Despite this little note:

Picture 8.png
Picture 8.png (23.23 KiB) Viewed 2266 times

that I hardcoded into:

templates/author/submit/step3.tpl

I still have a ton of messy abstracts.

Questions:

1. How can I clean the abstracts in the papers report so that we can make the program?
2. What can we do to help prevent this from happening next year?
3. A related question - How can I clean the HTML out of the abstracts in the papers report?

Thank you.
tshore
 
Posts: 264
Joined: Fri Nov 18, 2005 12:48 pm
Location: Hamilton, Ontario

Re: Clean MS WORD Garbage out of Abstracts

Postby albertosa » Wed Apr 14, 2010 12:25 pm

I had the same situation!

The problems seems to be related with copy/paste from Word 2007: it adds to the text a mess of hidden code regarding text formatting.
I simply disabled the TinyMCE and it worked well.

Also, I had to clean one by one the submissions affected with this problem: I used Notepad to get rid of the annoying code.
albertosa
 
Posts: 25
Joined: Fri Jul 04, 2008 9:09 am
Location: Guimarães/Braga - Portugal

Re: Clean MS WORD Garbage out of Abstracts

Postby michael » Fri Apr 16, 2010 9:28 am

Hi Trudy,

We'll be upgrading TinyMCE to the newest version which appears to have better support for cleaning up pasted HTML. In particular, there is now a new option, paste_auto_cleanup_on_paste:

"If enabled contents will be automatically processed when you paste using Ctrl+V or similar methods. This is enabled by default."

As I understand it, with this enabled TinyMCE will perform the cleaning regardless of the method of pasting by the user.

For cleaning existing records in the db, HTML Tidy may be helpful. You can install it standalone or as part of PHP.
michael
 
Posts: 405
Joined: Thu Mar 29, 2007 2:09 pm

Re: Clean MS WORD Garbage out of Abstracts

Postby ramon » Mon Sep 13, 2010 1:22 pm

Hello Michael,

How do we upgrade TinyMCE, regardless of OxS version?
Which package from Moxicode should we download?

How would one fix the database?
Should we run Tidy on a database dump file then insert the dump back?
Do you have a working example on how to clean MSOffice garbage styles?
ramon
 
Posts: 931
Joined: Wed Oct 15, 2003 6:15 am
Location: Brasí­lia/DF - Brasil

Re: Clean MS WORD Garbage out of Abstracts

Postby michael » Tue Sep 14, 2010 3:20 pm

Hi Ramon,

ramon wrote:How do we upgrade TinyMCE, regardless of OxS version?
Which package from Moxicode should we download?


It may be easiest to get the latest version of OJS or OCS and simply copy/move the tinymce folder from the lib/pkp/lib directory to its corresponding spot in your OxS installs.

You'll also need to update in each of your installs plugins/generic/tinymce/TinyMCEPlugin.inc.php to add the new clean paste option. Pease see this patch.

Don't forget to reload and/or clear your template cache once the upgrade is complete. You can also check whether you're picking up the new version of TinyMCE by clicking on the Help (question mark) icon in the TinyMCE editor.

ramon wrote:How would one fix the database?
Should we run Tidy on a database dump file then insert the dump back?
Do you have a working example on how to clean MSOffice garbage styles?


I've only used Tidy directly on HTML files so you may need to experiment a little bit here to see what works reliably. In terms of particular Tidy parameters, you'll need the "clean" and "word 2000" options.

If you don't yet have a local install or want some quick feedback on the various options, there's an online version of Tidy. Obviously for small tests and not your database dump :wink:

Hope that helps!

Michael

P.S. If you can post your experiences I'm sure others will find it quite helpful.
michael
 
Posts: 405
Joined: Thu Mar 29, 2007 2:09 pm

Re: Clean MS WORD Garbage out of Abstracts

Postby ramon » Wed Sep 15, 2010 1:11 pm

Hello Michael,

I tried to update my TinyMCE (on a test server):
- Downloaded OJS 2.3.2-1
- Copied the tinymce folder from the PKP Lib: cp -Rf ojs-2.3.2-1/lib/pkp/lib/tinymce/ ciinf/lib/
- Edited plugins/generic/tinymce/TinyMCEPlugin.inc.php and added the additional lines pointed by the patch at GITHUB you mentioned

Now, when I try to create a new review form, for example, it's throwing the following error:
Code: Select all
Real path failed


I checked the code and it seems to be related to the path to the cache...
ramon
 
Posts: 931
Joined: Wed Oct 15, 2003 6:15 am
Location: Brasí­lia/DF - Brasil

Re: Clean MS WORD Garbage out of Abstracts

Postby michael » Wed Sep 15, 2010 4:12 pm

Hi Ramon,

Can you please confirm that your cp is correct?

It looks to me like you'd lose the top-level tinymce directory with that cp command (i.e. ciinf/lib/tinymce should have the contents of lib/pkp/lib/tinymce).

Cheers,
Michael
michael
 
Posts: 405
Joined: Thu Mar 29, 2007 2:09 pm

Re: Clean MS WORD Garbage out of Abstracts

Postby ramon » Thu Sep 16, 2010 5:43 am

Hello Michael,

I believe the cp command is correct, as the tinymce folder already exists in the destination folder.
Here's the result of that command

I also tried updating tinymce manually, downloading from moxiecode, but to no avail either....
ramon
 
Posts: 931
Joined: Wed Oct 15, 2003 6:15 am
Location: Brasí­lia/DF - Brasil

Re: Clean MS WORD Garbage out of Abstracts

Postby michael » Thu Sep 16, 2010 7:18 pm

Hi Ramon,

Unfortunately I can't reproduce this -- I'm not sure what could be happening with your install.

For what it's worth, doing a standard cp -rf lib/pkp/lib/tinymce ojs224/lib/ works for me without any errors. I simply reload the page and the newer version of TinyMCE loads as expected.

Cheers,
Michael
michael
 
Posts: 405
Joined: Thu Mar 29, 2007 2:09 pm

Re: Clean MS WORD Garbage out of Abstracts

Postby phil » Tue Oct 05, 2010 8:11 am

Michael,

We are also in need of resolving problems associated with endusers pasting directly from Word (and ending up with much character garbage). I hope I have been understanding that element of the conversation here - that with the following changes we can expect endusers to simply paste (abstracts, for example) from their favourite word processor and the result will be clean text.

1. We are using OJS 2.3.1.2 and Tiny MCE v.3.2.2.3 Perhaps it is not necessary to upgrade TinyMCE to accomplish this task - but rather just make the edits as suggested at http://github.com/pkp/ojs/commit/1c5f76 ... 2fa#diff-0 ??? What do you think?

The most recent version available from TinyMCE is 3.3.9.2 - if upgrading is recommended is this what is now being used by OXS, or the version you'd recommend?

2. Whether we upgrade or not (I believe I understand the general procedures for doing so) I am confused by the "Subproject commit .........." portion of the edit instructions at the github posting above.

a. If upgrading is NOT required is this specific "Subproject" text edit required?
b. Where are these statements found? It refers to ../ojs/lib which is only a directory?
c. Re: the aplpha-numeric string following 'commit', should this be used explicitly as is, or this string dependent on something like TinyMCE version, or??????

-Subproject commit 3820075758715ea7723959ab597133eb5b0e20dc
+Subproject commit a5aa401f600754ac774ebb2a59c5ffb54098fd00

Thanks very much Michael,

/Phil
Canadian Journal of Public Health
phil
 
Posts: 15
Joined: Sun Jan 24, 2010 12:54 pm

Re: Clean MS WORD Garbage out of Abstracts

Postby michael » Wed Oct 06, 2010 10:21 am

Hi Phil,


phil wrote:I hope I have been understanding that element of the conversation here - that with the following changes we can expect endusers to simply paste (abstracts, for example) from their favourite word processor and the result will be clean text.


Yes, that's correct.

phil wrote:1. We are using OJS 2.3.1.2 and Tiny MCE v.3.2.2.3 Perhaps it is not necessary to upgrade TinyMCE to accomplish this task - but rather just make the edits as suggested at http://github.com/pkp/ojs/commit/1c5f76 ... 2fa#diff-0 ??? What do you think?


Improved Word-paste functionality was added to TinyMCE 3.2.3 so you would need to upgrade both your version of TinyMCE as well as apply the patch in that commit.

phil wrote:The most recent version available from TinyMCE is 3.3.9.2 - if upgrading is recommended is this what is now being used by OXS, or the version you'd recommend?


I'd recommend the version that we're currently using in lib/pkp (3.3.2) since it's been tested with OxS.

phil wrote:2. Whether we upgrade or not (I believe I understand the general procedures for doing so) I am confused by the "Subproject commit .........." portion of the edit instructions at the github posting above.


You can safely ignore the subproject commit. That corresponds to the commit in lib/pkp, the shared library that we use for all of the OxS apps. In effect it's the update to the TinyMCE library itself, but you'll be copying that from a recent OJS tarball that includes TinyMCE 3.3.2.

Cheers,
Michael
michael
 
Posts: 405
Joined: Thu Mar 29, 2007 2:09 pm

Re: Clean MS WORD Garbage out of Abstracts

Postby ramon » Thu Dec 23, 2010 1:38 pm

Hello all,

Just to report that I managed to update TinyMCE on development and production servers.
Since the download file is a ZIP, I believe it may have to do with the Linux GZip command.
Unpacking on a WinXP machine, with WinZip and transferring afterwards to the server worked perfectly.
Pasting from MSWord automagically clears all the garbage styling, thank the lord for TinyMCE upgrade!!
I haven't tested with OpenOffice, though...
ramon
 
Posts: 931
Joined: Wed Oct 15, 2003 6:15 am
Location: Brasí­lia/DF - Brasil


Return to OCS Technical Support

Who is online

Users browsing this forum: No registered users and 2 guests