strip_unsafe_html truncating abstracts

Are you responsible for making OJS work -- installing, upgrading, migrating or troubleshooting? Do you think you've found a bug? Post in this forum.

Moderators: jmacgreg, btbell, michael, bdgregg, barbarah, asmecher

Forum rules
The Public Knowledge Project Support Forum is moving to http://forum.pkp.sfu.ca

This forum will be maintained permanently as an archived historical resource, but all new questions should be added to the new forum. Questions will no longer be monitored on this old forum after March 30, 2015.
makouvlei
Posts: 16
Joined: Wed Nov 17, 2010 2:56 am

strip_unsafe_html truncating abstracts

Postby makouvlei » Fri Sep 02, 2011 1:31 am

Hi, using OJS 2.2.3, we have mathematical formulae contained in abstracts e.g. "P<.001" (encoded as "P&lt;.001") resulting in strip_unsafe_html truncating the content

Code: Select all

<div>{$article->getArticleAbstract()|strip_unsafe_html|nl2br}</div>

Is there something I can add to "allowed_html" in config.inc.php to prevent this or perhaps another way around this? It's a relatively controlled environment with limited access to editing of abstracts so could probably safely remove strip_unsafe_html in this particular instance, but if there is another solution that would be preferable.

Thanks
J van Tonder

asmecher
Posts: 10015
Joined: Wed Aug 10, 2005 12:56 pm
Contact:

Re: strip_unsafe_html truncating abstracts

Postby asmecher » Fri Sep 02, 2011 8:53 am

Hi makouvlei,

How are these abstracts being entered into the system? Through the regular submission process, unless you've disabled the TinyMCE plugin, they'll be entered via a rich text editor and stored in the database as HTML, which includes encoding characters like <, >, and & as "&lt;", "&gt;", and "&amp;" respectively. Those entities will make it through the HTML filter and should be treated properly by any subsequent processes.

Regards,
Alec Smecher
Public Knowledge Project Team

makouvlei
Posts: 16
Joined: Wed Nov 17, 2010 2:56 am

Re: strip_unsafe_html truncating abstracts

Postby makouvlei » Mon Sep 05, 2011 9:56 am

Thanks Alec, you're right the problem lies with the html-editor, which is enabled, but behaving a bit weirdly.

It appears that TineMCE is not saving the HTML-encoding even though it displays the characters as encoded in the TineMCE html-view. Even if I save encoded characters directly in the database, once edited by TinyMCE they become un-encoded again. Any idea what might cause this?

Regards
J van Tonder

asmecher
Posts: 10015
Joined: Wed Aug 10, 2005 12:56 pm
Contact:

Re: strip_unsafe_html truncating abstracts

Postby asmecher » Tue Sep 06, 2011 9:52 am

Hi makouvlei,

What version of OJS are you using, and have you altered it, i.e. by upgrading TinyMCE to a newer version? There are two elements to consider here: first, TinyMCE itself, which might be an old version; and second, the plugin that integrates TinyMCE with OJS, which includes some configuration to set up things like character encodings.

Regards,
Alec Smecher
Public Knowledge Project Team

makouvlei
Posts: 16
Joined: Wed Nov 17, 2010 2:56 am

Re: strip_unsafe_html truncating abstracts

Postby makouvlei » Wed Sep 07, 2011 2:43 am

Hi Alec

OJS version is 2.2.3

I upgraded the TinyMCE some time ago: majorVersion:"3",minorVersion:"3.2",releaseDate:"2010-03-25"

Tried a few things after going through the TineMCE documentation (e.g. adding entity_encoding : "raw" to tinyMCE.init in TinyMCEPlugin.inc.php) but no luck..

Regards
J van Tonder

asmecher
Posts: 10015
Joined: Wed Aug 10, 2005 12:56 pm
Contact:

Re: strip_unsafe_html truncating abstracts

Postby asmecher » Wed Sep 07, 2011 9:08 am

Hi makouvlei,

That's quite an old copy of OJS; if possible, I'd recommend that you try upgrading to a recent release, which will come with a newer copy of TinyMCE.

Regards,
Alec Smecher
Public Knowledge Project Team

makouvlei
Posts: 16
Joined: Wed Nov 17, 2010 2:56 am

Re: strip_unsafe_html truncating abstracts

Postby makouvlei » Fri Sep 16, 2011 1:38 am

Hi Alec

We're not in a position to upgrade OJS at this time. I tried upgrading cleanVar() in Core.inc.php (also adding functions utf8_bad_strip and utf8_normalize in Sting.inc.php as well as the "phputf8" folder) but then found that all HTML including valid tags are being encoded and appears as written-out tags in the browser.

However if I disable charset_normalization altogether all appears to go smoothly.

I see in viewtopic.php?f=8&t=7307 that you suggest charset_normalization can be disabled if the TinyMCE is up to date.

Seeing as our TinyMCE version was upgraded a while ago to 3.2 (same as in OJS 2.3.6) I should then be able to safely disable charset_normalization?

Regards
J van Tonder

asmecher
Posts: 10015
Joined: Wed Aug 10, 2005 12:56 pm
Contact:

Re: strip_unsafe_html truncating abstracts

Postby asmecher » Fri Sep 16, 2011 9:19 am

Hi makouvlei,

Yes, it's always safe to disable that option. You should be fine.

Regards,
Alec Smecher
Public Knowledge Project Team

ctparker
Posts: 2
Joined: Wed Jan 11, 2012 9:40 am

Re: strip_unsafe_html truncating abstracts

Postby ctparker » Fri Jan 13, 2012 10:51 am

I am also experiencing this issue with our site, in the article titles listed in submission pages. I believe it is a more general problem. I've added my analysis of the issue to Bug 6900:
http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=6900


Return to “OJS Technical Support”

Who is online

Users browsing this forum: No registered users and 1 guest