You are viewing the PKP Support Forum | PKP Home Wiki

Fix for charset bugs found at some archives

Open Harvester Systems support questions and answers, bug reports, and development issues.

Moderators: jmacgreg, michael, John

Forum rules
Developer Resources:

Git: You can access our public Git Repository here. Comprehensive Git usage instructions are available on the wiki.

Bugzilla: You can access our Bugzilla report tracker here.

Search: You can use our Google Custom Search to search across our main website, the support forum, and Bugzilla.

Questions and discussion are welcome.

Fix for charset bugs found at some archives

Postby ozp » Wed Jun 13, 2007 2:55 pm

I found many archives (most using OJS) that have bugs related to charset and html tags.

We made a filter for those errors:

At plugins/preprocessors/regex/RegexPreprocessorPlugin.inc.php

add this to preprocessEntry() function

Code: Select all

   $fieldsToChange = array('title','description','creator','rights', 'type', 'source', 'subject');

      $value = strip_tags($value);


      if(in_array($field->getName(), $fieldsToChange)) {

         foreach ($_SERVER['argv'] as $arg) switch ($arg) {

         case 'encode':

               $value = utf8_decode($value);




         $value = html_entity_decode($value, null, 'UTF-8');





when you call the harvest.php, you have to include "enconde" after the command
Posts: 51
Joined: Sat Apr 28, 2007 9:01 pm

Postby asmecher » Wed Jun 13, 2007 3:57 pm

Hi ozp,

Thanks -- FYI, we're working on a general solution for problems with illegal characters for the Harvester, OJS 2.x, and OCS 2.x. It's still under development, but should be included the next release of each.

Alec Smecher
Public Knowledge Project Team
Don't miss the First International PKP Scholarly Publishing Conference
July 11 - 13, 2007, Vancouver, BC, Canada
Posts: 10015
Joined: Wed Aug 10, 2005 12:56 pm

Return to Open Harvester Systems Support and Development

Who is online

Users browsing this forum: Yahoo [Bot] and 1 guest