OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



Problem with German character encoding [RESOLVED]

Are you responsible for making OCS work -- installing, upgrading, migrating or troubleshooting? Do you think you've found a bug? Post in this forum.

Moderators: jmacgreg, michael, John

Forum rules
What to do if you have a technical problem with OCS:

1. Search the forum. You can do this from the Advanced Search Page or from our Google Custom Search, which will search the entire PKP site. If you are encountering an error, we especially recommend searching the forum for said error.

2. Check the FAQ to see if your question or error has already been resolved. Please note that this FAQ is OJS-centric, but most issues are applicable to both platforms.

3. Post a question, but please, only after trying the above two solutions. If it's a workflow or usability question you should probably post to the OCS Conference Support and Discussion subforum; if you have a development question, try the OCS Development subforum.

Problem with German character encoding [RESOLVED]

Postby joelfisler » Wed Dec 19, 2007 2:31 am

Hi,
We are currently evaluating OCS at the University of Zurich: http://www.conferences.uzh.ch
Everything seems to work except for the encoding. People are submitting abstracts and the German "Umlaute" dont appear. So instead of:
"Was für Möglichkeiten bietet OLAT" OCS will display "Was f?r M?glichkeiten bietet OLAT"
Any idea how we could fix it? We tried many things but without success. Are there other people out there using OCS in German (or French or Spanish... etc.) and dont have this encoding problem?

Thanks for your help.
joelfisler
 
Posts: 1
Joined: Wed Dec 19, 2007 2:24 am

Re: Problem with German character encoding

Postby mazzo » Wed Dec 19, 2007 5:12 am

Hi

The configuration used is:
locale = en_US
client_charset = utf-8
connection_charset = utf8
database_charset = utf8

mysql-database-tables where using utf8 encoding, mysql and apache-webserver use iso-8859-1 encoding.

Checking Form.inc.php we found:

if ($value === utf8_decode($value) && $value !== utf8_en
code($value)) {
// string is cp1252
// transliterate cp1252->utf8 to work in utf-8
// utf8_decode to work in latin-1 (information m
ay be lost)
import('core.Transcoder');
$trans =& new Transcoder('CP1252', 'UTF-8');
$value = $trans->trans($value);

} elseif ($value !== utf8_decode($value) && $value !== u
tf8_encode($value)) {
// string is not within utf-8(?)
// normalize to ASCII (lowest common encoding) -
information will be lost
import('core.Transcoder');
$trans =& new Transcoder('UTF-8', 'ASCII');
$value = $trans->trans($value);
}
For some reason (?) with our environment

($value !== utf8_decode($value) && $value !== utf8_encode($value))

matched and the ASCII-Transcoder was invoked. Commenting out the Transcoder helped. But probably not the way to solve the problem.

Any hint is appreciated.
mazzo
 
Posts: 4
Joined: Wed Dec 19, 2007 4:57 am

Re: Problem with German character encoding

Postby mj » Wed Dec 19, 2007 7:23 am

Hi all,

This issue has been addressed in Bugzilla #3089 and a patch for OCS 2.0 is available at http://pkp.sfu.ca/bugzilla/attachment.cgi?id=375&action=edit. The new code has been thoroughly tested and is part of the recent OJS 2.2 release; it will also be included in OCS 2.1 when it is released as well.

Hope this helps,
mj
Site Admin
 
Posts: 304
Joined: Fri Mar 26, 2004 9:32 am
Location: Toronto, Canada

Re: Problem with German character encoding

Postby mazzo » Thu Dec 20, 2007 7:06 am

mj wrote:Hi all,

This issue has been addressed in Bugzilla #3089 and a patch for OCS 2.0 is available at http://pkp.sfu.ca/bugzilla/attachment.cgi?id=375&action=edit. The new code has been thoroughly tested and is part of the recent OJS 2.2 release; it will also be included in OCS 2.1 when it is released as well.

Hope this helps,


Unfortunatelly this did not help. As far as I understand, in Form.inc.php character translation happens even if everything is running in utf-8, what should not happen. We applied the suggested patch but this did not help. "Umlaute" where stripped out. We made following change in Form.inc.php

--- Form.inc.php.ori 2007-12-19 12:36:39.474715000 +0100
+++ Form.inc.php 2007-12-20 14:48:58.818284000 +0100
@@ -91,7 +91,11 @@
if (is_string($value)) {

// check for Windows-1252 encoding, and transliterate if necessary
- if ($value === utf8_decode($value) && $value !== utf8_encode($value)) {
+ if ($value === utf8_decode(utf8_encode($value))) {
+ // string is utf8
+ // nothing must be done
+ ;
+ } elseif ($value === utf8_decode($value) && $value !== utf8_encode($value)) {
// string is cp1252
// transliterate cp1252->utf8 to work in utf-8
// utf8_decode to work in latin-1 (information may be lost)

If a encoding and decoding of $value returns the value we do not strip any characters. This solved our problem, but maybe there is a better solution?
The configuration used is:

client_charset = utf-8
connection_charset = utf8
database_charset = utf8
charset_normalization = On

Thanks Roberto
mazzo
 
Posts: 4
Joined: Wed Dec 19, 2007 4:57 am

Re: Problem with German character encoding

Postby mj » Thu Dec 20, 2007 1:42 pm

Hi Roberto,

What you've actually discovered is an error in the patch -- thank you, and please accept my apologies. I'll update the patch set to fix it. The setData function in Form.inc.php should look like this:

Code: Select all
   function setData($key, $value) {

      if (is_string($value)) $value = Core::cleanVar($value);

      $this->_data[$key] = $value;
   }


As all of the UTF-checking code is now contained within the String and Transcoder classes. You're right that the old check using utf8_encode/utf8_decode doesn't work correctly; it has been replaced by the more reliable String::isUTF8() method. This is wrapped by the cleanVar() method, which is used throughout OCS. Once again, all of these improvements are in the current CVS and will be part of OCS 2.1.

Thanks again for your patience, and let me know if updating the code above solves the problem.
mj
Site Admin
 
Posts: 304
Joined: Fri Mar 26, 2004 9:32 am
Location: Toronto, Canada

Re: Problem with German character encoding

Postby mazzo » Thu Dec 20, 2007 3:48 pm

Yes, your modified function worked in our configuration.

Thank you for this quick fix!

Roberto
mazzo
 
Posts: 4
Joined: Wed Dec 19, 2007 4:57 am

Re: Problem with German character encoding

Postby luis » Wed Jan 30, 2008 6:17 pm

The new code works fine for me too. I'm using Brazilian character encoding (pt_BR). Thank you!

Luis
luis
 
Posts: 1
Joined: Wed Jan 30, 2008 5:32 pm

Re: Problem with German character encoding [RESOLVED]

Postby mbria » Tue Apr 29, 2008 11:39 am

Thanks a lot for the patch. It also worked for me on a brand new OCS on a latin charset context.

BTW I downloaded today OCS 2.0.0.0 from your site, and this bug is still there.

If the new version is not going to be released soon I highly encourage you to include a minor release with the patch, as far as it's an important issue in scenarios where utf8 is not full extended yet.

Once again, million thanks for the fixing and the wonderful development.

Cheers,

m.
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am

Re: Problem with German character encoding [RESOLVED]

Postby mj » Tue Apr 29, 2008 11:53 am

Hi all,

Glad to hear you've all been able to get things working with this fix. There is a complete patch against OCS 2.0.0 on the Bugzilla entry http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=3089, but we are putting the finishing touches on the next release, OCS 2.1, as we speak, and are aiming to have it released shortly. This new release addresses the UTF-8 issue as well as numerous other improvements.

Regards,
MJ
mj
Site Admin
 
Posts: 304
Joined: Fri Mar 26, 2004 9:32 am
Location: Toronto, Canada

Re: Problem with German character encoding [RESOLVED]

Postby mbria » Tue Apr 29, 2008 4:37 pm

Good news and wonderful work done with OCS2.

More I see all your developments, more I fall in love. ;-)

Thanks a lot for your help,

m.
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am


Return to OCS Technical Support

Who is online

Users browsing this forum: No registered users and 1 guest