We are moving to Git Issues for bug tracking in future releases. During transition, content will be in both tools. If you'd like to file a new bug, please create an issue.

Bug 6088 - OMP clean-up: fix missing/stale translation keys
OMP clean-up: fix missing/stale translation keys
Product: OMP
Classification: Unclassified
Component: General
PC Linux
: P3 normal
Assigned To: beghelli
: 4483 5797 5874 (view as bug list)
Depends on:
Blocks: 4483
  Show dependency treegraph
Reported: 2010-10-31 08:59 PDT by jerico
Modified: 2011-06-21 16:17 PDT (History)
5 users (show)

See Also:
Version Reported In:
Also Affects:


Note You need to log in before you can comment on or make changes to this bug.
Description jerico 2010-10-31 08:59:26 PDT
OMP translation files contain lots of stale (unused) translation keys. There are also a lot of translation keys missing in the translation files.

Grep all translation keys used in the application and compare them to the keys in the translation files. Remove all stale keys from the translation files and add missing keys.
Comment 1 jerico 2010-12-13 20:25:54 PST
IMO some issues to clarify are:
- When do we put translations into central files and when do we create topic-specific files?
- What are the performance implications of additional files?
- What goes into the app and what goes into lib/pkp?
- How do we organize translation keys /inside/ a translation file?

I guess Alec already has got some ideas about this. But I've not seen them written down somewhere so far.

It would be great if this could be documented somewhere, e.g. on the Wiki, so that we can be more consistent wrt translation files in the future.
Comment 2 Alec Smecher 2010-12-14 09:34:05 PST
I've been following something of a common sense approach for localization files, but that has admittedly been based on a number of assumptions that I haven't tested. I've started some germinal documentation at http://pkp.sfu.ca/wiki/index.php/Localization but if you have anything specific to ask or add, it's welcome.
Comment 3 Juan Pablo Alperin 2011-03-20 11:59:50 PDT
*** Bug 4483 has been marked as a duplicate of this bug. ***
Comment 4 Juan Pablo Alperin 2011-03-20 12:02:58 PDT
Do we have any tools to go through the locale files and check if each key is used somewhere in a template or code (i.e. automatically grep for each key in the code base)? This would let us remove keys that are no longer in use from the locale files. 

A tool to do the opposite would be a little bit tricker, but at least we could look for any Locale::translate(.*) and {translate key='(.*)' ... } and then grep to see if it is in at least one locale file. 

Not that we should build these tools now. But something that we can think about as a quick check in the build process. In the meantime, we need to do the following:
- go through locale files as best as possible and organize things according to the Wiki entry linked above
- be more conciencious when adding/using locale keys. A series of quick greps before adding a new key would go a long way to avoiding problems in the future.
Comment 5 Juan Pablo Alperin 2011-03-20 12:27:07 PDT
*** Bug 5797 has been marked as a duplicate of this bug. ***
Comment 6 Juan Pablo Alperin 2011-03-20 12:28:00 PDT
*** Bug 5874 has been marked as a duplicate of this bug. ***
Comment 7 Alec Smecher 2011-03-20 12:32:57 PDT
This is very brute force but will do the trick.

1. Save a list of all locale keys to a file:

sed -n -e "s/.*key=\"\([^\"]\+\)\".*/\1/p" `fgrep -l locale.dtd \`find . -name \*.xml\`` > /tmp/keys.txt

2. Search the codebase (.tpl and .php files) against that list for keys that don't show up anywhere, and if they aren't in the PKP library for the sake of other apps, can be removed:

for key in `cat /tmp/keys.txt`; do fgrep -q $key `find . -name \*.php` `find . -name \*.tpl` || (echo -n "$key: "; fgrep -l $key `find . -name \*.xml` | sed -n -e "s/\(.*\)\/[a-zA-Z_]\+\/\([a-zA-Z.]\+\)$/\1\/*\/\2/p" | sort | uniq); done

I wouldn't worry so much about locale keys that are referenced in the code and don't appear in the locale files, since those become apparent during testing.

Before I add a new locale key, I always grep for it in existing files:

fgrep -i -l ">Some Text Here<" `find . -name \*.xml`
Comment 8 jerico 2011-03-21 12:04:21 PDT
Some time ago Alec and I established a list of important requirements that need to be worked into the wiki page mentioned above before we can use it as a basis for decision making. Alec promised to do this. Alec, did you do that already? I'm not sure whether I'm subscribed to changes to that page...

The reason is that to a certain extent the concepts documented on that wiki page are themselves (partly) responsible for the current translation key chaos. In other words: trying to follow them will not prove very effective in producing a better translation key system.

The most important problem with the current translation key system is that we allow "topical" files for translation keys. This is not good because different devs can have very different ideas of what belongs to which file because "semantical" categories are much more subjective than "syntactical" categories. 

This is the same reason our "semantical" approach to CSS has proven impossible to maintain by a group. A "semantic" approach only works if a single developer updates it who can maintain it consistent or (as in the case of meta-data authorities) if you create something like a consistent "semantic dictionary and thesaurus" around it which we probably do not have time to maintain. These problems still persist to some extent in the current CSS approach which still leads to inconsistencies there, e.g. between grid.less and index.less where this distinction is being made.

There are more problems with the current translation key approach (e.g. performance, etc.) which we also addressed in our requirements list.
Comment 9 beghelli 2011-04-25 17:59:57 PDT
removed unused localel keys in OMP:
Comment 10 Matthew Crider 2011-05-03 10:26:41 PDT
Bruno, did you use alec's sed/grep method to clear out unused locale keys?  I think you may have deleted quite a few that are being used, e.g. several keys used in the submission wizard, the select option labels in the stage participant modal, and elsewhere.
Comment 11 beghelli 2011-05-03 13:16:00 PDT
Yes Matt, I used Alec´s methods. But I also created a script myself to delete the list that Alec´s scripts created. So, the error could be on this step. I will have a look again on this. Thanks for checking.
Comment 12 beghelli 2011-05-04 03:55:52 PDT
I looked again to all scripts that I used and I saw no error. The only thing that I overlooked at the first time was that some files were in the unused list but they were used in some xml files, in the registry directory. I forgot to look at those cases. So, when Matt warned me about some used keys that were deleted, I looked at them and they match exactly with the ones that I really removed, that were used only in xml files.

I think that I didn´t notice because my presses were already installed, and those keys information are already on database.

@Matt: I think that this commit will resolve your problem. If it doesn´t, just let me know.

Bring back some used keys:
Comment 13 Juan Pablo Alperin 2011-06-21 16:15:09 PDT
can this be closed?
Comment 14 Matthew Crider 2011-06-21 16:17:48 PDT
Yes, bruno resolved the translation issues I was seeing.