Design for Colloborative Translation Tool

Hi all,


This thread is the continuation of http://community.elgg.org/pg/forum/topic/808710/collaborative-internalization-for-elgg/.

Here is my initial design for the colloborative web based Translation tool.

Approach:

  1. The user will upload the plugin in .ZIP format to this webbased tool. The same will be done for CORE too.
  2. The tool will automatically unzip the file and reads the manifest.xml file to get the Plugin details. The details will be added to the PLUGIN_MASTER table.
  3. Then the php files inside the languages folder will be scanned and will be read individually.
  4. Each and every language key and value of each language file will be parsed and loaded in to the PLUGIN_STRINGS table.
  5. Then all the loaded strings will be shown in a new page for the users to translate each string in the language that they need.
  6. English will be considered as the standard language and will show up the string for other languages with the % of translation completed.
  7. User can add a new language for the plugin, if it is not already translated in that language.
  8. For each custom language selected, all the string will be displayed. User can report a particular string if the translation is not correct. For this string, the column str_marked will be TRUE.
  9. Other users can validate the strings for which the str_marked column is TRUE to come up with the better translation.



PLUGIN_MASTER Table:

plugin_id      : Integer,Autogenerated, Primary Key
plugin_name    : Text(30)
plugin_version : Text(10)
plugin_author  : Text(30)
plugin_tms     : CURRENT TIMESTAMP


PLUGIN_STRINGS Table:
plugin_id      : Integer,Autogenerated, Primary Key
str_lang       : Text(5)
str_key        : Text(50)
str_value      : Text(200)
str_marked     : Boolean - Default value of FALSE
plugin_tms     : Insert TIMESTAMP
plugin_updt_tms: Last Update TIMESTAMP


PLUGIN_LANGUAGES Table:
iso_lang       : Text - Holds the valid ISO language names

  • Yes. But that alonoe does not solve the requirement.

    We need to have a rating system to have the best translation.

  • A few comments on gettext - before we would switch to gettext, we would need to evaluate the speed of it compared to the current approach. We would also need to figure out what were the other reasons the original developers of Elgg had for going away from it. In the small amount of investigation that I did on this, I turned up the following:

    • gettext may require that Apache be restarted after making changes to a language file
    • gettext may require that the server have the locale installed before it will use a particular language file (this means is the server does not have the fr locale, it won't use the French language file)

    I'm not going to be able to look into this any more for a while. It would be great if members of the community would. Obviously, other FOSS projects are using gettext so how did they work around these issues (if in fact they exist).

  • Let me try to do a brief analysis on this, how other projects implement gettext.

    Thanks for this points, Cash.

  • Purus: Check Drupal. It uses gettext.

  • @Cash: I think that gettext-based apps use the locale setting of the app, not the environment. I am one of the main translators (to Hungarian, of course) of LimeSurvey, an online questionnaire management tool, and it's based on gettext, .po files serve as raw translation files, then they get compiled into .mo files (human unreadable). I don't remember if I even had to restart any service to put these files' updates in action, they were instantaneous (of course, you have to have access to the file system to upload the translations, but that's all).

  • Thanks for the post, Daniel. In my reading on this so far, it looks like the app sets the locale but there may be a requirement that the server has this locale installed. There is some documentation on this here: http://www.php.net/manual/en/function.setlocale.php. It's a little vague. The comments seem to point to requiring the server to have the locale installed. I don't have any actual experience with gettext though.

  • And I forgot to add, normally I use poedit as translation parser/editor/export tool. It has a dumb, but still useful translation memory service included, that digs through the strings already translated and comes up with suggestions (I say it's dumb, because the suggestions are just raw piles of words usually). Poedit is able to perform gettext string extraction on its own, if it has the whole source code stored locally to be parsed. If the source code gets updated, it has to be downloaded to the same local storage folder, the gettext process redone, and it will come up with a list of obsolete/fuzzy/missing strings. That feature is sort of advanced, though.

  • Hi, Cash, just take a look at here (maybe it's worth spending 2 hours to check out the whole source code, and take a peek at the translation status window).

    According to the nature of a questionnaire-based research, it must be able to handle multiple language respondents, that is, the same questionnaire would have multiple language versions of the same question (and very frequently, answer) strings, while all the built-in strings (pagination, controls etc.) appear in the chosen language of the respondent. (Survey display language is normally set by a URL parameter.)

    Anyway, has anyone had a language display problem stemming from a missing (non-installed) locale?

  • You should check out this online translation tool: https://poeditor.com. I've been using it in a few weeks now it it really feels good, no troubles until now. It's a good example of an efficient tool.

Feedback and Planning

Feedback and Planning

Discussions about the past, present, and future of Elgg and this community site.