Free software is going international! The Translation Project is a way to get maintainers, translators and users all together, so free software will gradually become able to speak many native languages.
gettext tool set contains everything maintainers
need for internationalizing their packages for messages. It also
contains quite useful tools for helping translators at localizing
messages to their native language, once a package has already been
To achieve the Translation Project, we need many interested people who like their own language and write it well, and who are also able to synergize with other translators speaking the same language. If you'd like to volunteer to work at translating messages, please send mail to your translating team.
Each team has its own mailing list, courtesy of Linux International. You may reach your translating team at the address `email@example.com´, replacing ll by the two-letter ISO 639 code for your language. Language codes are not the same as country codes given in ISO 3166. The following translating teams exist:
For example, you may reach the Chinese translating team by writing to `firstname.lastname@example.org´. When you become a member of the translating team for your own language, you may subscribe to its list. For example, Swedish people can send a message to `email@example.com´, having this message body:
Keep in mind that team members should be interested in working at translations, or at solving translational difficulties, rather than merely lurking around. If your team does not exist yet and you want to start one, please write to `firstname.lastname@example.org´; you will then reach the coordinator for all translator teams.
A handful of GNU packages have already been adapted and provided with message translations for several languages. Translation teams have begun to organize, using these packages as a starting point. But there are many more packages and many languages for which we have no volunteer translators. If you would like to volunteer to work at translating messages, please send mail to `email@example.com´ indicating what language(s) you can work on.
This is now official, GNU is going international! Here is the announcement submitted for the January 1995 GNU Bulletin:
A handful of GNU packages have already been adapted and provided with message translations for several languages. Translation teams have begun to organize, using these packages as a starting point. But there are many more packages and many languages for which we have no volunteer translators. If you'd like to volunteer to work at translating messages, please send mail to `firstname.lastname@example.org´ indicating what language(s) you can work on.
This document should answer many questions for those who are curious about the process or would like to contribute. Please at least skim over it, hoping to cut down a little of the high volume of e-mail generated by this collective effort towards internationalization of free software.
Most free programming which is widely shared is done in English, and currently, English is used as the main communicating language between national communities collaborating to free software. This very document is written in English. This will not change in the foreseeable future.
However, there is a strong appetite from national communities for having more software able to write using national language and habits, and there is an on-going effort to modify free software in such a way that it becomes able to do so. The experiments driven so far raised an enthusiastic response from pretesters, so we believe that internationalization of free software is dedicated to succeed.
For suggestion clarifications, additions or corrections to this document, please e-mail to `email@example.com´.
Facing this internationalization effort, a few users expressed their concerns. Some of these doubts are presented and discussed, here.
gettextnecessarily brings their package under the protective wing of the GNU General Public License or the GNU Library General Public License, when they do not want to make their program free, or want other kinds of freedom. The simplest answer is "normally not". The
gettext-runtimepart of GNU
gettext, i.e. the contents of
libintl, is covered by the GNU Library General Public License. The
gettext-toolspart of GNU
gettext, i.e. the rest of the GNU
gettextpackage, is covered by the GNU General Public License. The mere marking of localizable strings in a package, or conditional inclusion of a few lines for initialization, is not really including GPL'ed or LGPL'ed code. However, since the localization routines in
libintlare under the LGPL, the LGPL needs to be considered. It gives the right to distribute the complete unmodified source of
libintleven with non-free programs. It also gives the right to use
libintlas a shared library, even for non-free programs. But it gives the right to use
libintlas a static library or to incorporate
libintlinto another library only to free software.
On a larger scale, the true solution would be to organize some kind of fairly precise set up in which volunteers could participate. I gave some thought to this idea lately, and realize there will be some touchy points. I thought of writing to Richard Stallman to launch such a project, but feel it might be good to shake out the ideas between ourselves first. Most probably that Linux International has some experience in the field already, or would like to orchestrate the volunteer work, maybe. Food for thought, in any case!
I guess we have to setup something early, somehow, that will help many possible contributors of the same language to interlock and avoid work duplication, and further be put in contact for solving together problems particular to their tongue (in most languages, there are many difficulties peculiar to translating technical English). My Swedish contributor acknowledged these difficulties, and I'm well aware of them for French.
This is surely not a technical issue, but we should manage so the effort of locale contributors be maximally useful, despite the national team layer interface between contributors and maintainers.
The Translation Project needs some setup for coordinating language
coordinators. Localizing evolving programs will surely
become a permanent and continuous activity in the free software community,
once well started.
The setup should be minimally completed and tested before GNU
gettext becomes an official reality. The e-mail address
`firstname.lastname@example.org´ has been setup for receiving
offers from volunteers and general e-mail on these topics. This address
reaches the Translation Project coordinator.
I also think GNU will need sooner than it thinks, that someone setup a way to organize and coordinate these groups. Some kind of group of groups. My opinion is that it would be good that GNU delegates this task to a small group of collaborating volunteers, shortly. Perhaps in `gnu.announce´ a list of this national committee's can be published.
My role as coordinator would simply be to refer to Ulrich any German speaking volunteer interested to localization of free software packages, and maybe helping national groups to initially organize, while maintaining national registries for until national groups are ready to take over. In fact, the coordinator should ease volunteers to get in contact with one another for creating national teams, which should then select one coordinator per language, or country (regionalized language). If well done, the coordination should be useful without being an overwhelming task, the time to put delegations in place.
I suggest we look for volunteer coordinators/editors for individual languages. These people will scan contributions of translation files for various programs, for their own languages, and will ensure high and uniform standards of diction.
From my current experience with other people in these days, those who provide localizations are very enthusiastic about the process, and are more interested in the localization process than in the program they localize, and want to do many programs, not just one. This seems to confirm that having a coordinator/editor for each language is a good idea.
We need to choose someone who is good at writing clear and concise prose in the language in question. That is hard--we can't check it ourselves. So we need to ask a few people to judge each others' writing and select the one who is best.
I announce my prerelease to a few dozen people, and you would not believe all the discussions it generated already. I shudder to think what will happen when this will be launched, for true, officially, world wide. Who am I to arbitrate between two Czekolsovak users contradicting each other, for example?
I assume that your German is not much better than my French so that I would not be able to judge about these formulations. What I would suggest is that for each language there is a group for people who maintain the PO files and judge about changes. I suspect there will be cultural differences between how such groups of people will behave. Some will have relaxed ways, reach consensus easily, and have anyone of the group relate to the maintainers, while others will fight to death, organize heavy administrations up to national standards, and use strict channels.
The German team is putting out a good example. Right now, they are maybe half a dozen people revising translations of each other and discussing the linguistic issues. I do not even have all the names. Ulrich Drepper is taking care of coordinating the German team. He subscribed to all my pretest lists, so I do not even have to warn him specifically of incoming releases.
I'm sure, that is a good idea to get teams for each language working on translations. That will make the translations better and more consistent.
Taking French for example, there are a few sub-cultures around computers which developed diverging vocabularies. Picking volunteers here and there without addressing this problem in an organized way, soon in the project, might produce a distasteful mix of internationalized programs, and possibly trigger endless quarrels among those who really care.
Keeping some kind of unity in the way French localization of
internationalized programs is achieved is a difficult (and delicate) job.
Knowing the latin character of French people (:-), if we take this
the wrong way, we could end up nowhere, or spoil a lot of energies.
Maybe we should begin to address this problem seriously before
gettext become officially published. And I suspect that this
I expect the next big changes after the official release. Please note that I use the German translation of the short GPL message. We need to set a few good examples before the localization goes out for true in the free software community. Here are a few points to discuss:
If we get any inquiries about GNU
gettext, send them on to:
The `*-pretest´ lists are quite useful to me, maybe the idea could be generalized to many GNU, and non-GNU packages. But each maintainer his/her way!
François, we have a mechanism in place here at `gnu.ai.mit.edu´ to track teams, support mailing lists for them and log members. We have a slight preference that you use it. If this is OK with you, I can get you clued in.
Things are changing! A few years ago, when Daniel Fekete and I
asked for a mailing list for GNU localization, nested at the FSF, we
were politely invited to organize it anywhere else, and so did we.
For communicating with my pretesters, I later made a handful of
mailing lists located at iro.umontreal.ca and administrated by
majordomo. These lists have been very dependable
I suspect that the German team will organize itself a mailing list located in Germany, and so forth for other countries. But before they organize for true, it could surely be useful to offer mailing lists located at the FSF to each national team. So yes, please explain me how I should proceed to create and handle them.
We should create temporary mailing lists, one per country, to help people organize. Temporary, because once regrouped and structured, it would be fair the volunteers from country bring back their list in there and manage it as they want. My feeling is that, in the long run, each team should run its own list, from within their country. There also should be some central list to which all teams could subscribe as they see fit, as long as each team is represented in it.
There will surely be some discussion about this messages after the packages are finally released. If people now send you some proposals for better messages, how do you proceed? Jim, please note that right now, as I put forward nearly a dozen of localizable programs, I receive both the translations and the coordination concerns about them.
If I put one of my things to pretest, Ulrich receives the announcement and passes it on to the German team, who make last minute revisions. Then he submits the translation files to me as the maintainer. For free packages I do not maintain, I would not even hear about it. This scheme could be made to work for the whole Translation Project, I think. For security reasons, maybe Ulrich (national coordinators, in fact) should update central registry kept at the Translation Project (Jim, me, or Len's recruits) once in a while.
In December/January, I was aggressively ready to internationalize all of GNU, giving myself the duty of one small GNU package per week or so, taking many weeks or months for bigger packages. But it does not work this way. I first did all the things I'm responsible for. I've nothing against some missionary work on other maintainers, but I'm also loosing a lot of energy over it--same debates over again.
And when the first localized packages are released we'll get a lot of responses about ugly translations :-). Surely, and we need to have beforehand a fairly good idea about how to handle the information flow between the national teams and the package maintainers.
Please start saving somewhere a quick history of each PO file. I know for sure that the file format will change, allowing for comments. It would be nice that each file has a kind of log, and references for those who want to submit comments or gripes, or otherwise contribute. I sent a proposal for a fast and flexible format, but it is not receiving acceptance yet by the GNU deciders. I'll tell you when I have more information about this.
A translator sometimes has only a limited amount of time per week to spend on a package, and some packages have quite large message catalogs (over 1000 messages). Therefore she wishes to translate the messages first that are the most visible to the user, or that occur most frequently. This section describes how to determine these "most urgent" messages. It also applies to determine the "next most urgent" messages after the message catalog has already been partially translated.
In a first step, she uses the programs like a user would do. While she
does this, the GNU
gettext library logs into a file the not yet
translated messages for which a translation was requested from the program.
In a second step, she uses the PO mode to translate precisely this set of messages.
Here a more details. The GNU
libintl library (but not the
corresponding functions in GNU
libc) supports an environment variable
GETTEXT_LOG_UNTRANSLATED. The GNU
libintl library will
log into this file the messages for which
gettext() and related
functions couldn't find the translation. If the file doesn't exist, it
will be created as needed. On systems with GNU
libc a shared library
`preloadable_libintl.so´ is provided that can be used with the ELF
So, in the first step, the translator uses these commands on systems with
$ LD_PRELOAD=/usr/local/lib/preloadable_libintl.so $ export LD_PRELOAD $ GETTEXT_LOG_UNTRANSLATED=$HOME/gettextlogused $ export GETTEXT_LOG_UNTRANSLATED
and these commands on other systems:
$ GETTEXT_LOG_UNTRANSLATED=$HOME/gettextlogused $ export GETTEXT_LOG_UNTRANSLATED
Then she uses and peruses the programs. (It is a good and recommended practice to use the programs for which you provide translations: it gives you the needed context.) When done, she removes the environment variables:
$ unset LD_PRELOAD $ unset GETTEXT_LOG_UNTRANSLATED
The second step starts with removing duplicates:
$ msguniq $HOME/gettextlogused > missing.po
The result is a PO file, but needs some preprocessing before the Emacs PO mode can be used with it. First, it is a multi-domain PO file, containing messages from many translation domains. Second, it lacks all translator comments and source references. Here is how to get a list of the affected translation domains:
$ sed -n -e 's,^domain "\(.*\)"$,\1,p' < missing.po | sort | uniq
Then the translator can handle the domains one by one. For simplicity, let's use environment variables to denote the language, domain and source package.
$ lang=nl # your language $ domain=coreutils # the name of the domain to be handled $ package=/usr/src/gnu/coreutils-4.5.4 # the package where it comes from
She takes the latest copy of `$lang.po´ from the Translation Project, or from the package (in most cases, `$package/po/$lang.po´), or creates a fresh one if she's the first translator (see section 5 Creating a New PO File). She then uses the following commands to mark the not urgent messages as "obsolete". (This doesn't mean that these messages - translated and untranslated ones - will go away. It simply means that Emacs PO mode will ignore them in the following editing session.)
$ msggrep --domain=$domain missing.po | grep -v '^domain' \ > $domain-missing.po $ msgattrib --set-obsolete --ignore-file $domain-missing.po $domain.$lang.po \ > $domain.$lang-urgent.po
The she translates `$domain.$lang-urgent.po´ by use of Emacs PO mode.
(FIXME: I don't know whether
preserve obsolete messages, as they should.)
Finally she restores the not urgent messages (with their earlier
translations, for those which were already translated) through this command:
$ msgmerge --no-fuzzy-matching $domain.$lang-urgent.po $package/po/$domain.pot \ > $domain.$lang.po
Then she can submit `$domain.$lang.po´ and proceed to the next domain.
Go to the first, previous, next, last section, table of contents.