[Padre-dev] localizing plugin documentation
Enrique Nell
perl_nell at telefonica.net
Thu Apr 30 17:15:49 PDT 2009
Hi
On Apr 30, 2009, at 3:31 PM, Joaquin Ferrero wrote:
> We used OmegaT for the Perl Spanish Project:
I have been using OmegaT lately, and it hangs frequently on Mac OS X,
but otherwise it's quite nice.
It provides fuzzy matching and it can handle conveniently the document
updates, so I think that
it is a good option for the translation of documentation.
The issue here is that most of the current computer-assisted
translation (CAT) tools, like OmegaT,
work at the segment level, and POD is not too friendly in this
respect, since paragraphs are formatted
using line breaks, i.e., a typical segment (sentence) is broken in
several lines.
For instance, a CAT tool will find 3 segments in the following single-
sentence paragraph:
If a message can be controlled by the C<warnings> pragma, its warning
category is included with the classification letter in the description
below.
There's no point in using this kind of translation memory tool if we
are not going to process complete sentences.
So, to handle this correctly, we should pre-process the POD files and
either remove these "inner" line breaks,
or turn them into a literal "\n". Any CPAN module available for this?
Is it possible using gettext? (I don't know much
about it... good references are welcome.)
Some time ago I developed a short program to solve a similar problem:
I had to remove some of the line breaks
from a set of PDF export files to get full sentences, in order to
align them and create a translation memory.
It was based on some heuristics and the output wasn't perfect, but it
fixed a high percentage of the issues.
Perhaps some of you know better ways of doing this.
> The translation is at sentence level, and the process is very fast
> if the players interchange the translation memories.
> Well, in theory...
In a typical project you won't have that many repetitions and high
fuzzy matches to get a speed boost
(except for naïve texts); that would require working with a finer
granularity (at the chunk level).
But it sure helps to improve consistency and makes updating the files
a lot easier: OmegaT pre-translates
the new version of the file using the translation memory and it also
shows the differences for fuzzy matches.
Enrique
More information about the Padre-dev
mailing list