Olifant: Translation Memory Editor: Tms and Glossaries
Olifant: Translation Memory Editor: Tms and Glossaries
Olifant: Translation Memory Editor: Tms and Glossaries
Granada (Spain)
skype: javier-herrera
www.javierh.net
Pitfalls
1) Since the find function doesn't search translation units preceding the one you're in; i.e., it
doesn't search backwards, a "word not found" message may be misleading.
2) You may find that the latest changes made aren't undone with the well-known keyboard
shortcut Ctrl+Z. This may become very frustrating when you realise that the undo option on the edit
menu doesn't work either, especially if you've just inadvertently deleted segments that you intended
to keep. But don't despair, Olifant distinguishes between two similar but not identical functions: one,
"undo last edit", applies to changes in the text per se, and the other, "undo last table changes" (also
in the edit menu), to operations that affect database structure, which include flagging or deleting
segments and swapping target text entries.
Practical cases
While the project site has a number of tutorials, a read-through of the problems addressed here
may be useful, even if the reader isn't faced by the situations described (in fact, some of the
examples are a bit strained: in case 2, for instance, it should be the client's responsibility to furnish
a suitable memory). This section describes Olifant's major functions. The resources discussed can
in all likelihood be combined with others and used to formulate strategies for dealing with any
number of complex situations. The cases are set out below, in "advice column" format and with
illustrations whose aesthetics lie midway between comic book and Ikea manual.
Case 1: I have a TM in tmx format and want to change the assigned source and target
languages from Irish to British English and from Colombian to European Spanish. I've opened it
with conventional word processors, which either won't edit the tags (Word) or have no bulk find
and replace option (Notepad).
JavierH: To begin with, let me say that you don't need a sledgehammer to crack a nut: both
OpenOffice and Notepad++ have bulk find and replace functions. The latter is a simple tool
actually designed for programmers, but is so readily installed that it's worth downloading just for this
purpose. A word of advice for anyone planning to use Olifant and study the images below: don't
mistake the (create a) new and import a translation memory functions. The latter, as I've
mentioned it, is used to add segments from a second memory to the TM that's open: in other
words, to merge two memories.
****************************
C2: My client has sent me an updated memory for a follow-up job and asked me to use his
version, to which a series of important segments has been added. Normally, I would simply use the
most recent TM and disregard my previous version, but I happen to have invested a lot of effort in
modifying mine to adapt it to the terminology and style preferred by the end user. Since those
preferences aren't accommodated in the database that I've just received, I can only conclude that
its sole advantage over mine is that it has more TUs. Ideally, I'd like to have a procedure that would
merge the two, keeping the best of each. How can Olifant help in this case?
JH: The function that you can use here is overwrite. First, open the new memory with
Olifant and then import the earlier TM into it, as shown in the screenshots. The segments whose
English fields are duplicated will be tabled alongside the Spanish segments in the earlier TM (the
second memory opened), while the segments that appear only once, i.e., that weren't in the prior
database, will remain intact.
****************************
C3: I have a problem rather like that set out in the preceding query, but I don't want to
necessarily have to accept certain segments just because they're associated with one of the TMs. I
have to be free to compare the two versions of some, or all, of the translation units, delete the
version I dislike and edit the other one conscientiously according to my own criteria.
JH: In that case, you need to create a blank memory where you'll merge the other two.
Then flag the TUs whose original field appears in both memories (the program calls them duplicate
entries) for ready identification, and lastly display each segment with its counterpart to compare
each pair in detail. It's advisable to label the two TMs during the import process by creating a third
field that identifies the origin of each entry. Lastly, use filters to visualise only the flagged
segments. What you shouldn't do in this case is overwrite.
You need to perform step 2 because displaying any additional fields created isn't the default setting
in Olifant.
With this last step, the entries are automatically shown in alphabetical order.
What we've done is use the filter to display only the TUs we need to see (the ones previously
flagged). Otherwise, we'd have to use the scrollbar to run through the entire text to locate the
flagged units.
****************************
C4: My client has sent me an updated TM. I don't know what changes are involved, only
that corrections have been made. I want to identify the corrections that are most often repeated to
standardise on terminology and style, even in the segments with no matches. But I'm afraid that to
do so, I'll have to use the concordance button and check term by term to determine whether the
client accepted the wording I've been using. I tried to use the Word compare and combine
documents function, but since there are so many units in the earlier TM that aren't in the new
memory, and vice-versa, the text looks like a battlefield, rendering any careful review extremely
laborious. Does Olifant have some feature that yields the same result but less messily?
JH: No, but it does make our lives somewhat simpler. We can use a couple of tricks to
eliminate the segments that only create noise, and produce two Word tables, one for each TM, with
the rest.
Start off as in the case of memory A+B, i.e., using the two databases you're working with.
Go through all the same steps and once the duplicate entries have been filtered, sort
alphabetically on the "segment origin" field (by simply clicking on the name of the field). That will
group all the memory A segments at the top of the list and the memory B segments at the bottom.
Lastly, export memory A+B as a WordFast file, even if you don't have that program. It's the only
manageable format that can be readily converted into a table with a conventional text processor to
be able to later split the text in two. Since you're applying the filter, only the segments displayed
when the export procedure was performed will be processed. Now all you have to do is compare
the resulting files.
Note: the original Spanish version of this article was published in the Autumn 2012 issue (number
7) of the journal La Linterna del Traductor.