Metal. Siemens computer translation system

1990/03/01 Sagarna, Andoni - Ingeniaria Iturria: Elhuyar aldizkaria

The German company Siemens is the great manufacturer of everything related to electricity. This activity prompted, after ten years of research, the decision to develop METAL, the most advanced computer translation system currently on the market.

Product History

As is known, the German company Siemens is the great manufacturer of everything related to electricity. It also manufactures telephone exchanges. This activity prompted, after ten years of research, the decision to develop METAL, the most advanced computer translation system currently on the market. In fact, the 100,000 pages that made up the documentation of a telephony system could not return from German to English within reasonable terms and costs. Why not, then, respond by developing a product? And why not start from a development that had already begun?

They acquired the METAL (Machine Edited Text Aspiring Legibility) that was being prepared at the University of Texas. Ten years later, this system is able to return from German to English.

Before its launch, they have tested well the operation of METAL, first at the Munich headquarters of Siemens since 1986, then at the translation offices of Comex and Schönau und Damels of Zürich, then at Philips Kommunikations-Industre of Nuremberg, University of Villingen, Mannesmann Kiense and Hille.

Although to date METAL only offers the possibility of returning from German to English, versions of translations from German to Spanish and from English to German can begin testing this spring.

Working procedure

METAL is more than a translation tool. It can be defined as an integrated package of repetitive technical translations of large mass of text. This software requires two computers: A SINIX computer of the Siemens medium that works with the Unix operating system to work the text format and a LISP machine with the grammar and vocabulary necessary to carry out the translation. Both machines are joined via an Ethernet network. The user works on a connected PC as a terminal of the SINIX computer.

This terminal includes the source text, either from a disk, by optical reading and an OCR (character recognition program). The text for translation is sent to SINIX from the terminal. Once the training is completed, it is sent to the LISP machine for return and once this work is completed, it is transferred to SINIX for post-production. The expansion of the new lexicon and the grammatical adjustments made in machine translation are done on the screen of the LISP machine.

It is clear that the objective of METAL is not only to translate, but also responds to problems of introduction and formatting of the document. Therefore, graphics, tables, font, etc. of the original document. respect. If this can be done rigidly, the problem would be simple, but as any translator knows, the length and order of the words change again.

To overcome this objection, METAL first divides the format and text. METAL supports text made in word processors such as Word Star or Word Perfecto.

Once the format and text are divided, the text is divided into short sentences and sent to the LISP machine for translation.

This machine, first of all, searches the text for unknown words, which do not have in the dictionary, and once these are listed, the user must encode the new lexical entries according to linguistic criteria, using the auxiliary window system offered by the program.

The preliminary analysis shows the appearances of the new words and their contexts. In this way, the user can see shortly the use of these words. By the way, it is also a way to detect erroneous writings, since misspelled words will normally have an unknown form.

METAL uses three basic dictionaries. It has two monolingual dictionaries, each of 50,000 entries, one in German and one in English, and a dictionary of equivalences between words of both languages. These dictionaries are hierarchical: the grammatical morphemes about them, the common vocabulary under them and the general technical vocabulary below. In addition, there are technical dictionaries (computer science, telecommunications, medicine, etc.) organized by modules.

The aforementioned preliminary analysis creates a series of glossary files that report the return of each term through different specialized technical dictionaries.

There is also a composite word file that, after analyzing the unknown compound words, generates provisional equivalents from the meaning of the components. It can be said that it is correct in 70% of the average cases. The more technical the text, the more successful this system is.

When translating a term, the equivalent is first searched in the most specialized dictionaries and, if not found in them, resorts to the most general ones in the search. However, the user can modify this order if desired.

The dictionaries that METAL has are not exactly the same as those we see in book form. Each entry includes morphological and syntactic information, represented by rewriting rules. Proposes default rules for new words.

The translation is done by the database of the linguistic rules that are responsible for the analysis of sentences. This searches for phrases introduced at the deepest level. It then gradually goes to the surface, assigning structural component rules at each level. Upon reaching the superficial patches, it creates an arboreal structure for all prayer.

Before selecting the last tree, use a probability strategy in cases where there is a possibility to apply more than one rule. This requires a lot of memory: More than 120Mb.

Once the tree is obtained, the LISP machine places the sentences in a form of representation similar to case grammar.

Based on this level of deep analysis, the system generates an output tree in the target language. The user has the possibility to modify the patch codes of this, if necessary.

The system analyzes each sentence and stores the translation obtained in an output file for post-editing.

Cost and benefit

METAL translates about 200 pages into an 8-hour day. This speed may seem high or low, but if you take into account the full translation (including formatting) you can say that it is quite fast, since the post clerk cannot prepare more than 40 or 50 pages. Therefore, to give definitive format to the work that METAL does one night the next day, five posts of post-dition are necessary. Although translation is streamlined, no more personnel and more machines would be put into the later process. How much does this cost? The accounts are:

SINIX MX 300 with laser printer and peripherals: 2.600,000 sts. and maintenance cost of the machine 22,000 pts. monthly. SINIX machine software: 208,000 ptas. LISP machine: 6,500,000 sts. METAL translation software: 5,850,000 sts. and 60,000 sts. monthly maintenance cost.

With a total investment of 15,000,000 pesetas and a monthly maintenance cost of 82,000 pesetas.

Find out if you want to buy.

Gai honi buruzko eduki gehiago

Elhuyarrek garatutako teknologia