ElkarOla: Linguistic technologies at the service of an intelligent, industrial, inclusive and multilingual territory

2017/06/19 Leturia Azkarate, Igor - Informatikaria eta ikertzaileaElhuyar Hizkuntza eta Teknologia Iturria: Elhuyar aldizkaria

The ElkarOla project is the final result of 15 years of collaboration between the most representative organisations in the Basque Country in the research and development of language and speech technologies. As a sample of the work done, three demos have been worked to reflect what these technologies can do in three strategic areas.
Demo for advanced manufacturing area. This is an augmented reality teleassistance system for use between an expert and an employee. Ed. Igor Leturia

ElkarOla is a strategic research project developed in 2015 and 2016 in the field of language and speech technologies. Examples of these technologies are tools for translation, information management (search engines, information extraction, sentiment analysis), language resources (dictionaries, corpus, proofreaders) and speech tools (speech knowledge, speech creation). The project has worked especially for the Basque language, but also for other local and nearby languages.

Elhuyar, the research groups of the UPV-EHU Ixa and Aholab, the Vicomtech-IK4 technology center and the Tecnalia Research & Innovation foundation, in coordination with Elhuyar.

Five institutions have been collaborating for 15 years in the research and development of language and speech technologies for the Basque language. Prior to this project, four others were made: Hizking21 (2002-2004), AnHitz (2006-2008), BerbaTek (2009-2011) and Ber2Tek (2012-2014). While the previous ones were oriented to the language industry, ElkarOla emphasizes the areas of RIS3 Euskadi.

The RIS3 Smart Specialization Strategy is a regional strategy driven from Europe, aimed at innovation and development. In this strategy, each region, taking into account its productive capacities and potential, defines and concentrates resources and investments in strategic areas. In the case of the ACBC, RIS3 Euskadi sets three priorities: advanced manufacturing, energy, biosciences and health.

Although language and speech technologies are not among these priorities, they are an important cross-cutting line with application in all of them. In this sense, ElkarOla, in addition to basic research, has also carried out applied research in the aforementioned areas of RIS3 and, through technology transfer, have marketed and socialized various tools and applications.

Demos for priority areas

The bioscience and health demo is a search engine for health terms and relationships. Ed. Igor Leturia

As a final result of the project, we have developed three demos that reflect the possible contribution of these technologies and the collaboration between the consortium entities in these RIS3 areas.

The demo for the field of advanced manufacturing is an augmented reality teleassistance system for use between an expert and an operator. This demo demonstrates how language and speech technologies can contribute in a noisy industrial environment. If the operator wants to have his hands free to work, he will put some smartglass or have at his disposal a tablet. The expert will help you remotely through another tablet or computer. The expert will receive in real time what the employee indicates and what he sees with his tablet or glasses. Thus, the expert will verbally inform the worker, but since the worker can find himself in a noisy environment caused by the machines, the system automatically transcribes (by speaking knowledge) and translates (by machine translation) these instructions so that it arrives in writing to the worker. This text will be shown to the employee on your device, in real time, about what you are seeing and guided step by step in your task. In addition, the app will show the augmented reality worker the remote instructions, using arrows and the like.

The biosciences and health demo is a search engine for health terms and relations.This first prototype of a search engine for relationships between medical entities (diseases and medicines) and these works on a corpus of extracts of scientific articles in Spanish on medical issues. This corpus has been labeled manually to train and evaluate an automatic detection system for adverse drug reactions. Labeled entities include, on the one hand, generic drugs, drug brands and substances and, on the other, diseases and symptoms. Labeled relationships include causes (what causes the disease) and treatments (what medicine it is treated with or the disease). In the search engine you can perform searches by entities or relationships and graphically view the entities and relationships detected in each document.

Demo of a dialogue agent for customer service. Ed. Igor Leturia

Finally, to have a territorial demo we have developed a dialogue agent for customer service. Customer service is a key element in providing quality service in some sectors, but many of these services have low-value repetitive tasks or phases (user identification, form completion, simple queries…). Dialogue system technologies, along with natural language processing and artificial intelligence, allow this type of task to be automated so that technicians can use time more efficiently. The demo is a dialogue system integrated into a web interface in which natural language processing techniques and statistical classification algorithms are used to identify the user and to which department the event described by the user should be transferred. System responses are given by both text and speech synthesis and commands can be given both written and verbal.

Gai honi buruzko eduki gehiago

Elhuyarrek garatutako teknologia