Expert .eus, conversation service in Basque language
2020/06/01 Leturia Azkarate, Igor - Informatikaria eta ikertzaileaElhuyar Hizkuntza eta Teknologia Iturria: Elhuyar aldizkaria
Speech is one of the oldest and most differentiating forms of communication that humans have among us (one of the characteristics that differentiates us from animals). Later came the written communication which, in addition to the communication, has the capacity to preserve knowledge. Language is the most natural means of communication among people, but the way we have used people to communicate with computers is written: we introduce orders, programming and texts on the keyboard, which uses the texts on the screen to visualize the results. This is because computers were not able to understand human language.
In recent years, the creation and improvement of speech recognition systems has gradually expanded the systems of speech introduction: dialogue agents, automatic subtiters, smart speakers, dictation systems... But the Basque language has not been in these systems.
A year ago we explained in the magazine the solutions based on speech technologies for accessibility developed in Elhuyar: web page reading tool, Digital Reader, Wikispeech, Viajde... All of them are based on the synthesis of speech, that is, on the technology used by computers to create speech by delivering a text. Then we talked about how technologies for speech recognition can contribute to accessibility and inclusion (control of computers and other machines through speech, systems of dictatorship, automatic subtitling...). In other languages, yes, but there was no such system in Basque. Well, in March we launched the speech recognition service in Basque, suitable for accessibility and other uses: Expert.eus
Jakin.eus, the speaker of Elhuyar's speech
Expert is a speech connoisseur based on deep neural networks. Its name is very appropriate. The verb expert has two main senses: listen to the first and understand the second, and that is what Expert does, listen and understand what we say to him (because he gives in writing the price of the heard). On the other hand, the main meaning of the expert or adjective name is wisdom, wisdom, learned, which is also the expert system.
It is offered as a service or web platform in https://perit.eus Know Basque and Spanish (today necessary for the Basque society and institutions and agents to use it). We also plan to introduce more languages in the future.
To this platform we can upload an audio or video file, as well as a link to a video or audio online (for example, EITB, Youtube, Facebook, Instagram...), and Expert performs the automatic and instant transcription of what is said there. The result is presented in different formats: text of the transcription, subtitle file and transcription with the times of the words (for example, to know in which second a word has been said exactly in the video). Transcription or subtitles can be corrected or modified before downloading, using the online editing interface of Aditu. In addition, it allows to simultaneously transcribe what we say from the microphone of the computer or mobile phone.
In addition to being able to enjoy the service via web, we offer tailor-made solutions for companies and institutions. The service can be integrated into the workflow of the client, in the application, in the CMS, etc. using the API. Simultaneous transcription can also be used by API for integration into a virtual assistant, direct subtitles in events, etc. Or if you want it can also be installed in the client.
From the point of view of the accessibility, subtitled, dictated or ordered to the computer, passing through the automatic incorporation of subtitles of documentaries and programs to the audiovisual companies, the television and the radios, passing through the transcription of recordings of interviews to the journalists, the raising of minutes of plenary or other meetings or the direct placement of subtitles in the public sessions, the creation of subtitles of conferences or courses, the interactivity of people and machines, to the centers utilization of the virtual interviews of
The quality of the transcription or the invention rate of Expert is, in general, good, but it is true that it is very variable depending on the quality of the audio recording, the quality of the microphones, the echo, if you hear noise or high music, the record, if you speak in standard language or in some of its variants, the volume, the speed, etc. In optimal conditions, the invention rate can exceed 95%. Its best results are conferences, plenary, informative, documentaries, reports, etc. On the contrary, it is worse in Basque dialects, spontaneous and informal, films... In addition, the results are always somewhat worse in the case of simultaneous transcription. However, in most cases it is totally useful.
Many options for the future
Seeing the light is a milestone for Elhuyar and the Basque language, but it is not the end or destination of the road, but the beginning. We should continue to improve experts to improve speech knowledge with informal interviews, poor audio quality, dialects, movies... or, why not, with verses.
In addition, if we combine the knowledge of speech with other language and speech technologies that we work for the Basque language (machine translation, chatbots, synthesis or speech creation...), think about what can be done: smart speakers, simultaneous translation of speech to speech (imitating the original voices if desired). We see the future with enthusiasm for the Basque language to be at the same level as other languages in technologies and services. In Elhuyar we will continue working on it.
Gai honi buruzko eduki gehiago
Elhuyarrek garatutako teknologia