Automatic duplication, new option of Aditu.eus
Elhuyar has been working on language and speech technologies for the Basque language for more than 20 years, and we have recently brought to society one of the most significant examples of this work. In fact, although in recent years we have been offering services based on advanced technologies of automatic transcription, machine translation and speech synthesis, among others, this time we have integrated in the platform Aditu.eus the possibility of fully automatic audio or video dubbing, combining all of them.
Five years ago, just the day before the confinement due to the COVID-19 pandemic, we presented the service Aditu.eus, the first Basque platform for automatic transcription and subtitling of audio and video, which operated in Basque or Spanish. Since then, we have continued to incorporate functionalities in the Experto service: shortly after we added several advanced editing functions; at the end of a year, we began to offer the possibility of transcribing bilingual content in Euskera-Castellano and translating subtitles using our machine translation platform Elia.eus; later, we allowed transcription in four other languages (Catalan, Galician, English and French)... On the other hand, at the beginning of 2023 we launched the neural TTS service, a speech synthesis service that offers different languages and voices.
Well, by combining this speech synthesis technology with the already existing automatic transcription and translation technologies, we have turned Aditu.eus into a dubbing platform. By uploading an audio or video, you can first be asked to create subtitles, then translate the subtitles and finally create audio from those translated subtitles. In this way, with just three clicks, a duplicate audio or video is obtained. It is the first automatic dubbing platform in the Basque Country that works in six languages: Basque, Spanish, French, Catalan, Galician and English. We can duplicate an audiovisual content to any of them with Expertos.
We have done all this, as we do in all our tools, with our own technology. In fact, at the Orai NLP Technologies center created by Elhuyar we have developed the technologies of transcription, translation and speech generation Aditu.eus, taking special care of the quality of the results obtained with the contents in Basque. Hala, bikoizketa-lanak askoz errazago egin daitezke orain, euskaraz ongi dabilen teknologiarekin eta gure konfidentzialtasun-bermearekin.
Choice of voices, editing...
The automatic subtitling of a content in Aditu.eus identifies the speakers that are present and, if we want, we can name each of them. Then, when we are going to create a dubbing with the translated subtitles, the system asks us which voice we want to use for each of these speakers. We are offered a selection of male and female voices created by us, the same voices that are offered in the neural TTS service.
But it also allows us to imitate the original voice of the speaker. The technology to do this—that is, the technology that synthesizes speech by imitating a small sample of speech (which can also be in another language) and giving a text (although the model never sees that voice in training)—has been created specifically for this purpose.
“It has multiple applications: formal interviews, documentaries, voice-overs, corporate or marketing videos, educational materials...”
On the other hand, we mentioned that the duplication of Aditu.eus can be created completely automatically with just three clicks. But just as there may be errors in automatic transcription and translation and the platform allows you to correct them manually, even in this last phase of creating the audio of the dubbing, unwanted ones can occur, which we can edit. For example, it may happen that, for whatever reason, the translation of a sentence is considerably longer than the original. To be able to adapt the translated audios in the same time interval as the original (because other things are said before and after), there is no choice but to speed up this speech, which sometimes does not stop well. For this reason, on the dubbing editing screen, you can see how many characters per second each sentence is read, and also how much speed has had to be accelerated, and how much. If you go through the appropriate standards, it is also indicated in red. In cases like this, we can shorten the text and recreate the audio of that phrase. Or we can divide a phrase and move it in time to adapt it to the silence intervals within the phrase and synchronize it with the movement of the lips, and then duplicate that phrase again.
For many things, but not for everything
The speech that occurs in the dubbing, as you will know if you have ever tried the neural TTS platform, is of very good quality. It complies with the phonetic rules, intonation, etc. of the Basque language (or the corresponding language) and seems very natural (so much so that it does not give synthetic speech, and if we listen to it in the blind we would often not know if it is natural or synthetic). The imitation of the original voice is also very good quality. But there is a speech with neutral intonation, without expressiveness. So it doesn’t look good on certain types of content: movies, fictional content, informal podcasts and, in general, any context where speech isn’t neutral, serious or formal.
Despite these limitations, the possibilities of dubbing in Aditu.eus also have numerous applications: formal interviews, documentaries, voices in off, corporate or marketing videos, educational materials... We can duplicate them into many languages very easily and with very good results. In this way, we can expand the scope of our content and expand it to the global market, for example; or it can be used in schools to bring interesting didactic material to Basque; or to duplicate videos created in Basque to other languages, as a provisional solution for immigrant students who have just arrived and do not master Basque; and so on many others.
But I would like to end by mentioning a real use case that we are particularly happy about. You are probably familiar with the Teknopolis television programme produced by Elhuyar, which has two editions in Basque and Spanish on ETB1 and ETB2. Well, most of the recordings are made in two languages, but in cases where the interviewees do not know Basque, obviously the interview must be done only in Spanish. Then they had to do the transcription, translation and dubbing themselves. Well, now they do it with automatic duplication in Aditu.eus! In these parts, the synthetic voice is represented by a symbol and a text. Great, right?
Buletina
Bidali zure helbide elektronikoa eta jaso asteroko buletina zure sarrera-ontzian







