Mycroft.eus: Smart speaker in Euskera
2023/03/03 Leturia Azkarate, Igor - Informatikaria eta ikertzaileaElhuyar Hizkuntza eta Teknologia Iturria: Elhuyar aldizkaria
Smart speakers are increasing considerably in our homes. These devices that obey and respond to our verbal commands, but they don't do it in Basque. That is why, in Orai, Elhuyar’s working centre on artificial intelligence and language and speech technologies, in collaboration with the Talaios and Skura cooperatives, we address the Mycroft.eus project. Our intention was to develop a smart speaker, with characteristics different from those of technological giants: euskalduna, free software, which protects privacy, which has a vision of what local services are…
Amazon's Echo smart speakers and Google's Nest (formerly Home) smart speakers, for example, can be of great help in putting music without moving from the place we're talking about just talking, asking for time, putting alarms and reminders… Unfortunately, once again, tech giants have not taken Euskera into these devices and forced us to do it in Spanish or French in our own home, including our kids, either.
However, in the Orai workspace, created by Elhuyar for research in language technologies, we have the technologies necessary for a smart speaker to develop in Basque, for a few years the ASR (Automatic Speech Recognition) technologies of language knowledge and TTS (Text To Speech) of language synthesis have been developed. And there are smart speaker projects in free software that each can freely adapt, with the necessary modifications and enlargements. Probably the best known and most advanced is Mycroft.AI.
With all the foundations available, we address the project of developing a smart speaker in Euskera in 2020, in collaboration with two experienced cooperatives working with free software and hardware: Talaios and Skura. Mycroft.eus is a project of which I have had the honour and pleasure of being responsible. Mycroft.eus has been supported by the Building the Future of the Foral Council of Gipuzkoa, Hazitek of the Basque Government and ELG or European Language Grid of the European Commission, where it was one of the 10 selected from over 100 projects submitted, the only one in the Spanish State.
The Mycroft.eus project therefore intended to build a smart speaker in Euskera, based on Mycroft.AI, using ASR and TTS technologies in Orai’s Basque. But we also gave him other characteristics that are important and that don't have similar devices in the market, on the one hand, free software, on the other hand, the devices that will ensure privacy, on the other hand, their focus and, finally, local or nearby services.
Carried out in project
Nowadays, after a long period of work in the project, it can be stated that the software development phase is completed. We have translated into Basque the core of Mycroft (messages, phrases, texts...), its linguistic module (which interprets and says numbers, hours, dates, etc. ), adapting ASR and TTS technologies in Basque Orai (creation of plugins) and translating more than 40 of their skills (volume, hours, hours, reminder hours, hours, memory hours, etc. ).
As for other characteristics, it can also be said that the objective has been achieved. The fact of being free software has necessarily been fulfilled, as it is the Mycroft itself upon which it is based, and the developments made are uploaded to the Mycroft repository for use or improvement by anyone. As for privacy monitoring, Mycroft detects on the device itself the activation order (“Hey, Mycroft!”) without sending anything to any server; for subsequent commands we do not store the audios and phrases sent to Orai’s ASR and TTS servers or, of course, we do business with them. With regard to the voice approach, UNESCO’s I’d blush if I could report that commercial loudspeakers reinforce and propagate business biases (devices at our service and to meet our demands including female name, personality and voice, and even politely respond to aggressive questions, including abusive and sexual harassment) and recommends measures ranging from non-defective voice to voice. In the case of Mycroft, despite being a fictional name that is not usually used, it is called a man (Mycroft is Sherlock Holmes’ clear brother), a neutral personality that does not respond very much to harassment or abuse. In addition, we have made a male voice by default, and we're doing different research to synthesize a gender-neutral voice. Finally, with regard to the local dimension, we have developed half a hundred new skills to make the news from Basque websites, to listen to local broadcasters, to listen to Basque music, to ask in which village the festivities come…
In addition, the project has made several adaptations and improvements to ASR and TTS technologies in Basque. For example, we have improved the connoisseur to work better with poor quality audios or with sound or background music (i.e., with which we can find in the case of a speaker at home) and we have developed a much more natural neural synthesis technology (almost inseparable from the real sounds of people).
And now what?
Therefore, once the software development on Mycroft.eus is completed, what is the current state of the project? Is it all done? No. The project had and intends to make available to the Basque society a smart speaker in Euskera, also solving the hardware issue, that is, integrating it into a device, distributing it and putting it on sale. And this issue is much more complex for us, because we have experience in language technologies and software development, but less in hardware.
Mycroft, in Basque, is prepared to install himself on a computer with microphone and speaker and Linux operating system, or on a device specially prepared for it, for example, a device based on a Raspberry Pi to which we referred in the September article last year. We’ve been talking about a Google AIY Voice Kit, which is also prepared for him (a Raspberry Pi with microphone and built-in by Google, which is in a cardboard box, for people to develop and hacks using their ASR and TTS, widely used by the Mycroft community). But it is not within everyone's power to make such an installation and, in the case of a normal computer, it is not practical either.
And easier and more suitable than the design and production of hardware differentiated by ourselves, is to integrate it into the device that the company itself markets Mycroft.AI. But the problems arising from COVID-19 in the production of digital devices (chip shortage, price increase, etc.) They have also affected them. They have delayed the placing on the market of the device for several years and have not been able to do so for the price they would have liked or would have liked...
Just over a year ago, developers were able to purchase the first prototype of the device, the Mark II DevKit. Inside a methacrylate housing brought a Raspberry Pi, a plate, a speaker, a microphone, a screen, a camera, lights and buttons. Buying one of them, we prepared and tested for Mycroft in Basque to work properly. And a few months ago, at last, they took out the final apparatus, the Mark II, and we bought one to integrate and test the Basque. In order to be able to function in the latter, there have been many changes in the software and, in order for it to work in Basque too, we have to make some adaptations to the developments we had so far and we are in it.
On the other hand, we want to change the name and order of the device to wake up and put something more natural and simple to pronounce to the Basques. In addition, from the group of the first three developers we want to extend development to the Basque community of developers and to Basque companies, to develop more joint skills and give a boost to the local character and make the device more attractive.
Finally, beyond the smart speaker, we are working on transferring virtual assistants through speech to other environments, such as mobile phones. And we're also starting to develop industry-oriented applications.
Mycroft.eus is a nice, ambitious and necessary project. We have developed a smart speaker in Basque and see if we are soon in a position to market the device. But in addition, the knowledge technologies and speech synthesis in Euskera have been deepened, the collaboration between Orai, Skura and Talaios has begun, which will continue to be promising in the future, and new paths have been opened in the speech interfaces.