Overcoming limits by voice

2002/02/01 Anega, Iker | Jubete, Juan Jose | Lopez de Ipina, Karmele Iturria: Elhuyar aldizkaria

In recent years we have suffered a small invasion of new technologies. Computers, the Internet, the mobile phone... can be found anywhere; like many other technological devices, they have already become a common tool of our daily life. In fact, today it is becoming easier to have a computer at home.

There is no doubt that technology will continue to change our way of life. Somehow, the influence of technology on human habitat throughout history has been enormous, in most cases, to improve the quality of life, but in some cases, unfortunately, even to worsen (in the case of the arms industry). However, the attitude to this fact is usually not lukewarm.

In the past, electricity, telephone, television, radio and other new technological advances became part of our lives and meant a radical change in everyday life. While in the beginning most people thought that these technological elements were to be used by a privileged few… it is clear that they were wrong! Time has shown that society needs an adaptation process to accept any new technology, but that it eventually becomes a common tool.

Unfortunately, many times the process of appearance of the technology and the time of adaptation that the user needs do not coincide, and the user is usually lost and scared. The learning process is very complex and laborious.

In recent times, in order to overcome the limits generated by new technologies, some support interfaces have begun to be introduced. These interfaces seek the naturalness of communication between machines and users and try to simulate human forms of communication. Therefore, as the voice is the most natural medium with which people communicate, it can be an ideal instrument to communicate with the computer. Especially because the voice somehow softens the machines and the user feels more comfortable.

The inability of human beings to face new technologies (because they are not able to control them) is the feeling that people with reduced mobility (disabled, elderly) often have with the usual things. For example, opening the door for a person with good physical condition does not take much effort. On the contrary, if the person has any physical problem that reduces their mobility, this door can become an insurmountable limit.

In the case of these people, many times the technology has been very useful to facilitate their life: motorized wheelchairs, adapted keyboards, special lifts, etc. And for them the voice can also be a very useful tool to overcome daily limits.

So... Why not control the wheelchair through the voice? And… why not control the common elements of the house: television, doors, lights, windows, etc. using voice commands? Is that science fiction? No, that's a new technology called home automation. Home automation encompasses the design, control and development of intelligent buildings. So let's apply Domotic science to build a voice-controlled house for people with reduced mobility. Finally, why not use Basque to control the little house of our dreams?

The AHOTSDOMO system has been developed to bring this dream closer to reality. Through this system, the elements of common use of a home can be controlled by voice in Basque.

AHOTSDOMO system overview

Figure 1 shows the general outline of the AHOTSDOMO system. Here you can see how the system works.

When the user gives an order, the voice control system processes. If the given order is not within the set of system commands, it ignores. On the contrary, if it is an order, the system knows it and assigns it a code.

The encoded order, through the communication port, is sent to the control interface and interfaces to the decoding system. This decodes it and activates the elements needed to execute the order.

In some cases, when the order is executed, the system sensors send a response to the encoding system. This encodes the response and sends it to the control system through the interface. The control system executes the command and is ready to receive the following order.

Voice control system

This system component controls voice commands using the system control language. As in any language, the definition of the control language requires two resources: vocabulary (language words) and grammar (language rules).

In the case of this system, both for the development of the voice system and for the definition of vocabulary and grammar, the software tool has been used for the development of applications of speech recognition of the Software Development Kit (SDK) of the ViVoice of the IBM. This tool uses the ViVoice knowledge engine to control any defined language.

The definition of the control language (vocabulary and grammar) is then analyzed using the ViVoice SDK software.

Creating control language

The vocabulary of a speech recognition system consists of words capable of understanding systems (for example, in the AHOTSDOMO system: 'piztu', 'argia' ..). On the other hand, the system grammar defines the sets of phrases that the system can understand (for example: 'Turn on the light', 'Turn on the spot the clear particle'...).

The definitions of grammar and vocabulary must be exhaustive and must be designed following a series of language criteria that allow a proper functioning of the system:

  • Comfortable use. Grammar should be comfortable for the user, simple to use and remember the usual language. Short phrases and common and reduced vocabulary will be used as far as possible.
  • Flexibility. The set of instructions that the user can use must be broad.
  • Significant difference between dictionary words. The use of words not very similar in the dictionary composition is essential to make the system mix unlikely.
Figure . AHOTSDOMO system control panel.

In the case of the AHOTSDOMO system an 18-word dictionary has been used and a very simple and simple grammar has been designed to control the basic elements of the house.

To express grammar we have the Backus-Naur form or BNF grammar. The syntax and notation of the language to use is described. Among this type of BNF grammar is a type adapted to speech knowledge, the Speech Recognition Control Language SRCL. That is, to define the grammar of the AHOTSDOMO system the notation of the SRCL-BNF grammar has been used.

The rules of production of the system are as follows (elements of the grammar ", ", "" have been indicated among the symbols, " |" elements that cannot be used simultaneously; and "?" with the symbol appear optional elements) (Figure 2).

The set of accepted phrases is wide and offers the user flexibility and ease. Examples of phrases accepted by the system are:

  • "Open kitchen window", "Open window".
  • "Turn on the kitchen light", "Turn on the light".
  • "Connect the first plug", "Connect the plug".

Suitability of the IBM ViVoice SDK for Basque


The SDK software can be configured for multiple languages. Unfortunately, Basque is not yet among them. However, it has a useful tool to introduce a new dictionary: a tool to add vocabulary. This serves to define new words to use IBM Via-Voice engines. In this instrument the word is not only spelled, but also phonetically defined. It is therefore an unbeatable opportunity to define words in Basque using Spanish phonetics. Table 1 shows the list of commands that the system uses in Basque.

As can be seen in the table below, the words have been adapted to use the ViVoice engines. For this, the sounds in Basque

were searched for equivalents between the sounds of Castilian (e.g. ts = ch, z = s, ge = gue ..) This is the 'key' of this work, since in this way the development cost of such applications decreases considerably.

Through this methodology, numerous control applications can be carried out in Basque. On the one hand, because the words in Basque and the words of the ViVoice dictionary in Spanish are very different (and the mixture between them is very small); on the other, because in these applications the grammar is composed of simple words.

Order control system

As seen in the general scheme, the known order is encoded and sent to the control elements for execution. There are two types of command control in the system:

  1. Immediate control code elements. Simple elements: lights, television, etc. The execution of the precept does not require accreditation: "Turn on the light", "connect the plug". These elements can be controlled by two systems: a) Protocol X10. It is the standard protocol used in home automation. The main advantage of this command coding methodology is that it does not require special wiring, since it uses the electrical installation as a communication highway to send and receive orders. b) Microcontroller 8051. The microcontroller sends an amplified signal to power over the selected item.
  2. Code verification elements. Complex elements: blinds, doors, etc. Those who, in order execution, necessarily require a sensor to respond to the system. This safety control is used to protect the motors associated with the elements and prevent their deterioration as far as possible. Therefore, in the case of these elements, the voice command recognition system will not process new commands until the sensor response is received. In this case there are several control ways: a) Programmable automaton. This methodology allows for proper control of complex elements. b) Microcontroller 8051. In this case, an amplified signal is sent to the engine of the element for activation or deactivation.
System control panel

Figure 3 shows the control panel of the system. The control panel is used for system control. Allows to perform several tasks:

  • System simulation. In the control panel you can see the locations and status of the elements of the house that are controlling.
  • User support. At the top, the system has a drop-down menu to view system commands at any time.
  • Volume control. At the bottom left is the user voice volume control tool.
  • Communication settings. At the bottom right is the communication port configuration tool.
  • Command window. In the bottom center appears the order that is processing the system.
Looking ahead

The AHOTSDOMO (Voice Controlled Domotic System) system was born with the aim of facilitating the lives of people with reduced mobility. Although the developed system is just a step forward, it opens a way to control intelligent buildings in the world of Home Automation in Basque. The future is also hopeful in this technological field, both for people with reduced mobility and for the Basque language. The door has been opened for us to enter that world of dreams and begin to imagine a wonderful future without limits...

Figure . General outline of the AHOTSDOMO system.
Figure . AHOTSDOMO grammar production rules.
Table . Definition of the AHOTSDOMO command set for IBM ViVoice SDK software.

Gai honi buruzko eduki gehiago

Elhuyarrek garatutako teknologia