Text written in Basque and translated automatically by Elia without any subsequent editing. SEE ORIGINAL

Isabelle Guyon: “We have to put artificial intelligence to the maximum number of people”

For many, Isabelle Guyon is one of the pioneers of artificial intelligence. Teach the computer to do one thing, and you'll probably be using Guyon's work, which invented the basic methodologies for machine learning. Machines handle numbers easily, but to really approach human intelligence, they need to learn how to accurately distinguish objects like men and women, differentiate a medical diagnosis, or identify patterns in any data group. Guyon has made a fundamental contribution to this. He came to Bilbao to collect the Frontier of Knowledge Award from the BBVA Foundation and interviewed Guillermo Roa in the Norko Ferrokarrilla program of Euskadi Irratia. We also brought him to Elhuyar magazine.

Guillermo Roa Zubia

You're an expert in artificial intelligence and big data. Twenty years ago, he created the algorithms that underpin machine learning: the vectors of help. I mean, the algorithms that are used when you have to analyze a lot of data. Is that correct?

That's it. Some simple jobs to be human are very difficult for machines. For example, if the machine has to learn how to distinguish pears and apples, it is sometimes difficult for it, as some apples look like pears and some apple pears. It's hard to know where the limit is. And we developed some complex mathematical algorithms that helped us detect these limitations -- the aid vectors.

In fact, many methods are used in machine learning. At first, I graduated from neural networks, compared several methods, including so-called nuclear methods. But then, when I met Professor Vladimir Vapnik at the Bell Lab, we used and developed aid vectors based on a method invented in the 1960s, examples of data discrimination.

I realized that those algorithms and the nuclear method could be combined. My husband, Bernard Bozer, implemented that combination and worked pretty well. We started to apply in a number of things. With Bernard Schölkopf we developed a whole field around nuclear methods, multiplying their applications.

I worked in this field for many years. It replaced my first love, the neural networks. Not intentionally, but in my work there were two areas that were in competition. However, in practice they are not competitors. On the contrary, I think they are very complementary. In basic learning, you can combine neural networks and support vectors. Now there are many people who combine them and create more powerful techniques.

Neural networks are very useful, but it's a technique that's recently resurrected by computer capability. About 25 years ago, when you researched that, computers weren't that powerful at all, and the idea of neural networks was very good, but not very realistic at the time. Today, however, its use is enormous.

That is, there was a turning point. Since sufficient data are available to form neural networks and other study machines, the machines have matched human capacity. Sometimes they've even exceeded it, because the processing capacity of large amounts of data is very limited in humans. For example, they trained a lot of people to play Go, with hundreds of parties on display, overcoming human capacity. It was a surprise that a machine outperformed the game champion Go, because we thought it was still far away. And of course, this causes as many fears in people as sleep.

They fear that machines will become “superbeings”. But I believe that this is a great opportunity and that we should not be scared, but we must exploit it and make it available to the largest possible segment of the population.

We are in a revolution. A big revolution with neural networks and learning machines. Do you see that?

Yes, and I think we're at the beginning of the revolution, because they're spreading a lot, especially the algorithms that find patterns in the data. Now, on our phones and on our computers there are a lot of machine learning products that know the faces or that do automatic translations, for example. There are numerous applications of artificial vision thanks to convolutional neural networks. They actually developed them at the Bell Lab when I was working there. And at the same time, we work with vector machines, because they're complementary.

For example, suppose you form a neural network to split an image into small segments that then combine into chunks of stripes and crosses. You need big databases to get good training. But if you don't have a lot of data and the data you have is not right? For example, imagine you want to train your system to know the faces, but most of the images you have are images of objects, other kinds of data. But suppose it also has a few images of children's faces. To be able to train the system you need a method based on examples, like vector machines, and not a method based on the features you see.

Machines train them to do one thing, but we humans are able to do very different things and relate them to each other.

That's it, and we also have different ways of learning. For example, we have a long-term memory. This memory needs a lot of data and allows us to learn pattern differentiation strategies. And we have a short-term memory, with examples that we only learn from memory and then we make comparative decisions.

He has also worked in bioinformatics. This has been a big revolution in chemistry, for example. For example, if you're looking for a molecule that's associated with a certain protein, the automatic system can study many options and decide which molecules are most appropriate. It is increasingly used in biochemical research.

Ed. University of California/BBVA Foundation

Yes, that has been very important. And it's still important. We talk about big data, which is having a lot of data. But what data do we need? There are basically two ways to address the issue: the high number of examples and the high number of characteristics of each. If we talk about chemistry, we can study a molecule of many characteristics, with thousands of characteristics. In addition, biomedical research can also study the patient, who has thousands of characteristics.

For example, if we measure all the activities of genes, we study thousands of characteristics. Big data is a different kind. We don't have many genes, but we have many of their characteristics. Here you can use the aid vector machines. So much has been used in biomedicine and now also in chemistry.

I remember one British example: they wanted to combine donor and patient data to get adequate donors for kidney transplants, taking into account the compatibility of the blood type and other characteristics. They had data all over the country, and thanks to artificial intelligence systems, they were able to decide: This Londoner donates a kidney to Manchester and York… They formed a complex network of donors and patients. That's a simple example for you, right? It's somehow the past of machine learning.

Yes. And the most interesting thing is that we combine different disciplines: statistics, optimization and other traditional methods. Many people have joined forces in the last 20 years. Conventional statistical methods were sometimes unknown in computer science. And it's exciting for people who have worked on other kinds of artificial intelligence, that we can only do powerful things from numbers, especially by manipulating numbers and collecting lots of data.

But it's not black magic. If we have hundreds of thousands of features, how can we distinguish patterns? Are we trying to find the most characteristic features of one thing or another? Suppose we want to separate dogs from cows. It doesn't matter to have four legs, because both dogs and cows have four legs, but cows have horns and dogs don't. They look for these kinds of characteristics. Ultimately, from hundreds of thousands of data, you can simplify the problem by analyzing just those few numbers that matter to you for a particular problem.

Elhuyar has artificial intelligence researchers working on the automatic translation of the Basque language into English, Spanish, French and other languages, and vice versa. The problem is that the Basque does not have a large corpus to compare as many data as large languages. Therefore, they have to work very well to train many machines with this information.

People often think it's hard to have a lot of data, but the hardest thing is to have little data. In fact, Vapnik's theory helped us a lot to understand that when we have little data, we need to use pretty simple models. Interestingly, neural networks that handle little data are narrow networks. It underlies the complex theory. Now it's called regularization theory, that is, to work with little data, the key is not only the model you use, but also the way you form it.

I am particularly interested in what we call “short learning of little data”, that is, systems that must learn from a few examples. In these cases, we organize competitions. That's my way of working. Instead of me and the students doing the work, we open up the problem to a large group of researchers. We are therefore raising problems and opening up the possibility for anyone to solve. We can do a new job with a system formed in other jobs.

In line with what you have said, there are many strategies in neural networks, such as GAN network strategies, systems that learn from competition with another system. Incredible! Is it opening up a new future?

Yes. GAN networks have revolutionized neural network formation in recent years. People invent new methods and new ideas to exploit them. One of the things we've done is generate realistic artificial data. One of the goals is to protect privacy. And that is that these data, on many occasions, generate privacy concerns or have commercial value, so they cannot be disseminated without more. The big problem has been that some large companies have been denounced for releasing private data. So now they are very prudent. And that's bad for the research community, because researchers can't study more interesting problems and try to find a solution.

So I've worked with my colleagues in the New York IPR. Divide messages based on GAN networks to generate realistic artificial data without information about individuals. These data store all the statistical properties of the actual data, so they are useful for research.

Thus, students can use them to form the systems. The problem is that we'd also like to use them to make real discoveries, and they don't help. Keeping the properties of the real data, we could use them in research to make real discoveries. We're trying to progressively extend the boundaries of these realistic artificial data.

And does it work?

Yes, in biomedicine we have created many fake medical records because it's very sensitive information. In general, we were collaborating with companies with sensitive data, but they didn't allow us to export data. However, we now export models that can generate data that could exceed certain security or privacy limits. If I hope it will serve the scientific community.