Machine learning today is one of the most important fundamental technology trends. This is one of the main ways in which technology will change the world in the next decade. Some aspects of these changes are cause for concern. For example, the potential impact of machine learning on the labour market, or use it for unethical purposes (for example, authoritarian regimes). There is another problem, which is the subject of this post: the bias of artificial intelligence.
- Machine learning looks for patterns in the data. But artificial intelligence may be "biased" — that is, to find incorrect patterns. For example, detection of skin cancer photos to pay special attention to the images taken in the medical office. Machine learning is not able to understandhis algorithms only detect patterns in the numbers, and if the data is not representative, so will the result of their treatment. And to catch bugs like this can be difficult because of the mechanics of machine learning.
- The most obvious and frightening problem area is the human variety. There are many reasons why data about people can lose objectivity at the stage of collection. But do not think that this problem applies only to people: the same difficulties arise when trying to detect a flood in the warehouse or a failed gas turbine. Some systems can have a prejudice about skin color, others will discriminate against sensors Siemens.
- Such problems are not new to machine learning, and not peculiar only to him. Incorrect assumptions are made in any complex structures, and to understand why it made a decision, is never easy. To deal with this need comprehensively: to create tools and processes to verify and to educate users to not blindly follow the recommendations of the AI. Machine learning really is doing some things much better than us, but dogs, for example, is much more effective than people in detecting the drug that is not a reason to bring them in as witnesses and to impose sentences on the basis of their testimony. And dogs, by the way, much smarter than any machine learning systems.
What is the "AI bias"?
"Raw data" is both an oxymoron and a bad idea; the data must be well and carefully prepared. —Jeffrey Boker
Somewhere before 2013 to make a system that, say, recognize cats in photos, you had to describe the logical steps. How to find the image corners, to recognize the eye, perform textures on the presence of fur, to count the legs, and so on. Then assemble all the components and discover that it's all not really working. About as mechanical horse — theoretically it can be done, but in practice it is too complicated to describe. At the exit you have hundreds (or even thousands) of handwritten rules. And not a single working model.
With the advent of machine learning we have ceased to use the "manual" rules for recognition of an object. Instead, we take thousands of samples "," X thousand samples "a", Y, and forcing the computer to build a model based on their statistical analysis. Then we give this model some sample data, and it is with a certain accuracy determines if it fits to one of the sets. Machine learning generates the model based on the data, and not with the person who wrote it. The results are impressive, especially in the field of recognition of images and patterns, which is why all of those industry is in transition to a machine learning (ML).
But not so simple. In the real world your thousands of examples of X or Y also contain A, B, J, L, O, R and L. They can even be unevenly distributed, and some of them can meet so often that the system will pay more attention to them than to objects that interest you.
What does this mean in practice? My favorite example is when the system of recognition of images looking at a grassy hill and say: "sheep". It is understandable why: most of the photographs-examples of "sheep" is made on the meadows, where they live, and in these images the grass takes up much more space than the little white fuzzies, and that the grass system is considered the most important.
There are examples of more serious. Recent — one project detection of skin cancer photos. It turned out that dermatologists often take pictures of the lineup with manifestations of skin cancer, to lock the size of the formations. Examples of pictures of healthy skin lines no. For AI systems such rulers (or rather, the pixels that we defined as "ruler") became one of the differences between sets of examples, and sometimes more important than a slight rash on the skin. So the system established for the recognition of skin cancer, sometimes instead identified line.
The key point here is that the system does not have a semantic understanding of what she's looking at. We look at pixels and see the sheep skin or ruler, and the system — only numeric string. She sees a three-dimensional space, cannot see any objects nor textures, nor sheep. She just sees patterns in the data.
The complexity of diagnosing such problems is that the neural network model (generated by your system, machine learning) is made up of thousands of hundreds of thousands of nodes. There is no easy way to look at the model and see how she decides. The presence of this method would mean that the process is simple enough to describe all the rules manually without using machine learning. People are worried that machine learning has become a kind of "black box". (I'll explain later why this comparison is still too much.)
This, in General terms, and there is a bias problem of artificial intelligence or machine learning: the system for finding patterns in data patterns can detect incorrect, and you may not notice. This is the fundamental characteristic of technology, and it is obvious to all who work with her in academia and in large technology companies. Its effects are complex, and our possible solutions to those consequences too.
Let's talk first about the consequences.
AI may implicitly for us to make a choice in favor of certain categories of people, based on a large number of invisible signals
Scenarios of bias AI
The most obvious and frightening that this issue can manifest when we are talking about human diversity. Recently there was a rumorthat Amazon tried to build a machine learning system for primary screening job candidates. Because among the workers of the Amazon more than men, examples of "successful hiring" too often male, and in the compilation summary, the proposed system had more men. Amazon noticed this and did not release the system to production.
The most important thing in this example that the system is rumored to give preference to male candidates, despite the fact that gender was not indicated in the summary. The system has seen other patterns in the examples of "successful employment": for example, women can use special words to describe the achievements, or have a particular hobby. Of course, the system does not know what is "hockey", or who these "people", or what "success" — she was just doing statistical analysis of text. But the patterns that she saw, there would be likely not noticed by the person, and some of them (for example, the fact that people of different sexes different ways to describe success) we probably would be difficult to see, even looking at them.
Then it gets worse. A system of machine learning that finds cancer very well on pale skin, maybe worse to work with dark skin, or Vice versa. Not necessarily because of bias, but because you probably need to build another color to separate the model by selecting the other characteristics. Machine learning systems are not interchangeable, even in such a narrow area of image recognition. You need to configure the system, sometimes through trial and error, to discern features in the data that interests you, until you reach the desired level of accuracy. But you may not notice that the system is 98% accurate when working with one group and only in 91% (though this is more accurate than the analysis conducted by person).
While I mainly used the examples of people and their characteristics. On this topic mainly focuses discussion around this problem. But it is important to understand that bias against the people — only part of the problem. We will use machine learning to a variety of things, and the sampling error is relevant for all of them. On the other hand, if you work with people, data bias may be associated not with them.
To understand this, let us return to the example of skin cancer and consider three hypothetical possibility of failure of the system.
- Inhomogeneous distribution of the people: unbalanced number of photos of different skin tones, leading to false positive or false negative results associated with pigmentation.
- The data on which the training system contain frequent and heterogeneously distributed feature, not associated with people and not having diagnostic value, the range of the manifestations of skin cancer or the grass in the photos of the sheep. In this case the result will differ if the image system will find the pixel something that the human eye can determine as "ruler".
- The data includes party feature, which man cannot see, even if is to look.
What does it mean? We know a priori that the data may represent different groups of people, and the least you can schedule the search for such exceptions. In other words, there are plenty of social reasons to assume that the data on groups of people already contain some bias. If we look at a photo with a ruler, we will see this line — we just ignored her before, knowing that it didn't matter, and forgetting that the system is not known.
But what if all your photos unhealthy skin is made at the office where used incandescent lamps, and the healthy under fluorescent light? What if, when you are finished to remove healthy skin, before taking unhealthy you have updated the operating system on the phone, and Apple or Google changed the algorithm a little noise suppression? Man not to notice, whatever he was looking for such features. And vott system of machine usage will immediately see and uses it. She doesn't know.
While we were talking about false correlations, but it can happen that data are accurate, and the results are correct, but you don't want to use them for ethical, legal or managerial reasons. In some jurisdictions, for example, you cannot give women a discount on insurance, despite the fact that women may have safer driving. We can easily imagine a system that in the analysis of historical data will assign female names to the lesser risk factor. Okay, let's remove the names from the sample. But remember the example with the Amazon system can determine the gender from other factors (though she doesn't know what gender, and what car) and you don't notice until the regulator in hindsight don't analyze your proposed rates and will not charge a penalty.
Finally, it often means that we will use such systems only for projects that are associated with people and social interactions. This is not so. If you make gas turbines, you'll want to apply machine learning toward telemetry transmitted dozens or hundreds of sensors on your product (audio, video, temperature, and any other sensors generate data that can be very easily adapted to create a machine learning model). Hypothetically you can say: "Here is data about thousands of failed turbines received prior to the failure, but data from thousands of turbines that did not break. Build a model to tell me what is the difference between them". Well, now imagine that sensors Siemens are on 75% bad of the turbines and only 12% good (communication failures there). The system will build a model to find the turbine sensors Siemens. Oops!
Picture — Moritz Hardt, UC Berkeley
Control bias AI
What are we going to do? You can approach the issue from three sides:
- Methodological rigor in the collection and management of data for training the system.
- Technical tools for analyzing and diagnosing the behavior of the model.
- Training, training and caution when introducing machine learning into products.
In the book "Le Bourgeois Gentilhomme" by Moliere is a joke: one man said that literature is divided into prose and poetry, and he admired discovers that all his life talking prose, without knowing it. Probably, statistics is something like that, and feel today: without even noticing, they have devoted their careers to artificial intelligence and the sampling error. To find the sampling error and to worry about it — it's not a new problem, we just need a systematic approach to its solution. As mentioned above, in some cases it's really easier to do by studying the problems associated with data about people. We a priori assume that we can have preconceptions about different groups of people, but prejudice about sensors Siemens it is difficult for us even to imagine.
New in all of this, of course, the fact that people no longer do statistical analysis directly. It is carried out machines, which create large complex models that are difficult to understand. The issue of transparency is one of the main aspects of the problem of bias. We are afraid that the system is not just biased, but there is no way to detect her bias, and that machine learning this is different from other forms of automation, which is assumed to consist of clear and logical steps that can be checked.
There are two problems. We may still may conduct a systems audit of machine learning. And audit of any other system really no easier.
First, one of the areas of current research in the field of machine learning is the search for methods to identify important functional systems of machine learning. Thus, machine learning (in its current state) is a quite new area of science that changes quickly, so don't think that is impossible today, things are not may soon become very real. The project OpenAI is an interesting example.
Second, the idea that you can check and understand the decision-making process in existing systems or organizations that are good in theory, but not really in practice. To understand how decisions are made in a large organization is not easy. Even if there is a formal decision-making process, it does not reflect how people interact in reality, and they themselves often do not have a logical systematic approach to their decision-making. As my colleague Vijay Pande, the people — it is also the black boxes.
Take thousands of people in several overlapping companies and institutions, and the problem will become even more difficult. We know after the fact that "space Shuttle" was destined to fall apart when you return, and some people inside NASA had information that gave them reason to think that can happen something bad, but the system in General didn't know that. NASA even has just passed the same audit, having lost the previous Shuttle, and yet it lost another — for much the same reason. It is easy to argue that organizations and people follow a clear logical rules that can be tested, to understand and to change — but experience proves the opposite. It is a "misconception of the state planning Commission".
I frequently compare machine learning databases, especially relational — a new fundamental technology that has changed the capabilities of computer science and the world around her, which became a part of everything that we always use, not realizing in this report. Databases also have problems, and they have similar properties: the system can be built on wrong assumptions or bad data, but it would be difficult to notice, and people using the system, will do what she tells them without asking questions. There are plenty of old jokes about employees tax that incorrectly recorded your name, and convince them to fix the bug much harder than it actually is to change the name. This can be thought of in different ways, but it is unclear how better: how about a technical problem in SQL, or as an error in the Oracle release, or how about the failure of bureaucratic institutions? How difficult is it to find a mistake in the process which led to the fact that the system has such features, like correcting typos? Was it possible to understand this before people started complaining?
Even easier to illustrate this problem to the drivers because of the outdated data in the Navigator, move out into the river. Okay, the cards must be updated constantly. But as far as TomTom's fault that your car is swept away to sea?
I say this to the fact that Yes — bias in machine learning will create problems. But these problems are similar to those we faced in the past, and they can be noticed and solved (or not) is about as good as we've managed in the past. Therefore, a scenario in which the AI bias can cause damage, is unlikely to happen with the leading researchers working in a large organization. Most likely, some minor technological contractor or vendor will write something on his knee, using his incomprehensible opensorce components, libraries and tools. And the hapless customer is buying the phrase “artificial intelligence” in the product description and without asking unnecessary questions, I will distribute it to their low-wage workers, instructing them to do what AI. That's what happened with databases. It's not artificial intelligence, not even a software problem. It is the human factor.
Machine learning can do everything you can to teach your dog, but you can never be sure what the dog is taught.
I often think that the term "artificial intelligence" only gets to go into conversations like this. This term creates the false impression that we actually created it — the intellect. We are on the path to Skynet or HAL9000 — to something that actually understands. But no. It's just a machine, and it is much better to compare, say, with a washing machine. It is much better than a man doing the Laundry, but if you put in her bowl instead of underwear, she... wash. Even dishes will be clean. But it is not what you expected, and this will not happen because the system has some preconceptions about utensils. Washing machine doesn't know that such dishes, or what clothing is only example of automation is conceptually not different from the processes automated before.
What would it neither was about cars, planes or databases — these systems are simultaneously very powerful and very limited. They will completely depend on how people use these systems, good or bad, they have the intentions and how they understand their work.
Therefore, to say that "artificial intelligence is mathematics, so it can't be bias" absolutely not true. But just as wrong to say that machine learning is "subjective in nature". Machine learning finds patterns in the data and what patterns it will find data dependent and data dependent on us. How and what do we do with them. Machine learning really is doing some things much better than us, but dogs, for example, is much more effective than people in detecting the drug that is not a reason to bring them in as witnesses and to impose sentences on the basis of their testimony. And dogs, by the way, much smarter than any machine learning systems.