Remember the three “What can go wrong?” phenomena from our last blog article?
- Rule Blackbox
- Out-of-Context Bias
- Feedback Loop Bias
In this article, we will use these phenomena to derive a procedure for a compliance-oriented assessment of AI-based decisions in companies.
But before we get started, we need to make a distinction:
When is AI used in a company at all? Various terms in circulation such as artificial intelligence, data mining, machine learning or data science make it difficult to distinguish between them. However, it makes no sense to use these terms or even the methods used to define AI in a company.
This is why we want to talk about the use of AI in a company when decisions in the company are automatically made by machines, initially independent of method and technology. In this respect, the question that should be addressed is that of the compliance of automatic decisions in the company.
Step 1: AI inventory
The first step for a compliance-oriented assessment of AI in a company is the AI inventory. This is simply a list of AI applications used in the enterprise. One can speak of an application of AI in a company if the following criteria are fulfilled:
- Autonomy: The application decides autonomously by machine or supports a human decision considerably, so that at least one essential mechanical influence is present.
- Economic relevance: The decision is not only a decision for the technically correct handling of business transactions, but the decision has economic / entrepreneurial relevance or influence on the organization.
- Learning: The method of automatic decision making is not only based on static rules (“if-then-else concatenations”), but the decision calculation was first learned through the processing of training data by the machine.
- Possibility of dynamic adaptation: The way in which decisions are made could be adapted on a regular basis, since new data is constantly being added for an extended training of the algorithm used.
If an application meets the four criteria, include them in your AI inventory. For each application, the inventory of the following characteristics makes sense:
- Name of the AI application
- Technical responsibility (area/department)
- Description of the economic decision taken / supported by the AI.
- Classification: AI influences a customer relationship or influences internal processes.
Step 2: AI Risk Scoping
In the second step, you build on your created AI inventory and grade the AI applications in accordance with a risk assessment. Use a simple scheme for a risk classification of the AI application, e.g.: low, medium, high risk. Think in terms of scenarios of what the business or legal consequences could be if the AI failed or produced misjudgments. AI applications that affect a customer relationship should tend to be riskier. The following information could be noted for each application in your AI inventory:
- The selected risk assessment
- Scenarios of consequences if the AI would make misjudgments
- Description of influences on key figures used by AI within the company
Step 3: AI Compliance Assessment
In the third step, each AI application undergoes a detailed evaluation. Within the framework of the proposed procedure, recourse is made to the described “What can go wrong?” phenomena: the rule black box, out-of-context bias and feedback loop bias. According to the assessment, one should know whether an AI application poses a risk of non-compliance with regard to the three “What can go wrong?” phenomena.
The compliance manager should first ensure that he or she is aware of the regulatory framework in order to understand which legal requirements the use of AI in the company may come into conflict with. Two examples are given here – without claiming to be exhaustive:
- Section 1 of Germany’s General Act on Equal Treatment (Allgemeines Gleichbehandlungsgesetz – AGG) describes the characteristics that are important for equal treatment: “The purpose of this Act is to prevent or to stop discrimination on the grounds of race or ethnic origin, gender, religion or belief, disability, age or sexual orientation”. These features can – amongst other things – also be used in the training data for an AI system (e.g. in the automatic assessment of applicants for a job). This raises the question: Does the use of AI result in discrimination or can the characteristics be used without any need to worry?
- The General Data Protection Regulation (GDPR): According to Art. 13 and 14 of the GDPR, the processing of personal data is subject to a strict duty to provide information to those affected. This also includes the purpose and legal basis of the processing. Personal characteristics can be found in the training data of an AI. The question raised is thus this: Is the agreed purpose of use to be interpreted to the extent that the use of personal data for training an AI is permitted?
Now the “What can go wrong?” phenomena and the regulatory framework can be combined. Each AI application in the AI inventory should now be clarified. It is recommended to assess the three areas context, method and data of the AI application.
Follow the questions below for each AI application.
- Is the context during the development / training of the AI comparable to the context during the application / operation of the AI? If no: Suspicion of out-of-context bias.
- Do the results of the AI go directly or indirectly (e.g. by conclusion for an action) back into the AI as input data (e.g. time-shifted)? If yes: Suspicion of feedback loop bias
The clarification of the AI method or technology aims to assess the “explainability” of the generated AI decision by a human being.
- To what extent can one tell from the AI how great the effect of a characteristic is? If it is difficult to make statements on this: Suspicion of rule black box
- To what extent can one tell from the AI how large the interaction between two characteristics is? If it is difficult to make statements on this: Suspicion of rule black box
- Can one infer from the AI whether a characteristic is to be assessed as (statistically) significant? If it is difficult to make statements on this : Suspicion of rule black box
The training data determines the behavior of the AI.
- Does the AI use features that could lead to discrimination (derived from the regulatory context)? If so, further questions:
- Is the critical characteristic expressed in the same proportion as in an appropriate comparative population (e.g. women may be under-represented in training data)? If yes: Suspicion of rule black box
- Does the AI application make the same decision if the critical characteristics are removed from the training data? If yes: Further detailed analysis necessary, if necessary suspicion of rule black box
- Are personal characteristics used in the training data? If yes: Inventory in order to be able to provide information according to the GDPR.
- What is the quality of data like? Pay particular attention to the frequency of missing values, i.e. incomplete data records (e.g. unknown age). Missing values” are often filled with a “Best Guess” before being processed in an AI (so-called statistical imputation). Such a “Best Guess” represents what can be statistically expected and suppresses unknown but existing diversity. In the case of extensive “missing values”, especially in the case of critical characteristics with regard to discrimination, the following applies: Suspicion of rule black box
The article has pointed out some possible problems with the use of AI in companies and illustrated them with some clear cases and examples. Based on these findings, we have proposed a 3-step procedure model for the compliance-oriented evaluation of AI applications. The emphasis was placed on consistent application of the procedure model in practice.
Finally, the following two sources are worth mentioning as sources of assistance when assessing the compliance of AI within companies:
- AI Fairness 360 Open Source Toolkit from IBM (http://aif360.mybluemix.net)
- Gender Shades Project Bias in AI (http://gendershades.org)