What “Feeds” AI Systems and What are the Legal Implications

AI and surveillance

No matter how complex and advanced AI autonomous system a state develops, it can be rendered useless in the context of internal or non-internal conflict if it exists without access to or use of real data. Vast amount of data feeds into AI in order to help it produce desired outcomes.

Data and protected characteristics

Personal data is a subset of data that will almost always feed AI system that is aimed to be used within population. Whilst anonymous or synthetic data can be used to train a model and make it ‘function’, it is inevitable that during warfare real data, as opposed to anonymous or synthetic data, must at some point be utilised in order to make the AI system functional in any shape or form in, real world scenarios.

Often, datasets that are fed into/used in training of AI, contains very sensitive types of personal data, also known as special category data, or protected characteristics, such as precise location, nationality, or religious views. Moreover, sensitive nature of data can often be inferred from other information present in a dataset. AI systems in conflict settings often process personal data for purposes such surveillance, intelligence gathering, and profiling, target identification and risk assessment, as well as humanitarian aid distribution (e.g., biometric verification for aid recipients).

There are many contexts in which non-protected characteristics, such as the postcode a person lives in, are proxies for a protected characteristic, like race. Recent advances in machine learning, such as ‘deep’ learning, have made it even easier for AI systems to detect patterns in the world that are reflected in seemingly unrelated data. Unfortunately, this also includes detecting patterns of discrimination using complex combinations of features which might be correlated with protected characteristics in non-obvious ways (source: ICO).

Surveillance through profiling & monitoring

In order to gather vasts amounts of intelligence, which will almost always be in a form of personal data, it is crucial to perform surveillance. For the purposes of AI aimed at defence purposes, such surveillance is always extensive, long-term, and extremely invasive. Such surveillance datasets will include precise location data, close range satellite imagery, as well as behavioural data. AI technology that is used for such purposes is often autonomous in nature.

However, as explained by Dr Elke Schwarz – Reader in Political Theory at Queen Mary University London- “systems that employ AI […] are likely to be marred by incomplete, low-quality, incorrect or discrepant data. This, in turn, will lead to highly brittle systems and biased, harmful outcomes that will likely yield counterproductive outcomes. Autonomous systems tend to be built and tested on rather limited samples of data, sometimes synthetic data, sometimes inappropriate data—it is simply not possible to model the complexities of a battlefield accurately” (source: House of Lords).

Automated decision-making

Since AI involves pattern recognition, prediction-making, and problem solving, such tasks will inevitably involve decisions that are fully or partially automated. Importantly, it is those parts of decision-making that do not involve human oversight, that are most concerning.

Such decision-making can have a huge impact on entire populations as well as on individuals, depending on the aim of an AI system. During combat the role of personal data is hugely important in order for states to get military advantage. Indeed, in modern warfare, this form of decision-making can form the core function of an AI system built for combat, in order to aid it.

Dealing with bias

AI systems are known for producing output that can be read as biased. Since, AI learns from datasets which may be unbalanced and/or reflect discrimination, they may produce results which have discriminatory effects on people based on their gender, race, age, health, religion, or other characteristics.

The fact that AI systems learn from data does not guarantee that their outputs will not lead to discriminatory effects. The data used to train and test AI systems, as well as the way they are designed, and used, might lead to AI systems which treat certain groups less favourably without objective justification (source: ICO).

There is however some hope, based on more recent research into AI bias. One of such research findings suggests that through mathematical calculation, AI can be made less biased, or even bias-free. More importantly, the research identifies a test that is based on a formula developed based on this particular research, called ‘Conditional Demographic Disparity’. It was first proposed in a 2020 paper by a Sandra Wachter, Brent Mittelstadt (both Turing Fellows at the time) and Chris Russell (former Group Leader in Safe and Ethical AI at the Turing). The test lays out “ethical foundations” and has already been implemented by Amazon Web Services, which relies heavily on AI (source: Wachter et al, 2020).

Legal implications from international law perspective

In terms of applicable legal framework, and specifically when protected characteristics and/or precise location data is involved, both the international humanitarian as well as the international human rights legal frameworks, will apply and likely overlap. From the humanitarian law perspective, it will be important to ensure that the use of an AI system is lawful, which will include the assessment of proportionality, precaution, distinction and humanity.

Whilst, international humanitarian law (IHL) does not explicitly regulate personal data or privacy since its main focus is the protection of persons from direct effects of hostilities (e.g., distinction, proportionality), IHL prohibits attacks on civilians and persons hors de combat. Therefore, any use of AI-powered surveillance or targeting systems that process personal data must not result in indiscriminate or disproportionate harm. Further, specific groups (e.g., detainees and the wounded) are protected from exposure to unlawful interference with their person, which can extend to information privacy in some interpretations.

From the international human rights law (IHRL) perspective, and especially privacy and data protection, the important factor is to ensure that non-discrimination is built into the system during the surveillance and training phases. Additionally, the protection of privacy, autonomy, and cultural rights is also important so that AI will does not cause harm.

Whether undertaking vast and deep surveillance complies with IHRL, is debatable. Some NGOs, such as the Human Rights Watch, describe using material collected through surveillance in order to train AI models as “invasive and incompatible with human rights”.

IHRL imposes not only duties on governments to uphold, but also responsibilities on companies and organisations, such as those procuring such systems, to comply, as well as to provide requirements for legal remedies and reparation of harms (source: Chatham House).

What “Feeds” AI Systems and What are the Legal Implications

Tags