Manipulating machine learning systems through ordering attack

It is well known that AI systems are vulnerable to different attacks that try to introduce bias in the underlying machine learning model. If data are biased, the learned model could be biased and consequentially AI decisions could also be biased.

However, researchers had uncovered a new attack that can introduce the bias to machine learning model. A group of researchers from University of Cambridge, University of Toronto, Vector Institute and University of Edinburgh had presented a new attack that require no changes to the underlying dataset or model architecture, but instead only change the order in which data are fed to the model.

Most deep neural networks are trained by so called stochastic gradient descent method, which means that the training data are fed into the model in random order. Researchers had found out that if training data are not biased and they are not random but ordered, the trained model could be biased.

With that kind of attacks an attacker can introduce secret bias into the underlying machine learning model, but since the training data are not biased, they can pretend that training was actually fair.

Researchers have found out that an attacker does not need to add biased samples to the training data. But with manipulation of the order of training data, they can undermine both machine learning model integrity (by introducing bias) and its availability (because training could be less effective, or could take longer).

Careful reordering of a model’s training data therefore allows to backdoor the machine learning model without changing the training data at all.

Practical result could be that an attacker can collect a set of credit reports data that is representative for the whole population. However they can start the model’s training on rich men and finish model’s training with poor women. In that case the initialisation bias would undermine the whole underlying machine learning model and the final model would be secretly biased towards rich men.

The research confirmed that stochastic gradient descent method heavily depends on randomness. Therefore, the use of true random number generators is very important if we want to prevent bias in machine learning models. That also means that AI transparency and AI explainability measures should also focus not just on the input data, but also on the process of machine learning model training. The first draft of the UNESCO’s Recommendation on the Ethics of Artificial Intelligence states that “in cases where serious adverse human rights impacts are foreseen, transparency may also require the sharing of specific code or datasets”. The draft also states the “explainability of AI systems also refers to the understandability of the input, output and behaviour of each algorithmic building block and how it contributes to the outcome of the systems”.

The described research reminds us why high transparency and explainability of the AI systems are so important. AI should meet the basic requirements in order to be beneficial for everyone, trustworthy and to follow the fundamental rights and ethical principles.

Author: Matej Kovačič 

Links 

Manipulating SGD with Data Ordering Attacks, https://arxiv.org/abs/2104.09667

Manipulating Machine-Learning Systems through the Order of the Training Data, https://www.schneier.com/blog/archives/2022/05/manipulating-machine-learning-systems-through-the-order-of-the-training-data.html

Outcome document: first draft of the Recommendation on the Ethics of Artificial Intelligence, https://unesdoc.unesco.org/ark:/48223/pf0000373434