Inference attacks: How much information can machine learning models leak?

Academics warn that user privacy may fall at the hands of little-known attack vector

The widespread adoption of machine learning models in different applications has given rise to a new range of privacy and security concerns including little-known inference attacks, warns a new paper

The widespread adoption of machine learning models in different applications has given rise to a new range of privacy and security concerns.

Among them are ‘inference attacks’, whereby attackers cause a target machine learning model to leak information about its training data.

However, these attacks are not very well understood and we need to readjust our definitions and expectations of how they can affect our privacy.

This is according to researchers from several academic institutions in Australia and India who made the warning in a new paper (PDF) accepted at the IEEE European Symposium on Security and Privacy, which will be held in September.

The paper was jointly authored by researchers at the University of New South Wales; Birla Institute of Technology and Science, Pilani; Macquarie University; and the Cyber & Electronic Warfare Division, Defence Science and Technology Group, Australia.

Membership and attribute inference attacks

Machine learning models are trained on thousands of data records. In many cases, these datasets might contain sensitive information such as names, birth dates, addresses, passwords, credit card numbers, health data, and other personal details.

Inference attacks aim to reveal this secret information by probing a machine learning model with different input data and weighing the output.

There are different types of inference attacks. Membership inference (MI) is a type of attack in which the adversary tries to rebuild the records used to train the model.

In membership inference, the attacker runs one or more records through a machine learning model and determines whether it belonged to the training dataset based on the model’s output.

In general, machine learning models output stronger confidence scores when they are fed with their training examples, as opposed to new and unseen examples.

Another type of inference attack is attribute inference (AI), in which an attacker has partial knowledge of a training record and tries to guess the missing attributes by tweaking them until the model reaches its peak performance.

Strong membership inference

Previous research in the field shows that membership and attribute inference can be combined to rebuild a target machine learning model’s training dataset.

But in their paper, ‘On the (In)Feasibility of Attribute Inference Attacks on Machine Learning Models’, the researchers show that combining membership and attribute inference is more complicated than was previously thought – and even if a target model is susceptible to MI attacks, AI attacks on the same model will not necessarily work.

For such attacks to succeed, the attackers must be able to perform strong membership inference (SMI), the researchers write.

Plain membership inference techniques tend to confuse member examples with non-members when their attributes are close enough.

In contrast, SMI is a technique that can tell a difference between a member and non-members that are very similar to it.

The researchers argue that “a successful MI attack does not necessarily mean a successful SMI attack” and that “a successful SMI attack is essential for an AI attack.”

Reciprocally, the researchers’ findings show that if a target model is not susceptible to SMI attacks, standalone AI attacks will not succeed on it either.

The researchers tested five known model inference attacks on several different machine learning architectures, including neural networks, support vector machines, random forests, and logistic regression.

Their results show that “while these attacks are successful to infer membership, they are ineffective in inferring membership at distances close to the training dataset (SMI).”

Approximate attribute inference

In the paper, the researchers also propose approximate attribute inference (AAI), a more relaxed notion of AI in which the adversary tries to find values that are in the close vicinity of the target attributes. AAI attacks are more feasible than AI attacks on models, according to the researchers.

“It is possible to infer attributes approximately close to their true values with a success rate significantly greater than random guess when the target model is susceptible to membership inference,” the researchers write.

READ Prototype pollution: The dangerous and underrated vulnerability impacting JavaScript applications

AAI attacks also become increasingly accurate when the target machine learning has been overfitted, the researchers have found.

Overfitting happens when the engineers of a machine learning model have too few training examples or run the training procedure for too long.

This results in the model being very accurate of the training examples and not accurate enough on novel examples.

Implications for machine learning security

Machine learning engineers want to constantly protect their models against different types of security and privacy threats.

The researchers’ work shows that in applications where the main privacy concern is AI and not MI, developers can make use of the paper’s findings to better protect their models.

“As a future direction, it will be interesting to explore whether the approximate attribute inference attacks mentioned in this paper can be improved to infer missing attributes as close as possible to the original attributes,” the researchers write.

The Daily Swig reached out to the authors for comments on their work.

Inference attacks: How much information can machine learning models leak?

Membership and attribute inference attacks

Strong membership inference

Approximate attribute inference

Implications for machine learning security

We’re going teetotal – It’s goodbye to The Daily Swig

Bug Bounty Radar

Indian gov flaws allowed creation of counterfeit driving licenses