A new inference attack could allow access to sensitive user data

A new inference attack could allow access to sensitive user data

An illustrative example of VFL. Party B is a financial company that holds features 1 and 2, and party A is a bank that owns features 3 and 4. They cooperate to train a model that predicts whether a loan application will be successful. approve or not. Credit: Morteza Varasteh.

As the use of machine learning (ML) algorithms continues to grow, computer scientists around the world are constantly trying to identify and solve ways that these algorithms can be used effectively. malicious or inappropriate. In fact, because of their advanced data analytics capabilities, ML methods have the potential to quickly and efficiently allow third parties to access private data or carry out cyberattacks. .

Morteza Varasteh, a researcher at the University of Essex in the UK, recently identified a new type of inference attack capable of compromising users’ confidential data and sharing that data with other parties. This attack, detailed in a pre-published paper on arXivexploits vertical federated learning (VFL), a distributed ML scenario where two different parties possess different information about the same individual (customer).

“This work builds on my previous collaboration with a colleague at Nokia Bell Labs, where we introduced a method of extracting private user information in the data center, known as a receiver. dynamic (for example: insurance company),” Varasteh told Tech Xplore, “The passive party collaborates with another data centercalled an active party (e.g. a bank), to build an ML algorithm (e.g. a credit approval algorithm for a bank).”

The main goal of Varasteh’s recent research is to show that after developing an ML model in a vertical associative learning (VFL) setting, the so-called “active side” is capable of extracting confidential information. of the user, this information is only shared with other parties involved in building the ML model. Active parties can do so using their own available data in combination with other information about the ML model.

Importantly, this can be done without making user requests from the other party. This means that, for example, if a bank and an insurance company company By collaborating on the development of the ML algorithm, banks can use this model to obtain information about their own customers who are also customers of the insurance company without asking for their permission.

“Consider a scenario where a bank and an insurance company have many common customers, with customers sharing some information with the bank and some with the insurance company,” explains Varasteh. “To build a more robust credit approval model, the bank partnered with an insurance company to create a machine learning (ML) algorithm. The model was built and the bank used it for processing. loan applications, including one from a client named Alex, who was also an insurance company customer.”

In the scenario outlined by Varasteh, the bank might be interested in finding out what information Alex (a hypothetical user they shared with an insurance company) shared with the insurance company. Of course, this information is private so the insurance company cannot freely share it with the bank.

“To overcome this, banks can create another ML model based on their own data to mimic the ML model built in collaboration with the insurer,” says Varasteh. “The ML model automatically makes estimates of Alex’s overall situation in the insurance company, taking into account the data Alex shares with the bank. After the bank has an insight into Alex’s situation , and using the parameters of the VFL model, they were able to use a set of equations to solve Alex’s personal information that was only shared with the insurance company.”

The inference attack outlined by Varasteh in his paper is relevant to all situations where two parties (e.g. bank, company, institution, etc.) share some common users and hold sensitive data of these users. Executing these types of attacks would require an “active” party to hire developers to create autonomous ML models, a task that is now becoming easier to do.

“We show that a bank (i.e. an operating party) can use its available data to estimate the outcome of a VFL model built in collaboration with a company,” says Varasteh. insurance company”.

“After obtaining this estimate, it is possible to solve a set of mathematical equations using the parameters of the VFL model to obtain the personal information of the hypothetical user Alex. It is worth noting that the information is Alex’s personal information is not allowed to be shared with anyone.Although some additional countermeasures have been introduced in the article to prevent this type of attack, the attack itself is still a remarkable piece. in the study results.”

Varasteh’s work has shed light on several new initiatives on the potential for malicious use of ML models to gain unauthorized access to users’ personal information. Notably, the attack and data breach scenario he identified had not been explored in the document before.

In his paper, a researcher at the University of Essex suggests that privacy protection schemes (PPS) can protect users from this type of inference attack. These schemas are designed to distort the parameters of the VFL model to correspond to the characteristics of the data held by the so-called passive party, such as insurance company in the scenario outlined by Varasteh. By distorting these parameters to varying degrees, passive parties that collaborate to help the active party build the ML model can reduce the risk that the active party has access to sensitive customer data. Surname.

This recent work may inspire other researchers to assess the risk of a newly discovered inference attack and identify similar attacks in the future. Meanwhile, Varasteh intends to further examine VFL structures, look for potential privacy loopholes, and develop algorithms that can close them with minimal harm to all parties involved.

“VFL’s main goal is to enable the building of powerful ML models while ensuring that user privacy is protected,” Varasteh added. “However, there is a subtle dichotomy in VFL between the passive party, which is responsible for keeping user information secure, and the active party, which aims to better understand the VFL model and its outcomes. Provide clear information about the VFL . model model results that may lead to ways to extract personal information. Therefore, there is still much work to be done on both sides and for different scenarios in the context of the VFL.”

More information:
Morteza Varasteh, Privacy against inference agnostic attacks in vertically linked learning, arXiv (2023). DOI: 10.48550/arxiv.2302.05545

Journal information:

© 2023 Science X Network

quote: A new inference attack that could allow access to sensitive user data (2023, March 7) retrieved March 7, 2023 from 2023-03-inference-enable-access-sensitive-user.html

This document is the subject for the collection of authors. Other than any fair dealing for private learning or research purposes, no part may be reproduced without written permission. The content provided is for informational purposes only.


News7F: Update the world's latest breaking news online of the day, breaking news, politics, society today, international mainstream news .Updated news 24/7: Entertainment, the World everyday world. Hot news, images, video clips that are updated quickly and reliably

Related Articles

Back to top button