Tech

Machine learning model identifies apps that are likely to violate Google Play store guidelines


application

Credit: Public Domain Pixabay / CC0

A significant percentage of new apps in the Google App store are removed for violating the store’s guidelines. This is inconvenient for users of these applications, they may lose data in the application. Computer scientists from the University of Groningen have devised two machine learning models that can predict the likelihood of a new app being deleted, both before and after uploading it to the app store. These models can help both developers and users. The details of this project are described in an article that was published in the journal Systems and Soft Computing on September 29.

The Google Play Store has set rules and requirements that developers must follow. Once submitted, apps are immediately uploaded to the store, but it takes Google a while to test them before they remove apps that are found to be in violation of the guidelines. Developers whose apps have been removed multiple times may face a ban from the store.

“My research interest lies in digital privacy and security problem“, said Fadi Mohsen, assistant professor at the Information Systems Group of the Bernoulli Institute of Mathematics, Computer Science and Artificial Intelligence, University of Groningen. It will be possible to predict whether new applications will be affected. delete or not.

“Attempts have been made to do this, but these efforts often focus on specific types of apps that have been removed for specific reasons, such as because they contained malware,” explains Mohsen. “. “We wanted to develop a general model that predicts the likelihood of an app being deleted, regardless of the type of app or the reason for the deletion.” Furthermore, previous efforts have focused solely on users, while Mohsen also wants to assist developers who inadvertently violate the guidelines.

Machine learning model identifies apps that are likely to violate Google Play store guidelines

A high-level overview of the data collection process. Credit: Systems and Soft Computing (In 2022). DOI: 10.1016 / j.sasc.2022.200045

The first step was to collect a large dataset from deleted apps and unremoved apps: “We collected metadata, including descriptions, provided to the store by developers. rows, from about two million apps. We then downloaded the Source code of half of these applications. ”

Mohsen and his colleagues then tracked the status of these apps in the store for six months to see which apps were removed. “In our pick, this was the case for 56% of them.” It took them 26 months to perfect the data set used to create the machine learning models.

The algorithm they used is called Extreme Gradient Boosting. “This is the best machine learning algorithm for these types of problems,” explains Mohsen. The algorithm is used to generate two predictive models: one for developers and one for users. The model for users was defined by 47 features, and in a test dataset it predicted the deletion of a given app with 79.2% accuracy. Like some of these features, such as ratings in app storenot available before submitting the app to the store, the developer model is based on only 37 features and as a result its accuracy is slightly lower: 76.2%.

“We can now predict the future of an application with reasonable accuracy,” says Mohsen. The next step is to develop an interface by which developers and users can assess applications for their risk of deletion. “This is valuable for developers, as they can be banned from the Google App Store if they violate the continuity guidelines,” says Mohsen, but also for users, when they generate data using the app. their application, they will lose it if the application is suddenly withdrawn. “

Other researchers will also benefit from this study. “The rich dataset we created for our paper has been made publicly available through the Dutch repository of Dataverse.nl,” says Mohsen. This means that anyone can try to improve the results obtained by Mohsen and his colleagues. “We’re looking forward to the competition to find out if they can beat us. That will add more benefits to users and developers.”


Meta alert phone app stealing password


More information:
Fadi Mohsen et al., Early Detection of Violations of Mobile Apps: A Data-Driven Predictive Modeling Approach, Systems and Soft Computing (In 2022). DOI: 10.1016 / j.sasc.2022.200045

Data set: dataverse.nl/dataset.xhtml?persistsId=DOI: 10.34894 / H0YJFT

Quote: Machine learning model that identifies apps that potentially violate Google Play store guidelines (2022, October 13) retrieved October 16, 2022 from https://techxplore.com/news/ 2022-10-machine-apps-violate-google-guidelines.html

This document is the subject for the collection of authors. Other than any fair dealing for personal study or research purposes, no part may be reproduced without written permission. The content provided is for informational purposes only.

news7f

News7F: Update the world's latest breaking news online of the day, breaking news, politics, society today, international mainstream news .Updated news 24/7: Entertainment, Sports...at the World everyday world. Hot news, images, video clips that are updated quickly and reliably

Related Articles

Back to top button