A Deep Dive Inside DREBIN: An Explorative Analysis beyond Android Malware Detection Scores

Kevin Allix Jacques Klein N. Daoudi + 1 penulis

Abstrak

Machine learning advances have been extensively explored for implementing large-scale malware detection. When reported in the literature, performance evaluation of machine learning based detectors generally focuses on highlighting the ratio of samples that are correctly or incorrectly classified, overlooking essential questions on why/how the learned models can be demonstrated as reliable. In the Android ecosystem, several recent studies have highlighted how evaluation setups can carry biases related to datasets or evaluation methodologies. Nevertheless, there is little work attempting to dissect the produced model to provide some understanding of its intrinsic characteristics. In this work, we fill this gap by performing a comprehensive analysis of a state-of-the-art Android malware detector, namely DREBIN, which constitutes today a key reference in the literature. Our study mainly targets an in-depth understanding of the classifier characteristics in terms of (1) which features actually matter among the hundreds of thousands that DREBIN extracts, (2) whether the high scores of the classifier are dependent on the dataset age, and (3) whether DREBIN’s explanations are consistent within malware families, among others. Overall, our tentative analysis provides insights into the discriminatory power of the feature set used by DREBIN to detect malware. We expect our findings to bring about a systematisation of knowledge for the community.

Artikel Ilmiah Terkait

An Analysis of Machine Learning-Based Android Malware Detection Approaches

S. Karpagam R. Kavitha R. Srinivasan + 1 lainnya

1 Agustus 2022

Despite the fact that Android apps are rapidly expanding throughout the mobile ecosystem, Android malware continues to emerge. Malware operations are on the rise, particularly on Android phones, it make up 72.2 percent of all smartphone sales. Credential theft, eavesdropping, and malicious advertising are just some of the ways used by hackers to attack cell phones. Many researchers have looked into Android malware detection from various perspectives and presented hypothesis and methodologies. Machine learning (ML)-based techniques have demonstrated to be effective in identifying these attacks because they can build a classifier from a set of training cases, eliminating the need for explicit signature definition in malware detection. This paper provided a detailed examination of machine-learning-based Android malware detection approaches. According to present research, machine learning and genetic algorithms are in identifying Android malware, this is a powerful and promising solution. In this quick study of Android apps, we go through the Android system architecture, security mechanisms, and malware categorization.

Android malware analysis in a nutshell

W. El-shafai Mohanned Ahmed Iman M. Almomani

5 Juli 2022

This paper offers a comprehensive analysis model for android malware. The model presents the essential factors affecting the analysis results of android malware that are vision-based. Current android malware analysis and solutions might consider one or some of these factors while building their malware predictive systems. However, this paper comprehensively highlights these factors and their impacts through a deep empirical study. The study comprises 22 CNN (Convolutional Neural Network) algorithms, 21 of them are well-known, and one proposed algorithm. Additionally, several types of files are considered before converting them to images, and two benchmark android malware datasets are utilized. Finally, comprehensive evaluation metrics are measured to assess the produced predictive models from the security and complexity perspectives. Consequently, guiding researchers and developers to plan and build efficient malware analysis systems that meet their requirements and resources. The results reveal that some factors might significantly impact the performance of the malware analysis solution. For example, from a security perspective, the accuracy, F1-score, precision, and recall are improved by 131.29%, 236.44%, 192%, and 131.29%, respectively, when changing one factor and fixing all other factors under study. Similar results are observed in the case of complexity assessment, including testing time, CPU usage, storage size, and pre-processing speed, proving the importance of the proposed android malware analysis model.

Can We Trust Your Explanations? Sanity Checks for Interpreters in Android Malware Analysis

Ting Liu Yang Liu Xiaofei Xie + 3 lainnya

13 Agustus 2020

With the rapid growth of Android malware, many machine learning-based malware analysis approaches are proposed to mitigate the severe phenomenon. However, such classifiers are opaque, non-intuitive, and difficult for analysts to understand the inner decision reason. For this reason, a variety of explanation approaches are proposed to interpret predictions by providing important features. Unfortunately, the explanation results obtained in the malware analysis domain cannot achieve a consensus in general, which makes the analysts confused about whether they can trust such results. In this work, we propose principled guidelines to assess the quality of five explanation approaches by designing three critical quantitative metrics to measure their stability, robustness, and effectiveness. Furthermore, we collect five widely-used malware datasets and apply the explanation approaches on them in two tasks, including malware detection and familial identification. Based on the generated explanation results, we conduct a sanity check of such explanation approaches in terms of the three metrics. The results demonstrate that our metrics can assess the explanation approaches and help us obtain the knowledge of most typical malicious behaviors for malware analysis.

Android malware analysis and detection: A systematic review

Sukhdip Singh Gulshan Shrivastava Anuradha Dahiya

25 Oktober 2023

Android malware has been emerged as a significant threat, which includes exposure of confidential information, misrepresentation of facts and execution of applications without the knowledge of the users. Malware analysis plays an essential role in dealing with the unlawful behaviour of such malicious applications. Android malware analysis involves examining and understanding malware behaviour and its characteristics. It also includes potential adversarial impacts on Android devices. This paper presents a quick understanding and a holistic view of malware detection and analysis. The current investigation conducted a systematic literature review (SLR) to recognize the salient shifts in malware detection by examining a range of scholarly journals and conference papers. The SLR investigated 99 articles published between the years 2018 and 2023. The key observation of this SLR is that static analysis is the most implemented approach for detecting Android malware; Apktool and Androguard are the most frequently used tools. This study also conceded that deep learning and machine learning models have more potential to analyse the malicious behaviour of malware. Certain challenges are faced in Android malware analysis, that is, obfuscation techniques, dynamic code loading, and issues related to experimented datasets. Further, this study focuses on the following areas: the definition of the sample set, data optimisation and processing, feature extraction, machine learning application, and classifier validation. This investigation differs from previous analyses of Android malware detection by emphasizing additional methods based on machine learning.

Android Malware Category and Family Classification Using Static Analysis

Cong-Danh Nguyen Nghi Hoang Khoa Cam Nguyen Tan + 1 lainnya

11 Januari 2023

In recent years, Android malware has been overgrown, challenging malware analysts. However, there has been a lot of research in detecting and classifying Android malware based on machine learning. Android malware classification is an essential goal in classifying malware families. This paper proposes the application of machine learning and deep learning methods in classifying malware families and categories based on many different datasets to evaluate and select suitable methods for each dataset. This work demonstrates that with the Drebin and CICMaldroid2020 datasets classified by family and category, respectively, after feature extraction and selection, trained and evaluated with machine learning models, results are high accuracy, and the false positive rate is low. We also compare our results with several previous studies to highlight our results.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.