A Hypergraph-Based Android Malware Detection System and Method

By constructing a hybrid association structure for APKs and APIs using a hypergraph neural network, the problem of insufficient mining of high-order associations in existing technologies is solved, achieving more accurate Android malware detection and more efficient computation, and supporting the fusion of multimodal data.

CN117150489BActive Publication Date: 2026-06-30UNIV OF ELECTRONICS SCI & TECH OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
UNIV OF ELECTRONICS SCI & TECH OF CHINA
Filing Date
2023-08-28
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing technologies fail to fully exploit the high-order relationships between APKs and APIs when detecting Android malware, resulting in information loss and excessive computational overhead. Furthermore, graph neural network models lack sufficient granularity in modeling these relationships, making it impossible to accurately describe multi-dimensional relationships.

Method used

A hypergraph neural network is used to construct a hybrid association structure of APK and API. High-order and low-order features are extracted by hypergraph convolutional neural network, and the hypergraph is transformed into a simple graph by clique expansion algorithm. The features are then fused by attention mechanism to achieve classification of APK nodes.

Benefits of technology

It can more accurately describe the multi-dimensional relationships between APKs, reduce computational overhead, improve detection accuracy and efficiency, and support the fusion and expansion of multimodal data.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117150489B_ABST
    Figure CN117150489B_ABST
Patent Text Reader

Abstract

This invention belongs to the field of malicious code detection and provides an Android malware detection system and method based on a hypergraph. It primarily addresses the problems of existing heterogeneous graph-based Android malware detection methods, such as coarse granularity, high computational overhead, and neglect of high-order relationships between applications during application relationship modeling. The main scheme includes: statically analyzing Android APK files to obtain their API call information; constructing a hypergraph describing the relationships between APKs based on their API call relationships; obtaining a simple graph corresponding to the hypergraph through clique expansion; extracting high-order and low-order relationship features between APKs through hypergraph convolution and graph convolution based on the obtained hypergraph, simple graph, and initial node features, and fusing them through an attention mechanism; training and learning the detection model on the fused features using node classification tasks; and using the trained detection model to detect and identify Android malware.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of malicious code analysis. It is a method for analyzing and classifying Android software based on static analysis, hybrid correlation structure, and hypergraph convolutional neural network. The object of analysis is Android software, and it can automatically output whether the Android software is malicious software. Background Technology

[0002] With the increasing prevalence of mobile devices such as tablets and smartphones, attacks targeting mobile devices are on the rise. Mobile malware is one of the most dangerous threats, causing various security incidents and economic losses. This poses a serious threat to users, society, and the nation. Therefore, research on Android malware and its detection technologies has significant practical implications.

[0003] In recent years, with the widespread application of deep learning technology, the use of deep learning to detect Android malware has emerged, significantly improving the effectiveness of malware detection. Researchers have proposed various detection algorithms and solutions using different deep neural network models, among which graph neural networks have demonstrated good performance. A hypergraph is a graph structure that uses hyperedges to connect two or more vertices, enabling it to model higher-order data correlations. These characteristics give hypergraphs a stronger ability to characterize and mine nonlinear higher-order relationships between data samples compared to general graph structures, allowing for more accurate modeling of multi-dimensional relationships and richer semantic learning. Furthermore, hypergraph structures are more flexible in handling multimodal and heterogeneous data, facilitating multimodal fusion and expansion.

[0004] In the paper "HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network," Hou et al. represent Android applications and APIs as nodes, with relationships between nodes represented by edges, forming a structured heterogeneous network. They then use a meta-path-based approach to describe the semantic relevance of applications and APIs, and employ multi-kernel learning to aggregate different similarities. This method has the following problems:

[0005] (1) He only analyzed the most direct relationship between APKs based on API calls, without fully exploring the higher-order relationship between APKs and APIs. This will ignore the multi-dimensional relationship between APKs brought about by API calls. For example, the type and classification information of various applications using the same API (such as APIi) cannot be obtained from the simple graph. The lower-order relationships reflected in the simple graph are all binary, reflecting the relationship between APKs.

[0006] (2) It uses a general graph structure to represent the direct call relationship between APK and API, without distinguishing between different APIs called by APK. This results in the loss of some original information when establishing the relationship between APK and API. For example, APK1 and APK2 both call the same API1, which is related to network status; APK3 and APK4 both call the same API2, which is related to file operations. In the final simplified graph representation, only the fact that APK1 and APK2, and APK3 and APK4 both call the same API is shown, without showing the API itself.

[0007] In the paper "GDroid: Android malware detection and classification with graphconvolutional network", Gao et al. constructed a heterogeneous graph containing APK-API edges and API-API edges by mining information on APK calls to APIs and API usage patterns, and then used a graph neural network to train and classify the information in the heterogeneous graph. Similar to paper 1, this method still has the following problems:

[0008] (1) The relationship between APKs is constructed based solely on the similarity of functions between APIs. The granularity of the representation of the association formed by different APIs is relatively coarse. For example, the final figure shows how many APIs with the same usage pattern are between two APKs, but does not distinguish which APIs caused the association.

[0009] (2) Only the direct call relationship between APK and API is considered, but the mining of higher-order relationships between them is insufficient and the mining of multi-dimensional relationships between APKs is ignored. For example, the type and classification information of various applications using the same API (such as APIi) cannot be obtained from the simple graph. The lower-order relationships reflected in the simple graph are all binary, reflecting the relationship between APK and APK.

[0010] (3) This method involves multiple large-scale matrix multiplications when calculating the relationship between APKs, which will result in a large time and computational overhead. Summary of the Invention

[0011] To address the aforementioned issues, this invention aims to propose an Android malware detection method based on a hypergraph neural network. This method makes a preliminary exploration of malware detection using hypergraphs. It offers finer-grained representation of the association structures formed by different APIs, with each API having a corresponding separate hyperedge representation. Furthermore, it integrates high-order and low-order associations for richer semantic learning. This method utilizes hypergraphs to more accurately describe the relationships between APKs associated with multiple different APIs (multivariate relationships between APKs), preventing losses caused by forcibly converting multivariate relationships into binary relationships. Simultaneously, it simplifies the association modeling process, avoiding multiple large-scale matrix multiplication calculations.

[0012] To achieve the above objectives, the present invention adopts the following technical solution:

[0013] This invention provides an Android malware detection system based on a hypergraph, comprising the following modules:

[0014] Program analysis module: Used to perform static analysis on Android APK files and extract their API call information;

[0015] Hypergraph and Simple Graph Construction Module: Used to represent APKs and extracted APIs using a hypergraph structure, mapping APKs to nodes and APIs to hyperedges. All APKs that call a certain API are connected by the hyperedge representing that API. At the same time, it uses clique extension to transform the previously constructed hypergraph into a simple graph.

[0016] Feature extraction module: Used for spatial domain-based learning in the hybrid association structure detection model using hypergraph convolutional neural network, that is, extracting high-order and low-order features of hypergraph and simple graph from the node-to-node neighbor information aggregation method, and then fusing these two parts of features as features of APK node;

[0017] The classification module is used to train and learn the node classification task based on the mixed association relationships between APKs and APIs using a hypergraph convolutional neural network. Specifically, it trains and learns the fused features obtained from the feature extraction module, which combine high-order and low-order associations, and then uses classifiers such as MLPs for final detection and classification. The trained intelligent classifier is then used to detect and identify Android malware, outputting the classification results.

[0018] In the above technical solution, the specific steps for constructing the hypergraph and simple graph are as follows:

[0019] Step a1: Hypergraph construction. For Android APK files, map them as nodes in the hypergraph. For API call information obtained from the program analysis module, map it as hyperedges in the hypergraph. Connect APKs that call the same API through hyperedges. When APK1 and APK2 both call API1, and APK2, APK3 and APK4 both call API2, the constructed hypergraph is: Hypergraph node set N = {APK1, APK2, APK3}, Hypergraph hyperedge set HE = {he1 = (APK1, APK2), he2 = (APK2, APK3, APK4)}. Different hyperedges are used to distinguish the associations between APKs caused by different API calls.

[0020] Step a2: Simple graph transformation. The hypergraph obtained in step a1 is transformed into a simple graph using the clique extension algorithm, representing the low-order associations between APKs.

[0021] In the above technical solution, the feature extraction module is implemented through the following specific steps:

[0022] Step b1: Initialize APK node features. Encode the API call information related to the APK using one-hot encoding as the initial features of the APK node. Expand the node features to other features that can represent the information of the APK node itself. Specifically, this involves feature encoding of more program analysis results, embedding program code using natural language processing, or embedding the application's function call graph or control flow graph using graph embedding.

[0023] Step b2: Low-order association feature extraction. For the simple graph obtained in the graph construction module and the initial node features of APK in step b1 above, a graph convolutional neural network is used to process them to obtain low-order association features.

[0024] Step b3: Extraction of higher-order association features. For the hypergraph obtained in the graph construction module and the initial features of the APK nodes in step b1 above, use a hypergraph convolutional neural network to process them to obtain higher-order association features.

[0025] Step 4: Feature fusion extraction. The low-order and high-order association features obtained in steps bb2 and b3 above are fused using an attention mechanism to obtain the final features for node classification.

[0026] This invention also provides a method for detecting Android malware based on a hypergraph. The method involves statically analyzing APK files to extract their API call information; mapping the call relationships between APKs and APIs to nodes and hyperedges in a hypergraph to construct the hypergraph; transforming the hypergraph into a simple graph through clique expansion; obtaining high-order and low-order association features of nodes by spatial domain-based learning on the hypergraph and simple graph; obtaining APK node features by fusing the high-order and low-order association features of nodes; and learning the node features using a hypergraph convolutional neural network.

[0027] The above technical solution also includes the following steps:

[0028] S1: Perform static analysis on the Android APK file to obtain API call information;

[0029] S2: Used to represent APKs and extracted APIs using a hypergraph structure, mapping APKs to nodes and APIs to hyperedges. All APKs that call a certain API are connected by the hyperedge representing that API. At the same time, clique extension is used to transform the previously constructed hypergraph into a simple graph.

[0030] S4: Spatial-domain-based learning in hybrid association structure detection models using hypergraph convolutional neural networks, that is, extracting high-order and low-order features of hypergraph and simple graph from the node-to-node neighbor information aggregation method, and then fusing these two parts of features as the features of APK nodes.

[0031] S5: Used for training and learning a hypergraph convolutional neural network to classify nodes based on the mixed relationships between APKs and APIs. Specifically, it trains and learns fused features obtained from the feature extraction module, which integrate high-order and low-order relationships, and uses classifiers such as MLPs for final detection and classification. The trained intelligent classifier is then used to detect and identify Android malware, outputting classification results.

[0032] In the above technical solution, step 2 includes the following steps:

[0033] Step a1: Hypergraph construction. For Android APK files, map them as nodes in the hypergraph. For API call information obtained from the program analysis module, map it as hyperedges in the hypergraph. Connect APKs that call the same API through hyperedges. When APK1 and APK2 both call API1, and APK2, APK3 and APK4 both call API2, the constructed hypergraph is: Hypergraph node set N = {APK1, APK2, APK3}, Hypergraph hyperedge set HE = {he1 = (APK1, APK2), he2 = (APK2, APK3, APK4)}. Different hyperedges are used to distinguish the associations between APKs caused by different API calls.

[0034] Step a2: Simple graph transformation. The hypergraph obtained in step a1 is transformed into a simple graph using the clique extension algorithm, representing the low-order associations between APKs.

[0035] In the above technical solution, step 3 includes the following steps:

[0036] Step b1: Initialize APK node features. Encode the API call information related to the APK using one-hot encoding as the initial features of the APK node. Expand the node features to other features that can represent the information of the APK node itself. Specifically, this involves feature encoding of more program analysis results, embedding program code using natural language processing, or embedding the application's function call graph or control flow graph using graph embedding.

[0037] Step b2: Low-order association feature extraction. For the simple graph obtained in the graph construction module and the initial node features of APK in step b1 above, a graph convolutional neural network is used to process them to obtain low-order association features.

[0038] Step b3: Extraction of higher-order association features. For the hypergraph obtained in the graph construction module and the initial features of the APK nodes in step b1 above, use a hypergraph convolutional neural network to process them to obtain higher-order association features.

[0039] Step 4: Feature fusion extraction. For the low-order and high-order association features obtained in steps b2 and b3 above, an attention mechanism is used to fuse them to obtain the final features for node classification.

[0040] Compared with the prior art, the beneficial effects of this invention are as follows:

[0041] I. This invention is the first method to use hypergraphs for malware detection. It can strongly characterize and mine non-linear high-order relationships between APKs. At the same time, using hypergraphs can provide finer granularity in the representation of relationships formed by different APIs, thereby more accurately describing the relationships between APKs associated with different APIs and preventing the loss of original information in the process of converting multi-dimensional relationships into binary relationships.

[0042] Second, in obtaining low-order correlations, this invention utilizes clique extension to process the hypergraph and transform it into a simple graph, thus saving the computational and time overhead caused by huge matrix multiplication operations.

[0043] Third, this invention uses a hybrid association structure detection model based on hypergraph convolutional neural networks to learn Android APKs and their features, integrating high-order and low-order association relationships, enabling richer semantic learning.

[0044] Fourth, the hypergraph utilized in this invention can achieve multimodal and heterogeneous data representation through its flexible hyperedge expansion, facilitating multimodal fusion and expansion. This allows researchers to subsequently add features such as permissions, interfaces, and Android dynamic link libraries to enable richer semantic learning in the model, thereby improving the accuracy of model detection. Attached Figure Description

[0045] Figure 1 This is a schematic diagram of the overall workflow of the present invention. Detailed Implementation

[0046] To make the objectives, technical solutions, and advantages of this invention clearer, examples are provided below for illustration.

[0047] This invention provides an Android malware detection system based on a hypergraph, comprising the following modules:

[0048] Program analysis module: Used to perform static analysis on Android APK files and extract their API call information;

[0049] Hypergraph and Simple Graph Construction Module: Used to represent APKs and extracted APIs using a hypergraph structure, mapping APKs to nodes and APIs to hyperedges. All APKs that call a certain API are connected by the hyperedge representing that API. At the same time, it uses clique extension to transform the previously constructed hypergraph into a simple graph.

[0050] Feature extraction module: Used for spatial domain-based learning in the hybrid association structure detection model using hypergraph convolutional neural network, that is, extracting high-order and low-order features of hypergraph and simple graph from the node-to-node neighbor information aggregation method, and then fusing these two parts of features as features of APK node;

[0051] The classification module is used to train and learn the node classification task based on the mixed association relationships between APKs and APIs using a hypergraph convolutional neural network. Specifically, it trains and learns the fused features obtained from the feature extraction module, which combine high-order and low-order associations, and then uses classifiers such as MLPs for final detection and classification. The trained intelligent classifier is then used to detect and identify Android malware, outputting the classification results.

[0052] In the above technical solution, the specific steps for constructing the hypergraph and simple graph are as follows:

[0053] Step a1: Hypergraph construction. For Android APK files, map them as nodes in the hypergraph. For API call information obtained from the program analysis module, map it as hyperedges in the hypergraph. Connect APKs that call the same API through hyperedges. When APK1 and APK2 both call API1, and APK2, APK3 and APK4 both call API2, the constructed hypergraph is: Hypergraph node set N = {APK1, APK2, APK3}, Hypergraph hyperedge set HE = {he1 = (APK1, APK2), he2 = (APK2, APK3, APK4)}. Different hyperedges are used to distinguish the associations between APKs caused by different API calls.

[0054] Step a2: Simple graph transformation. The hypergraph obtained in step a1 is transformed into a simple graph using the clique extension algorithm, representing the low-order associations between APKs.

[0055] In the above technical solution, the feature extraction module is implemented through the following specific steps:

[0056] Step b1: Initialize APK node features. Encode the API call information related to the APK using one-hot encoding as the initial features of the APK node. Expand the node features to other features that can represent the information of the APK node itself. Specifically, this involves feature encoding of more program analysis results, embedding program code using natural language processing, or embedding the application's function call graph or control flow graph using graph embedding.

[0057] Step b2: Low-order association feature extraction. For the simple graph obtained in the graph construction module and the initial node features of APK in step b1 above, a graph convolutional neural network is used to process them to obtain low-order association features.

[0058] Step b3: Extraction of higher-order association features. For the hypergraph obtained in the graph construction module and the initial features of the APK nodes in step b1 above, use a hypergraph convolutional neural network to process them to obtain higher-order association features.

[0059] Step 4: Feature fusion extraction. The low-order and high-order association features obtained in steps bb2 and b3 above are fused using an attention mechanism to obtain the final features for node classification.

[0060] This invention also provides a method for detecting Android malware based on a hypergraph. The method involves statically analyzing APK files to extract their API call information; mapping the call relationships between APKs and APIs to nodes and hyperedges in a hypergraph to construct the hypergraph; transforming the hypergraph into a simple graph through clique expansion; obtaining high-order and low-order association features of nodes by spatial domain-based learning on the hypergraph and simple graph; obtaining APK node features by fusing the high-order and low-order association features of nodes; and learning the node features using a hypergraph convolutional neural network.

[0061] The above technical solution also includes the following steps:

[0062] S1: Perform static analysis on the Android APK file to obtain API call information;

[0063] S2: Used to represent APKs and extracted APIs using a hypergraph structure, mapping APKs to nodes and APIs to hyperedges. All APKs that call a certain API are connected by the hyperedge representing that API. At the same time, clique extension is used to transform the previously constructed hypergraph into a simple graph.

[0064] S4: Spatial-domain-based learning in hybrid association structure detection models using hypergraph convolutional neural networks, that is, extracting high-order and low-order features of hypergraph and simple graph from the node-to-node neighbor information aggregation method, and then fusing these two parts of features as the features of APK nodes.

[0065] S5: Used for training and learning a hypergraph convolutional neural network to classify nodes based on the mixed relationships between APKs and APIs. Specifically, it trains and learns fused features obtained from the feature extraction module, which integrate high-order and low-order relationships, and uses classifiers such as MLPs for final detection and classification. The trained intelligent classifier is then used to detect and identify Android malware, outputting classification results.

[0066] In the above technical solution, step 2 includes the following steps:

[0067] Step a1: Hypergraph construction. For Android APK files, map them as nodes in the hypergraph. For API call information obtained from the program analysis module, map it as hyperedges in the hypergraph. Connect APKs that call the same API through hyperedges. When APK1 and APK2 both call API1, and APK2, APK3 and APK4 both call API2, the constructed hypergraph is: Hypergraph node set N = {APK1, APK2, APK3}, Hypergraph hyperedge set HE = {he1 = (APK1, APK2), he2 = (APK2, APK3, APK4)}. Different hyperedges are used to distinguish the associations between APKs caused by different API calls.

[0068] Step a2: Simple graph transformation. The hypergraph obtained in step a1 is transformed into a simple graph using the clique extension algorithm, representing the low-order associations between APKs.

[0069] In the above technical solution, step 3 includes the following steps:

[0070] Step b1: Initialize APK node features. Encode the API call information related to the APK using one-hot encoding as the initial features of the APK node. Expand the node features to other features that can represent the information of the APK node itself. Specifically, this involves feature encoding of more program analysis results, embedding program code using natural language processing, or embedding the application's function call graph or control flow graph using graph embedding.

[0071] Step b2: Low-order association feature extraction. For the simple graph obtained in the graph construction module and the initial node features of APK in step b1 above, a graph convolutional neural network is used to process them to obtain low-order association features.

[0072] Step b3: Extraction of higher-order association features. For the hypergraph obtained in the graph construction module and the initial features of the APK nodes in step b1 above, use a hypergraph convolutional neural network to process them to obtain higher-order association features.

[0073] Step 4: Feature fusion extraction. For the low-order and high-order association features obtained in steps b2 and b3 above, an attention mechanism is used to fuse them to obtain the final features for node classification.

[0074] Example

[0075] Step 1: Construct an experimental dataset. Taking CICMalDroid2020 as an example, randomly select 4000 Benign samples, 1000 Adware samples, 1000 Banking samples, 1000 Riskware samples, and 1000 SMS samples to construct an experimental dataset with a 1:1 ratio of benign and malicious samples.

[0076] Step 2: Decompile the Android APK samples in the dataset to obtain their valid classes.dex files; analyze the classes.dex files using Androguard to extract their API call information.

[0077] Step 3: Based on the API call information obtained in Step 2, encode it using one-hot encoding as the initial feature of the corresponding APK node. If the APK calls the API, the corresponding position is 1; otherwise, it is 0.

[0078] Step 4: Hypergraph Construction. Further analyze the API call information of the APK and construct a hypergraph based on this. Specifically, map the APK to the node in the hypergraph and the API to the hyperedge in the hypergraph. Each hyperedge represents a different API. If multiple APK nodes call the same API, connect the multiple APK nodes with the hyperedge corresponding to that API.

[0079] Step 5: Transform the hypergraph constructed in Step 4 into a general graph using the clique expansion algorithm to represent the low-order associations between applications.

[0080] Step 6: For the hypergraph obtained in Step 4 and the general graph obtained in Step 5, and in conjunction with the APK node features initialized in Step 3, extract the high-order and low-order association features between APKs through Hyper Graph Convolution and Graph Convolution, respectively.

[0081] Step 7: The high-order and low-order association features obtained in Step 6 are fused using an attention mechanism to obtain the hybrid association features of the nodes.

[0082] Step 8: Model training. While ensuring that the sample type distribution remains unchanged, the experimental data is randomly divided into a training set:test set ratio of 7:3. The samples in the training set are used to train the detection model, thereby obtaining a trained malware detection classifier.

[0083] Step 9: Use the trained malware detection classifier to detect and identify unknown Android malware in the test set.

Claims

1. A hypergraph-based Android malware detection system, characterized in that, Includes the following modules: Program analysis module: Used to perform static analysis on Android APK files and extract their API call information; Hypergraph and Simple Graph Construction Module: Used to represent APKs and extracted APIs using a hypergraph structure, mapping APKs to nodes and APIs to hyperedges. All APKs that call a certain API are connected by the hyperedge representing that API. At the same time, it uses clique extension to transform the previously constructed hypergraph into a simple graph. Feature extraction module: Used for spatial domain-based learning in the hybrid association structure detection model using hypergraph convolutional neural network, that is, extracting high-order and low-order features of hypergraph and simple graph from the node-to-node neighbor information aggregation method, and then fusing these two parts of features as features of APK node; The classification module is used to train and learn the node classification task of the mixed association relationship between APK and API using the hypergraph convolutional neural network. That is, it trains and learns the fused features that combine high-order and low-order associations obtained from the feature extraction module, uses the MLP classifier to perform the final detection and classification, and uses the trained intelligent classifier to detect and identify Android malware, and outputs the classification results. The feature extraction module is implemented through the following steps: Step b1: Initialize APK node features. Encode the API call information related to the APK using one-hot encoding as the initial features of the APK node. Expand the node features to other features that can represent the information of the APK node itself. Specifically, this involves feature encoding of more program analysis results, embedding program code using natural language processing, or embedding the application's function call graph or control flow graph using graph embedding. Step b2: Low-order association feature extraction. For the simple graph obtained in the graph construction module and the initial node features of APK in step b1 above, a graph convolutional neural network is used to process them to obtain low-order association features. Step b3: Extraction of higher-order association features. For the hypergraph obtained in the graph construction module and the initial features of the APK nodes in step b1 above, use a hypergraph convolutional neural network to process them to obtain higher-order association features. Step 4: Feature fusion extraction. The low-order and high-order association features obtained in steps bb2 and b3 above are fused using an attention mechanism to obtain the final features for node classification.

2. The Android malware detection system based on a hypergraph according to claim 1, characterized in that, The specific steps for constructing hypergraphs and simple graphs are as follows: Step a1: Hypergraph construction. For Android APK files, map them as nodes in the hypergraph. For API call information obtained from the program analysis module, map it as hyperedges in the hypergraph. Connect APKs that call the same API through hyperedges. When APK1 and APK2 both call API1, and APK2, APK3, and APK4 all call API2, the constructed hypergraph is: Hypergraph node set N = {APK1, APK2, APK3}, Hypergraph hyperedge set HE = {he1 = (APK1, APK2), he2 = (APK2, APK3, APK4)}. Different hyperedges are used to distinguish the associations between APKs caused by different API calls. Step a2: Simple graph transformation. The hypergraph obtained in step a1 is transformed into a simple graph using the clique extension algorithm, representing the low-order associations between APKs.

3. A method for detecting Android malware based on a hypergraph, characterized in that... Includes the following steps: S1: Perform static analysis on the Android APK file to obtain API call information; S2: Used to represent APKs and extracted APIs using a hypergraph structure, mapping APKs to nodes and APIs to hyperedges. All APKs that call a certain API are connected by the hyperedge representing that API. At the same time, clique extension is used to transform the previously constructed hypergraph into a simple graph. S4: Spatial-domain-based learning in hybrid association structure detection models using hypergraph convolutional neural networks, that is, extracting high-order and low-order features of hypergraph and simple graph from the node-to-node neighbor information aggregation method, and then fusing these two parts of features as the features of APK nodes. S5: Used for training and learning the node classification task of mixed association relationships between APK and API using hypergraph convolutional neural network. That is, it trains and learns the fusion features that combine high-order and low-order associations obtained from the feature extraction module, uses MLP classifier for final detection and classification, and uses the trained intelligent classifier to detect and identify Android malware, and outputs the classification results. Step 3 includes the following steps: Step b1: Initialize APK node features. Encode the API call information related to the APK using one-hot encoding as the initial features of the APK node. Expand the node features to other features that can represent the information of the APK node itself. Specifically, this involves feature encoding of more program analysis results, embedding program code using natural language processing, or embedding the application's function call graph or control flow graph using graph embedding. Step b2: Low-order association feature extraction. For the simple graph obtained in the graph construction module and the initial node features of APK in step b1 above, a graph convolutional neural network is used to process them to obtain low-order association features. Step b3: Extraction of higher-order association features. For the hypergraph obtained in the graph construction module and the initial features of the APK nodes in step b1 above, use a hypergraph convolutional neural network to process them to obtain higher-order association features. Step 4: Feature fusion extraction. For the low-order and high-order association features obtained in steps b2 and b3 above, an attention mechanism is used to fuse them to obtain the final features for node classification.

4. The method for detecting Android malware based on a hypergraph according to claim 3, characterized in that, Step 2 includes the following steps: Step a1: Hypergraph construction. For Android APK files, map them as nodes in the hypergraph. For API call information obtained from the program analysis module, map it as hyperedges in the hypergraph. Connect APKs that call the same API through hyperedges. When APK1 and APK2 both call API1, and APK2, APK3, and APK4 all call API2, the constructed hypergraph is: Hypergraph node set N = {APK1, APK2, APK3}, Hypergraph hyperedge set HE = {he1 = (APK1, APK2), he2 = (APK2, APK3, APK4)}. Different hyperedges are used to distinguish the associations between APKs caused by different API calls. Step a2: Simple graph transformation. The hypergraph obtained in step a1 is transformed into a simple graph using the clique extension algorithm, representing the low-order associations between APKs.