Mobile application APK file embedded privacy policy extraction method

A technology of mobile application and extraction method, applied in special data processing applications, text database clustering/classification, unstructured text data retrieval, etc. slow speed etc.

Active Publication Date: 2021-07-06
BEIJING UNIV OF POSTS & TELECOMM
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] However, there are few existing studies on the extraction of privacy policies embedded in mobile applications. The main method is to analyze the application file structure through static analysis, preprocess the input samples, extract the required information, and generate the Activity tree. Figure; and then based on the Activity tree diagram and tree hierarchy, the Activity traversal script written by the traversal strategy, the main task is to find the page where the privacy agreement is located, and obtain the privacy policy file inside the mobile application by matching the keywords of the page-related control text
[0009] Since the accuracy of the activity tree diagram decreases with the level, and there may be omissions in the judgment of the privacy policy link based on the control text matching, there is still room for improvement in the success rate of privacy policy discovery
In addition, this method requires two steps of static analysis and automated testing for each input sample. Due to the complexity of the steps and the time-consuming automated testing, the extraction speed is slow

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mobile application APK file embedded privacy policy extraction method
  • Mobile application APK file embedded privacy policy extraction method
  • Mobile application APK file embedded privacy policy extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] In order to facilitate the understanding and implementation of the present invention by those of ordinary skill in the art, the present invention will be further described in detail and in-depth below with reference to the accompanying drawings.

[0046] The present invention is based on the Android application program, utilizes dynamic and static detection of the privacy policy link embedded in the mobile application APK file, and automatically discovers and extracts the privacy policy link contained in the APK file; the overall workflow is as follows: figure 1 As shown: First, statically analyze the APK file of the mobile application to be tested, and obtain all the URL link sets contained in the APK file through decompilation and rule matching; The privacy policy crawler extracts the features of each page and inputs the classification model, and trains the privacy policy page judgment model; judges the extracted features of the obtained pages through the privacy polic...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a mobile application APK (Android Package) file embedded privacy policy extraction method and belongs to the field of Android mobile terminal application software analysis and detection. The method specifically comprises the following steps of firstly, selecting a to-be-detected APK file for decompiling and rule matching, obtaining all URL (Uniform Resource Locator) links, respectively crawling each webpage content, and extracting feature words in a privacy policy text; meanwhile, collecting feature words of a plurality of webpages to train a dichotomy model in advance; inputting feature words of the to-be-detected APK file into the trained dichotomy model one by one, judging whether a privacy policy page exists in an output result or not, and if yes, outputting a privacy policy and ending; otherwise, performing automatic dynamic testing, monitoring request addresses in the traffic, extracting corresponding URL links, crawling contents of pages to extract feature words, inputting the feature words into a dichotomy model for judgment, and ending till a privacy policy page is found or a set traversal depth is exceeded. According to the method, dynamic and static tests are combined, so extraction efficiency and the success rate of privacy policies are improved.

Description

technical field [0001] The invention belongs to the field of Android mobile terminal application software analysis and detection, and relates to a method for extracting a privacy policy embedded in an APK file of a mobile application. Background technique [0002] Static analysis refers to scanning the program files by various means such as lexical analysis or syntax analysis without running, thereby generating the decompiled code of the program, and then reading the decompiled code to grasp the function of the program. It is essentially static text analysis, so it has high analysis efficiency. [0003] Common static decompilation tools include apktool, backsmali, and dex2jar, among which apktool is the most commonly used decompilation tool for static analysis. It is written in Java and can decompile and recompile APK files. It also has the ability to install a specific framework-res framework. Clean up the last decompiled folder and other functions. [0004] After the sam...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/56G06F21/62G06F16/35G06F16/951G06F16/955G06F40/284
CPCG06F21/563G06F21/566G06F21/6245G06F16/951G06F16/955G06F40/284G06F16/35G06F2221/033
Inventor 郭燕慧徐国爱徐国胜张淼王皓月
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products