Method for detecting sensitive information based on D-S evidence theory

A technology of evidence theory and sensitive information, which is applied in the field of sensitive information detection based on D-S evidence theory, can solve problems such as inconsistent algorithm results, low recall rate, and low precision rate, and achieve the effect of preventing leapfrog storage and leakage

Active Publication Date: 2012-04-25
THE PLA INFORMATION ENG UNIV
View PDF2 Cites 32 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to overcome the low recall rate and precision rate in a single sensitive information detection algorithm (such as based on vector model, Boolean model, probability model), and the problem of inconsistent results between algorithms. On the basis of detecting the effect of e-government sensitive information, a method based on evidence theory is proposed to integrate various detection algorithms

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for detecting sensitive information based on D-S evidence theory
  • Method for detecting sensitive information based on D-S evidence theory
  • Method for detecting sensitive information based on D-S evidence theory

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0027] Example 1: see figure 1 , figure 2 This embodiment specifically describes an embodiment of the present invention with reference to the drawings. Before describing the specific implementation of the present invention in detail, some concepts involved in the present invention will be explained in a unified manner.

[0028] Sensitive information: Refers to the information that the user needs and cares about and is judged to be meaningful by the user, which is specifically characterized by query requests (such as keywords) and related description information. We call files containing sensitive information sensitive files.

[0029] Information retrieval module: complete the function of retrieving the text required by the user in the local resource database, and submit the retrieval result to the user interface module.

[0030] Keywords: The keywords involved in this article are based on the keyword glossary involving sensitive government information in the e-government system.

[...

Embodiment 2

[0118] Example 2: see figure 1 In this embodiment, the sensitive information detection method based on the D-S evidence theory, the implementation method includes the following steps:

[0119] Step 1). Perform format conversion on the detected documents in the database and preprocess them as data objects to extract index items;

[0120] Step 2), create index information according to the index items obtained in step 1), assign corresponding weights to keywords, and store them in the database;

[0121] Step 3) Use vector-based detection algorithms, Boolean model-based detection algorithms, probability model-based detection algorithms, and regular expression-based detection algorithms, or any two or three detection algorithms that have a known sensitivity level Collect together for detection and calculate the weight of each algorithm;

[0122] Step 4), use the algorithm described in step 3) to detect the target detection document, use the evidence theory synthesis rule to calculate the t...

Embodiment 3

[0123] Example three: see figure 1 , This embodiment of the sensitive information detection method based on D-S evidence theory is different from the second embodiment:

[0124] Before the step 2), it also includes the acquisition of keyword weights. The method for acquiring the weights adopts the TFIDF weighting strategy, and specifically adopts the vector space-based sensitive information detection algorithm. The steps are as follows:

[0125] Step (1), according to the TFIDF weighting strategy, the document is expressed as a vector of weights W j = 1j , W 2j ,..., w Mj > , Where w ij Indicates index item t i In document d j Weight in,

[0126] The specific calculation formula can be expressed as:

[0127]

[0128] Where tf(t i , D j ) Is the word t i In document d j The number of words appearing in; N is the number of all texts to be clustered; df(t i ) Contains the word t i The number of documents;

[0129] Step (2), express the query p as a vector of weights to calculate the simi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for detecting sensitive information of an e-government system. Various detection algorithms, including the detection algorithm based on a regular expression model, the detection algorithm based on a vector space, the detection algorithm based on a boolean model and the detection algorithm based on a probability model are fused in the method for detecting the sensitive information based on a D-S evidence theory. The D-S evidence theory is firstly utilized to fuse the values of different keywords in the same query acquired by the algorithms, and then is utilized to fuse the trust values acquired by different algorithms, thereby acquiring a sensitive degree of an information detection object. According to the method for detecting the sensitive information based on the D-S evidence theory provided by the invention, the advantages of the algorithms in the information detection of the e-government system are comprehensively utilized, the problems of the lowrecall ratio and precision ratio of a single algorithm and the inconsistency problem of detection results between different algorithms are solved, and the sensitive information in the e-government system is efficiently prevented from being stored and revealed in a level-skipping form.

Description

Technical field [0001] The invention relates to a sensitive information detection method of an electronic government affair system, in particular to a sensitive information detection method based on D-S evidence theory. It belongs to the field of computer security. Background technique [0002] The Internet is an important infrastructure for informatization and an important strategic resource of the country. Actively using the Internet for e-government construction can not only save resources, save costs, but also improve efficiency and expand service coverage. It has important strategic significance for the e-government and informatization construction of a developing country like China. However, the use of the open Internet to carry out e-government construction is faced with security threats and risks such as computer viruses, network attacks, information leakage, and identity counterfeiting, and information security should be highly valued. Government affairs applications b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F21/00G06F21/60
Inventor 陈性元杜学绘夏春涛陈华城王超曹利峰孙奕李炳龙张东巍赵艳杰
Owner THE PLA INFORMATION ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products