Method for detecting sensitive information based on D-S evidence theory
A technology of evidence theory and sensitive information, which is applied in the field of sensitive information detection based on D-S evidence theory, can solve problems such as inconsistent algorithm results, low recall rate, and low precision rate, and achieve the effect of preventing leapfrog storage and leakage
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0027] Example 1: see figure 1 , figure 2 This embodiment specifically describes an embodiment of the present invention with reference to the drawings. Before describing the specific implementation of the present invention in detail, some concepts involved in the present invention will be explained in a unified manner.
[0028] Sensitive information: Refers to the information that the user needs and cares about and is judged to be meaningful by the user, which is specifically characterized by query requests (such as keywords) and related description information. We call files containing sensitive information sensitive files.
[0029] Information retrieval module: complete the function of retrieving the text required by the user in the local resource database, and submit the retrieval result to the user interface module.
[0030] Keywords: The keywords involved in this article are based on the keyword glossary involving sensitive government information in the e-government system.
[...
Embodiment 2
[0118] Example 2: see figure 1 In this embodiment, the sensitive information detection method based on the D-S evidence theory, the implementation method includes the following steps:
[0119] Step 1). Perform format conversion on the detected documents in the database and preprocess them as data objects to extract index items;
[0120] Step 2), create index information according to the index items obtained in step 1), assign corresponding weights to keywords, and store them in the database;
[0121] Step 3) Use vector-based detection algorithms, Boolean model-based detection algorithms, probability model-based detection algorithms, and regular expression-based detection algorithms, or any two or three detection algorithms that have a known sensitivity level Collect together for detection and calculate the weight of each algorithm;
[0122] Step 4), use the algorithm described in step 3) to detect the target detection document, use the evidence theory synthesis rule to calculate the t...
Embodiment 3
[0123] Example three: see figure 1 , This embodiment of the sensitive information detection method based on D-S evidence theory is different from the second embodiment:
[0124] Before the step 2), it also includes the acquisition of keyword weights. The method for acquiring the weights adopts the TFIDF weighting strategy, and specifically adopts the vector space-based sensitive information detection algorithm. The steps are as follows:
[0125] Step (1), according to the TFIDF weighting strategy, the document is expressed as a vector of weights W j = 1j , W 2j ,..., w Mj > , Where w ij Indicates index item t i In document d j Weight in,
[0126] The specific calculation formula can be expressed as:
[0127]
[0128] Where tf(t i , D j ) Is the word t i In document d j The number of words appearing in; N is the number of all texts to be clustered; df(t i ) Contains the word t i The number of documents;
[0129] Step (2), express the query p as a vector of weights to calculate the simi...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com