Marine big data sensitivity evaluation system and a marine big data sensitivity prevention method for confidentiality requirements

An evaluation system and sensitivity technology, applied in the direction of digital data protection, can solve the problems of classification result interference, inaccurate classification results, slow recognition speed, etc., to improve efficiency and accuracy, facilitate open sharing, and fast recognition speed Effect

Active Publication Date: 2019-05-24
OCEAN UNIV OF CHINA
View PDF5 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The sensitive data dictionary matching method has the following defects: 1. The recognition accuracy is low. The dictionary matching adopts a pattern matching method, so the establishment of the data dictionary determines the accuracy of sensitive data recognition. When the dictionary is incomplete or the dictionary is established incorrectly , there will be a problem of reduced recognition accuracy; 2. Classification result interference. Because dictionary matching is used, the same data information will be matched to multiple data dictionaries. Because traditional data dictionaries cannot perform weighted calculations, it will cause interference to classification results. , resulting in inaccurate classification results
[0008] The artificial identification method of sensitive data has the following defects: 1. The identification speed is slow. Due to the manual processing method, when faced with a large amount of data, the manual combing speed is relatively long compared with the machine identification speed cycle, and the professional quality of the processing personnel is relatively high. High; 2. The judging standards are not uniform. Since the sensitive data identification process mainly depends on the subjective judgment of people, different people may have different judging standards for the same data, and even the results identified by the same person at different times are still different. different, which will lead to discrepancies in sensitive data identification results
The self-built corpus may have the problem that the corpus is not comprehensive enough due to the insufficient amount of training data
In step 2, it is still necessary to manually identify and classify sensitive features of the basic corpus, which is highly subjective. Different people may have different criteria for judging the same data, and even the same person may identify the same data at different times. The results are still different, which will lead to discrepancies in the sensitive data identification results
In step 3, the category of the target data is obtained through the weighted ranking of sensitive words, but the sensitivity of the data is not quantified

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Marine big data sensitivity evaluation system and a marine big data sensitivity prevention method for confidentiality requirements
  • Marine big data sensitivity evaluation system and a marine big data sensitivity prevention method for confidentiality requirements
  • Marine big data sensitivity evaluation system and a marine big data sensitivity prevention method for confidentiality requirements

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] This embodiment provides a marine big data sensitivity assessment system for confidentiality requirements, which mainly includes the following parts;

[0056] The data feature extraction module is used to extract data samples and metadata information from the structured ocean secret raw data, and use natural language processing related technologies to extract keywords, subject words, and associated words from unstructured ocean secret information documents to establish a corpus , and perform feature extraction to form a sensitive feature library; it is used to extract its data attribute features from the target processing data of marine big data, and prepare for the next step of matching with the sensitive feature library to find sensitive data.

[0057] The sensitive feature matching module is used to match the target processing data features in the marine big data with the sensitive feature library, analyze the similarity between the target processing data set and the ...

Embodiment 2

[0063] The prevention method based on the marine big data sensitivity assessment system for confidentiality requirements described in Embodiment 1, that is, sensitive data desensitization processing, data desensitization refers to the deformation of certain sensitive information through desensitization rules to achieve Reliable protection of sensitive private data.

[0064] combine figure 1 As shown, the method of the present embodiment includes the following steps:

[0065] Step 1: Establish a sensitive feature database.

[0066] Starting from the relevant regulations on the scope of secrets in marine work, the formed secret information and internal business information, reverse analysis is used to extract the features of the generated secret information sets by combining machine and manual methods. The document metadata of the secret information text, the keywords, subject words, associated words of the text content, and the metadata of the original sensitive data set are ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a marine big data sensitivity evaluation system and a marine big data sensitivity prevention method for confidentiality requirements. The marine big data sensitivity evaluationmethod comprises the following steps: step 1, establishing a sensitive feature library; Step 2, matching and analyzing a to-be-processed data set; 3, constructing a sensitivity assessment model; 4, calculating attribute sensitivity; 5, protecting sensitive data; And 6, dynamically optimizing the sensitive feature library. The method is used for realizing identification and protection of sea-related data with different data sensitivities and value intensities from various marine service systems.

Description

technical field [0001] The invention belongs to the technical field of data identification and processing, and in particular relates to a marine big data sensitivity assessment system and prevention method for confidentiality requirements. Background technique [0002] Data is an important asset of organizations, enterprises and individuals, and data leakage will cause great losses to data owners. In recent years, data leakage has become the source of the black industry chain of information trafficking. The security protection of sensitive data in the big data environment has extremely high theoretical research value and engineering practice significance. How to protect sensitive data has become the focus of big data security. [0003] Existing technologies mostly adopt: access control, sensitive data confidentiality technology based on data distortion (retaining certain statistical characteristics), data encryption, and restricted release (not releasing certain attributes ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/62G06F21/60
Inventor 王晓东罗祥裕解玮玮魏志强王雪
Owner OCEAN UNIV OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products