Sensitive data detection method and device based on machine learning

A sensitive data and machine learning technology, applied in the field of computer systems, can solve problems such as incomplete detection and low efficiency of the data retrieval process, and achieve the effect of enhancing detection capabilities, improving work efficiency, and improving overall detection capabilities

Pending Publication Date: 2021-03-16
CHINA ZHESHANG BANK
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, it is easy to cause incomplete detection
If manual observa...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sensitive data detection method and device based on machine learning
  • Sensitive data detection method and device based on machine learning
  • Sensitive data detection method and device based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the present invention.

[0033] An embodiment of the present invention provides a method for exporting data and detecting sensitive data in a production environment based on machine learning. During the specific implementation, the security management department will explain and define the category of sensitive fields. The specific fields exemplified in this method need to be set according to the actual scene. The method includes the following steps:

[0034] 1. In banking business, the fields of database tables can be generally divided into two types: pure numbers and text. Pure digital types such as amount, ID number, mobile phone number, etc.; text types such as name, address, etc. Regul...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a sensitive data detection method and device based on machine learning. When data is synchronized from a production environment to a development test environment, sensitive fields must be desensitized. According to a machine learning principle, a data table field sensitivity identification problem is converted into a text classification problem, and an NLP technology is applied to train a model to identify sensitive fields. The method is combined with a conventional detection means, the recognition effect is continuously improved through a self-learning algorithm, the risk that sensitive fields are omitted to the external environment can be reduced, manual intervention is reduced, and the working efficiency is improved.

Description

technical field [0001] The invention belongs to the field of computer systems, and in particular relates to a machine learning-based sensitive data detection method and device. Background technique [0002] The banking industry is an area with very strict data security requirements. Sensitive fields must be desensitized before all data is exported. [0003] Sensitive fields are generally desensitized through scripts submitted by developers. But in the face of huge data tables and the number of fields, developers may not be able to cover all sensitive fields. A traditional detection method is through regular expression matching. Regular expressions are a rule-based matching technique, limited by specific rules. For example, it has a strong ability to identify fields with strong regularity such as mobile phone number and card number. For less regular content, such as work units and home addresses, the recognition ability is weak. Data fetching positions often involve a la...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F21/62G06Q40/02G06N3/04
CPCG06F21/6245G06Q40/02G06N3/045
Inventor 臧铖陈嘉俊屠轲占可非
Owner CHINA ZHESHANG BANK
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products