Unlock instant, AI-driven research and patent intelligence for your innovation.

A mining method and device for data regular expressions

A technology of expression and data, applied in the field of data processing

Active Publication Date: 2016-12-28
SHENZHEN AUDAQUE DATA TECH
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

There are many methods and techniques for applying regular expressions, but there are few ways to generate a more effective regular expression. For example, Sergei Savchenko proposed a method based on The regular expression mining method of intelligent finite automata, but this method also has great limitations, for example, the method has distribution requirements and the size of the data set can only be between 30-50
[0004] At present, in the field of data processing, there is no mining method that can mine the essential structure of data and form a regular expression for massive data containing error data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A mining method and device for data regular expressions
  • A mining method and device for data regular expressions
  • A mining method and device for data regular expressions

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] In order to make the purpose, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0032] The present invention provides a method and device for mining regular expressions of data. By storing the obtained data in a dictionary tree structure, mining of massive data can be realized, and data nodes can be processed according to a pre-established regular expression rule table. Upgrade, and then perform branch merging according to the number of upgraded child nodes and the same characters, and at the same time identify interference branches, and delete branches, and finally convert the generated rule tree into a string format for input. The invention realizes the mining of regular...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data regular expression mining method. The method includes: acquiring and storing data and storing it in a dictionary tree structure; upgrading nodes according to regular expression rules; The number of nodes is merged separately; the interference branch is identified, and the branch is deleted; the rule tree is converted into a string format and output. By storing the acquired data in a dictionary tree structure, it is possible to mine massive data, by upgrading data nodes, merging branches, deleting interfering branches, and finally converting the generated rule tree into a string format for input. The invention realizes the mining of the regular expressions of massive data containing wrong data, the rule tree can satisfy the mining of wrong data, and can be used to check the data and find out the wrong data. In addition, the present invention also provides a data regular expression mining device.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a data regular expression mining method and device. Background technique [0002] Data mining refers to the process of extracting information that people do not know but is valuable to users from a large amount of incomplete, vague, and erroneous data. The data mining process usually includes preprocessing data, implementing data mining algorithms, and displaying mining results. The early data mining process was implemented in a serial manner on a stand-alone node. In a stand-alone node data mining system, the amount of data that can be mined and the load of the algorithm depend on the performance of a single execution node. Since the current data mining systems need to deal with massive data, this method of serial processing on a single node can only support a small amount of data, and its performance is low. Later, with the development of data mining technology, the current min...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/9024G06F16/90344G06F16/2465G06F16/322G06F16/2246G06F16/00
Inventor 王明兴贾西贝
Owner SHENZHEN AUDAQUE DATA TECH