Unlock instant, AI-driven research and patent intelligence for your innovation.

Data extraction method, device and system and computer readable storage medium

A data extraction and data technology, applied in the field of data processing, can solve problems such as writing errors, low extraction efficiency, and over-sensitivity, and achieve the effect of improving matching efficiency and high extraction efficiency

Pending Publication Date: 2021-11-16
SHANGHAI GUAN AN INFORMATION TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] When using regular expressions to extract data in the prior art, due to the wide variety of data types, the types of regular expressions also need to be fully covered, and because regular expressions are not flexible enough, too sensitive, and there is a risk of writing errors, so in When these regular expressions are used to extract data, some data may not match, thus affecting the success rate of data extraction; in addition, in the prior art, data extraction adopts the method of looping through regular expressions, In the most pessimistic case, the time complexity of single data processing is O(n), and the time to extract data is too long, which will reduce the efficiency of data extraction
[0004] Aiming at the problems of low data extraction success rate and low extraction efficiency in the prior art, there is currently no effective solution

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data extraction method, device and system and computer readable storage medium
  • Data extraction method, device and system and computer readable storage medium
  • Data extraction method, device and system and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0028] The regular expressions in the prior art are often written manually. When these regular expressions extract data according to the requirements, because the data types are complex, the regular expressions are too sensitive, easy to write mistakes, and it is very easy to fail to match the data. This results in the problems of low data extraction efficiency and low parsing performance.

[0029] In order to improve data extraction efficiency a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a data extraction method. The method comprises the steps: obtaining a target to-be-extracted data set and a newest regular expression set; when it is judged that the same regular expressions exist in the current regular expression set and the newest regular expression set, assigning the weight values of the same regular expressions in the current regular expression set to the same regular expressions in the newest regular expression set; sorting the regular expressions in the newest regular expression set according to the weight values from large to small; and matching the to-be-extracted data in the target to-be-extracted data set with the sorted regular expressions to obtain a data extraction result, and adding one to a weight value of the regular expression matched with the data. By dynamically changing the weight value of the regular expression, the regular expression with a higher matching success rate is enabled to preferentially extract the to-be-extracted data, and the data extraction efficiency is improved.

Description

technical field [0001] The present invention relates to the field of data processing, in particular to a data extraction method, device, system and computer-readable storage medium. Background technique [0002] A regular expression is a logical formula for string operations. It is to use some specific characters defined in advance and the combination of these characters to form a "rule string". This "rule string" is used to express the string A filtering logic for . A regular expression is a text pattern that describes one or more character strings to be matched when searching for text, and plays a pivotal role in extracting effective information from massive big data. [0003] When using regular expressions to extract data in the prior art, due to the wide variety of data types, the types of regular expressions also need to be fully covered, and because regular expressions are not flexible enough, too sensitive, and there is a risk of writing errors, so in When these reg...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/903
CPCG06F16/90344G06F16/90348
Inventor 魏畅邓宇超胡绍勇
Owner SHANGHAI GUAN AN INFORMATION TECH