Unlock instant, AI-driven research and patent intelligence for your innovation.

A Multi-source Heterogeneous Data Acquisition Method

A technology of multi-source heterogeneous data and collection methods, which can be applied to other database retrieval, electronic digital data processing, special data processing applications, etc., and can solve the problems of time-consuming and labor-intensive data collection.

Active Publication Date: 2020-12-18
BEIJING TONGTECH CO LTD +3
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The invention provides a multi-source heterogeneous data acquisition method to solve the time-consuming and labor-intensive situation of data acquisition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Multi-source Heterogeneous Data Acquisition Method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0077] as attached figure 1 The method flowchart of a multi-source heterogeneous data acquisition method includes:

[0078] Step 100: establishing a keyword list;

[0079] Step 101: Obtain the collection content of each data source, and create a corresponding collection grammar;

[0080] Step 102: Establish data collection rules according to the collection grammar;

[0081] Step 103: Associating the data collection rules with keywords corresponding to the keyword table for multi-source heterogeneous data collection.

[0082] The principle of the above-mentioned technical solution is that: in terms of multi-source data procurement, the present invention first determines the keyword table of the multi-source heterogeneous data, the keyword table of the multi-source heterogeneous data is determined by the data source, and the keyword table of the data source Including the keywords of the data output by the data source and the keywords of the carrier device of the data source. ...

Embodiment 2

[0085] As an embodiment of the present invention: the establishment of the keyword list includes:

[0086] The establishment of keyword table includes:

[0087] Obtain the data source of the multi-source heterogeneous data, and determine the data source keyword;

[0088] Obtaining type features of data content in the multi-source heterogeneous data, and determining type feature keywords;

[0089] According to the data source keywords and type feature keywords, determine the adjacent words of the data source keywords and type feature keywords, and use the adjacent words as supplementary words;

[0090] According to the data source keywords, type feature keywords and supplementary words, a three-dimensional heterogeneous keyword table is established.

[0091] The principle of the above-mentioned technical solution is: when the present invention constructs the keyword table, firstly determine the data source of multi-source heterogeneous data, then determine the type feature of...

Embodiment 3

[0094] Described establishment keyword list also includes:

[0095] Preprocessing the keywords in the keyword table;

[0096] According to the preprocessing, determine the part of speech of the keyword;

[0097] dividing the keywords based on the part of speech, and determining the divided characters;

[0098] Based on the keyword preset length screening algorithm, the keywords after the characters are divided are calculated to determine the length of the keywords after the characters are divided;

[0099] Comparing the length of the keyword after the character division with a preset ideal length, determining the difference between the length of the keyword after the character division and the ideal length;

[0100] Based on the degree of difference, keywords with relatively large differences are deleted through a preset threshold of difference degree, and after the keywords with relatively large differences are deleted, a keyword table is determined.

[0101] The principle...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a multi-source heterogeneous data acquisition method. The method comprises the steps of establishing a keyword table; acquiring the acquisition content of each data source, andcreating a corresponding acquisition grammar; establishing a data acquisition rule according to the acquisition grammar; and associating the data acquisition rule to a keyword corresponding to the keyword table. The method has the beneficial effect that the comprehensiveness of data acquisition is improved by constructing the keyword table of the data source. By constructing the acquisition syntax, the acquisition syntax is determined on the basis of the abstract syntax tree through the custom reflection rule, so that the acquisition syntax is flexible and variable, and the data acquisition requirement is met due to the custom reflection rule. According to the invention, the data collection rule is constructed according to the collection grammar, so that the data collection can be dynamically carried out. According to the method, the data acquisition rule is associated to the keyword of the keyword table, so that the data acquisition is more comprehensive, the rule vulnerability is better in a dynamic updating mode, and the acquired data is more accurate.

Description

technical field [0001] The invention relates to the technical field of data collection, in particular to a multi-source heterogeneous data collection method. Background technique [0002] At present, with the rapid development of Internet technology, enterprises, governments, various organizations and groups need to collect data from various data sources based on their own data collection needs. For example: in the field of manufacturing, collect production, procurement, sales orders, services, and financial data; in the government field, collect data on industry and commerce, taxation, human resources, and civil affairs; in the field of telecommunications, collect data on network services, call billing systems, and customer service systems. Data assets are formed by collecting data from various production links for analysis. [0003] However, in the prior art, when the data is retrieved and collected through the comprehensive data processing and analysis system, because th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/90G06F40/253
CPCG06F16/90G06F40/253
Inventor 张春林李利军李春青常江波尚雪松
Owner BEIJING TONGTECH CO LTD