A Multi-source Heterogeneous Data Acquisition Method
A technology of multi-source heterogeneous data and collection methods, which can be applied to other database retrieval, electronic digital data processing, special data processing applications, etc., and can solve the problems of time-consuming and labor-intensive data collection.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0077] as attached figure 1 The method flowchart of a multi-source heterogeneous data acquisition method includes:
[0078] Step 100: establishing a keyword list;
[0079] Step 101: Obtain the collection content of each data source, and create a corresponding collection grammar;
[0080] Step 102: Establish data collection rules according to the collection grammar;
[0081] Step 103: Associating the data collection rules with keywords corresponding to the keyword table for multi-source heterogeneous data collection.
[0082] The principle of the above-mentioned technical solution is that: in terms of multi-source data procurement, the present invention first determines the keyword table of the multi-source heterogeneous data, the keyword table of the multi-source heterogeneous data is determined by the data source, and the keyword table of the data source Including the keywords of the data output by the data source and the keywords of the carrier device of the data source. ...
Embodiment 2
[0085] As an embodiment of the present invention: the establishment of the keyword list includes:
[0086] The establishment of keyword table includes:
[0087] Obtain the data source of the multi-source heterogeneous data, and determine the data source keyword;
[0088] Obtaining type features of data content in the multi-source heterogeneous data, and determining type feature keywords;
[0089] According to the data source keywords and type feature keywords, determine the adjacent words of the data source keywords and type feature keywords, and use the adjacent words as supplementary words;
[0090] According to the data source keywords, type feature keywords and supplementary words, a three-dimensional heterogeneous keyword table is established.
[0091] The principle of the above-mentioned technical solution is: when the present invention constructs the keyword table, firstly determine the data source of multi-source heterogeneous data, then determine the type feature of...
Embodiment 3
[0094] Described establishment keyword list also includes:
[0095] Preprocessing the keywords in the keyword table;
[0096] According to the preprocessing, determine the part of speech of the keyword;
[0097] dividing the keywords based on the part of speech, and determining the divided characters;
[0098] Based on the keyword preset length screening algorithm, the keywords after the characters are divided are calculated to determine the length of the keywords after the characters are divided;
[0099] Comparing the length of the keyword after the character division with a preset ideal length, determining the difference between the length of the keyword after the character division and the ideal length;
[0100] Based on the degree of difference, keywords with relatively large differences are deleted through a preset threshold of difference degree, and after the keywords with relatively large differences are deleted, a keyword table is determined.
[0101] The principle...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 
