Unlock instant, AI-driven research and patent intelligence for your innovation.

Data extraction method

A data extraction and data technology, applied in the field of communication, can solve problems such as limited information

Active Publication Date: 2019-05-10
深圳市科联汇通科技有限公司
View PDF7 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is a lack of effective cleaning and scientific analysis methods for this type of data in the existing technology, so the information that can be obtained is limited

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data extraction method
  • Data extraction method
  • Data extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below.

[0042] An embodiment of the present invention provides a data cleaning method, wherein the data includes type-1 data and type-2 data, the type-1 data is directly published data, and the type-2 data is comment data for type-1 data. The method is as figure 1 shown, including:

[0043] S101. Acquire a data set, where the data set includes first-class data and second-class data.

[0044] S102. Preprocess the data set to obtain a data network set {d i}, the data network elements in the data network set are represented by d i = {V, E} form record, where V is the user identification, and E represents the comment relationship between the second type of data issued by one user identification and the first type of data issued by another user identification, and each vertex includes user identification, title and T...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data extraction method, the data comprises first-class data and second-class data, the first-class data is directly published, the second-class data is comment data for the first-class data, the data extraction method comprises the following steps: obtaining a data set, the data set comprising the first-class data and the second-class data; preprocessing the data set to obtain a data network set {di}, wherein data network elements in the data network set are recorded in a form of di = {V, E}, V is a user identifier, E represents a comment relation of two types of datapublished by one user identifier to one type of data published by another user identifier, and each vertex comprises three parts of data, namely the user identifier, a title and content; obtaining a theme vector set according to a title of a vertex in the data network set; obtaining the relevancy between the identifier of each vertex in the data network set and each vector in the theme vector setto obtain a relevancy set; and performing data extraction according to the relevancy set. The method can effectively extract target users and important data related to themes.

Description

technical field [0001] The invention relates to the communication field, in particular to a data extraction method. Background technique [0002] In the field of data analysis, it is often necessary to clean and extract data. In common interactive websites, such as Zhihu and Baidu Tieba, there are a large amount of user mutual evaluation data. This type of data can reflect the user's personal preferences, and can also be used to study current events and social phenomena. There are more social information. It can be widely used in advertising target user research, hot issue research, public opinion supervision and other fields. However, there is a lack of effective cleaning and scientific analysis methods for such data in the prior art, so the information that can be obtained is limited. Contents of the invention [0003] In order to solve the above technical problems, the present invention provides a data extraction method. [0004] The present invention is realized by ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/215
Inventor 金涛江浩
Owner 深圳市科联汇通科技有限公司