Internet big data analysis and extraction method

An extraction method and big data technology, applied in the field of big data, can solve problems that affect business analysis results, redundant data processing, and affect user experience, and achieve high computing efficiency and improved accuracy

Pending Publication Date: 2022-03-01
CHINA INFOMRAITON CONSULTING & DESIGNING INST CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The existing big data intelligent processing system has at least the following disadvantages: the existing data technology lacks the analysis of unstructured data, a large amount of effective information is lost, and the analysis results of the business are affected; the existing data analysis and extraction rely too much on human labor. Feature extraction, low accuracy, poor calculation efficiency, slow response to user requests, affects user experience; different services usually use different data processing and feature extraction methods, resulting in a large amount of redundant data processing, and the data units of different services feature incompatible

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Internet big data analysis and extraction method
  • Internet big data analysis and extraction method
  • Internet big data analysis and extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] refer to figure 1 , the present invention provides a kind of Internet big data analysis extraction method, comprises the following steps:

[0053] S1. According to the characteristics of the data, the data objects are divided into different parts and types, and then further analyzed to obtain the range of data to be extracted;

[0054] S2. Determine the causal relationship between the variables by specifying the dependent variable and the independent variable, establish a regression model, and solve the parameters of the model according to the measured data, and then evaluate whether the regression model can fit the measured data well, if it can be very good If the fit is good, the range of data to be extracted can be further narrowed according to the independent variable. The similar matching algorithm can be applied to such as data cleaning, user input error correction, recommendation statistics, plagiarism detection system, automatic scoring system, web search and DN...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an internet big data analysis and extraction method, which comprises the following steps of: 1, dividing a data object into different parts and types according to the characteristics of data to obtain a data range to be extracted; step 2, establishing a regression model, solving each parameter of the model according to the measured data, then evaluating whether the regression model can fit the measured data, and if the regression model can fit the measured data, further narrowing the range of the data to be extracted according to the independent variable; step 3, dividing the data into more than two aggregation classes according to the characteristic attributes of the data, and grouping the data to be grabbed, the elements in each aggregation class having the same characteristics; 4, calculating the similarity degree of the two pieces of data by adopting a similarity matching method; 5, the word frequency is used as a statistical index to indicate data segment information fed back by the data; and step 6, obtaining a data analysis result. The method is automatically completed by using an embedded mapping-based representation learning algorithm, and the calculation efficiency is high.

Description

technical field [0001] The invention belongs to the technical field of big data, and in particular relates to a method for analyzing and extracting Internet big data. Background technique [0002] Big data refers to a collection of data that cannot be captured, managed, and processed by conventional software tools within a certain period of time. It is a massive, high-growth rate that requires a new processing model to have stronger decision-making power, insight and discovery, and process optimization capabilities. and diverse information assets. [0003] At present, many web crawlers are used to grab relevant information from public websites, and then perform structured processing and storage, which may be disturbed by a large amount of useless information such as expired information and phishing website information, and the data accuracy and practicability are low. Therefore, it is necessary to study the Internet data extraction method in depth to solve the problem of im...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/958G06F16/953G06F16/906
CPCG06F16/958G06F16/953G06F16/906
Inventor 陈大海张冰徐浩葛卫春
Owner CHINA INFOMRAITON CONSULTING & DESIGNING INST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products