Big data processing method based on machine learning

A technology of big data processing and machine learning, applied in digital data processing, special data processing applications, text database query, etc., can solve problems such as algorithm failure, irregularity, and traditional methods that cannot be fully applied to social text analysis and calculation

Active Publication Date: 2019-11-05
贵州航天云网科技有限公司
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The huge amount of data makes it difficult for traditional stand-alone machine learning and data mining algorithms to complete calculations within an acceptable time, resulting in algorithm failure
Taking instant messaging data as an example, due to the characteristics of real-time update and variability, the dramatic increase in data volume will make the process of natural language processing and machine learning more complicated, and the overall architecture of the current parallel computing environment is not suitable for text Efficient parallel processing of data, especially process management and cache management, does not match the storage and distributed computing architecture required by text mining algorithms; moreover, the non-standard text increases the difficulty for users to understand information and discover events degree, which eventually leads to a serious imbalance in the data distribution
In addition, there is also the problem of lack of semantic representation of Chinese words, which makes traditional methods not fully applicable to the analysis and calculation of social texts, so it is difficult to realize information intelligence, human-computer interaction and automatic question answering

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data processing method based on machine learning
  • Big data processing method based on machine learning
  • Big data processing method based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The following and accompanying appendices illustrating the principles of the invention Figure 1 A detailed description of one or more embodiments of the invention is provided together. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details.

[0029] One aspect of the present invention provides a method for processing big data based on machine learning. figure 1 It is a flowchart of a machine learning-based big data processing method according to an embodiment of the present...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a big data processing method based on machine learning, comprising: given a search sentence, using a common stop word list to filter the words in the initial search, and retaining meaningful search terms; using a semantic block model Represent vocabulary with semantic vectors; on the basis of semantic vectors, use cosine similarity for each initial search term to find multiple words with the closest similarity from other vocabulary as extended search terms; use the corresponding The extended search terms in the initial search sentence are replaced, and the newly generated search word sequence is used as the extended search sentence; according to the arrangement and combination of the extended search words, the extended search sentences with different expression forms are obtained. The invention improves the parallel framework of MAPRUDUCE to better meet the needs of text data mining; and aims at the non-standard characteristics of social text, uses semantic vectors to effectively represent and analyze text data, and is suitable for social text mining analysis of various scales and calculate.

Description

technical field [0001] The invention relates to big data mining, in particular to a machine learning-based big data processing method. Background technique [0002] Big data, especially social network data, contains huge commercial and social value. Effective management and utilization of these data and the value mining of data will have a huge impact on enterprises and individuals. On the other hand, while big data brings new development opportunities, it also brings many technical challenges. Traditional information processing and computing technologies have been difficult to effectively deal with the processing of big data. The effective processing of large-scale social network data faces major technical difficulties at multiple levels such as data storage and algorithm analysis. The huge amount of data makes it difficult for traditional stand-alone machine learning and data mining algorithms to complete calculations within an acceptable time, resulting in algorithm fai...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27G06F16/33G06N3/04
CPCG06F40/30G06N3/045
Inventor 不公告发明人
Owner 贵州航天云网科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products