Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Big data processing method based on machine learning

A technology of big data processing and machine learning, which is applied in digital data processing, special data processing applications, text database query, etc., can solve the problem that traditional methods cannot be fully applied to social text analysis and calculation, data distribution is unbalanced, and it is difficult to realize information Intellectualization, human-computer interaction and automatic question answering

Active Publication Date: 2019-01-15
贵州航天云网科技有限公司
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The huge amount of data makes it difficult for traditional stand-alone machine learning and data mining algorithms to complete the calculation within an acceptable time, resulting in the failure of the algorithm
Taking instant messaging data as an example, due to the characteristics of real-time update and variability, the dramatic increase in data volume will make the process of natural language processing and machine learning more complicated, and the overall architecture of the current parallel computing environment is not suitable for text Efficient parallel processing of data, especially process management and cache management, does not match the storage and distributed computing architecture required by text mining algorithms; moreover, the non-standard text increases the difficulty for users to understand information and discover events degree, which eventually leads to a serious imbalance in the data distribution
In addition, there is also the problem of lack of semantic representation of Chinese words, which makes traditional methods not fully applicable to the analysis and calculation of social texts, so it is difficult to realize information intelligence, human-computer interaction and automatic question answering

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data processing method based on machine learning
  • Big data processing method based on machine learning
  • Big data processing method based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The following and accompanying appendices illustrating the principles of the invention Figure 1 A detailed description of one or more embodiments of the invention is provided together. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details.

[0029] One aspect of the present invention provides a method for processing big data based on machine learning. figure 1 It is a flowchart of a machine learning-based big data processing method according to an embodiment of the present...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a big data processing method based on machine learning, which comprises the following steps: given a retrieval sentence, filtering the words in the initial retrieval by using auniversal stop word list, and reserving meaningful retrieval words; the semantic block model is used to represent the lexical semantic vector. On the basis of semantic vector, cosine similarity is used to find out the closest words from other words for each initial search term, which can be used as extended search terms. The corresponding extended search terms in the initial search are used to replace the original search terms, and the newly generated search term sequence is used as the extended search terms. According to the permutation and combination of the extended search terms, the extended search sentences with different expressions are obtained. The invention improves the parallel frame of the MAPRUDUCE and is better adapted to the needs of text data mining. Aiming at the irregularity of social text, semantic vectors are used to represent and analyze the text data effectively, which is suitable for the analysis and calculation of social text mining of various scales.

Description

technical field [0001] The invention relates to big data mining, in particular to a machine learning-based big data processing method. Background technique [0002] Big data, especially social network data, contains huge commercial and social value. Effective management and utilization of these data and the value mining of data will have a huge impact on enterprises and individuals. On the other hand, while big data brings new development opportunities, it also brings many technical challenges. Traditional information processing and computing technology has been difficult to effectively deal with the processing of big data. The effective processing of large-scale social network data faces major technical difficulties at multiple levels such as data storage and algorithm analysis. The huge amount of data makes it difficult for traditional stand-alone machine learning and data mining algorithms to complete calculations within an acceptable time, resulting in algorithm failur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27G06F16/33G06N3/04
CPCG06F40/30G06N3/045
Inventor 不公告发明人
Owner 贵州航天云网科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products