Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Parallel Data Analysis Methods

A data analysis and data technology, applied in the direction of unstructured text data retrieval, text database clustering/classification, semantic analysis, etc., can solve the problem that traditional methods cannot fully apply to social text analysis and calculation, data distribution is unbalanced, and difficult to achieve Information intelligence, human-computer interaction and automatic question answering

Active Publication Date: 2020-05-12
CITY CLOUD TECH HANGZHOU CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The huge amount of data makes it difficult for traditional stand-alone machine learning and data mining algorithms to complete calculations within an acceptable time, resulting in algorithm failure
Taking instant messaging data as an example, due to the characteristics of real-time update and variability, the dramatic increase in data volume will make the process of natural language processing and machine learning more complicated, and the overall architecture of the current parallel computing environment is not suitable for text Efficient parallel processing of data, especially process management and cache management, does not match the storage and distributed computing architecture required by text mining algorithms; moreover, the non-standard text increases the difficulty for users to understand information and discover events degree, which eventually leads to a serious imbalance in the data distribution
In addition, there is also the problem of lack of semantic representation of Chinese words, which makes traditional methods not fully applicable to the analysis and calculation of social texts, so it is difficult to realize information intelligence, human-computer interaction and automatic question answering

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel Data Analysis Methods
  • Parallel Data Analysis Methods
  • Parallel Data Analysis Methods

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The following and accompanying appendices illustrating the principles of the invention Figure 1 A detailed description of one or more embodiments of the invention is provided together. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details.

[0020] One aspect of the present invention provides a parallelized data analysis method. figure 1 is a flowchart of a parallelized data analysis method according to an embodiment of the present invention.

[0021] The present inventio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a parallel data analysis method. The method comprises the following steps: establishing a Map / Reduce parallel computing environment for text data analysis; Starting a text mining process on each node of the business set; The parallel processes running independently on each node are organized into parallel programs. The invention provides a parallel data analysis method, which improves the parallel framework of MAPRUDUCE from the angles of process management and buffer management, and better meets the needs of text data mining. Aiming at the irregularity of social text, semantic vectors are used to represent and analyze the text data effectively, which is suitable for the analysis and calculation of social text mining of various scales.

Description

technical field [0001] The invention relates to big data mining, in particular to a parallel data analysis method. Background technique [0002] Big data, especially social network data, contains huge commercial and social value. Effective management and utilization of these data and the value mining of data will have a huge impact on enterprises and individuals. On the other hand, while big data brings new development opportunities, it also brings many technical challenges. Traditional information processing and computing technologies have been difficult to effectively deal with the processing of big data. The effective processing of large-scale social network data faces major technical difficulties at multiple levels such as data storage and algorithm analysis. The huge amount of data makes it difficult for traditional stand-alone machine learning and data mining algorithms to complete calculations within an acceptable time, resulting in algorithm failure. Taking instan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/332G06F16/35G06F9/48G06F40/289G06F40/30
CPCG06F9/4881G06F40/289G06F40/30
Inventor 不公告发明人
Owner CITY CLOUD TECH HANGZHOU CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products