Method and system for processing efficient relating subject model data

A data processing and topic model technology, applied in the direction of electrical digital data processing, special data processing applications, program control design, etc., can solve problems such as increasing system storage load, solution bottlenecks, difficult large-scale data use, etc.

Inactive Publication Date: 2008-07-23
INST OF SOFTWARE - CHINESE ACAD OF SCI
View PDF0 Cites 38 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004]Although the associated topic model provides an ideal means of high-level text representation in terms of function, it is still mainly limited to a small amount of data, and it is difficult to implement it in the real environment. The fundamental reason for the use of large-scale data is that there is a serious bottleneck in its solution method: first, its classic implementation is based on the conventional serial computing method, that is, each step of the computing task must be performed sequentially, and the previous step processing The result is the start of the subsequent processing
In this way, at any point in time, all computing tasks can only b

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for processing efficient relating subject model data
  • Method and system for processing efficient relating subject model data
  • Method and system for processing efficient relating subject model data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0054] The network topology of the present invention is a computer cluster, as shown in Figure 1, which consists of two basic components, namely: a master control node and several computing nodes. There is only one master control node, which can use an ordinary PC, and is mainly responsible for functions such as interface interaction, data distribution, and result summary. There are multiple computing nodes (in principle, there is no limit on the number) and different types of computers can be selected. The computing nodes undertake the main computing workload of the solving task. The master control node and the computing node are connected through the network, and the data only needs to be directly transmitted between the master control node and the computing node, and there is no communication between the computing nodes.

[0055] The process flow of the method of the present invention is shown in Figure 2: vertically represents sequential steps, and horizontally represents ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an effective correlated theme model data process method and a system, wherein the method comprises: in task initiation phase, firstly offering an initiation model M0 through a master control node and synchronizing the model to all computation nodes and then dividing task set and distributing the set onto a plurality of computation nodes for computation; in task execution phase, processing a plurality of data, wherein in each turn, firstly processing local parallel computation on working thread of each computation node to obtain theme distribution and model statistics of the node file sub set, and then sending the theme distribution and model statistics to a master control node for collection and judging whether the data process results converge or not. The system of the invention comprises a master control node and a plurality of computation nodes, which form a cluster computer system for computation. The invention can greatly improve computation speed and expand computation aims.

Description

technical field [0001] The invention relates to a text representation method and system thereof, in particular to a high-efficiency data processing method and system based on hidden subject text representation, and belongs to the field of computer information retrieval. Background technique [0002] Computer information retrieval is one of the important infrastructures of the information society, and the services provided run through from basic network information search to information filtering and classification to various advanced data mining. In computer information retrieval, the representation method of text is a fundamentally important issue: first, the processing object of computer information retrieval is mainly text information, and other types of information generally must depend on text information or additional text information to exist; Furthermore, the text representation method is a prerequisite for computer information retrieval services, because the basic m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F9/5027G06F9/5066G06F2209/5017
Inventor 李文波孙乐
Owner INST OF SOFTWARE - CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products