Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

SMO parallel processing method orientated at multi-core cluster

A parallel processing and cluster technology, applied in concurrent instruction execution, machine execution devices, etc., can solve problems such as large space, high cost, and slow response speed

Inactive Publication Date: 2014-04-09
COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

According to the current actual needs of Sogou in terms of classification business, we need to complete the classification of the following sample size: a sample file with 10,000 lines, 2 sample files (2 categories) per day, and the script is set to study sample files for 10 days, that is, 200,000 samples, the size of the kernel matrix is ​​the square of the number of samples, that is, 4*1010 elements (but the kernel matrix), assuming 4 bytes per character, considering the symmetry needs to be greater than 4*4*1010 / 2byte=80Gbyte, when storing the sample needs The space is large, the cost is high, and when dealing with large-scale data, the storage reflection speed is slow

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • SMO parallel processing method orientated at multi-core cluster
  • SMO parallel processing method orientated at multi-core cluster
  • SMO parallel processing method orientated at multi-core cluster

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments.

[0022] figure 1 It is a flow chart of SMO parallel processing oriented to a multi-core cluster according to an embodiment of the present invention. Wherein, the execution subjects are all processes. This embodiment comprises the following steps:

[0023] Step S101, obtaining local problem parameters of data to be classified; obtaining global parameters according to local problem parameters; assigning initial values ​​to local problem parameters and algorithm parameters according to global parameters.

[0024] All process initialization works, assign initial values ​​to the algorithm parameters, and set the multiplier in the algorithm to a i =0, intermediate variable E i =-y i , where y i label for each sample, E i for the local question parameter.

[0025] α i = ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an SMO parallel processing method orientated at a multi-core cluster. The SMO parallel processing method orientated at the multi-core cluster comprises the steps that an initial value is assigned to a local problem parameter according to a global parameter, and an initial value is assigned to an algorithm parameter; a local first boundary and a local second boundary of the local problem parameter are calculated according to the initial value of the local problem parameter; a global first boundary and a global second boundary are obtained according to local first boundary and the local second boundary; when the difference between the global first boundary and the global second boundary is not smaller than preset accuracy, a first multiplier corresponding to the global first boundary and a second multiplier corresponding to the global second boundary are calculated in an iterative model; the local problem parameter is updated in a multithreading mode after each time of iteration; when the iteration reaches a preset iteration frequency, a local solution of data to be classified is calculated according to a local sample multiplier, a global solution is obtained according to the local solution, and data classification is finished. The SMO parallel processing method orientated at the multi-core cluster resolves the traditional problems of high data classification cost, a high error rate and low response speed.

Description

technical field [0001] The invention relates to data classification, in particular to a multi-core cluster-oriented SMO parallel processing method. Background technique [0002] The problem to be solved by the present invention is to realize the parallelization of text classification, which not only improves the speed of text classification, but also distributes and stores data more reasonably and effectively. The technical problem comes from Sogou Company. Sogou Search is an interactive Chinese search engine, and classification technology is one of its core technologies. Due to the ever-increasing data scale of the modern Internet, the volume of samples that need to be classified is huge. However, the solution of storing these data in the same place and on the same disk is obviously difficult to cope with the changes in the future data world. Therefore, there is an urgent need for a technical solution to realize the distributed storage and parallel classification of these ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/38
Inventor 迟学斌高原王珏单桂华田东刘俊
Owner COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products