Unlock instant, AI-driven research and patent intelligence for your innovation.

Support vector machine training method based on Spark framework

A support vector machine and training method technology, which is applied in the field of support vector machine training based on the Spark framework, can solve problems such as high computational complexity, increased SVDD running time, and increasingly urgent demand for solution methods, and achieve faster solution calculations process, the effect of saving computing instruction cycles

Active Publication Date: 2018-06-05
北京寄云鼎城科技有限公司
View PDF6 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, with the exponential growth of the amount of data, the requirements for the memory and CPU of the stand-alone version can no longer meet the demand, and the demand for the algorithm parallelization solution method is becoming more and more urgent
The SMO algorithm needs to calculate multiple quadratic programming problems to solve the support vector data description (SVDD), which has high computational complexity, and the running time of SVDD will increase sharply with the increase of the number of training samples.
The memory required to store the kernel matrix Kii increases rapidly with the number of training points N in the training set. The size of the kernel matrix is ​​the square of the number of samples. Directly applying SVDD to data anomaly detection will lead to excessive calculation and memory overflow problems.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Support vector machine training method based on Spark framework
  • Support vector machine training method based on Spark framework
  • Support vector machine training method based on Spark framework

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0034] refer to figure 1 , figure 1 A flow chart of a support vector machine training method based on the Spark framework provided by an embodiment of the present invention, the method includes:

[0035] S1. Obtain a training sample set, and distribute and store all sample vectors in the training sample set in the data nodes of the Spark framework.

[0036] Specifically, after receiving the training sample set, the sample vectors in the sample set are distributed and stored in the data nodes under the Spark framework through distributed storage.

[0037] Such as figure 2 As shown, Apache Spark is a fast and general-purpose engine designed for distributed memory comput...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a support vector machine training method based on a Spark frame. The method comprises the steps of acquiring a training sample set, storing all sample vectors in the training sample set into data nodes of the Spark frame in a distributed manner; extracting a largest sample vector V2 which violates a KKT condition from the training sample set, and simultaneously selecting a sample vector V1 with largest difference to a sphere center distance of the sample vector V2; performing iterative optimization calculation on the sample vectors V1 and V2, and obtaining the updated sample vectors V1new and V2new; broadcasting the sample vectors V1new and V2new into the data node of the Spark, calculating a differential generated by the sample vectors V1 and V2 in each data node, thereby calculating an updated sphere center; and updating the sphere center distance and sphere diameter of each sample vector in the data node. According to the method of the invention, single-machine calculation dense operation is distributed to the working nodes through applying a Spark distributed calculating frame; and when data increase occurs, transverse extension can be performed and a storage space is not limited by a single machine.

Description

technical field [0001] The present invention relates to the field of computer technology, more specifically, to a support vector machine training method based on the Spark framework. Background technique [0002] Since the appearance of Support Vector Machine (SVM), it has been widely used in information security, image processing, pattern recognition, fault diagnosis, anomaly detection and other fields. In 1999, Tax, Scholkopf and Duin et al. proposed two One Class SVM algorithms, namely the Hyperplane-based and Hypersphere-based One Class SVM. Among them, support vector data description (support vector data description, SVDD) is a single-class classification method using hyperspheres, and its goal is to use training data to describe a hypersphere as a discriminant model for classification. [0003] The current commonly used software packages for SVM pattern recognition and regression are python's scikit-learn and Taiwan's LIBSVM of Professor Lin Zhiren. Among them, Sciki...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06K9/00
CPCG06V10/95G06F18/2411G06F18/214
Inventor 许千帆王宇陈玫
Owner 北京寄云鼎城科技有限公司