Target-domain classifier training method, sample recognition method, terminal and storage medium

A training method and a recognition method technology, which are applied in the fields of terminals and storage media, target domain classifier training methods, and sample recognition methods, and can solve the problem of high cost of speech recognition.

Inactive Publication Date: 2018-05-11
NUBIA TECHNOLOGY CO LTD
View PDF4 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The technical problem to be solved by the present invention is to provide a new target domain classifier training scheme to solve the problem that in the prior art, different classifiers in the target domain can only be created for regions with different speech habits to realize speech recognition, resulting in the cost of speech recognition. Aiming at this technical problem, a target domain classifier training method, sample identification method, terminal and storage medium are provided

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Target-domain classifier training method, sample recognition method, terminal and storage medium
  • Target-domain classifier training method, sample recognition method, terminal and storage medium
  • Target-domain classifier training method, sample recognition method, terminal and storage medium

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0056] In the prior art, when creating a classifier for the target domain, it is necessary to collect a large amount of sample data and mark these sample data before training the classification used to classify and identify unclassified samples in the target domain based on the marked samples. classifier, which is expensive to train. Especially in the field of speech recognition, due to the problem of different speech habits in different regions, it is difficult to create a separate speech classifier for each region. To address the above problems, this embodiment provides a target domain classifier training method. Combine below figure 1 The training method of the target domain classifier is described as follows:

[0057] S102. Perform feature extraction on the training samples to obtain training data.

[0058] In this embodiment, the training samples include two parts: one part is the samples from the source domain, which are called auxiliary samples here, and all samples i...

no. 2 example

[0080] The target domain classifier training method and sample recognition method provided in the foregoing embodiments are applicable to handwritten digital image samples or voice samples. This embodiment will continue to introduce the target domain classifier training method and sample identification method on the basis of the first embodiment, please refer to image 3 :

[0081] S302. Perform feature extraction on the training samples to obtain training data.

[0082] In this embodiment, the terminal uses the RBM algorithm to extract features from the training samples. Combine below Figure 4 A brief introduction to the principle of the RBM algorithm, Figure 4 A schematic diagram of the RBM model shown:

[0083] RBM generally consists of a two-layer structure of visible layer unit v and hidden layer unit h. The connection weight between the visible layer unit and the hidden layer unit is represented by w, the visible layer bias is represented by b, and the hidden layer...

no. 3 example

[0139] This embodiment firstly provides a storage medium, where one or more computer programs are stored. In an example of this embodiment, a target domain database construction program is stored in the storage medium, and the program can be read, compiled, and executed by a processor, so as to implement the flow of the target domain classifier training method in the foregoing embodiments. In other examples of this embodiment, a sample identification program may be stored in the storage medium, and the sample identification program may be executed by a processor to implement the sample identification method in the foregoing embodiments.

[0140] In addition, this embodiment also provides a terminal, please refer to Image 6 , the terminal 60 includes a processing 61 , a memory 62 and a communication bus 63 . Wherein the communication bus 63 is used to realize the connection and communication between the processor 61 and the memory 62, the memory 62 is used as a computer-reada...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a target-domain classifier training method, a sample recognition method, a terminal and a storage medium. The invention aims to solve the problem that the scheme creation costof an existing classifier is high in the prior art. The invention provides a target-domain classifier training method. According to the method, training data are subjected to at least two times of iterative classification treatment. Meanwhile, according to the classification result of each iteration classification treatment, the data classification weight vector of each sample in the training datais adjusted. After N times of iterative classification treatment, the classification weight vector of each sample data can achieve the effect of matching the actual data distribution condition of a target domain. The invention further provides the sample recognition method, the terminal and the storage medium. Based on the characteristic that the data distribution of source samples in the targetdomain is the same as the data distribution of to-be-recognized sample data, a classifier which belongs to the target domain is established by combining a large number of marked auxiliary sample datain the source domain. Meanwhile, unmarked samples in the target domain are classified. According to the technical scheme of the invention, the requirement on the number of source samples in the targetdomain is not high. The method is easy in implementation and low in implementation cost.

Description

technical field [0001] The present invention relates to the computer field, and more specifically, relates to a target domain classifier training method, a sample identification method, a terminal and a storage medium. Background technique [0002] In the field of speech recognition, a huge and comprehensive speech database is the basis for accurate recognition. If you want to realize speech recognition for a certain language, you need to establish a speech database for that language. However, there are currently 5,651 identified languages ​​in the world. If a comprehensive database is to be established for each language, cost is the first consideration. It is very difficult to build a database even for the most widely spoken Chinese language, because there are different pronunciation problems in different regions: for some words, there may only be different pronunciations between different regions, such as the southwest area and the southeast coastal area. Pronunciation is...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/22G10L15/06G10L15/08
CPCG10L15/063G10L15/08G10L15/22
Inventor 刘赣
Owner NUBIA TECHNOLOGY CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products