Language model training method and system, mobile terminal and storage medium

A language model and training method technology, applied in natural language data processing, speech analysis, speech recognition, etc., can solve the problems of language model training efficiency and low scalability, and achieve the goal of improving training efficiency and accuracy, and improving recognition efficiency Effect

Active Publication Date: 2020-05-22
XIAMEN KUAISHANGTONG TECH CORP LTD
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the embodiment of the present invention is to provide a language model training method, system, mobile term

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Language model training method and system, mobile terminal and storage medium
  • Language model training method and system, mobile terminal and storage medium
  • Language model training method and system, mobile terminal and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045] see figure 1 , is a flow chart of the language model training method provided in the first embodiment of the present invention, including steps:

[0046] Step S10, obtaining training text and training vocabulary, classifying the training text to obtain a plurality of language modules, and constructing a language dictionary corresponding to the language modules according to the training vocabulary;

[0047] Wherein, the text language in the training text can be set according to requirements, for example, the text language can be Chinese, English, Korean or Japanese, etc., the training vocabulary and training text can be obtained based on the database, the training vocabulary includes nouns vocabulary, verb vocabulary, adjective vocabulary, adverb vocabulary, etc.;

[0048] Specifically, in this step, a classifier can be used to classify the training text, and the classifier is used to classify the text in the training text according to the different word attributes, so ...

Embodiment 2

[0063] see figure 2 , is a flow chart of the language model training method provided in the second embodiment of the present invention, including steps:

[0064] Step S11, obtaining training text and training vocabulary, classifying the training text to obtain a plurality of language modules, and constructing a language dictionary corresponding to the language modules according to the training vocabulary;

[0065] Wherein, by classifying the training text, to obtain the noun module, the verb module, the adjective module and the adverb module, preferably, in other embodiments, the language module can also be divided into states according to the different text attributes in the training text Word modules, etc.;

[0066] Specifically, in this step, a one-to-one relationship is adopted between the language module and the language dictionary, therefore, by constructing a dictionary based on the training vocabulary, a dictionary of nouns, a dictionary of verbs, a dictionary of adj...

Embodiment 3

[0102] see image 3 , is a schematic structural diagram of the language model training system 100 provided by the third embodiment of the present invention, including: a text classification module 10, a model training module 11, a phoneme matching module 12 and a probability calculation module 13, wherein:

[0103] Text classification module 10, is used for obtaining training text and training vocabulary, classifies described training text, to obtain a plurality of language modules, and constructs the language dictionary corresponding to described language module according to described training vocabulary;

[0104] The model training module 11 is configured to perform model training on the module language model in the language module according to the language dictionary, and train the training text to obtain a text language model.

[0105] Wherein, the model training module 11 is also used for: extracting the language text corresponding to the language module in the training t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a language model training method and system, a mobile terminal and a storage medium, and the method comprises the steps: obtaining a training text and a training vocabulary, carrying out the classification of the training text so as to obtain a plurality of language modules, and constructing a language dictionary corresponding to the language modules according to the training vocabulary; performing model training on a module language model in the language module according to the language dictionary, and training the training text to obtain a text language model; obtaining to-be-recognized voice to perform phoneme recognition to obtain a phoneme string, and matching the phoneme string with the module language model to obtain a phoneme matching result; and performing probability calculation on the phoneme matching result through a text language model, and outputting the sentence corresponding to the maximum probability value. According to the method, the training efficiency and accuracy of the language model are improved by classifying the training texts and constructing and designing the language dictionary, and the language model can be effectively expanded on the basis of the training design of the module language model and the training texts.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, and in particular relates to a language model training method, system, mobile terminal and storage medium. Background technique [0002] Speech recognition research has a history of several decades. Speech recognition technology mainly includes four parts: acoustic model modeling, language model modeling, pronunciation dictionary construction, and decoding. Each part can become a separate research direction, and compared with image And text, the difficulty of collecting and labeling speech data is also greatly increased, so building a complete language model training system is a very time-consuming and extremely difficult task, which greatly hinders the development of speech recognition technology. [0003] In the existing language model training process, the language model can only be trained according to the pre-stored vocabulary and sentence patterns in the database, and the vocabul...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/06G06F40/216G06F40/242G06F40/284G06K9/62
CPCG10L15/063G10L2015/0633G10L2015/025G06F18/24323
Inventor 张广学肖龙源蔡振华李稀敏刘晓葳
Owner XIAMEN KUAISHANGTONG TECH CORP LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products