Language model re-evaluation method based on long and short memory network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A long-term and short-term memory and language model technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of inability to remember historical information, limited performance improvement, and insignificant effect of high-meta language model re-evaluation, and achieve learning ability. The effect of strong, improved performance, good memory function

Active Publication Date: 2017-06-06

INST OF ACOUSTICS CHINESE ACAD OF SCI +1

View PDF7 Cites 20 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In the speech recognition system, the effect of using the high-element language model for re-evaluation is not obvious, and the performance improvement is limited by using the forward neural network language model and the recursive neural network language model for M candidate re-evaluation.

Because none of these language models can perform a good memory function on historical information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] The present invention will be described in detail below in conjunction with the accompanying drawings and preferred embodiments.

[0038] The data sets used in this experiment are as follows:

[0039]Training set: The training data used include the Chinese text corpus provided by the Linguistic Data Consortium (LDC): Call-Home, Call-Friend and Call-HKUST; the self-collected natural spoken dialogue data, collectively referred to as CTS (Conversational Telephone Speech ) corpus. The other part of the training data is the text corpus downloaded from the Internet, collectively referred to as general corpus.

[0040] Development set: Self-collected telephone channel dataset.

[0041] Test set: The data set (86305) provided by the National 863 High-tech Program in 2005 and the partial data (LDC) of natural spoken telephone conversations collected by the University of Hong Kong in 2004.

[0042] 1. Training phase

[0043] 1) Use the CTS corpus to train the trigram language...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a language model re-evaluation method based on long and short memory network. The method includes the following steps: a step 100 of inputting to-be-identified language information, and pre-processing the input to-be-identified language information; a step 101 of conducting one-time decoding on the pre-processed information with an N-gram grammar language model, then selecting M optimal candidate results; a step 102 of introducing the identifying result of the one-time decoding to the acquired M optimal candidate results as historical sentence information; a step 103 of re-evaluating the selected M optimal candidate results with the n-gram language model; a step 104 of re-evaluating the M optimal candidate results that are introduced with the historical sentence information by using a neural network training language model which is based on a LSTM structure; and a step 105 of combining the result of the re-evaluation obtained by using the n-gram language model with the result of the re-evaluation obtained by using the neural network language model which is based on the LSTM, selecting an optimal result as the final identification result of the to-be-identified language information.

Description

technical field [0001] The invention relates to the field of speech recognition, and is a method for reassessing recognition results by using a long-short-term memory network language model, thereby improving speech recognition performance. Background technique [0002] The language model describes the constraint phenomenon between words in linguistics in a mathematical way, and plays a significant role in the field of speech recognition, especially in the speech recognition system for telephone conversations, the colloquial language model can often be greatly improved Significantly improve system performance. However, language models are domain- and time-sensitive, real-life phone conversation-style corpora are limited, and real speech quality is varied and all-encompassing, so speech recognition rates are usually low. In order to improve the performance of the speech recognition system, a post-processing stage is generally added, that is, the first-pass decoding not only ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/06

CPCG10L15/063

Inventor 张鹏远左玲云潘接林颜永红

Owner INST OF ACOUSTICS CHINESE ACAD OF SCI

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Language model re-evaluation method based on long and short memory network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology