Chinese spoken language semantic comprehension method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of semantic understanding and spoken language, applied in the fields of digital data processing, special data processing applications, instruments, etc., can solve the problem that the language model cannot consider both words and words, the language model training data is large, and the Chinese word segmentation effect is not good. Achieve the effect of improving comprehension performance, avoiding word segmentation errors, and reducing costs

Pending Publication Date: 2019-11-29

AISPEECH CO LTD

View PDF4 Cites 11 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0014] In order to solve at least a large amount of manually-labeled data in the prior art, the word vector intelligently expresses the characteristics of a single word, which plays a very limited role in the generalization of sentences, and relies on a large number of high-quality unlabeled texts, language model training The data is huge, the training time is very long, the language model cannot consider characters and words at the same time, and the effect of Chinese word segmentation is not good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment approach

[0056] As an implementation manner, the input of the hidden layer vector of the spoken speech audio to the semantic understanding model includes:

[0057] Perform domain classification based on the feature sequence corresponding to the hidden layer vector of the speech audio;

[0058] Predicting the semantic slot category of each hidden vector corresponding to each word in the speech audio;

[0059] Semantics of the speech audio are determined according to the domain classification and the semantic slot category.

[0060] In this embodiment, the input sentence is encoded through the modeled neural network:

[0061]

[0062] where h' t is the reverse hidden vector in the neural network, is the forward hidden vector in the neural network, e t is the bidirectional language model feature of the corresponding position of the t word (including the current word and the bidirectional language model hidden layer vector corresponding to the word to which the current word is divi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention provides a Chinese spoken language semantic comprehension method. The method comprises the steps of obtaining a generalized label-free text sequence training set, and performing forward prediction and reverse prediction on the training set in sequence to train a word-level and a word-level bidirectional language model; receiving spoken language voice audios input bya user, and carrying out sequence word segmentation to obtain character sequences and word sequences; decoding the character sequence and the word sequence by using the character-level bidirectionallanguage model and the word-level bidirectional language model respectively to obtain character-level implicit strata vectors and word-level implicit strata vectors; performing vector alignment on theimplicit strata vectors of the character sequence and the word sequence to obtain an implicit strata vector of spoken language voice audio input by the semantic comprehension model; and inputting thehidden layer vector of the spoken language voice audio into a semantic comprehension model, and determining the semantics of the spoken language voice audio. The embodiment of the invention further provides a Chinese spoken language semantic comprehension system. The embodiment of the invention has good generalization ability, combines word and character sequences, and improves the performance ofChinese semantic comprehension.

Description

technical field [0001] The invention relates to the field of intelligent voice interaction, in particular to a method and system for understanding the semantics of spoken Chinese. Background technique [0002] Semantic understanding plays an important role in the interaction of intelligent voice, and the following methods are usually used for semantic understanding: [0003] 1. Spoken language semantic understanding based on deep learning and supervised learning: It is necessary to carry out artificial semantic annotation on natural text or speech recognition text, combined with a deep neural network model, to train the semantic understanding model in a data-driven manner. [0004] 2. Spoken semantic understanding based on deep learning, supervised learning, and pre-trained word vectors: artificial semantic annotation needs to be performed on natural text or speech recognition text, and at the same time, external pre-trained word vectors are used to initialize the input laye...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/27

Inventor 朱苏徐华俞凯张瑜

Owner AISPEECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Chinese spoken language semantic comprehension method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment approach

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology