Unlock instant, AI-driven research and patent intelligence for your innovation.

Punctuation mark recognition model construction method and device

A punctuation and recognition model technology, applied in character recognition, character and pattern recognition, speech recognition, etc., to achieve the effect of improving recognition accuracy and high punctuation recognition accuracy

Active Publication Date: 2022-02-15
ALIBABA DAMO (HANGZHOU) TECH CO LTD
View PDF12 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] This application provides a method for constructing a punctuation mark recognition model to solve the problem that existing models in the prior art only have high recognition accuracy in the field of parallel corpus coverage

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Punctuation mark recognition model construction method and device
  • Punctuation mark recognition model construction method and device
  • Punctuation mark recognition model construction method and device

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0063] Please refer to figure 1 , which is a flow chart of an embodiment of the method for building a punctuation mark recognition model of the present application. The subject of execution of the method is a device for building a punctuation mark recognition model, and the device is usually deployed at the server end, but is not limited to the server end, and can also be any device capable of implementing the method. The method for building a punctuation mark recognition model provided in this embodiment includes:

[0064] Step S101: Obtain a first text set and a first speech data set, and a set of correspondences between the second speech data and the second text.

[0065] The first text set includes a plurality of first texts with punctuation marks. For example, the first text reads, "Drugs are in short supply, patients cannot use them, and the market has been manipulated against regulations, leading to high prices... In recent years, the problem of domestic shortage of dr...

no. 2 example

[0093] In the above embodiments, a method for building a punctuation mark recognition model is provided, and correspondingly, the present application also provides a punctuation mark recognition model building device. The device corresponds to the embodiment of the above-mentioned method. The parts in this embodiment that are the same as those in the first embodiment will not be described again, please refer to the corresponding parts in the first embodiment.

[0094] A device for building a punctuation mark recognition model provided by the present application includes: a data acquisition unit, a pre-training unit, and an optimization unit.

[0095] The data acquisition unit is used to obtain the first text set and the first voice data set, and the corresponding relationship set between the second voice data and the second text; the pre-training unit is used to learn the first text set according to the first text set. The network parameters of the text processing module incl...

no. 3 example

[0108] In the foregoing embodiments, a method for constructing a punctuation mark recognition model is provided, and correspondingly, the present application also provides an electronic device. The device corresponds to an embodiment of the method described above. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to part of the description of the method embodiment. The device embodiments described below are illustrative only.

[0109] An electronic device in this embodiment includes: a processor and a memory; the memory is used to store a program for realizing the method for building a punctuation mark recognition model, and after the device is powered on and runs the program of the method through the processor, it executes The following steps: obtain the first text set and the first speech data set, and the corresponding relationship set between the second speech data and the s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a punctuation mark recognition model construction method, device and equipment. The method comprises the following steps of acquiring a first text set, a first voice data set and a corresponding relation set between second voice data and a second text, according to the first text set, learning to obtain network parameters of a text processing module included in the model, according to the first voice data set, learning to obtain a first network parameter of a voice processing module included in the model, and training the voice processing module based on the first network parameter according to the corresponding relation set to obtain a second network parameter of the voice processing module. By the adoption of the processing mode, the model has the consistent recognition accuracy rate in the general field, meanwhile, the voice processing module is better learned from a small amount of parallel data covering a small number of fields, and after acoustic information is introduced, the intention of a speaker can be better utilized, and punctuation marks better conforming to spoken languages are obtained.

Description

technical field [0001] The present application relates to the technical field of speech processing, in particular to a method, device and equipment for building a punctuation mark recognition model, a speech transcription system, and a speech interaction system. Background technique [0002] A speech transcription system is a speech processing system that transcribes speech into text. The system can automatically form meeting minutes to improve meeting efficiency, give full play to meeting functions, avoid waste of human, material and financial resources, reduce meeting costs, and achieve human resource efficiency. [0003] For the convenience of users to read, the text output by the real-time speech transcription system is usually text with punctuation marks. Spoken punctuation prediction is the task of identifying punctuation marks in speech-transcribed text. A typical spoken punctuation prediction method is to predict the punctuation marks that may appear in the speech ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G10L15/02G10L15/22G10L15/26G06V30/10
CPCG10L15/063G10L15/02G10L15/22G10L15/26
Inventor 陈梦喆陈谦
Owner ALIBABA DAMO (HANGZHOU) TECH CO LTD