Unlock instant, AI-driven research and patent intelligence for your innovation.

Punctuation mark recognition model construction method and device

A punctuation mark and recognition model technology, applied in character recognition, character and pattern recognition, speech recognition, etc., to achieve high punctuation recognition accuracy and improved recognition accuracy

Active Publication Date: 2022-04-22
ALIBABA DAMO (HANGZHOU) TECH CO LTD
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] This application provides a method for constructing a punctuation mark recognition model to solve the problem that existing models in the prior art only have high recognition accuracy in the field of parallel corpus coverage

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Punctuation mark recognition model construction method and device
  • Punctuation mark recognition model construction method and device
  • Punctuation mark recognition model construction method and device

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0063] Please refer to figure 1 , which is a flow chart of an embodiment of the method for building a punctuation mark recognition model of the present application. The subject of execution of the method is a device for building a punctuation mark recognition model, and the device is usually deployed at the server end, but is not limited to the server end, and can also be any device capable of implementing the method. The method for building a punctuation mark recognition model provided in this embodiment includes:

[0064] Step S101: Obtain a first text set and a first speech data set, and a set of correspondences between the second speech data and the second text.

[0065] The first text set includes a plurality of first texts with punctuation marks. For example, the first text reads, "Drugs are in short supply, patients cannot use them, and the market has been manipulated against regulations, leading to high prices... In recent years, the problem of domestic shortage of dr...

no. 2 example

[0093] In the above embodiments, a method for building a punctuation mark recognition model is provided, and correspondingly, the present application also provides a punctuation mark recognition model building device. The device corresponds to the embodiment of the above-mentioned method. The parts in this embodiment that are the same as those in the first embodiment will not be described again, please refer to the corresponding parts in the first embodiment.

[0094] A device for building a punctuation mark recognition model provided by the present application includes: a data acquisition unit, a pre-training unit, and an optimization unit.

[0095] The data acquisition unit is used to obtain the first text set and the first voice data set, and the corresponding relationship set between the second voice data and the second text; the pre-training unit is used to learn the first text set according to the first text set. The network parameters of the text processing module incl...

no. 3 example

[0108] In the foregoing embodiments, a method for constructing a punctuation mark recognition model is provided, and correspondingly, the present application also provides an electronic device. The device corresponds to an embodiment of the method described above. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to part of the description of the method embodiment. The device embodiments described below are illustrative only.

[0109] An electronic device in this embodiment includes: a processor and a memory; the memory is used to store a program for realizing the method for building a punctuation mark recognition model, and after the device is powered on and runs the program of the method through the processor, it executes The following steps: obtain the first text set and the first speech data set, and the corresponding relationship set between the second speech data and the s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The application discloses a method, device and equipment for building a punctuation mark recognition model. Wherein, the method includes: obtaining the first text set and the first speech data set, and the corresponding relationship set between the second speech data and the second text; according to the first text set, learning the text processing included in the model The network parameters of the module; according to the first voice data set, learn the first network parameters of the voice processing module included in the model; according to the correspondence set, train the voice processing module based on the first network parameters to obtain the voice processing module The second network parameter of . With this processing method, the model has a relatively consistent recognition accuracy in the general field. At the same time, it can better learn the speech processing module from a small amount of parallel data covering less fields. After the introduction of acoustic information, it can be better utilized. According to the speaker's own intention, the punctuation marks are more in line with the spoken language.

Description

technical field [0001] The present application relates to the technical field of speech processing, in particular to a method, device and equipment for building a punctuation mark recognition model, a speech transcription system, and a speech interaction system. Background technique [0002] A speech transcription system is a speech processing system that transcribes speech into text. The system can automatically form meeting minutes to improve meeting efficiency, give full play to meeting functions, avoid waste of human, material and financial resources, reduce meeting costs, and achieve human resource efficiency. [0003] For the convenience of users to read, the text output by the real-time speech transcription system is usually text with punctuation marks. Spoken punctuation prediction is the task of identifying punctuation marks in speech-transcribed text. A typical spoken punctuation prediction method is to predict the punctuation marks that may appear in the speech ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/06G10L15/02G10L15/22G10L15/26G06V30/10
CPCG10L15/063G10L15/02G10L15/22G10L15/26
Inventor 陈梦喆陈谦
Owner ALIBABA DAMO (HANGZHOU) TECH CO LTD