English punctuation mark adding method, system and device based on data enhancement

A technology of punctuation marks and data, which is applied in the field of adding English punctuation marks, can solve problems such as poor technical effect and restoration of punctuation marks in voice transcription, and achieve the effects of saving manpower, excellent effect, and reducing costs

Pending Publication Date: 2020-12-01
SHENZHEN RAISOUND TECH
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to provide a method, system and equipment for adding English punctuation marks based on data enhancement, which is used to solve the problem that the existing English punctuation marks prediction research technology based on text features is not effective enough to solve the problem of voice transcription punctuation symbol recovery problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • English punctuation mark adding method, system and device based on data enhancement
  • English punctuation mark adding method, system and device based on data enhancement
  • English punctuation mark adding method, system and device based on data enhancement

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0033] The terms "first", "second", "third" and the like in the description and claims of the present invention and the above drawings are used to distinguish different objects, rather than to describe a specific order. Furthermore, the terms "include" and "have", as well as any variations thereof, are intended to cover a non-exclus...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an English punctuation mark adding method, system and device based on data enhancement. The method comprises the steps that text information is acquired and preprocessed to obtain training data, data enhancement processing is conducted on the training data, wherein the data enhancement comprises random deletion, random replacement, random syllable-similar word replacement and random insertion, and enhanced data is obtained; the method further includes integrating the original data before enhancement and the data after enhancement together to serve as a training data set; and performing model training by using the training data set to obtain a prediction model, the prediction model being used for adding English punctuation marks to the input text information. According to the invention, the real data is simulated by performing data enhancement processing on the training data, so that the prediction model obtained by training is more robust, the effect in a speechrecognition system is better, the operand is not increased, and compared with the mode of labeling a large number of real texts with punctuations, the manpower can be saved and the cost can be reduced.

Description

technical field [0001] The invention relates to the technical field of speech recognition, in particular to a method, system and equipment for adding English punctuation marks based on data enhancement. Background technique [0002] In recent years, with the help of machine learning, the rapid development of deep learning, and the accumulation of big data corpus, speech recognition technology has developed by leaps and bounds. Nowadays, with the breakthrough of speech recognition technology research, through speech recognition technology, people's language is converted into text recognition results as instructions and added to related products or as the final result, which can greatly improve the efficiency of human-computer interaction, which has great impact on computer development and society. The importance of life has become increasingly prominent, and the application fields of products developed with speech recognition technology have become more and more extensive, su...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/284G06F40/154G06F16/35G06F16/951G06K9/62G10L15/26
CPCG06F40/284G06F40/154G06F16/35G06F16/951G06F18/24G06F18/214
Inventor 黄石磊刘轶王昕
Owner SHENZHEN RAISOUND TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products