Unlock instant, AI-driven research and patent intelligence for your innovation.

Noise-inserted corpus generation method and device, equipment and readable storage medium

A noise and corpus technology, applied to the corpus generation method of inserting noise, readable storage media, devices, and equipment fields, can solve problems such as huge labeling costs, slips of the tongue, and no training corpus

Inactive Publication Date: 2021-09-14
NTT DOCOMO INC
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

"Noise" such as slips of the tongue, stagnation, and hesitation may occur during oral translation, resulting in wrong translation results
In the existing training corpus for training neural networks, there is usually no or very little training corpus similar to noise
If noise insertion and labeling are performed manually, a huge labeling cost will be incurred

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Noise-inserted corpus generation method and device, equipment and readable storage medium
  • Noise-inserted corpus generation method and device, equipment and readable storage medium
  • Noise-inserted corpus generation method and device, equipment and readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] The following will clearly and completely describe the technical solutions in the embodiments of the present disclosure with reference to the drawings in the embodiments of the present disclosure. Apparently, the described embodiments are only some of the embodiments of the present disclosure, not all of them. Based on the embodiments in the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present disclosure.

[0036]"First", "second" and similar words used in the present disclosure do not indicate any order, quantity or importance, but are only used to distinguish different components. Likewise, "comprising" or "comprises" and similar words mean that the elements or items appearing before the word include the elements or items listed after the word and their equivalents, and do not exclude other elements or items. Words such as "connected" or "connected" ar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a noise-inserted corpus generation method and device, equipment and a readable storage medium. The noise-inserted corpus generation method comprises the steps: acquiring a to-be-processed corpus that comprises at least one word; for a word in the at least one word, obtaining feature information of the word; based on the feature information, determining noise corresponding to the words; and inserting the noise corresponding to the word into the corpus to be processed, and generating the corpus with the inserted noise.

Description

technical field [0001] The present disclosure relates to the field of natural language processing based on artificial intelligence technology, and more specifically, to a noise-inserted corpus generation method, device, device, and readable storage medium. Background technique [0002] Natural language processing (NLP) is one of the important application fields of artificial intelligence technology. Natural language processing enables computers to read text like humans and understand the meaning behind the text, thereby completing specific applications such as machine translation, automatic question answering, information retrieval, sentiment analysis, and automatic text summarization. As a branch of natural language processing, machine translation is used to realize machine translation based on neural networks, such as translation between different languages ​​such as Chinese to English and English to Chinese. [0003] Before using the neural network for natural language p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/284G06F40/58G06N3/02G06N3/08
CPCG06F40/284G06F40/58G06N3/02G06N3/08
Inventor 张斯曼李安新陈岚村上聪一朗
Owner NTT DOCOMO INC