Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text data generation method, computer equipment and storage medium

A text data and text technology, applied in the field of text data generation methods, computer equipment and storage media, can solve the problems of grammar errors, difficulty in learning enough generalized knowledge, misuse, etc., and achieve the effect of generating various results

Pending Publication Date: 2022-04-15
IFLYTEK CO LTD +2
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Due to the lack of enough labeled data, the grammatical error detection model is prone to overfitting on small-scale training sets, which seriously restricts the improvement of the grammatical error detection model
On the other hand, the small scale of labeled data also means that there are fewer grammatical errors in the data, and there may even be extreme cases where most of the errors in the data are misuses of "de, get, and place", which leads to grammatical errors in the data. Can be trained on few error types, it is difficult to learn enough general knowledge

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text data generation method, computer equipment and storage medium
  • Text data generation method, computer equipment and storage medium
  • Text data generation method, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0020] The flow charts shown in the drawings are just illustrations, and do not necessarily include all contents and operations / steps, nor must they be performed in the order described. For example, some operations / steps can be decomposed, combined or partly combined, so the actual order of execution may be changed according to the actual situation.

[0021] Embodiments of the present application provide a method for generating text data, computer equipment, and a ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a text data generation method, computer equipment and a storage medium. The text data generation method comprises the steps of obtaining a third text without grammar defects; the third text is input into a generation model, a first generation text corresponding to the third text and a target probability used for indicating that the first generation text has grammatical defects are generated, and the generation model is obtained through training based on a first text without grammatical defects and a second text with grammatical defects; and determining the first generated text as a target text according to the target probability corresponding to the first generated text. The generation model learns the grammar error rule corresponding to the second text, the trained generation model can play the excellent characteristic of diverse generation results, and a large number of texts containing grammar defects can be automatically constructed.

Description

technical field [0001] The present application relates to the technical field of natural language processing, and in particular to a method for generating text data, computer equipment and storage media. Background technique [0002] The goal of the grammatical error detection task is to detect possible grammatical errors in the text. These grammatical errors can be divided into "typos", "incomplete components", "redundant components", "improper collocations", "improper words", " Improper word order" and other categories. The task of grammatical error detection requires the use of text marked with grammatical errors for model training. However, the current Chinese corpus marked with grammatical errors is small in scale, difficult to obtain, and expensive to label. [0003] Due to the lack of enough labeled data, the grammatical error detection model is prone to overfitting on small-scale training sets, which seriously restricts the improvement of the grammatical error dete...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/253G06F40/289G06F40/30G06K9/62
Inventor 呼啸巩捷甫宋巍盛志超王士进陈志刚胡国平秦兵刘挺
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products