Unlock instant, AI-driven research and patent intelligence for your innovation.

Training data generation method and device, electronic equipment and readable medium

A technology for training data and generating models, applied in the field of machine learning, can solve the problems of low accuracy of generated text summaries, low model versatility, etc., to achieve the effect of improving versatility and prediction accuracy, improving performance, and improving pertinence

Pending Publication Date: 2021-11-16
BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Embodiments of the present invention provide a method, device, electronic device, and computer-readable storage medium for generating training data, so as to solve or partially solve the low accuracy of generated text summaries due to the low versatility of the model in the related art The problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training data generation method and device, electronic equipment and readable medium
  • Training data generation method and device, electronic equipment and readable medium
  • Training data generation method and device, electronic equipment and readable medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0111] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0112] As an example, automatic text summarization can effectively compress and refine document information, help users retrieve the required relevant information from massive information, and avoid the problem that users may generate too much redundant and one-sided information when searching through search engines. Or reduce the problem of users reading a large amount of document information, and effectively solve the problem of information overload.

[0113] For the generation process of the text summary, the corresponding text summary can be obtained by inputting the corresponding text into the summary generation model. For the summary generation model, it often takes plain text as input, and the text summary corresponding to ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a training data generation method and device, electronic equipment and a readable medium. The method comprises the following steps: processing a prediction text set according to an abstract generation model, and determining an abstract evaluation value corresponding to each prediction text in the prediction text set, extracting predicted texts with low abstract evaluation values from the predicted text set to form a predetermined text set, and performing text similarity matching on the predetermined text set and at least one candidate text set so as to extract a text with similarity meeting a preset condition from the candidate text set as a target text; determining target text abstracts corresponding to the target texts, and then adopting the target texts and the target text abstracts as training data for the abstract generation model, so that after a preset text set with a relatively poor prediction result of the current abstract generation model is extracted, the preset text set can be matched with a candidate text set, and targeted screening of the texts is realized; the pertinence of the training data is improved.

Description

technical field [0001] The present invention relates to the technical field of machine learning, in particular to a method for generating training data, a device for generating training data, an electronic device and a computer-readable medium. Background technique [0002] With the explosive growth of text information, people are exposed to massive amounts of text information every day, such as news, meeting minutes, blogs, chats, reports, papers, Weibo, etc. Therefore, it is becoming more and more important to extract important content from text information, and automatic text summarization, a technology that enables users to obtain information more quickly and accurately, has emerged as the times require. Extracting important content from text information has become an urgent need for us, and Automatic Text Summarization provides an efficient solution. [0003] Among them, the model used to generate a summary often uses the title (or text summary) as the output of the mo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/34G06F16/33G06N20/00
CPCG06F16/345G06F16/334G06N20/00
Inventor 杨鹏涂曼姝龚能
Owner BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD