Supercharge Your Innovation With Domain-Expert AI Agents!

Corpus generation method and device and computer equipment

A technology for generating devices and corpus, applied in the network field, can solve problems such as poor error correction effect, and achieve the effect of improving user experience and improving error correction effect.

Pending Publication Date: 2021-02-26
ALIBABA GRP HLDG LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The error correction effect of the actual error correction model is largely affected by the quality and quantity of the training samples. The higher the quality of the training samples, the greater the number of training samples, the better the error correction effect of the error correction model is. worse

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Corpus generation method and device and computer equipment
  • Corpus generation method and device and computer equipment
  • Corpus generation method and device and computer equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062]In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the drawings in the embodiments of the present application.

[0063]In some of the procedures described in the specification and claims of this application and the above-mentioned drawings, multiple operations appearing in a specific order are included, but it should be clearly understood that these operations may not be in the order in which they appear in this document. Execution or parallel execution, the operation sequence numbers such as 101, 102, etc., are only used to distinguish different operations, and the sequence numbers themselves do not represent any execution order. In addition, these processes may include more or fewer operations, and these operations may be executed sequentially or in parallel. It should be noted that the descrip...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a corpus generation method and device and computer equipment. According to the embodiment of the invention, the method comprises the steps: obtaining a positive sample in a target domain, and determining a replacement word corresponding to at least one correct word in the positive sample; and replacing the at least one correct word with the corresponding replacement word, obtaining a negative sample corresponding to the positive sample, and generating a first error correction parallel corpus of the target domain at least based on the positive sample andthe negative sample. According to the embodiment of the invention, a large number of high-quality first error correction parallel corpora in the target field can be further quickly obtained.

Description

Technical field[0001]The embodiments of this application relate to the field of network technology, and in particular to a method and device for generating corpus, and a computer device.Background technique[0002]With the rapid development of Internet technology, users are increasingly making purchases through online shopping malls. When a user searches for a desired product, he generally needs to enter the text information of the product to be searched in the search box of the user terminal, and the user terminal performs a product search based on the text information and displays the product matching the text information to the user.[0003]However, when the user enters the text information of the product to be searched, due to the limitation of knowledge or hand errors during the input process, there may be incorrect text input. In order to improve the user experience, the user can search and correct the wrong text information entered by the user. The correct text information that m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/232G06F40/30G06F16/33
CPCG06F16/3332
Inventor 刘恒友李辰包祖贻黄睿徐光伟李林琳
Owner ALIBABA GRP HLDG LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More