Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text processing method and device, medium and computing equipment

A text processing and text technology, applied in the field of text translation, can solve the problems of high alignment cost and high quality requirements of uploaded text

Pending Publication Date: 2020-02-18
网易有道信息技术(北京)有限公司
View PDF7 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Therefore, in the existing technology, when generating parallel corpora, there are problems of high quality requirements for uploaded texts, manual intervention for alignment, and high alignment costs.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text processing method and device, medium and computing equipment
  • Text processing method and device, medium and computing equipment
  • Text processing method and device, medium and computing equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The principle and spirit of the present invention will be described below with reference to several exemplary embodiments. It should be understood that these embodiments are given only to enable those skilled in the art to better understand and implement the present invention, rather than to limit the scope of the present invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

[0045] Those skilled in the art know that the embodiments of the present invention can be implemented as a system, device, device, method or computer program product. Therefore, the present invention can be implemented in the form of complete hardware, complete software (including firmware, resident software, microcode, etc.), or a combination of hardware and software.

[0046] According to the embodiments of the present invention, a text processing method, d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a text processing method. The method comprises the steps of obtaining a source text and a target text; determining a segmented paragraph pair according to thefirst paragraph number a of the source text and the second paragraph number b of the target text, the segmented paragraph pair comprising a first paragraph serial number for the source text and a second paragraph serial number for the target text; segmenting the source text and the target text according to the segmentation paragraph pair to obtain a plurality of sub-source texts and a plurality ofsub-target texts in one-to-one correspondence with the plurality of sub-source texts; and aligning the plurality of sub-source texts and the plurality of sub-target texts by adopting a predeterminedalignment algorithm. According to the method, the device, the medium and the computing equipment, the two texts are divided into the plurality of sub-texts, and then the sub-texts are aligned, so thatcascading errors caused by non-standard texts during subsequent paragraph alignment and sentence alignment can be reduced, the text alignment quality is improved, and the quality requirement on the two texts is reduced.

Description

technical field [0001] Embodiments of the present invention relate to the field of text translation, and more specifically, embodiments of the present invention relate to a text processing method, device, medium, and computing device. Background technique [0002] This section is intended to provide a background or context for implementations of the invention that are recited in the claims. The descriptions herein are not admitted to be prior art by inclusion in this section. [0003] In the field of translation, it is often necessary to use an alignment algorithm to generate a series of parallel sentence pairs, that is, to align texts in two different languages ​​at the sentence level to obtain parallel sentences, thereby providing a large amount of corpus for automatic translation. [0004] Existing common alignment algorithms include double alignment algorithm and direct sentence alignment algorithm. Among them, the direct sentence alignment algorithm is to divide the t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/45
Inventor 付凯陈旻黄瑾段亦涛
Owner 网易有道信息技术(北京)有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products