Text processing method and device, medium and computing equipment

A text processing and text technology, applied in the field of text translation, can solve the problems of high alignment cost and high quality requirements of uploaded text
CN110807334APending Publication Date: 2020-02-18网易有道信息技术(北京)有限公司

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
网易有道信息技术(北京)有限公司
Publication Date
2020-02-18

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The embodiment of the invention provides a text processing method. The method comprises the steps of obtaining a source text and a target text; determining a segmented paragraph pair according to thefirst paragraph number a of the source text and the second paragraph number b of the target text, the segmented paragraph pair comprising a first paragraph serial number for the source text and a second paragraph serial number for the target text; segmenting the source text and the target text according to the segmentation paragraph pair to obtain a plurality of sub-source texts and a plurality ofsub-target texts in one-to-one correspondence with the plurality of sub-source texts; and aligning the plurality of sub-source texts and the plurality of sub-target texts by adopting a predeterminedalignment algorithm. According to the method, the device, the medium and the computing equipment, the two texts are divided into the plurality of sub-texts, and then the sub-texts are aligned, so thatcascading errors caused by non-standard texts during subsequent paragraph alignment and sentence alignment can be reduced, the text alignment quality is improved, and the quality requirement on the two texts is reduced.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] Embodiments of the present invention relate to the field of text translation, and more specifically, embodiments of the present invention relate to a text processing method, device, medium, and computing device. Background technique

[0002] This section is intended to provide a background or context for implementations of the invention that are recited in the claims. The descriptions herein are not admitted to be prior art by inclusion in this section.

[0003] In the field of translation, it is often necessary to use an alignment algorithm to generate a series of parallel sentence pairs, that is, to align texts in two different languages ​​at the sentence level to obtain parallel sentences, thereby providing a large amount of corpus for automatic translation.

[0004] Existing common alignment algorithms include double alignment algorithm and direct sentence alignment algorithm. Among them, the direct sentence alignment algorithm is to divide the t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More