Statement segmentation method and device, storage medium, processor and terminal equipment

A storage medium and processor technology, applied in natural language data processing, semantic analysis, electronic digital data processing, etc., can solve problems such as large translation deviations, inability to segment sentences and/or sentence pairs, etc., and achieve a wide range of applications Effect

Pending Publication Date: 2020-11-10
ALIBABA GRP HLDG LTD
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide a sentence segmentation method and device, a storage medium, a processor, and a terminal device, to at least solve the problem...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Statement segmentation method and device, storage medium, processor and terminal equipment
  • Statement segmentation method and device, storage medium, processor and terminal equipment
  • Statement segmentation method and device, storage medium, processor and terminal equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] According to an embodiment of the present invention, an embodiment of a method for sentence segmentation is also provided. It should be noted that the steps shown in the flowcharts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and , although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0027] The sentence segmentation method provided in Embodiment 1 of the present application can be executed in a mobile terminal, a computer terminal or a similar computing device. figure 1 A hardware structural block diagram of a computer terminal (or mobile device) for realizing the sentence segmentation method is shown. Such as figure 1 As shown, the computer terminal 10 (or mobile device 10) may include one or more (shown by 102a, 102b, ..., 102n in the figure) processor 102 (the processor 102 ma...

Embodiment 2

[0069] According to an embodiment of the present invention, a sentence segmentation device for implementing the above sentence segmentation method is also provided, Figure 5 is a schematic diagram of an optional sentence segmentation device according to an embodiment of the present invention, such as Figure 5 As shown, the device includes: an acquisition unit 51, a training unit 53, and a segmentation unit 55, wherein,

[0070] The acquisition unit 51 is used to acquire training data, wherein the training data is a pair of bilingual sentences to be used formed by segmenting the initial bilingual sentence pair at least based on the word alignment relationship;

[0071] A training unit 53, configured to obtain a sentence segmentation model through training data training;

[0072] The segmentation unit 55 is configured to segment the sentence to be segmented using the sentence segmentation model.

[0073] The above-mentioned sentence segmentation can use the acquisition unit ...

Embodiment 3

[0082] Embodiments of the present invention may provide a computer terminal, and the computer terminal may be any computer terminal device in a group of computer terminals. Optionally, in this embodiment, the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.

[0083] Optionally, in this embodiment, the foregoing computer terminal may be located in at least one network device among multiple network devices of the computer network.

[0084] Optionally, the above-mentioned computer device includes: a processor; and a memory, connected to the processor, for providing the processor with instructions for processing the following processing steps: obtaining training data, wherein the training data is at least based on the word alignment relationship, by pairing After the initial bilingual sentence pair is segmented, the bilingual sentence pair to be used is formed; the sentence segmentation model is obtained through training data train...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a statement segmentation method and device, a storage medium, a processor and terminal equipment. The method comprises the steps of acquiring training data , and the training data are bilingual sentence pairs to be used formed by segmenting initial bilingual sentence pairs at least based on a word alignment relation; training through the training data to obtain a statementsegmentation model; and segmenting the to-be-segmented statement by adopting the statement segmentation model. According to the method and the apparatus, the technical problem of very large translation deviation caused by incapability of effectively segmenting statements and/or sentence pairs during text translation in related technologies is solved.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, in particular to a sentence segmentation method and device, a storage medium, a processor and a terminal device. Background technique [0002] In related technologies, machine translation refers to the translation of text from one natural language into another natural language by means of a computer program. Currently, when machine translation is performed, it often includes two steps: 1) first convert the original text of the bilingual sentence pair Segment the translation and the translation according to punctuation to form clauses; 2) align the clauses to obtain bilingual clause pairs for mutual translation. But there are obvious defects in this kind of translation method, that is, 1) after segmentation according to punctuation, there may be no clause pairs for complete mutual translation; 2) sentences that do not have punctuation but are still very long cannot be p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/58G06F40/30G06F40/289G06F16/35
CPCG06F16/35
Inventor 陆军施杨斌赵宇骆卫华
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products