Unlock instant, AI-driven research and patent intelligence for your innovation.

Text segmentation method and device, electronic equipment and storage medium

A text and target text technology, applied in the fields of electrical digital data processing, instruments, computing, etc., can solve the problems of redundant workflow processing methods, errors in sentence generation, insufficient part of speech coverage, etc., to ensure accuracy and processing efficiency. Effect

Pending Publication Date: 2021-05-28
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This kind of processing workflow processing method is too redundant and huge, and word segmentation needs to rely on a huge thesaurus and word segmentation algorithm, and after word segmentation, it is necessary to return to the part of speech of word segmentation to generate sentences, and it also needs to rely on a huge part of speech model, and then combining sentences may also be due to Insufficient part-of-speech coverage or part-of-speech conflicts lead to errors in sentence generation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text segmentation method and device, electronic equipment and storage medium
  • Text segmentation method and device, electronic equipment and storage medium
  • Text segmentation method and device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0029] A schematic diagram of the text segmentation method provided by the first embodiment of the present disclosure. Such as figure 1 As shown, the method includes:

[0030] S101: Divide the text to be processed based on punctuation marks to obtain L first clauses; L is an integer greater than or equal to 1;

[0031] S102: Determine M clauses to be output based on ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text segmentation method and device, electronic equipment and a storage medium, and relates to the field of information processing. According to the specific implementation scheme, a to-be-processed text is divided based on punctuation marks, and L first clauses are obtained; L is an integer greater than or equal to 1; M to-be-output clauses are determined based on the L first clauses, and the M to-be-output clauses are taken as segmentation results of the to-be-processed text; M is an integer greater than or equal to 1, wherein M to-be-output clauses are determined on the basis of the L first clauses, and the method comprises the steps that under the condition that the length of the ith first clause in the L first clauses is larger than a preset length threshold value, the ith first clause is processed on the basis of a matching rule, and the to-be-output clauses are obtained; and i is an integer greater than or equal to 1 and less than or equal to L.

Description

technical field [0001] The present disclosure relates to the field of information processing, in particular to the field of text information processing. Background technique [0002] In the prior art, the text is segmented, usually by first performing word segmentation and then generating clauses according to the part of speech. This kind of processing workflow processing method is too redundant and huge, and word segmentation needs to rely on a huge thesaurus and word segmentation algorithm, and after word segmentation, it is necessary to return to the part of speech of word segmentation to generate sentences, and it also needs to rely on a huge part of speech model, and then combining sentences may also be due to Insufficient part-of-speech coverage or part-of-speech conflicts lead to errors in sentence generation. Contents of the invention [0003] The disclosure provides a text segmentation method, device, electronic equipment and storage medium. [0004] According t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/211G06F40/289
CPCG06F40/211G06F40/289
Inventor 常炎隆
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD