Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and equipment for segmenting texts

A technology of text and segmentation steps, applied in the field of segmented text, to achieve the effect of easy analysis and comparison, and time saving

Active Publication Date: 2017-10-03
CANON KK
View PDF13 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, if image 3 As shown, the prior art method only identified 4 fragments

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and equipment for segmenting texts
  • Method and equipment for segmenting texts
  • Method and equipment for segmenting texts

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0042] Figure 4 is a flowchart illustrating a method for segmenting text including a plurality of sentences according to the first embodiment of the present invention.

[0043] like Figure 4 As shown, in the extraction step 410, a plurality of evidence and a plurality of inferences are extracted from the text.

[0044] In some examples, evidence and inferences may be entities or named entities.

[0045] In one embodiment, the extracting step 410 may include identifying evidence and / or inferences from the text according to a predefined vocabulary. The above identification operation can be realized by any kind of suitable method known in the art. For example, a vocabulary can be predefined by users or experimentation based on what is discussed in the text. A vocabulary may include all entities or common entities of evidence and / or inferences that may exist in such a text. Evidence and / or inferences can be identified from text by, for example, searching and matching entiti...

no. 2 example

[0100] This embodiment involves applying the text segmentation method of the first embodiment to display text in a better manner.

[0101] Figure 12 is a flowchart illustrating a method for displaying text according to a second embodiment of the present invention.

[0102] like Figure 12 As shown, first, in step 1210, the text is segmented into multiple segments by using the text segmentation method of the first embodiment.

[0103] Then, in step 1220, the segmented segments are displayed by associating each segment with an inference.

[0104] by figure 1 A medical imaging report is shown in as an example of text to be segmented and displayed. As discussed above, this report can be split into five segments, such as Figure 10 shown.

[0105] Each segment is then associated with a corollary, and the text is displayed using multiple pages, each page having a label describing the corresponding corollary. In pages with the Inferences tab, the findings and diagnoses in the...

no. 3 example

[0110] This embodiment involves applying the text segmentation method of the first embodiment to link text across multiple documents.

[0111] Figure 15 is a flowchart showing a method for linking texts according to the third embodiment of the present invention.

[0112] like Figure 15 As shown, first, in step 1510, each of the texts is segmented into multiple segments by using the text segmentation method of the first embodiment.

[0113] Then, in step 1520, each segment is associated with an inference.

[0114] Then, in step 1530, segments associated with the same inference are linked together. Chaining can be accomplished by any technique known in the art. For example, linking across documents can be implemented based on tags.

[0115] This embodiment links text fragments of the same inference across documents. In one example, multiple text fragments in multiple radiology reports for the same patient are linked together if they relate to the same physiological disorde...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and equipment for segmenting texts. A method for segmenting a text with a plurality of sentences comprises the following steps of: extracting a plurality of evidences and a plurality of inferences from the text; for each inference in the plurality of inferences, determining a preferential position of each evidence in the plurality of evidences on the basis of the text and / or a segmentation history, wherein the preferential position indicates a most possible position, in a sequence of the evidence for making the inference, of the evidence; and determining one or more boundaries in boundaries between every two continuous sentences in the text as fragment boundaries on the basis of the preferential positions of the evidences, so as to segment the text into a plurality of fragments. Through utilizing the method and equipment provided by the invention, the segmentation is more correct.

Description

technical field [0001] The present invention relates to methods and devices for segmenting text, and in particular to methods and devices for segmenting text into parts according to topics. Background technique [0002] In the prior art, several methods for segmenting text into segments have been proposed. For example, US application publication US2014 / 0052753A1 (METHOD, DEVICE AND SYSTEM FOR PROCESSING PUBLIC OPINION TOPICS) discloses a method for determining whether a topic of public opinion meets an alarm condition, which includes using lexical features (such as concepts) to segment text. [0003] However, there are some disadvantages in those prior arts, such as low accuracy and the like. The reason for the low accuracy may be that the mapping between segmented text fragments and concepts is sometimes inconsistent. For example, in the case of segmenting medical imaging reports, such as radiology reports, physicians often write more than one diagnosis for a body part in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/205G06F16/355G06N5/045G06N20/00G06F40/295
Inventor 黄耀海胡钦谙郭瑞山
Owner CANON KK