Supercharge Your Innovation With Domain-Expert AI Agents!

Method and device for detecting key paragraph in text

A paragraph and text technology, which is applied in the key paragraph detection method and device field in the text, can solve the problems of inaccurate relationship extraction and long distance, and achieve the effect of reducing computational complexity, reducing difficulty, and improving accuracy

Active Publication Date: 2019-12-03
苏州美能华智能科技有限公司
View PDF5 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the problem of inaccurate relationship extraction due to the existence of a large number of paragraphs that do not contain entity information when the related technology extracts key information based on long texts, or the distance between entities in long texts, this application provides a text Key paragraph detection method and device

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting key paragraph in text
  • Method and device for detecting key paragraph in text
  • Method and device for detecting key paragraph in text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present application as recited in the appended claims.

[0068] In order to facilitate the understanding of this application, some terms involved in this application are explained below.

[0069] PDF: English full name Portable Document Format, portable document format, is a common electronic file format.

[0070] OCR: English full name Optical Character Recognition, optical character recognition, is a recognition technology that converts inf...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a device for detecting a key paragraph in a text. The method comprises the following steps: segmenting and copying a to-be-detected text to obtain a paragraph set;inputting each paragraph set into a label prediction model to obtain a label of each paragraph; removing the invalid preset word number of the head end and the preset word number of the tail end in the paragraph set, and splicing the labels of the remaining valid texts together to obtain the label of each original paragraph corresponding to the paragraph set; and screening out the original paragraphs with the labels B and I as key paragraphs. According to the invention, the to-be-detected text is segmented; obtaining a short paragraph set; predicting paragraphs in each paragraph set by utilizing a label prediction model; in order to reduce previous text information loss at a starting boundary and next text information loss at an ending boundary caused by segmentation of a paragraph set, segmentation optimization at the boundary is carried out by using overlapping operation, so that the paragraph prediction accuracy is improved, and the paragraph label prediction calculation complexityis greatly reduced.

Description

technical field [0001] The invention belongs to the technical field of computers, and relates to a method and a device for detecting key paragraphs in a text. Background technique [0002] With the continuous development of Internet applications, a large amount of data will be generated every day. Among the more demands, it is necessary to extract the key information needed in these data and perform structured processing on the extracted key information. [0003] When extracting the key information in the data, the common practice is to first identify the key of the text, generate a set of candidate keys, and then use the relationship extraction technology to search for possible relationships between similar keys, and associate the keys to form a structured information is stored. [0004] At present, most of the researches on key extraction in text information are based on short texts. When the text data is long text, the set of candidate keys generated by key recognition ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06N3/04
CPCG06N3/045Y02D10/00
Inventor 熊玉竹周以晴侯绍东
Owner 苏州美能华智能科技有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More