Visual document atlas construction method

A construction method and document technology, applied in the field of knowledge graphs, can solve problems such as inaccessibility, and achieve the effect of efficient mastery and simplified complexity

Pending Publication Date: 2020-08-28
SHANGHAI DATATOM INFORMATION TECH CO LTD
View PDF3 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

One of the common ways is to form a text summary, that is, to summarize and summarize the original text, and summarize the main content of the article in a concise text, but this natural language information expression method does not allow people to obtain the required information intuitively and clearly.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Visual document atlas construction method
  • Visual document atlas construction method
  • Visual document atlas construction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0029] Step S1, preprocessing the input text:

[0030] S11: Extract keywords from the input text based on the TextRank algorithm: only keep the noun words in the text to participate in the TextRank weight calculation, and then set the threshold and filter out some keywords;

[0031] The specific process is as follows: set the threshold to T, if the TextRank weight of the vocabulary is greater than the set threshold T, then keep the word as a keyword, otherwise filter out these words

[0032] S12: Segmenting the input text: the purpose of text segmentation is that subsequent dependent syntactic analysis needs to be analyzed in units of a single sentence, and the method adopted is to segment according to punctuation such as commas, full stops, exclamation marks, and question marks;

[0033] S13: Perform word segmentation, part-of-speech tagging, named entity recognition, and dependent syntax processing on each single sentence: Named entity recognition extracts words with specifi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a visual document atlas construction method, which comprises the following steps of S1, sequentially extracting keywords from an input text, carrying out sentence segmentationon the input text, carrying out word segmentation on each single sentence generated by sentence segmentation, and sequentially carrying out part-of-speech tagging, named entity identification and dependency syntax analysis on each word formed by word segmentation; S2, formulating a relationship extraction rule, and extracting triple data from each simple sentence obtained in S1 based on the relationship extraction rule, wherein the triple data is composed of two entity words and a relation word; S3, importing the triple data obtained in the S2 into a graph database to form a document graph; and S4, performing graph mining operation on the data in the graph database, and realizing document graph visualization based on the mining operation. According to the visual document atlas constructionmethod, the key information of the text can be extracted, and the document is mapped into the visual graph based on semantic association, so that a user is helped to efficiently master the semantic information of the article.

Description

technical field [0001] The invention belongs to the technical field of knowledge graphs, and in particular relates to a method for constructing a visualized document graph. Background technique [0002] In the era of Internet information explosion, it is difficult for people to quickly and accurately obtain the main content of documents from unstructured text information, especially long text documents such as reports, papers, and reports, and information overload is even more serious. Therefore, it is very necessary to perform a "dimension reduction" process on various texts. One of the common ways is to form a text summary, that is, to summarize and summarize the original text, and summarize the main content of the article in a concise text, but this natural language information expression method does not allow people to obtain the required information intuitively and clearly. . In 2012, Google proposed the concept of knowledge graph. A knowledge graph is essentially a ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/36G06F40/211G06F40/295G06F40/30
CPCG06F40/211G06F40/295G06F40/30G06F16/367
Inventor 许青青谢赟吴新野韩欣
Owner SHANGHAI DATATOM INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products