Method for establishing mapping knowledge domain based on book catalogue

A technology of knowledge graph and book catalog, applied in the field of knowledge graph generation, which can solve the problems of sparse relationship between nodes, poor fixed structure, poor scalability, and few levels.

Active Publication Date: 2014-04-16
ZHEJIANG UNIV
View PDF6 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage is that there are relatively few layers, the relationship between nodes is relatively sparse, the structure is relatively fixed, the scalability is not good, and it is artificially generated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for establishing mapping knowledge domain based on book catalogue
  • Method for establishing mapping knowledge domain based on book catalogue
  • Method for establishing mapping knowledge domain based on book catalogue

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0180] Below in conjunction with the method of the present invention describe in detail the concrete steps that this example implements, as follows:

[0181] 1) 10,000 computer books were processed by optical character recognition OCR, and in the digital directory structure, according to the length of the entries in the directory, 9 Chinese characters were used as a boundary to distinguish between long entries and short entries. class entry;

[0182] 2) if figure 1 and figure 2 As shown, the short entries are directly used as a batch of candidate nodes, and the long entries are tagged with the open source natural language processing tool FudanNLP to obtain a part-of-speech array, and then another batch of candidate nodes is extracted by using conjunctions, punctuation and part-of-speech rules;

[0183] The processing of word segmentation, part-of-speech tagging, and splitting and fusion between words for long entries is as follows:

[0184] Use natural language processing ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for establishing a mapping knowledge domain based on a book catalogue. The method comprises the steps that a catalogue page in a digitized book is extracted, the lengths of items in the catalogue are differentiated, and part-of-speech tagging is conducted on the long items through a natural language processing tool, so that part-of-speech arrays are obtained, and candidate nodes are extracted according to rules of conjunctions, punctuations and parts of speech; the long items and the short items are authenticated in the Baidu encyclopedia and the Hudong encyclopedia, a leader-member relation and parallel relations are formed through a catalogue structure and serve as a framework of the mapping knowledge domain, the strong and weak parallel relations are differentiated and serve as increments respectively, and the leader-member relation is supplemented with the strong and weak parallel relations; according to a noisy data excavating algorithm with suffixes serving as a base, nodes are selected from the items which do not pass the authentication of the encyclopedias and the mapping knowledge domain is supplemented with the selected nodes; finally, the weights of relations in the supplemented mapping knowledge domain are calculated and ranked, so that noise is removed through screening. Compared with an existing mapping knowledge domain, the mapping knowledge domain established through the method is richer in node, better in expandability and higher in accuracy.

Description

technical field [0001] The present invention relates to the generation of knowledge maps by means of computer artificial intelligence, data mining and other methods, in particular to a method for constructing knowledge maps based on book catalogs. Background technique [0002] Today, with the rapid development and popularization of computers, in order to more conveniently and clearly obtain information, learn knowledge, and analyze and mine the relationship and evolution process between knowledge, there is an increasing need for a content-rich, high-accuracy, and easy-to-expand Knowledge graph, and how to construct this knowledge graph has naturally become a hot spot of current research. [0003] The current Chinese knowledge graph includes HowNet, interactive encyclopedia knowledge tree, and CNKI classification, but each of them has limitations and various problems. [0004] HowNet was developed by Mr. Dong Zhendong of the Chinese Academy of Sciences. It describes the conc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/9017G06F16/90328G06F16/90332
Inventor 鲁伟明张萌魏宝刚庄越挺
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products