Method and system for scientific and technical literature retrieval

A document retrieval and technology technology, applied in the fields of information retrieval and data mining, can solve the problems of not combining language habits, low efficiency of scientific and technological literature retrieval, ignoring relevance, etc., and achieve the effect of improving retrieval efficiency

Inactive Publication Date: 2014-11-26
NORTHEAST DIANLI UNIVERSITY
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This kind of method is statistical and matching mechanically, but ignores the actual semantic meaning of the word itself and the semantic correlation between words, and does not combine the language habits of various languages, the retrieval effect not ideal
Especially for scientific and technological literature, some common theories and method terms appear frequently, but these terms with high word frequency cannot well characterize the characteristics of the literature, so the retrieval efficiency of scientific and technological literature is not as good as that of statistical word frequency. not tall

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for scientific and technical literature retrieval
  • Method and system for scientific and technical literature retrieval
  • Method and system for scientific and technical literature retrieval

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below through specific embodiments in conjunction with the accompanying drawings. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

[0044] As mentioned in the background technology section, scientific and technological documents are different from web pages, microblogs, news, etc. They are structured documents and may contain general theories and method terms in many fields. Therefore, words with high frequency may not be a good representation of scientific literature. The inventor has discovered through a lot of research and practice that the title of scientific and technological literature is a high-level summary of the content of scientific and technological documents. The efficiency of retrieving the ti...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a determiner-based method for Chinese scientific and technical literature retrieval. The method comprises the following steps that firstly, the semantic relevancy between feature vectors of retrieved entries and feature vectors of all scientific and technical literature names in a data set is calculated; then based on the Chinese character 'de' namely 'of', the retrieved entries are subjected to description, so that the scientific and technical literature names with prefixes being the same as those of the retrieved entries are found out, the semantic relevancy corresponding to the scientific and technical literature names is corrected, and finally a plurality of scientific and technical literatures with high semantic relevancy corresponding to the names of the scientific and technical literatures are used as retrieval results. According to the method, the semantic relevancy between the retrieved entries and the names of the scientific and technical literatures is considered and combined with relevance between words in the Chinese grammar, and therefore the retrieval efficiency of Chinese scientific and technical literatures is improved.

Description

Technical field [0001] The present invention relates to the fields of information retrieval and data mining, in particular to a retrieval method for scientific and technological documents. Background technique [0002] With the development of information technology and computers, the number of various electronic documents has increased at an unprecedented rate, and electronic documents are gradually replacing traditional paper publications. Electronic document retrieval has become an effective way to obtain information. [0003] Existing electronic document retrieval methods are usually implemented based on statistical word frequency. When entering search keywords, the search results are sorted according to the frequency of the counted keywords appearing in the electronic file. These methods are all mechanically performed statistics and matching, and ignore the actual semantic meaning of the word itself and the semantic relevance between words, and they do not combine the languag...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/3344G06F40/30
Inventor 郭晓利曲朝阳潘峰娄建楼孙慧宇
Owner NORTHEAST DIANLI UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products