Keyword weight calculation method based on position of word

A calculation method and keyword technology, applied in the direction of calculation, electrical digital data processing, special data processing applications, etc., can solve the problem of factors such as the position of words without consideration, and achieve the effect of facilitating understanding and memory, and facilitating analysis

Inactive Publication Date: 2016-03-23
WUHAN UNIV OF TECH
View PDF2 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, the calculation method of keyword weight is mostly based on word frequency, and the factor of word position is not considered in the influence factor of keyword weight calculation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword weight calculation method based on position of word
  • Keyword weight calculation method based on position of word
  • Keyword weight calculation method based on position of word

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Below in conjunction with specific embodiment, the present invention is further described:

[0034] Such as figure 1 The shown keyword weight calculation method based on word position includes the following steps:

[0035] Document preprocessing: For the provided document, if the document is similar to pdf, it needs to be preprocessed to obtain text information. Specifically, the corresponding text analysis tool can be used; the analysis tool is used to analyze the pages of the pdf document, and it can be obtained after analysis To all page data of pdf documents, catalog pages and page paragraphs are identified through table of contents and paragraph features, and these data are stored reasonably for subsequent processing calls such as word segmentation. .

[0036] Keyword extraction: Extract keywords from the text information after document preprocessing, compare the existing keyword table, extract keywords from each paragraph of each page of the document in units of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a keyword weight calculation method based on a position of a word. The method comprises the following steps of: pre-processing a document: performing pre-processing on the provided document to obtain text information; extracting a keyword: extracting the keyword from the text information after preforming preprocessing on the document; acquiring an influence factor: acquiring a weight factor of the extracted keyword; acquiring a basic weight in one aspect; identifying words in a first sentence of an abstract and an article chapter in the other aspect; performing weighted calculation: regarding the acquired influence factor as a weight calculating factor to perform final weight calculation; and outputting a keyword weight table: outputting the final keyword weight table. According to the keyword weight calculation method based on the position of the word provided by the present invention, a weight parameter of the word can be calculated accurately, so that the keyword analysis is facilitated and the understanding and memorization of readers to the content of articles are facilitated.

Description

technical field [0001] The invention relates to the technical field of digital publications, in particular to a method for calculating keyword weights based on word positions. Background technique [0002] A keyword is a word or word that is input by a user when using a search and can summarize the content of the information that the user is looking for to the greatest extent, and is the generalization and concentration of information. The keywords mentioned in the publishing industry often refer to the core and main content of the article. [0003] At present, in published articles, the position of a sentence in the article can reflect the importance of the sentence. Similarly, the position of a word in the article can also reflect the importance of the word in the article. In many cases, important words appear In the abstract, the first sentence of the article paragraph, so the position of the word can be used as a factor in the weight calculation. [0004] At present, m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 刘永坚白立华杨朝阳李文忠杨慧朱驰风
Owner WUHAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products