Text keyword extraction method, apparatus and device, and storage medium

A keyword and text technology, applied in the field of text keyword extraction, can solve the problem of inaccurate document keyword extraction, and achieve the effect of improving accuracy and accuracy

Inactive Publication Date: 2018-09-21
GCI SCI & TECH +1
View PDF8 Cites 34 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of the above problems, the purpose of the present invention is to provide a method, device, device and storage medium for extracting text keywords, which can solve the problem of inaccurate document keyword extraction, making it more accurate when measuring the similarity between different documents

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text keyword extraction method, apparatus and device, and storage medium
  • Text keyword extraction method, apparatus and device, and storage medium
  • Text keyword extraction method, apparatus and device, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

no. 2 example

[0078] On the basis of the first embodiment, the at least two texts to be matched include the first text and the second text; then after step S50, further include:

[0079] S60, acquire the keywords of the first text and the keywords of the second text, and use regular expression matching to generate a character string;

[0080] S70. Generate a first vector matrix according to the weight of each keyword in the first text and the character string;

[0081] S80. Generate a second vector matrix according to the weight of each keyword in the second text and the character string;

[0082] S90. Calculate the similarity between the first text and the second text according to the first vector matrix and the second vector matrix.

[0083]In this embodiment, keywords of the first text and the second text are extracted from all documents to be matched, for example, keywords of the bidding text and keywords of the bidding text are obtained. In this embodiment, regular expressions have t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text keyword extraction method. The method comprises the steps of performing word segmentation processing on at least two to-be-matched texts to obtain at least one segmentedword corresponding to each text; according to a preset value assignment rule, calculating a word frequency score, a part-of-speech score and a position score of each segmented word in each text; according to the word frequency score, the part-of-speech score and the position score of each segmented word, calculating a comprehensive weight value of each segmented word in each text; according to the comprehensive weight value of each segmented word, calculating a weight of each segmented word in each text; and according to the weight of each segmented word in each text, extracting a keyword ofeach text. The invention furthermore discloses a text keyword extraction apparatus and device, and a storage medium. The problem of inaccurate document keyword extraction can be solved, so that the measurement of the similarity between different documents is more accurate.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a method, device, equipment and storage medium for extracting text keywords. Background technique [0002] With the development of science and technology and the soundness of social laws, engineering projects of a certain scale now need to select suitable companies or units through bidding, and companies or units participating in bidding need to do a good job in bidding to improve their competitiveness. Therefore, the research on matching degree of bidding documents has become an important direction of the value-added business of bidding intermediaries, and the premise of document matching degree research is the extraction of document keywords, which is one of the important fields in the field of natural language. [0003] However, the inventor found in the process of implementing the present invention that in the prior art, when measuring the similarity between differe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/289
Inventor 杜翠凤
Owner GCI SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products