Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Keyword extraction method suitable for word text

An extraction method and keyword technology, applied in the field of text analysis, can solve the problems of unsatisfactory keyword effect and failure to meet the requirements of keyword extraction, and achieve the effect of improving accuracy

Active Publication Date: 2020-07-10
EISOO SOFTWARE
View PDF12 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0018] For example: the file name is "Winning Beautiful! Dirk Nowitzki assisted the opening of the Dallas Cowboys season" (quoting a sports news in Baidu News as an example), and the five keywords extracted by using the above method are: Dallas, game, season , Cowboys, team, obviously the keywords of this news should include "Nowitzki", but "Nowitzki" can only appear when 10 keywords are extracted, the effect of the keywords extracted in this way is not satisfactory, and it cannot achieve real results. Requirements for Keyword Extraction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword extraction method suitable for word text
  • Keyword extraction method suitable for word text
  • Keyword extraction method suitable for word text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0059] Such as figure 1 As shown, the present invention provides a kind of keyword extraction method applicable to word text, comprises the following steps:

[0060] S1: get the word text and extract the text;

[0061] S2: Use the TFIDF algorithm and the TextRank algorithm to extract a set number of keywords;

[0062] S3: Obtain the text name and text title, and perform word segmentation;

[0063] S4: Construct the text feature vector and input the trained keyword extraction model;

[0064] S5: Using the keyword extraction model, extract the keywords extracted by the TextRank algorithm again, obtain the final keyword set, and complete the text keyword extraction.

[0065] Such as figure 2 As shown, the present invention is mainly divided into two parts, the first step is to train the model, and the second step is to use the model to extract keywords. figure 1 The representation is the main logic for the model training phase and the keyword extraction phase of the model a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a keyword extraction method suitable for a word text. The keyword extraction method comprises the following steps: S1, obtaining the word text and extracting a text; S2, respectively utilizing a TFIDF algorithm and a TextRank algorithm to extract a set number of keywords; S3, obtaining a text name and a text title, and performing word segmentation; S4, constructing a textfeature vector, and inputting the trained keyword extraction model; S5, extracting the keywords, extracted through the TextRank algorithm, again through the keyword extraction model, obtaining a finalkeyword set, and completing the text keyword extraction. Compared with the prior art, the text keyword extraction method has the advantages of being high in accuracy and recall rate and the like.

Description

technical field [0001] The invention relates to the field of text analysis, in particular to a keyword extraction method suitable for word text. Background technique [0002] Keyword extraction is the key to technologies such as information retrieval, text classification and clustering, and automatic abstract generation, and is an important means to quickly obtain document topics. Keywords are traditionally defined as a set of words or phrases that summarize the subject matter of a document. Keywords have very important applications in many fields, such as automatic summarization of documents, web page information extraction, classification and clustering of documents, search engines, etc. However, in most cases, the text does not directly provide keywords, so it is necessary to design a keyword extraction method. [0003] In the field of text analysis, the technologies for extracting text keywords mainly include: TextRank algorithm, TFIDF algorithm and LDA topic model. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/216G06F40/289G06F16/35
CPCG06F16/35G06F40/284G06F16/313G06F40/30G06F40/216
Inventor 张校源陈骁马祥祥
Owner EISOO SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products