Unlock instant, AI-driven research and patent intelligence for your innovation.

Keyword extraction method, keyword extraction device, and computer-readable storage medium

An extraction method and technology of an extraction device, applied in the field of computer software applications, can solve problems such as insufficient importance, keyword extraction results stuck in articles, inaccuracy, etc.

Active Publication Date: 2021-03-09
BEIJING DAJIA INTERNET INFORMATION TECH CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The keywords extracted by these two algorithms are all high-frequency words, but because important words may not appear many times, it is not comprehensive enough to measure the importance of a word simply by "word frequency". In addition, the above methods may make an article Several synonyms that appear frequently in the text are all selected as keywords, resulting in the repetition of keywords, and only the words contained in the article can be obtained, and the semantic abstraction cannot be carried out, so that the extraction results of keywords are limited to the article and not precise

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword extraction method, keyword extraction device, and computer-readable storage medium
  • Keyword extraction method, keyword extraction device, and computer-readable storage medium
  • Keyword extraction method, keyword extraction device, and computer-readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present application as recited in the appended claims.

[0063] figure 1 It is a flowchart of a keyword extraction method according to an exemplary embodiment, specifically including steps S101-S104.

[0064] Most of the keyword extraction algorithms in the prior art are statistical algorithms based on word frequency. Among the extracted keywords, there are many words with repeated semantics, and the content of the article cannot be well repr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present application relates to a keyword extraction method, a keyword extraction device and a computer-readable storage medium. The keyword extraction method includes: separately calculating the correlation between the text vector of the target text and each candidate word in the candidate vocabulary; extracting K candidate words from the N candidate words whose correlation is greater than the correlation threshold, and generating the target text The candidate word joint vector, N and K are both natural numbers greater than 1; respectively calculate the first similarity and second similarity; and K candidate words corresponding to the joint vectors of candidate words whose first similarity is greater than the second similarity are used as keywords of the target text. By comparing the consistency between the title of the article and the extracted keywords, and introducing auxiliary titles for semantic discrimination, the unsupervised problem is turned into a supervised problem, and the accuracy of keyword extraction is improved.

Description

technical field [0001] The present application belongs to the field of computer software applications, especially a keyword extraction method, a keyword extraction device and a computer-readable storage medium. Background technique [0002] When performing natural language processing or news recommendation, it is usually necessary to extract keywords for articles, or to make personalized recommendations based on keywords, so different keyword extraction algorithms are used. [0003] Most of the existing keyword extraction algorithms are algorithms based on word frequency statistics, such as keyword extraction algorithms based on TF-IDF or Textrank, and keyword extraction algorithms based on TF-IDF generally endow TF (word frequency) and IDF (inverse document Term Frequency) with different weights, the product of TF and IDF is TF-IDF (Term Term Frequency Inverse Document Term Frequency Feature), and the words with the highest TF-IDF value are selected as keywords. The algori...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/9535G06F40/289
CPCG06F40/289
Inventor 刘永起
Owner BEIJING DAJIA INTERNET INFORMATION TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More