Keyword extraction method and device and storage medium

A technology for extracting methods and keywords, which is applied in the field of data processing and can solve the problem of high computational complexity

Pending Publication Date: 2020-04-28
BEIJING XIAOMI MOBILE SOFTWARE CO LTD
View PDF7 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These two methods both rely on a large amount of labeled corpus, have high com

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword extraction method and device and storage medium
  • Keyword extraction method and device and storage medium
  • Keyword extraction method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0081] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with aspects of the invention as recited in the appended claims.

[0082] An embodiment of the present disclosure provides a keyword extraction method. refer to figure 1 , figure 1 is the flow of a keyword extraction method shown according to an exemplary embodiment Figure 1 ,Such as figure 1 As shown, the method includes the following steps:

[0083] In step 101, the original document is received, and a plurality of candidate phrases are extracted from t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a keyword extraction method and device and a storage medium, and the method comprises the steps: receiving an original document, extracting a plurality of candidate phrases from the original document, and enabling the extracted candidate phrases to form a candidate set; obtaining the association degree between each candidate phrase in the candidate set and the original document; obtaining divergence of each candidate phrase in the candidate set; and selecting at least one candidate phrase from the candidate set as a key phrase based on the correlation degree and the divergence degree, and forming a key phrase set of the original document based on the selected at least one key phrase. Thus, the candidate phrases with the high association degree with the original document can be selected from the candidate set to serve as the key phrases, the similarity between the extracted key phrases and the original document is high, and the accuracy of extracting the key phrases is improved; the phrases with high divergence can be selected from the candidate set to serve as the key phrases, so that the difference degree between the extracted key phrases and the key phrases selected to the key phrase set is high, and the diversity of the key phrases is improved.

Description

technical field [0001] The present disclosure relates to the technical field of data processing, and in particular to a keyword extraction method, device and storage medium. Background technique [0002] With the explosive growth of Internet text data, it is often necessary to extract keywords that can summarize the core viewpoints of articles in related businesses, so as to realize functions such as accurate recommendation and key annotation. The implementation of this type of business is highly subjective, and it is difficult to obtain available annotation corpus, which leads to low accuracy of traditional methods and consumes a lot of computing time. [0003] In related technologies, keyword extraction includes two methods, the first method is keyword extraction (for words that have appeared in the text), and the second method is keyword generation (for words that have not appeared in the text). [0004] There are many ways to implement keyword extraction in Method 1, in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/284G06F16/35
CPCG06F16/35G06F40/30G06F40/279G06F40/284Y02D10/00G06F40/289
Inventor 过群鲁骁孟二利王斌史亮齐保元纪鸿旭
Owner BEIJING XIAOMI MOBILE SOFTWARE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products