Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Keyword extraction method, apparatus and device, and storage medium

An extraction method and keyword technology, applied in the field of text processing, can solve problems such as low accuracy, poor keyword performance, and no consideration of the semantic features of candidate words, and achieve the effect of improving accuracy and recall rate.

Inactive Publication Date: 2019-12-20
TENCENT TECH (SHENZHEN) CO LTD
View PDF8 Cites 35 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, using this method to extract keywords requires word segmentation for the text to be extracted, so the effect of extracted keywords is very dependent on the accuracy of word segmentation. When the accuracy of word segmentation is poor, the accuracy of keyword extraction is low; in addition, this The method does not consider the semantic features of the candidate words and may perform poorly in keyword extraction in proprietary domains

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword extraction method, apparatus and device, and storage medium
  • Keyword extraction method, apparatus and device, and storage medium
  • Keyword extraction method, apparatus and device, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0031] It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrate...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of text processing, in particular to a keyword extraction method, apparatus and device and a storage medium. The method includes: obtaining corpus text tobe extracted; inputting the to-be-extracted corpus text into a text labeling model for character type labeling processing to obtain a label corresponding to each character in the to-be-extracted corpus text, wherein the text labeling model is determined by performing supervised training based on a preset neural network model by using a training corpus text with a sample label, and the preset neural network model comprises a semantic representation model, a full connection layer connected with the semantic representation model, a conditional random field connected with the full connection layerand an output layer connected with the conditional random field; obtaining a character corresponding to a preset label in the corpus text to be extracted; and determining a keyword of the corpus textto be extracted according to the character corresponding to the preset label. According to the method, the accuracy and recall rate of keyword extraction can be improved.

Description

technical field [0001] The present invention relates to the technical field of text processing, in particular to a keyword extraction method, device, equipment and storage medium. Background technique [0002] With the development of the Internet, the amount of online text information has exploded, and it is increasingly difficult to manually obtain the required text information. Therefore, how to quickly and effectively summarize the key information of texts in a certain field or topic has become an important issue. [0003] In order to effectively deal with massive text data, researchers have done a lot of research in the directions of text classification, text clustering, automatic summarization and information retrieval, and these researches all involve the problem of how to obtain the keywords in the text. Keywords are the refinement of text topic information, which highly summarizes the main content of the text and can help users quickly understand the gist of the tex...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F16/35
CPCG06F16/35
Inventor 智绪浩
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products