Subject heading classification model creation method and device and storage medium

A classification model and technology of subject words, applied in the field of data processing, can solve the problems of poor accuracy of subject word classification models, complex creation process of subject word classification models, and high creation cost, so as to reduce creation costs, improve accuracy, and simplify the creation process Effect

Active Publication Date: 2017-11-07
TENCENT TECH (SHENZHEN) CO LTD
View PDF4 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide a method for creating a subject classification model, a creation device, and a storage medium that can accurately create a subject classification model, the creation process is si

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Subject heading classification model creation method and device and storage medium
  • Subject heading classification model creation method and device and storage medium
  • Subject heading classification model creation method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0031] Please refer to the drawings, in which the same component symbols represent the same components, and the principle of the present invention is exemplified by being implemented in a suitable computing environment. The following description is based on the illustrated specific embodiments of the present invention, which should not be regarded as limiting other specific embodiments of the present invention that are not described in detail herein.

[0032] In the following description, the specific embodiments of the present invention will be described with reference to the steps and symbols of operations executed by one or more computers, unless otherwise stated. Therefore, it will be able to understand these steps and operations, several of which are mentioned as being executed by a computer, including manipulation by a computer processing unit that represents an electronic signal of data in a structured form. This manipulation transforms the data or maintains it in a locati...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a subject heading classification model creation method. The method comprises the steps that multiple model training documents are obtained, and label terms of the model training documents are extracted; based on a similarity algorithm, core theme phrases corresponding to the label terms are obtained; based on a map content library, a first model training document collection corresponding to the core theme phrases is obtained; based on a machine learning algorithm, the model training documents are subjected to sort operation; based on the map content library, subject type identification of all the model training documents corresponding to the label terms is obtained, and according to the subject type identification corresponding to the label terms, a second model training document collection corresponding to the label terms is determined; repetitive model training documents in the first model training document collection and the second model training document collection corresponding to the label terms are taken as positive samples, other model training documents in the map content library are taken as negative samples, and a subject heading classification model of the label terms is created. The invention further provides a subject heading classification model creation device and a storage medium.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a creation method, creation device and storage medium of a subject heading classification model. Background technique [0002] In the Internet content distribution system, it is necessary to classify articles through keywords, which refer to words that can represent the main content characteristics of the article, so that users can quickly and conveniently understand the content of the article through the keywords. [0003] Existing article subject words are generally tag words that appear in the article, and the tag word extraction algorithm in the article requires that the tag words of the article must have appeared in the article, which greatly limits the abstraction and generalization ability of the article topic words . For example, the tag word "black technology" may not appear in an article describing a specific black technology, so that the subject word of the article cann...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/355G06F18/214
Inventor 孙子荀
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products