Text label extracting method and device

A technology of text tags and extraction methods, applied in the Internet field, can solve the problems of not being able to meet the needs of different granularity retrieval, unable to provide news tag subscriptions, etc.

Active Publication Date: 2016-11-23
SHENZHEN TENCENT COMP SYST CO LTD
View PDF3 Cites 77 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The text tags extracted by this method have no hierarchy, cannot meet the retrieval needs of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text label extracting method and device
  • Text label extracting method and device
  • Text label extracting method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In one embodiment, such as figure 1 As shown, a method for extracting text labels is provided, including the following steps:

[0028] In step S110, category prediction is performed on the text to be extracted through the text classification model to obtain the target category of the text.

[0029] Specifically, the text classification model is a mathematical model for classifying text, and different methods can be used to train the classification model to obtain different text classification models. Select a text classification model as required, such as maximum entropy model, decision tree model, etc. After obtaining the text classification model through the offline training method, use the trained text classification model to predict the category when predicting the online category of the text, calculate the probability that the text belongs to each category, and use the category with the highest probability as the target category of the text. The type of each cate...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a text label extracting method. The text label extracting method comprises the following steps: category prediction is performed on a to-be-extracted text through a text categorization model, and a target category of the text is obtained; topic prediction is performed on the to-be-extracted text through a topic clustering model, and a predicted topic is obtained; if the predicted topic is in a default topic set, a target topic corresponding to the predicted topic is acquired, keyword extraction is performed on the to-be-extracted text, target keywords of the text are obtained, and the target category, the target topic and the target keywords are taken as labels of the text. The text labels have different levels to meet multi-granularity retrieval requirements, and multi-granularity recommended articles can be provided according to different labels. Besides, the invention provides a text label extracting device.

Description

technical field [0001] The invention relates to the technical field of the Internet, in particular to a method and device for extracting text labels. Background technique [0002] With the development of Internet technology, people can read various types of texts through the Internet. Tag is a keyword with strong correlation with the text, which can simply describe and classify the text content for easy retrieval and sharing. [0003] The traditional tag extraction method is based on keywords, extracts keywords from words that have appeared in the article, and uses the extracted keywords as the tags of the article. The text tags extracted by this method have no hierarchy, and cannot meet the retrieval needs of different granularities, nor can they provide news tag subscriptions of different granularities. Contents of the invention [0004] Based on this, it is necessary to provide a method and device for extracting text tags to address the above problems, which can meet ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/313G06F16/35
Inventor 胡燊刘安安王迪
Owner SHENZHEN TENCENT COMP SYST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products