Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for extracting theme tag from text set and electronic equipment

A text and theme technology, applied in the field of natural language processing, can solve problems that are difficult to determine and difficult to operate in a targeted manner, and achieve the effect of simple calculation, convenient calculation and wide application range

Pending Publication Date: 2022-03-01
EMOTIBOT TECH LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the cutting theme of this method needs to be determined according to the result of the vector representation, and it is difficult to operate in a targeted manner
The traditional K-means clustering algorithm also needs to realize the determination of the number of clusters, which is difficult to determine when the number of divisions in the data set is uncertain

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting theme tag from text set and electronic equipment
  • Method for extracting theme tag from text set and electronic equipment
  • Method for extracting theme tag from text set and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

[0044] Like numbers and letters denote similar items in the following figures, so that once an item is defined in one figure, it does not require further definition and explanation in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second" and the like are only used to distinguish descriptions, and cannot be understood as indicating or implying relative importance.

[0045] figure 1 It is a schematic structural diagram of an electronic device provided in an embodiment of the present application. The electronic device 100 may be used to execute the method for extracting topic tags from a text set provided in the embodiment of the present application. Such as figure 1 As shown, the electronic device 100 includes: one or more processors 102, and one ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for extracting topic tags from a text set and electronic equipment. The method comprises the following steps: converting each text in the text set into a text vector; taking each text vector as a cluster at the bottommost layer, executing hierarchical clustering from bottom to top, and determining a topic tag of each layer of cluster; for any word, obtaining a cluster set corresponding to the word according to a cluster containing the word in the topic tag; the cluster set comprises at least one cluster, and each cluster comprises at least one text; according to the cluster sets corresponding to the different words and the keywords to be extracted, finding out a target cluster set mapped by the keywords; and according to the topic label corresponding to each cluster in the target cluster set, obtaining the topic label which is extracted from the text set and is related to the keyword. According to the scheme, extraction of theme tags is simpler and more convenient.

Description

technical field [0001] The present application relates to the technical field of natural language processing, in particular to a method and electronic equipment for extracting topic tags from a text set. Background technique [0002] Theme is the central idea of ​​the text, which summarizes and reflects the main body and core of the text content. The topic tag can briefly summarize the main content of the text through a small number of words. In the era of information overload and rapid data growth, enterprises will accumulate massive text resources. In the case of a large amount of text and a wide range of sources, the data set contains different fields and types of content, and actually faces the topic of obtaining text from multiple angles and The problem of understanding the relative relationship between these texts. [0003] For the topic tag extraction of a large amount of text, the topic model can be used directly, such as the common Latent Dirichlet Allocation (LDA...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06K9/62
CPCG06F16/35G06F18/231
Inventor 简仁贤马永宁任钊立
Owner EMOTIBOT TECH LTD