Supercharge Your Innovation With Domain-Expert AI Agents!

Method for extracting subject term/sequence in subject hierarchical modeling

An extraction method and technology of subject words, applied in semantic analysis, character and pattern recognition, instruments, etc., can solve the problem that the subject words/sequences cannot highlight the differences between the topics at the upper and lower levels, and the subject words/sequences cannot be reflected. Issues such as topic relevance at upper and lower levels

Pending Publication Date: 2022-07-15
同方知网数字出版技术股份有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] 1) c-tf-idf: The subject words / sequences extracted by this method cannot reflect the correlation between the topics of the upper and lower levels;
[0004] 2) Calculate the frequency of words / sequences, TextRank, and select top-n words / sequences based on semantic similarity: the subject words / sequences extracted by this method cannot highlight the differences between topics at the upper and lower levels and between topics at the same level;

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting subject term/sequence in subject hierarchical modeling
  • Method for extracting subject term/sequence in subject hierarchical modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail below with reference to the embodiments and accompanying drawings.

[0017] like figure 1 As shown, the extraction method flow of topic words / sequences in topic hierarchical modeling, including

[0018] 1) Obtain the model output of hierarchical modeling of topics: it mainly includes the hierarchical relationship between topics and the collection of documents corresponding to the topics;

[0019] 2) Extract the subject words / sequences according to the model output: mainly extract the subject words / sequences according to the correlation and difference;

[0020] 3) Present a representation of hierarchical topics: Present topics to the user in the form of topic words / sequences.

[0021] The above 1) is mainly to obtain the hierarchical topic results that have been mined (the hdbscan method is used here, and other ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for extracting a subject term / sequence in subject hierarchical modeling. The method comprises the following steps: acquiring model output of subject hierarchical modeling; extracting subject terms / sequences according to model output; and displaying a word / sequence feature representation result of the theme. According to the method, the themes can be better represented in the process of mining the hierarchical theme structure, the correlation between the themes of the upper layer and the lower layer is reflected, the difference between the themes of the upper layer and the lower layer and the difference between the themes of the same layer are highlighted, and the themes are hierarchical in structure and hierarchical in representation.

Description

technical field [0001] The present invention relates to the technical field of subject word extraction, in particular to a method for extracting subject words / sequences in topic hierarchical modeling. Background technique [0002] Topic hierarchical modeling refers to mining not only topic sets, but also hierarchical relationships between topics in the process of topic modeling. The results of topic mining are finally presented to users in the form of topic words / sequences, and the representation of a topic directly determines the user's understanding and control of the topic. Conventional subject word / sequence extraction methods include c-tf-idf, TextRank, calculating word / sequence frequency, and selecting top-n words / sequences based on semantic similarity. Different from the single-level topic representation, the hierarchical topic representation should not only reflect the correlation between the upper and lower-level topics, but also highlight the differences between th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/30G06F40/216G06K9/62
CPCG06F40/30G06F40/216G06F18/22
Inventor 冯晓燕吴晨
Owner 同方知网数字出版技术股份有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More