Topic analysis method and device and storage medium

An analysis method and topic technology, applied in the fields of instruments, electrical digital data processing, computing, etc., can solve problems such as the inability to fully summarize the core content, and the non-one-to-one relationship between topics and articles.

Active Publication Date: 2020-03-10
HUNAN ANTVISION SOFTWARE
View PDF21 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

That is, the topic and the article are not in a one-to-one relationship, and the clustering algorithm believes that a text has only one topic, so it cannot fully summarize the core content of the entire text

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Topic analysis method and device and storage medium
  • Topic analysis method and device and storage medium
  • Topic analysis method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0060] figure 1 It is a schematic flow chart of the topic analysis method in the embodiment of the present invention, please refer to figure 1 , the embodiment of the present invention provides a topic analysis method, the method includes:

[0061] S101: Obtain the text corpus to be processed, and obtain the word segmentation result and the corresponding part of speech corresponding to each text corpus to be processed.

[0062]It should be noted that the text corpus is a corpus collection for data capture, which may contain ill sentences or special symbols in sentences. Therefore, text containing special symbols needs to be processed.

[0063] In an implementation manner of the present invention, sentence division processing is performed on the text corpus according to punctuation marks, so as to remove specific punctuation marks c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a topic analysis method, which comprises the steps of obtaining to-be-processed text corpora, and obtaining a word segmentation result and a corresponding part-of-speech corresponding to each to-be-processed text corpus; obtaining a filtered text corpus; analyzing the word segmentation result and the corresponding part-of-speech of each filtered text corpus through a dependency syntax to obtain a dependency relationship between syntax components of segmented words and the segmented words and a dependency pair corresponding to each text corpus; obtaining a topic corresponding to each text corpus according to the combined sentence pattern structure and the dependency pair; obtaining similar topics, and sorting the similar topics according to the number of the similartopics. The invention also discloses a topic analysis device and a storage medium, syntactic analysis is used on the basis of word segmentation to analyze a dependency relationship between a grammatical structure and a word segmentation result in a text statement, and a smooth and accurate topic is extracted according to a plurality of preset common Chinese combined sentence pattern structures, sothat topics can be analyzed from massive texts.

Description

technical field [0001] The present invention relates to the field of topic analysis and processing, in particular to a topic analysis method, device and storage medium. Background technique [0002] With the rapid development of information technology, the Internet has become the main channel for people to obtain and distribute information. Due to the large amount of network information, wide sources, and fast dissemination speed, it is becoming more and more troublesome for ordinary netizens to quickly and accurately find the network information they want. Therefore, how to quickly, accurately and comprehensively analyze and extract the hot topics concerned by netizens from the massive network information has become a very popular research direction at present. [0003] At present, texts are still the main way of expressing online topics, and the technical means of discovering topics from texts at this stage is still limited to the lexical level, that is, relying on keywor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/216G06F40/253G06F40/289
Inventor 耿雪芹王晓斌焦梦姝黄三伟
Owner HUNAN ANTVISION SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products