Electric power public opinion abstract extraction optimization method and system based on topic clustering

A technology of summarization and clustering, which is applied in the field of extracting and optimizing power public opinion summaries, can solve problems such as incoherent semantics, incomplete considerations, unclear themes of the original text, etc., and achieve the effect of improving quality and reducing interference factors

Pending Publication Date: 2020-05-01
DAREWAY SOFTWARE
View PDF3 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the above methods all have problems such as incomplete consideration of factors, high redundant information in the generated abstract, unclear theme of the original text, and incoherent semantics.
In summary, the generation of automatic summaries of electric power public opinion lacks effective solutions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Electric power public opinion abstract extraction optimization method and system based on topic clustering
  • Electric power public opinion abstract extraction optimization method and system based on topic clustering
  • Electric power public opinion abstract extraction optimization method and system based on topic clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] Embodiment 1, this embodiment provides a method for extracting and optimizing power public opinion summaries based on topic clustering;

[0040] Such as figure 1 As shown, the extraction and optimization method of power public opinion summary based on topic clustering includes:

[0041] S1: Obtain the power industry news text to be extracted;

[0042] S2: The power industry news text to be extracted is clustered in units of sentences; the latent Dirichlet Allocation (Latent Dirichlet Allocation, hereinafter referred to as: LDA topic model) is used to extract the subject words of the clustering results to obtain the power the subject headings of the text;

[0043] In the power industry news text to be extracted, the words and word frequencies that have the same or similar semantics to the text subject words are counted, and they are combined with the text subject words to obtain the high-frequency word set under the corresponding topic of the power industry news text ...

Embodiment 2

[0167] Embodiment 2, this embodiment also provides a power public opinion summary extraction and optimization system based on theme clustering;

[0168] Abstract extraction and optimization system for power public opinion based on topic clustering, including:

[0169] An acquisition module configured to: acquire the power industry news text to be extracted;

[0170] The clustering module is configured to: cluster the power industry news text to be extracted in units of sentences; use the hidden Dirichlet distribution to extract the subject terms of the clustering results, and obtain the subject terms of the power text;

[0171] The statistics module is configured to: in the power industry news text to be extracted from the summary, the words and their word frequencies that have the same or similar semantics to the text subject words, and merge them with the text subject words to obtain the power industry news text correspondence High-frequency words and phrases under the topi...

Embodiment 3

[0174] Embodiment 3. This embodiment also provides an electronic device, including a memory, a processor, and computer instructions stored in the memory and run on the processor. When the computer instructions are executed by the processor, the computer instructions in Embodiment 1 are completed. steps of the method described above.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a topic clustering-based power public opinion abstract extraction optimization method and system. The method comprises the steps of obtaining a power industry news text of a to-be-extracted abstract; clustering the power industry news texts of which the abstracts are to be extracted by taking sentences as units; performing subject term extraction on the clustering result byusing implicit Dirichlet distribution LDA to obtain subject terms of the power text; performing statistics on words with the same or similar semanteme as the text subject term and word frequencies thereof in the power industry news text to be abstracted, and combining the words with the text subject term to obtain a high-frequency word set under a topic corresponding to the power industry news text; constructing a text network diagram for the power industry news text of which the abstract is to be extracted; performing abstract extraction processing based on the text network diagram and the high-frequency word set to obtain a candidate abstract sentence group; redundancy elimination is carried out on the candidate abstract sentence group to obtain a primary abstract; and optimizing the primary abstract to obtain a final abstract, and outputting the final abstract.

Description

technical field [0001] The present disclosure relates to the technical field of extracting public opinion summaries, and in particular relates to an optimization method and system for extracting electric power public opinion summaries based on topic clustering. Background technique [0002] The statements in this section merely mention background art related to the present disclosure and do not necessarily constitute prior art. [0003] The power industry is related to the national economy and the people's livelihood, and power-related events often receive extensive attention from the general public and the media. The automatic summary extraction of electric power public opinion text is based on the text information crawled from the webpage, and uses data processing technology to extract or generate a content summary, so as to complete the description of the core information of the article. The application of this technology can enable power companies to timely control powe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F16/34G06F16/9535
CPCG06F16/3344G06F16/35G06F16/345G06F16/9535
Inventor 史玉良张晖管永明吕梁胥鹏飞刘智勇李娜娜
Owner DAREWAY SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products