Semantic frame-based power grid defect text mining method

A semantic framework and text mining technology, applied in semantic analysis, digital data processing, natural language data processing, etc., can solve problems such as heavy workload, time-consuming and labor-intensive, and difficulty in verifying the correctness of classification and statistical work. Easy application and high statistical accuracy

Inactive Publication Date: 2016-09-21
ZHEJIANG UNIV
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The classification, analysis and statistics of equipment defects that power grid companies need to carry out every year often rely on manual work. Not only is the workload heavy, time-consuming and labor-intensive, but also due to subjective factors and differences in experience, the correctness of the classification and statistical work is difficult to verify

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semantic frame-based power grid defect text mining method
  • Semantic frame-based power grid defect text mining method
  • Semantic frame-based power grid defect text mining method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The specific implementation steps of the present invention are further described below in conjunction with examples:

[0023] Step 1: Word segmentation. Defective text is segmented based on Hidden Markov Model (HMM, Hidden Markov Model).

[0024] Step 2: word frequency feature extraction. Perform word frequency statistics on word segmentation results, sort words from high frequency to low frequency, and remove stop words such as symbols, names of people, and places.

[0025] Step 3: Co-occurrence feature extraction. The four slots Pb, Ps, A, and C rarely appear together. Most of the semantic frames in defect texts have slots missing. The non-core slots Pb and C are often missing, and the core slots Ps and A always exist (extremely Except in some special cases).

[0026] Step 4: Lexeme Feature Extraction. The position sequence of the four grooves has strong regularity, and the most typical arrangement sequence is Pb-Ps-A-C, Pb-Ps-C-A.

[0027] Step 5: Build ontolog...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a semantic frame-based power grid defect text mining method. Characteristic information in a power grid defect text is mined for meeting various defect automatic statistic demands. An existing statistic method is based on labor, so that the time and labor are wasted and the correct rate is subjectively influenced to a great extent. The method comprises the steps of firstly extracting syntactic structure knowledge of the defect text and constructing an ontology dictionary; secondly filling semantic slots with keywords of the defect text by applying a semantic slot filling method; thirdly integrating the disordered slots into a semantic frame by utilizing a semantic association algorithm; and finally performing word string combination to realize simplification of the semantic frame. The defect automatic statistics of different demands can be conveniently realized after the semantic frame is constructed. The method is high in defect text statistic accuracy and convenient to apply.

Description

technical field [0001] The invention belongs to the technical field of power systems, and in particular relates to a text mining method for power grid defects based on a semantic framework. Background technique [0002] In the process of equipment operation and maintenance management, power grid enterprises will record information such as equipment failures, defects, maintenance, and defect elimination in Chinese. These information are stored in the information management system in the form of text, which not only reflects the past history of the individual health status of power equipment, but also contains rich reliability information of similar equipment. From Chinese text information to reliable information that is easy to use, it is necessary to explore complex information mining techniques and information mining processes. At present, the above-mentioned information mining problems have not been fully resolved. The classification, analysis and statistics of equipment...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/216G06F40/30
Inventor 曹靖陈陆燊邱剑王慧芳
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products