Power grid equipment word segmentation dictionary and fault case library construction method
A power grid equipment and word segmentation dictionary technology, applied in the direction of neural learning methods, neural architecture, semantic tool creation, etc., can solve problems such as low retrieval and browsing efficiency, insufficient maintenance decision-making support, and insufficient mining of related information, so as to facilitate intuitive understanding and improve The effect of application value and improving the accuracy of word segmentation
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0054] The case texts of power grid equipment failure defects contain a large number of technical terms, which are usually not included in the dictionaries of existing general word segmentation tools. If a general word segmentation tool is used to segment text in the power grid domain, a large number of professional terms will be misclassified, which will affect the reliability of subsequent word vector training and text classification. Therefore, before word segmentation, it is very important to expand the domain-specific words on the public domain dictionary of mature word segmentation tools, and to build a word segmentation dictionary in the power grid field to improve the accuracy of subsequent steps.
[0055] Method A semi-supervised method combining automatic labeling based on named entity recognition model and manual screening is used to construct a power grid domain dictionary. The process is as follows figure 2 . Solving the professional compliance of the identified...
Embodiment 2
[0066] Further, in step b), before extracting text information, extract pictures, file names, and author information, filter label and typo noise, import the extracted and filtered text into the word segmentation tool of the power grid domain dictionary for word segmentation, and complete the text preprocessing work . The fault defect cases that are actually handled are usually manually written, and are rich text files including tables, pictures, text and labels, such as pdf, word and other formats. Before extracting text information, information such as stored pictures, file names, and authors should be extracted, and noise such as labels and typos should be filtered. The processed text is imported into the word segmentation tool of the word segmentation dictionary in the above-mentioned power grid field for precise word segmentation, and the text preprocessing work is completed so far.
Embodiment 3
[0068] Further, step c) includes the following steps:
[0069] c-1) The purpose of extracting power grid equipment fault text data information is to extract meaningful information about power grid equipment faults and defects through the analysis and processing of unstructured text data, and form structured data, which is convenient for future Accurate retrieval of content information. Considering the diversity of power grid fault text descriptions, a unified attribute template is used for attribute extraction when extracting text data. The attribute types are divided into digital state quantity attributes, phrase state quantity attributes and sentence state quantity attributes. The state quantity attribute is proposed to be extracted using a rule-based method, the phrase-type state quantity attribute is proposed to be extracted using the entity matching method based on grammatical rules, and the sentence-type state quantity attribute is proposed to be classified using a distr...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com