Unlock instant, AI-driven research and patent intelligence for your innovation.

Corpus processing method and device, storage medium and processor

A processing method and processor technology, applied in the fields of storage media and processors, corpus processing methods, and devices, can solve problems such as low efficiency of terminology process, and achieve the effect of solving low efficiency and improving extraction efficiency.

Pending Publication Date: 2021-04-09
STATE GRID BEIJING ELECTRIC POWER +2
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Embodiments of the present invention provide a corpus processing method, device, storage medium, and processor to at least solve the technical problem of low efficiency in the process of acquiring terms in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Corpus processing method and device, storage medium and processor
  • Corpus processing method and device, storage medium and processor
  • Corpus processing method and device, storage medium and processor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0023] It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a corpus processing method and device, a storage medium and a processor. The method comprises the steps that words and sentences to be recognized are acquired; a new word discovery model is adopted to process words and sentences, at least one candidate corpus is recognized, and the new word discovery model is a corpus model obtained through training of a deep learning model; and a target corpus is determined from the at least one candidate corpus, the target corpus being a new vocabulary identified from the words and sentences. According to the invention, the technical problem of low efficiency of the term acquisition process in the prior art is solved.

Description

technical field [0001] The present invention relates to the field of charging, in particular, to a method, device, storage medium and processor for processing corpus. Background technique [0002] Terminology refers to the designation of general concepts in a particular field of expertise. In the vertical field of electric power, unregistered words are a big problem when analyzing words on unprocessed original corpus. Unregistered words refer to words that are not included in the word segmentation vocabulary but must be segmented, including various proper nouns (person names, place names, business names, etc.), abbreviations, new vocabulary, etc. Moreover, most of the unregistered words are technical terms in the field of electric power, so term discovery is an urgent problem to be solved. The discovery of terms directly affects the quality of the corpus. The term must first appear as a complete language unit, and it must have the characteristics of frequent appearance, cl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/289G06N3/04G06N3/08
CPCG06F40/289G06N3/08G06N3/045
Inventor 尚颖张晔马薇黄松徐光兵李彦龙梁卫泉丁勇王端瑞侯本忠张永强闫丽飞
Owner STATE GRID BEIJING ELECTRIC POWER