Chinese named entity recognition method and device based on vocabulary enhancement and multiple features

A technology of named entity recognition and vocabulary enhancement, applied in the field of information extraction, can solve the problems of low recognition accuracy and recall rate, achieve the effect of improving accuracy rate and recall rate, avoiding recognition errors, and fully characterizing characters

Pending Publication Date: 2022-02-08
HUAZHONG UNIV OF SCI & TECH +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of the above defects or improvement needs of the prior art, the present invention proposes a Chinese named entity recognition method and device based on vocabulary enh

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese named entity recognition method and device based on vocabulary enhancement and multiple features
  • Chinese named entity recognition method and device based on vocabulary enhancement and multiple features
  • Chinese named entity recognition method and device based on vocabulary enhancement and multiple features

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. The characteristics, operations or characteristics described in the specification can be combined in any appropriate manner to form various embodiments. At the same time, the steps or actions in the method description can also be exchanged or adjusted in a manner obvious to those skilled in the art. Therefore, various sequences in the specification and drawings are only for clearly describing a certain embodiment, and do not mean a necessary sequence, unless otherwise stated that a certain sequence must be followed. In addition, the technical features involved in the various embodiments of the prese...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Chinese named entity recognition method and device based on vocabulary enhancement and multiple features, and belongs to the technical field of information extraction. The method comprises the following steps: extracting character features of an input sequence in combination with a bidirectional long-short-term memory network and a convolutional neural network, introducing vocabulary information corresponding to characters in a character string mode matching manner, extracting vocabulary features in a word frequency weighted average manner, and extracting pre-training features by using a pre-training model; using a gating mechanism to control vocabulary enhancement of the vocabulary features to the character features; linearly splicing the character features subjected to vocabulary enhancement and the pre-training features to construct multiple features; obtaining context features based on context correlation of multiple features; and combining label decoding with context features to predict an optimal label sequence of the input sequence. Therefore, the character features of the Chinese sequence are extracted more fully; the extracted vocabulary features are richer, and the influence of Chinese word segmentation errors is avoided; and the entity identification index is improved by using a multi-feature combined strategy mode.

Description

technical field [0001] The invention belongs to the technical field of information extraction, and more specifically relates to a method and device for recognizing Chinese named entities based on vocabulary enhancement and multi-features. Background technique [0002] From the perspective of research, named entity recognition can be divided into two categories: one is based on traditional methods, mainly methods based on dictionaries and templates, methods based on unsupervised learning and supervised learning methods based on features; It is based on the deep learning method. According to the different input forms, this category can be summarized as word-level and word-level. [0003] The initial research methods were carried out by constructing lexical information and template rules. In most of these methods, industry experts manually constructed special dictionaries or templates based on the characteristics of the data set for matching and recognition. Generally speaking...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/216G06F40/295G06F40/30G06F16/33G06N3/04G06N3/08
CPCG06F40/295G06F40/30G06F40/216G06F16/3344G06N3/08G06N3/044G06N3/045
Inventor 袁凌徐志鹏李国徽胡记伟胡小飞
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products