Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Chinese name recognition method based on recurrent neural network

A technology of cyclic neural network and recognition method, which is applied in the fields of natural language processing, deep learning and named entity recognition, can solve the problems of urgently improving the breadth of Chinese name recognition, and achieve the effect of increasing generalization ability, expanding breadth and reducing complexity

Inactive Publication Date: 2016-08-17
DALIAN UNIV OF TECH
View PDF2 Cites 42 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In addition, the current Chinese name recognition system mainly recognizes Chinese names, but less involves Japanese names, foreign transliterated names, and minority transliterated names. The breadth of Chinese name recognition needs to be improved urgently.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese name recognition method based on recurrent neural network
  • Chinese name recognition method based on recurrent neural network
  • Chinese name recognition method based on recurrent neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] The specific implementation manners of the present invention will be further described below in conjunction with the accompanying drawings and technical solutions.

[0036] figure 1 It shows the preprocessing of the Chinese name recognition model, the word vector training and the training process of the Chinese name recognition model.

[0037] figure 2 Indicates the process of post-processing, the following synthesis figure 1 The present invention will be described in detail.

[0038] Below with 1998 " People's Daily " as data set, illustrate the present invention in detail with a specific example.

[0039] Step 1, data preprocessing of "People's Daily" in 1998: the specific sub-steps are as follows:

[0040] Use the word segmentation tool nihao word segmentation to process the corpus to obtain a word dictionary. Then use the word dictionary to digitize each word after word segmentation and assign a classification label. Finally, each word has a number number and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a Chinese name recognition method based on a recurrent neural network. The method includes the steps of S1, corpus pretreatment; S2, word vector training, wherein a word2vec tool is used for word vector training; S3, Chinese name recognition model training, wherein data obtained after processing in S1 and word vectors obtained after training in S2 are used for training a neural network model; S4, name recognition and aftertreatment, wherein the model obtained after modeling in S3 carries out name recognition on test corpuses, names recognized by the model are subjected to aftertreatment through a context rule and a diffusion algorithm, and finally names are obtained. By means of the method, the complexity of feature selection during Chinese name recognition can be effectively lowered, rich syntax and grammar information included in Chinese texts is fully utilized through the word vectors, the generalization ability of the model is improve accordingly, Japanese names and foreign transliteration names are recognized at the same time, and the extent of Chinese name recognition is widened.

Description

technical field [0001] The invention relates to the fields of natural language processing, deep learning, named entity recognition and the like, in particular to a recognition method applicable to Chinese names, Japanese people and foreign transliterated names in Chinese texts. Background technique [0002] With the rapid development of Internet technology and the rapid expansion of new information, the need to extract useful information from massive data is becoming more and more urgent. How to quickly and effectively obtain useful information and knowledge from large-scale, unstructured language texts has become a research hotspot in the field of natural language processing. Compared with English and other languages, Chinese information lacks separation marks, which increases the difficulty of named entity recognition. But named entity recognition has an important impact in areas such as information extraction, machine translation, and text classification. In the named e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27G06N3/02
CPCG06F40/242G06F40/295G06N3/02
Inventor 黄德根徐新峰
Owner DALIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products