Named entity identification method for Vietnamese

A technology for named entity recognition and entity labeling, which is applied in the fields of natural language data processing, instruments, electrical digital data processing, etc., and can solve the problem that the recognition accuracy needs to be improved.

Pending Publication Date: 2020-06-26
GUILIN UNIV OF ELECTRONIC TECH
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the recognition accuracy of these methods still needs to be improved.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Named entity identification method for Vietnamese

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0023] refer to figure 1 , a named entity recognition method for Vietnamese, comprising the following steps:

[0024] 1) Model training: The process of the model training is:

[0025] 1-1) Data input: The model used for model training has a six-layer structure, including an input layer connected sequentially from top to bottom, a bidirectional encoder representation of a transformer BERT layer, a gated recurrent unit GRU layer, a conditional random field CRF layer, Dictionary correction layer and output layer, the data set is a text file, including training set, test set, verification set, the text of the training set and verification set is divided into two columns, which are words and labels respectively, and the entity label adopts BIO system, person name PER, place name LOC, organization name ORG, other O, where the first word of each entity label is connected with the letter B, and the non-first word is connected with the letter I, the verification set text only contains...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a named entity recognition method for Vietnamese, and the method is characterized in that the method comprises the following steps: 1) model training; and 2) data dictionary construction. The model training comprises 1-1) data input, 1-2) BERT layer training, 1-3) GRU layer training and 1-4) CRF layer training, and the data dictionary construction comprises 2-1) data dictionary correction and 2-2) result verified. The named entity recognition method for Vietnamese is high in Vietnamese named entity recognition accuracy.

Description

technical field [0001] The invention relates to the field of computer application technology, in particular to natural language processing technology, in particular to a named entity recognition method for Vietnamese. Background technique [0002] With the rapid development of Internet technology and the continuous deepening of research in the field of natural language processing, the available information resources have been greatly enriched, and people urgently need to obtain useful information from massive unstructured texts. In this context, named entity recognition technology comes into play. And born. Named entity recognition is a basic task in natural language processing. The purpose is to identify named entities such as names of people, places, and institutions in text. In all researches on natural language processing, this is a task that must be overcome. Named entity recognition, as the basic work in information extraction, question answering system, machine trans...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/295
Inventor 黄永忠田磊廖显文吴辉文庄浩宇
Owner GUILIN UNIV OF ELECTRONIC TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products