Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Model training method and device and named entity recognition method and device

A named entity, model training technology, applied in neural learning methods, biological neural network models, instruments, etc.

Active Publication Date: 2020-08-11
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This sparsity of training data brings great challenges to model training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model training method and device and named entity recognition method and device
  • Model training method and device and named entity recognition method and device
  • Model training method and device and named entity recognition method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0079] The solutions provided in this specification will be described below in conjunction with the accompanying drawings.

[0080] figure 1 A schematic diagram of an implementation scenario of an embodiment disclosed in this specification. Among them, the sequence of word segmentation that will contain multiple word segmentation Input the first recurrent neural network, the first recurrent neural network can output the hidden vector of each word , based on each hidden vector, the distribution probability of each word segment in each category can be determined, and based on these distribution probabilities, the classification result of each word segment is obtained, that is, the label of which category each word segment corresponds to. Categories can be represented by labels. SOS is the start symbol of the word segmentation sequence, and EOS is the end symbol of the word segmentation sequence.

[0081] Named entities (Entity), also known as entity words, have the nature ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a model training method and device and a named entity recognition method and device. The model training method comprises steps of during model training, replacing a first named entity in the first sample sequence with a first preset character to obtain a second sample sequence, and determining a text fragment containing the first preset character from the second sample sequence; recursively determining implicit vectors of a plurality of segmented words in the second sample sequence by adopting a first recurrent neural network, and determining representation vectors of the text fragments; constructing Gaussian distribution based on the representation vector through a variational auto-encoder, and determining a global implicit vector for the text segment; adopting a first recurrent neural network, the global implicit vector serving as an initial implicit vector, determining a decoding implicit vector of the segmented words in the text segment in arecursive mode, and determining prediction values of the segmented words in the text segment; and determining a prediction loss value based on the difference and the distribution difference between the segmented words in the text fragment and the prediction value of the segmented words, and updating the first recurrent neural network and the variational auto-encoder in the direction of reducing the prediction loss value.

Description

technical field [0001] One or more embodiments of this specification relate to the technical field of natural language processing, and in particular to methods and devices for model training and named entity recognition. Background technique [0002] In the field of natural language processing technology, the classification of named entities (Entity) in text sequences is an important direction of research. Named entities have the nature of nouns in the part of speech, including person names, organization names, place names, and all other entity categories identified by names. Broader named entities also include categories such as numbers, dates, currencies, addresses, and more. Accurate recognition of the categories of named entities can improve the accuracy and effectiveness of natural language processing. [0003] Usually, a training set is used to train a model for recognizing named entities, and after the model is trained, a test set is used to test the model. A major...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/284G06F40/295G06N3/04G06N3/08
CPCG06F40/284G06F40/295G06N3/049G06N3/084G06N3/044G06N3/045
Inventor 李扬名李小龙姚开盛
Owner ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products