Comprehensive position coding method for vocabulary sequence data

A technology of sequence data and encoding method, which is applied in the field of natural language processing, can solve problems such as ineffectiveness and irreplaceable data features.

Active Publication Date: 2021-03-09
NANJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

On the one hand, the data features represented by these two codes cannot replace each other. After the absolute position code is added to the vocabulary code, the model can identify the absolute position of a certain word in the vocabulary sequence. This optimized model, no matter in In the task of machine translation, or in other tasks of natural language processing, it can effectively predict which words should appear in each position in the sentence sequence in the test, and combine the words that have appeared before to effectively predict the next position. Which specific vocabulary should the vocabulary be; and after incorporating the relative position encoding into the calculation of the binary relationship, the model can effectively learn the impact of the distance between any two vocabulary on the relationship between the two, or comprehensively based on any two vocabulary The encoding and relative position encoding calculate the relationship between the two, and in the final test, find the most suitable word among the candidate words according to the distance relationship between the current position and the previous words; but on the other hand, before Efforts to fuse the two encodings have not yielded much results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Comprehensive position coding method for vocabulary sequence data
  • Comprehensive position coding method for vocabulary sequence data
  • Comprehensive position coding method for vocabulary sequence data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0074] In order to enable those skilled in the art to better understand the solution of the present invention, the technical solution of the present invention will be clearly and completely described below in conjunction with an application example of the present invention in a neural machine translation model Transformer. The present invention emphasizes the complete calculation process of the position coding method and the calculation details of the model, which can be referred to the attached figure 1 and 2 to understand.

[0075] The calculation process of the Transformer model in the training phase and the testing phase is as follows. All input, output and intermediate variables will be marked with their size and shape when they appear for the first time, and will not be marked with superscripts when they appear again. Variables without superscripts are scalars:

[0076] Each pair of parallel corpus, that is, source input - target input, is a pair of vocabulary sequence...

Embodiment 2

[0143] The testing process of the trained Transformer model is as follows:

[0144] S1': The test data is in the same form as the training data, that is, a pair of source input and target input vocabulary sequences.

[0145] S2': Same as the training process of the model, input the source input sentence into the encoder of the trained Transformer model to get the final encoder output X N .

[0146] S3': set For the final output of the Transformer model generated in the previous round, the Extract the serial number of the element with the largest value in each row of , and locate it in the target input vocabulary code dictionary VOC tgt in the corresponding vocabulary, and sequentially extract their feature vectors to form the output Pred of the Transformer model i×dim .

[0147] S4': Take the feature vector representing start as the first row, Pred i×dim The line numbers of other lines in the line are incremented by 1, and we get Test_Input (i+1)×dim , which is input ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a comprehensive position coding method for vocabulary sequence data, which comprises the following steps of: before a vocabulary sequence is input into a model, adding codes ofabsolute positions of vocabularies in the sequence to each vocabulary in addition to codes of the vocabularies; when binary relation calculation of every two vocabularies is carried out in the deep learning model, adding codes of the relative positions of the two vocabularies in the sequence; and optimizing the numerical values of the two position codes, and continuously adjusting the numerical values in the training process. According to the method, on the basis that absolute position coding is carried out on original positions of vocabularies, the distance between any two vocabularies is further coded, and the two codes are combined, so that the seriality of data can be effectively reflected when language source data with serialization characteristics is input into a deep learning modelin parallel for calculation. Compared with an existing position coding method, the method has the advantages that a current mainstream machine translation model can achieve higher translation precision, and the error rate is lower.

Description

technical field [0001] The invention relates to a comprehensive position coding method for vocabulary sequence data, which belongs to the technical field of natural language processing. Background technique [0002] In natural language processing tasks, the most common source data unit used as input is a sentence, or a sequence of words, which is born with time / space / logic relationship sequence attributes. Naturally, when using neural network models to process sequences, people first think of recurrent neural networks: a deep learning model with the ability to serialize data. But from the perspective of semantic analysis, the processing of lexical sequences cannot be carried out completely in accordance with the spatial order, because the relationship between words and words is not completely consistent with their spatial order in the sequence, and the simple recurrent neural network cannot handle long-term rely. Deep learning models such as LSTM and attention-based bidire...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/242G06F40/284G06F40/58G06N3/08
CPCG06F40/242G06F40/284G06F40/58G06N3/08
Inventor 柳林青徐小龙
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products