Unlock instant, AI-driven research and patent intelligence for your innovation.

Statement sequence labeling method and device, electronic equipment and storage medium

A sequence tagging and sentence technology, applied in neural learning methods, semantic analysis, electrical digital data processing, etc., can solve the problems of conduction, low accuracy of sequence tagging, and low prediction accuracy of sequence tagging models, and achieve accuracy improvement and representation The effect of information enhancement and accuracy improvement

Pending Publication Date: 2022-02-18
BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the prior art, the sequence annotation framework based on machine learning and deep learning is usually used as the main framework of the end-to-end Point of Interest (POI) analysis model, which abstracts the analysis of POI into the process of sequence annotation , when using this framework to sequentially tag Chinese writing, two methods are usually used. The first method is to perform word segmentation processing on the POI text first, and use words as tags to perform sequence tagging; the second method is to cancel the POI word segmentation processing step. , directly use characters (Chinese characters, English letters) as labels to perform sequence labeling; while using the first method for sequence labeling, due to the existing word segmentation processing will have errors, and the errors will be transmitted to the subsequent sequence labeling stage, As a result, the accuracy of sequence labeling is low; while using the second method for sequence labeling, since the prior lexical information is discarded, it is hoped that the model can identify potential lexical information, which improves the accuracy of the sequence labeling model. The difficulty of training will lead to low prediction accuracy of the trained sequence labeling model, so there is an urgent need for a method that can improve sequence labeling

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Statement sequence labeling method and device, electronic equipment and storage medium
  • Statement sequence labeling method and device, electronic equipment and storage medium
  • Statement sequence labeling method and device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0041] Please refer to figure 1 , the embodiment of the present invention provides a sentence sequence labeling method, the method comprising:

[0042] S101. Perform word segmentation processing on the target sentence through the preset domain dictionary to obtain a word segmentation set;

[0043] S102. For each character in the target sentence, based on the word segmentation set, obtain the vector of the character in each of the N sets of vocabulary sets corresponding to the word segmentation set, and according to the vector of the character in each set of vocabulary, Obtain the word vector of the character, where N is an integer greater than 1;

[0044]S103. Input the vector of each character in each vocabulary set into the pre-trained sequence labeling model, and perform sequence labeling on the target sentence.

[0045] Among them, in step S101, it is first necessary to preprocess the domain dictionary. The preprocessing of the domain dictionary may include the steps of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a statement sequence labeling method, wherein the method comprises the steps: performing word segmentation processing on a target statement through a preset domain dictionary to obtain a segmented word set; for each character in the target statement, based on the segmented word set, obtaining a vector of the character in each group of vocabulary sets in the N groups of vocabulary sets, and obtaining a word vector of the character according to the vector of the character in each group of vocabulary sets; inputting the vector of each character in each group of vocabulary set into a pre-trained sequence labeling model, and performing sequence labeling on the target statement, so as to perform word segmentation processing on the target statement through a preset domain dictionary to improve the accuracy of word segmentation; and according to the word segmentation set, obtaining the vector of each character in each group of vocabulary set, and obtaining the word vectors of the characters. Therefore, the representation information of each character in the target word can be effectively enhanced, and on the basis of improving the word segmentation accuracy and enhancing the representation information of each character, the accuracy of the sequence label output by the sequence label model is also improved.

Description

technical field [0001] The invention relates to the technical field of language processing, in particular to a sentence sequence labeling method, device and electronic equipment. Background technique [0002] With the rapid development of artificial intelligence, the application fields of machine learning and deep learning are becoming wider and wider. Because machine learning and deep learning can achieve a certain degree of understanding and reasoning, the machine can adapt itself, making the machine more intelligent. [0003] In the prior art, the sequence annotation framework based on machine learning and deep learning is usually used as the main framework of the end-to-end Point of Interest (POI) analysis model, which abstracts the analysis of POI into the process of sequence annotation , when using this framework to sequentially tag Chinese writing, two methods are usually used. The first method is to perform word segmentation processing on the POI text first, and use ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/242G06F40/279G06F40/30G06N3/04G06N3/08
CPCG06F40/242G06F40/279G06F40/30G06N3/04G06N3/08
Inventor 刘旭东罗京
Owner BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD