Associated information fused program language recognition system and method

A technology of related information and recognition method, which is applied in the field of formula language recognition system that integrates related information, and can solve the problems of inability to effectively recognize various types of formula language, poor portability, and poor generalization ability

Pending Publication Date: 2022-04-12
NORTHEAST DIANLI UNIVERSITY
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Recognition methods based on statistics and rules rely on pre-set standards and have poor portability. When faced with complex text, they cannot effectively identify various types of programming languages.
With the rise of machine learning in the field of natural language processing, some scholars have tried to use classifiers such as random forests and support vector machines to identify formulaic language through classification techniques. However, this method has high requirements for feature selection and needs to be selected to effectively reflect The feature set of programming language characteristics leads to its poor generalization ability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Associated information fused program language recognition system and method
  • Associated information fused program language recognition system and method
  • Associated information fused program language recognition system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0080] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0081] refer to figure 1 and figure 2 , a program language recognition system that integrates related information, including:

[0082] The basic feature extraction module is used to use the embedding layer in Torch to generate word embedding vectors as part-of-speech features, feature vectors trained by GloVe word vector technology as semantic features, and late-fused part-of-speech features and semantic features as the basic features of the model.

[0083] The associated information extraction module is used to use the mutual information between words and the dependent syntactic relationship of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a program language recognition system and method fusing associated information, and the system comprises a basic feature extraction module which is used for generating a word embedding vector as a part-of-speech feature by using an embedding layer in Torch, taking a feature vector trained by a GloVe word vector technology as a semantic feature, and generating a semantic feature; taking the part-of-speech features and the semantic features after late fusion as basic features of the model; the associated information extraction module is used for adopting mutual information between words and a dependency syntactic relationship of sentences as associated information for identifying program languages; and the label representing module is used for representing labels. According to the method, feature vectors are represented through a word embedding technology, associated information capable of representing program language features is fused, deeper semantic features are acquired by using a graph convolutional neural network, and finally, a dependency relationship among tags is considered, and a conditional random field model is used for tag decoding, so that the purpose of identifying program languages is achieved.

Description

technical field [0001] The invention relates to formula language recognition, in particular to a formula language recognition system and method for fusing associated information. Background technique [0002] Formulaic language is a multi-word combination with specific functions and semantics, and is generally recognized, stored and retrieved as a whole. Studies have shown that most of the expressions of human language consist of formulas in nature. Formulaic language recognition, also known as "multi-word expression recognition", is a basic task in natural language processing. It has a wide range of applications and has important theoretical and practical significance for computer-aided language teaching and machine translation. [0003] In recent years, the research on formula language at home and abroad is in the ascendant stage. Scholars have obtained a lot of research results on formula language with the help of corpus technology and computer application programs AntCo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06F40/30G06F40/268G06F40/211G06N3/04G06N3/08
Inventor 鲍松彬郑育杰王敬东孟凡奇
Owner NORTHEAST DIANLI UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products