Supercharge Your Innovation With Domain-Expert AI Agents!

Low-resource text recognition algorithm based on semantic elements

A technology of text recognition and elements, applied in semantic analysis, text database clustering/classification, unstructured text data retrieval, etc., can solve problems such as huge models, difficult model accuracy requirements, and inability to deploy offline on mobile terminals. Achieve high interpretability and reduce learning difficulty

Pending Publication Date: 2020-12-25
河南合众伟奇云智科技有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In this scenario, there are relatively few language expressions, but the accuracy of the classification model is high.
At present, those skilled in the art mostly use neural network language models such as BERT for processing. This model can ensure the convergence and generalization of the model with less data, but it is difficult to meet the accuracy requirements of the model, and it will also make the model too large to be able to Deploy offline on mobile

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Low-resource text recognition algorithm based on semantic elements
  • Low-resource text recognition algorithm based on semantic elements
  • Low-resource text recognition algorithm based on semantic elements

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] In order to better understand the present invention, the content of the present invention is further clearly described below with reference to the embodiments and the accompanying drawings, but the protection content of the present invention is not limited to the following embodiments. In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without one or more of these details.

[0022] like figure 1 , figure 2 As shown, the implementation process of the present invention includes the following steps:

[0023] 1. In step S1, a text sentence is acquired, the text sentence is encoded, and an encoded sentence tensor representation E is obtained.

[0024] Step S1 first uses LSTM or Transformer to encode text sentences. The LSTM or Transformer algorithm is a relatively popular sequen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a low-resource text recognition algorithm based on semantic elements, and belongs to the technical field of natural language understanding. The method comprises the steps of obtaining a text sentence, and performing encoding processing on the text sentence to obtain encoded sentence tensor representation; performing semantic element recognition processing on the sentence tensor representation to obtain a semantic element recognition result; scaling the sentence tensor representation by using a semantic element recognition result; processing the scaled sentence tensor representation by using a mean pooling method to obtain a semantic element vector representation; processing the sentence tensor representation by using the mean pooling method to obtain sentence vectorrepresentation; splicing the sentence vector representation and the semantic element vector representation to obtain final sentence representation; and processing the final representation of the sentence to obtain a final text type probability. According to the invention, a semantic element recognition task is introduced so that the model has the capability of recognizing different semantic elements, and learning difficulty of the instruction text classification task is greatly reduced.

Description

technical field [0001] The invention belongs to the technical field of natural language understanding, and in particular relates to a low-resource text recognition algorithm based on semantic elements. Background technique [0002] In recent years, deep learning models have achieved remarkable results in many natural language understanding tasks. However, deep learning-based methods often require a relatively large amount of labeled data. Moreover, the application of natural language understanding has relatively strong scene-based characteristics, and it is impossible to directly develop applications using public corpus resources. In many fields and different scenarios, the cost of corpus labeling is relatively high. [0003] Voice command control is a human-computer interaction method that uses a voice control system. The usual implementation method is to use voice recognition technology to convert voice information into text, and then use text classification technology to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/30G06F40/211G06N3/04G06F16/35
CPCG06F40/30G06F40/211G06F16/35G06N3/044G06N3/045
Inventor 付勇井友鼎杜创胜王旭峰甘志芳王顺智
Owner 河南合众伟奇云智科技有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More