Gastroscope video part identification network structure based on Transformer

A network structure and video technology, applied in the field of video recognition, can solve problems such as poor recognition accuracy, achieve auxiliary shooting and diagnosis, accurate classification results, and improve classification accuracy

Pending Publication Date: 2021-07-27
ZHONGSHAN HOSPITAL FUDAN UNIV
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Aiming at the problem of poor recognition accuracy of specific parts in gastric endoscopy, a transformer-based gastroscope video part recognition network structure is proposed. On the basis of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Gastroscope video part identification network structure based on Transformer

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0011] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.

[0012] Such as figure 1 The schematic diagram of the Transformer-based gastroscope video part recognition network structure is shown. The video image is collected in real time and input to the recognition network. First, it enters the feature extraction. The feature extraction part follows the CNN structure. The 2D convolution kernel is used to extract features independently for each frame of image, that is, the convolution kernel is in Slide on each frame of image, go through four convolutional layers (Block1~Block4), and finally perform dimensionality reduction feature extraction th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a gastroscope video part recognition network structure based on Transformer. On the basis of feature extraction of a convolutional neural network, the relationship between video frames in a time sequence is fused through a Transform structure, so that the accuracy of video recognition is improved. Compared with 2DCNN classification which can only pay attention to information of a single picture, and 3DCNN convolutional network which is relatively high in parameter quantity and can only pay attention to local time channel information, the structure has the advantages that information between frames is aggregated by utilizing an attention structure of transformer, so that the classification result is more accurate, and the classification precision during gastroscope video identification can be effectively improved. The position of the gastroscope is positioned in real time under endoscopic examination, and the category of the alimentary canal part in the video is accurately recognized. The structure assists a doctor in gastroscope shooting and diagnosis, improves the overall gastroscope video shooting quality, carries out sampling for subsequent pathology examination, and has significant significance and actual function requirements.

Description

technical field [0001] The invention relates to a video recognition technology, in particular to a Transformer-based gastroscope video part recognition network structure. Background technique [0002] At present, for gastroscope video recognition, the existing findings are basically based on the establishment of a full convolutional network model for single-frame images for classification, such as Densenet, Efficientnet and other series of models. These methods use convolutional layers to extract features, and then use the extracted feature to obtain a single video frame classification result. However, the image features of the stomach and digestive tract have high common characteristics, and it is difficult to learn the timing characteristics of video data and the global characteristics of digestive tract organs from a single frame of video, so it is lacking in judging the overall category of gastroscope video, thus This leads to poor classification accuracy for gastroscop...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06T7/00G06K9/46G06K9/62G06N3/04G06N3/08
CPCG06T7/0012G06N3/08G06T2207/10068G06T2207/20081G06T2207/30092G06V10/44G06N3/045G06F18/241
Inventor 诸炎李全林周平红张丹枫耿子寒
Owner ZHONGSHAN HOSPITAL FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products