Gastroscope video part identification network structure based on Transformer

A network structure and video technology, applied in the field of video recognition, can solve problems such as poor recognition accuracy, achieve auxiliary shooting and diagnosis, accurate classification results, and improve classification accuracy
CN113177940APending Publication Date: 2021-07-27ZHONGSHAN HOSPITAL FUDAN UNIV

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
ZHONGSHAN HOSPITAL FUDAN UNIV
Publication Date
2021-07-27

Smart Images

  • Figure 1
    Figure 1
Patent Text Reader

Abstract

The invention relates to a gastroscope video part recognition network structure based on Transformer. On the basis of feature extraction of a convolutional neural network, the relationship between video frames in a time sequence is fused through a Transform structure, so that the accuracy of video recognition is improved. Compared with 2DCNN classification which can only pay attention to information of a single picture, and 3DCNN convolutional network which is relatively high in parameter quantity and can only pay attention to local time channel information, the structure has the advantages that information between frames is aggregated by utilizing an attention structure of transformer, so that the classification result is more accurate, and the classification precision during gastroscope video identification can be effectively improved. The position of the gastroscope is positioned in real time under endoscopic examination, and the category of the alimentary canal part in the video is accurately recognized. The structure assists a doctor in gastroscope shooting and diagnosis, improves the overall gastroscope video shooting quality, carries out sampling for subsequent pathology examination, and has significant significance and actual function requirements.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to a video recognition technology, in particular to a Transformer-based gastroscope video part recognition network structure. Background technique

[0002] At present, for gastroscope video recognition, the existing findings are basically based on the establishment of a full convolutional network model for single-frame images for classification, such as Densenet, Efficientnet and other series of models. These methods use convolutional layers to extract features, and then use the extracted feature to obtain a single video frame classification result. However, the image features of the stomach and digestive tract have high common characteristics, and it is difficult to learn the timing characteristics of video data and the global characteristics of digestive tract organs from a single frame of video, so it is lacking in judging the overall category of gastroscope video, thus This leads to poor classification accuracy for gastroscop...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More