Medical visual question and answer method based on global visual information intervention

A visual information, global technology, applied in neural learning methods, instruments, biological neural network models, etc., can solve the problems of model accuracy, overfitting, model robustness, etc., to increase punishment and ensure the main body. Status, the effect of improving accuracy

Pending Publication Date: 2022-07-29
DALIAN UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The above studies can guarantee the generalization ability of the model well in the case of a small sample size of the data set, but most of these studies still have the problem of overfitting, which affects the accuracy of the model on the test set. negative impact
In addition, their proposed models tend to rely heavily on linguistic bias to answer questions (linguistic bias, that is, the model overuses the superficial correlation between the question and answer words in the training set to generate answers), regardless of the image, which is harmful to the model. The robustness of the negative impact

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Medical visual question and answer method based on global visual information intervention
  • Medical visual question and answer method based on global visual information intervention
  • Medical visual question and answer method based on global visual information intervention

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0063] In this embodiment, the Windows system is used as the development environment, Pycharm is used as the development platform, Python is used as the development language, and the medical visual question answering method based on the global visual information intervention of the present invention is used to complete the answer prediction for medical images and related questions.

[0064] In this embodiment, the medical visual question answering method based on global visual information intervention includes the following steps:

[0065] Step 1: Load the pretrained weights of the MAML and CDAE encoders in the MMQ network into figure 1 In the medical visual question answering network shown;

[0066] Step 2: Input the 'medical image-medical question' pair in the training set into the medical visual question answering network in step 1 for training;

[0067] Step 3: Take the required medical image and the corresponding medical question as input, load the model trained and save...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a medical visual question and answer method based on global visual information intervention, and belongs to the technical field of computer vision and medical artificial intelligence. According to the method, the global visual information missing from the main branch is supplemented through the global visual information branch, and the contribution of the image to the answer prediction score is enhanced, so that the model pays more attention to the characteristics of the medical image when predicting the answer and does not excessively depend on the language characteristics in the corresponding question; a forward compensation branch is proposed to enhance the contribution of multi-modal fusion information in a main branch to an answer prediction score, so that the influence caused by interference additionally introduced by a global visual information branch is inhibited; besides, a nonlinear fusion variant is used for fusing prediction scores of a plurality of branches to serve as a final score, and a multi-branch loss fusion module is combined, so that punishment on a global visual information branch and a forward compensation branch is increased, and therefore, the subject status of a main branch during answer prediction is ensured, and the accuracy of answer prediction is improved. And the accuracy of the whole model is effectively improved.

Description

technical field [0001] The invention belongs to the technical field of computer vision and medical artificial intelligence, in particular to a medical visual question answering method based on global visual information intervention. Background technique [0002] Clinics receive, store and examine large volumes of medical images, such as radiology and pathology images, on a daily basis, putting increasing pressure on doctors to analyze the images. Medical Visual Question Answering (VQA) systems aim to jointly analyze multimodal content from medical images and natural language and provide correct answers for a given clinical question corresponding to medical images. Not only can it provide doctors with valuable additional insights to reduce misdiagnosis, it can also help patients understand their medical images without a doctor. [0003] Although this field has great potential in the medical industry, it is still in its infancy and there is relatively little research related ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06V10/774G06V10/82G06N3/04G06N3/08
CPCG06V10/774G06V10/82G06N3/08G06N3/045
Inventor 周东生彭培爔张强
Owner DALIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products