Medical visual question and answer method based on global visual information intervention

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A visual information, global technology, applied in neural learning methods, instruments, biological neural network models, etc., can solve the problems of model accuracy, overfitting, model robustness, etc., to increase punishment and ensure the main body. Status, the effect of improving accuracy

Pending Publication Date: 2022-07-29

DALIAN UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The above studies can guarantee the generalization ability of the model well in the case of a small sample size of the data set, but most of these studies still have the problem of overfitting, which affects the accuracy of the model on the test set. negative impact

In addition, their proposed models tend to rely heavily on linguistic bias to answer questions (linguistic bias, that is, the model overuses the superficial correlation between the question and answer words in the training set to generate answers), regardless of the image, which is harmful to the model. The robustness of the negative impact

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0063] In this embodiment, the Windows system is used as the development environment, Pycharm is used as the development platform, Python is used as the development language, and the medical visual question answering method based on the global visual information intervention of the present invention is used to complete the answer prediction for medical images and related questions.

[0064] In this embodiment, the medical visual question answering method based on global visual information intervention includes the following steps:

[0065] Step 1: Load the pretrained weights of the MAML and CDAE encoders in the MMQ network into figure 1 In the medical visual question answering network shown;

[0066] Step 2: Input the 'medical image-medical question' pair in the training set into the medical visual question answering network in step 1 for training;

[0067] Step 3: Take the required medical image and the corresponding medical question as input, load the model trained and save...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a medical visual question and answer method based on global visual information intervention, and belongs to the technical field of computer vision and medical artificial intelligence. According to the method, the global visual information missing from the main branch is supplemented through the global visual information branch, and the contribution of the image to the answer prediction score is enhanced, so that the model pays more attention to the characteristics of the medical image when predicting the answer and does not excessively depend on the language characteristics in the corresponding question; a forward compensation branch is proposed to enhance the contribution of multi-modal fusion information in a main branch to an answer prediction score, so that the influence caused by interference additionally introduced by a global visual information branch is inhibited; besides, a nonlinear fusion variant is used for fusing prediction scores of a plurality of branches to serve as a final score, and a multi-branch loss fusion module is combined, so that punishment on a global visual information branch and a forward compensation branch is increased, and therefore, the subject status of a main branch during answer prediction is ensured, and the accuracy of answer prediction is improved. And the accuracy of the whole model is effectively improved.

Description

technical field [0001] The invention belongs to the technical field of computer vision and medical artificial intelligence, in particular to a medical visual question answering method based on global visual information intervention. Background technique [0002] Clinics receive, store and examine large volumes of medical images, such as radiology and pathology images, on a daily basis, putting increasing pressure on doctors to analyze the images. Medical Visual Question Answering (VQA) systems aim to jointly analyze multimodal content from medical images and natural language and provide correct answers for a given clinical question corresponding to medical images. Not only can it provide doctors with valuable additional insights to reduce misdiagnosis, it can also help patients understand their medical images without a doctor. [0003] Although this field has great potential in the medical industry, it is still in its infancy and there is relatively little research related ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06V10/774G06V10/82G06N3/04G06N3/08

CPCG06V10/774G06V10/82G06N3/08G06N3/045

Inventor 周东生彭培爔张强

Owner DALIAN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Medical visual question and answer method based on global visual information intervention

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology