Illusion processing method and device of multi-modal large model, equipment and medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By constructing a hallucination assessment framework and a semantic segmentation-based differentiated reweighting strategy, the shortcomings of multimodal large models in hallucination assessment and relief are addressed, achieving fine-grained quantitative evaluation and accurate hallucination relief, and improving the accuracy and reliability of the model's output.

CN122242584APending Publication Date: 2026-06-19HANGZHOU INST FOR ADVANCED STUDY UCAS

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: HANGZHOU INST FOR ADVANCED STUDY UCAS
Filing Date: 2026-05-25
Publication Date: 2026-06-19

AI Technical Summary

Technical Problem

Existing multimodal large models struggle to cover fine-grained hallucination types in hallucination evaluation, lack scene adaptability, and lack fine-grained distinction in inference intervention, resulting in inaccurate outputs from the models in complex visual semantic tasks.

Method used

A hallucination assessment framework is constructed. By comparing assessment samples with real labels, key attention heads are located. Then, semantic segmentation and differentiated reweighting strategies are used to adjust attention weights to achieve hallucination relief in a multimodal large model.

Benefits of technology

It enables fine-grained quantitative evaluation of multimodal large models across different evaluation dimensions, accurately locates the causes of hallucinations, effectively alleviates hallucination problems in the reasoning process, enhances the model's scenario adaptability in hallucination relief, and improves the accuracy and reliability of the output results.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122242584A_ABST

Patent Text Reader

Abstract

This application discloses a method, apparatus, device, and medium for hallucination processing in a multimodal large model, relating to the fields of artificial intelligence and information processing technology. Based on a hallucination assessment framework, it performs differentiated reweighting of attention weights on key attention heads according to the semantic segmentation of the multimodal sequence. This allows for reasonable control of the attention weight allocation corresponding to different semantic segments, effectively alleviating hallucination problems generated during reasoning from the underlying mechanism level of the model while maintaining the normal recognition and reasoning capabilities of the multimodal large model, thus enhancing the scene adaptability of the multimodal large model in hallucination mitigation. The method includes: constructing a hallucination assessment framework; using the hallucination assessment framework to locate key attention heads in the multimodal large model that trigger hallucinations; and adjusting the attention weights of key attention heads according to the semantic segmentation of the multimodal sequence to achieve hallucination mitigation during the reasoning stage of the multimodal large model.

Need to check novelty before this filing date? Find Prior Art