Illusion processing method and device of multi-modal large model, equipment and medium

By constructing a hallucination assessment framework and a semantic segmentation-based differentiated reweighting strategy, the shortcomings of multimodal large models in hallucination assessment and relief are addressed, achieving fine-grained quantitative evaluation and accurate hallucination relief, and improving the accuracy and reliability of the model's output.

CN122242584APending Publication Date: 2026-06-19HANGZHOU INST FOR ADVANCED STUDY UCAS

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HANGZHOU INST FOR ADVANCED STUDY UCAS
Filing Date
2026-05-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing multimodal large models struggle to cover fine-grained hallucination types in hallucination evaluation, lack scene adaptability, and lack fine-grained distinction in inference intervention, resulting in inaccurate outputs from the models in complex visual semantic tasks.

Method used

A hallucination assessment framework is constructed. By comparing assessment samples with real labels, key attention heads are located. Then, semantic segmentation and differentiated reweighting strategies are used to adjust attention weights to achieve hallucination relief in a multimodal large model.

Benefits of technology

It enables fine-grained quantitative evaluation of multimodal large models across different evaluation dimensions, accurately locates the causes of hallucinations, effectively alleviates hallucination problems in the reasoning process, enhances the model's scenario adaptability in hallucination relief, and improves the accuracy and reliability of the output results.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242584A_ABST
    Figure CN122242584A_ABST
Patent Text Reader

Abstract

This application discloses a method, apparatus, device, and medium for hallucination processing in a multimodal large model, relating to the fields of artificial intelligence and information processing technology. Based on a hallucination assessment framework, it performs differentiated reweighting of attention weights on key attention heads according to the semantic segmentation of the multimodal sequence. This allows for reasonable control of the attention weight allocation corresponding to different semantic segments, effectively alleviating hallucination problems generated during reasoning from the underlying mechanism level of the model while maintaining the normal recognition and reasoning capabilities of the multimodal large model, thus enhancing the scene adaptability of the multimodal large model in hallucination mitigation. The method includes: constructing a hallucination assessment framework; using the hallucination assessment framework to locate key attention heads in the multimodal large model that trigger hallucinations; and adjusting the attention weights of key attention heads according to the semantic segmentation of the multimodal sequence to achieve hallucination mitigation during the reasoning stage of the multimodal large model.
Need to check novelty before this filing date? Find Prior Art