Method and device for evaluating large model based on multi-end interaction verification, equipment and medium

By using a multi-terminal interactive verification method, which utilizes the cyclical interactive verification between the first and second evaluation terminals, the problem of lack of integration of viewpoints in large model evaluation is solved, the accuracy and objectivity of the evaluation are improved, and the evaluation results are ensured to better reflect the actual security status of the model.

CN121920557BActive Publication Date: 2026-06-26BEIJING UNIV OF POSTS & TELECOMM

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING UNIV OF POSTS & TELECOMM
Filing Date
2026-03-26
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

In existing technologies, large model evaluation lacks effective fusion of multi-end perspectives, resulting in insufficient objectivity and accuracy of evaluation results, and failing to truly reflect the actual security status of the model.

Method used

A multi-terminal interactive verification method is adopted. The first and second evaluation terminals evaluate the responses of the large model when the state changes to determine the global cognitive energy. When the global cognitive energy is greater than the convergence threshold, the two evaluation terminals update their evaluation results by referring to each other's views through cyclical interactive verification until the global cognitive energy reaches the threshold, and finally determine the evaluation result of the large model.

Benefits of technology

It improves the accuracy of large model evaluation, ensures that the evaluation results are more consistent with the actual risk status of the model, avoids the one-sidedness of a single judgment, and realizes the integration and correction of evaluation viewpoints.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121920557B_ABST
    Figure CN121920557B_ABST
Patent Text Reader

Abstract

The application provides a large model evaluation method and device based on multi-end interaction verification, equipment and medium, belonging to the technical field of large model evaluation, the method comprises the following steps: inputting a plurality of test instructions in a target test instruction sequence into a to-be-tested large model according to a preset order, and determining an implicit risk metric when the state of the to-be-tested large model changes; based on a first evaluation end and a second evaluation end, the answers output by the to-be-tested large model when the state changes are evaluated to obtain a first evaluation result and a second evaluation result; based on the first evaluation result and the second evaluation result, the global cognitive energy is determined; according to the implicit risk metric, a convergence threshold is determined; the evaluation results of the first evaluation end and the second evaluation end are exchanged, and the global cognitive energy is recalculated until the global cognitive energy is not greater than the convergence threshold, and based on the first evaluation result and the second evaluation result determined last, the evaluation result of the to-be-tested large model is determined. The application can improve the accuracy of the large model evaluation result.
Need to check novelty before this filing date? Find Prior Art