A prompt word sensitivity analysis method of a large language model

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By employing an instance-level cue word sensitivity analysis method, utilizing the cue word sensitivity analysis metrics PSS and decoding confidence, the problem of accurately measuring the cue word sensitivity of large language models is solved. This enables more refined analysis and a deeper understanding of the underlying mechanisms of the model, thereby improving the accuracy and guidance of the analysis.

CN119474254BActive Publication Date: 2026-06-26SHANGHAI ARTIFICIAL INTELLIGENCE INNOVATION CENT

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SHANGHAI ARTIFICIAL INTELLIGENCE INNOVATION CENT
Filing Date: 2024-09-13
Publication Date: 2026-06-26

Application Information

Patent Timeline

13 Sep 2024

Application

26 Jun 2026

Publication

CN119474254B

IPC: G06F16/33; G06F16/332; G06N5/04

CPC: G06F16/3344; G06F16/3329; G06N5/041

AI Tagging

Technology Topics

Natural language processing Sensitive analysis

Technical Efficacy Phrases

improve accuracy

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Digital human interaction control method and device based on multi-modal
CN122389919AAccurately capture instant intentionsimprove accuracy Interaction control Feature vector
Fault diagnosis method and device and model prototype acquisition method
CN115658361Bfully excavatedimprove accuracy
A device for visualizing calibration of astigmatic eye focal lines
CN224483971Uimprove accuracy High measurement accuracy Target line Astigmatism
Hub node modeling and analysis data generation method and device for single-layer latticed shell structure
CN122221370AGeometric CAD Design optimisation/simulation
Data processing method and apparatus, storage medium, and electronic device
CN121943471BSolve technical problems with low accuracyimprove accuracyBone tibiaSurgery

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies struggle to accurately measure the sensitivity of large language models to prompt words, and lack consideration for real-world application scenarios and user needs, making it impossible to delve into their underlying mechanisms.

Method used

We employ an instance-level cue word sensitivity analysis method. By acquiring instance data and calculating the cue word sensitivity analysis index PSS and decoding confidence, we comprehensively evaluate the cue word sensitivity of a large language model and reveal its underlying mechanism.

Benefits of technology

It improves the accuracy of cue word sensitivity analysis for large language models, enabling a more comprehensive analysis of the model's cue word sensitivity in both objective and subjective evaluations, and providing guidance for model development.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN119474254B_ABST

Patent Text Reader

Abstract

The application relates to a prompt word sensitivity analysis method of a large language model, characterized in that the method comprises the following steps: S1, acquiring instance data, each instance corresponding to a prompt word set, the prompt word set comprising multiple-choice question prompt words, inputting the instance data into a large language model to be detected, and analyzing a prompt word sensitivity analysis index PSS based on the answer of the large language model; S2, inputting the multiple-choice question prompt words in the instance data into the large language model to be detected, calculating decoding confidence according to the probability of the maximum probability token corresponding to the multiple-choice question prompt word predicted by the large language model, and calculating average decoding confidence based on the decoding confidence of the multiple-choice question prompt words corresponding to multiple instances; and S3, analyzing the overall prompt word sensitivity of the large language model to be detected based on the prompt word sensitivity analysis index PSS and the average decoding confidence. Compared with the prior art, the application has the advantages of improving the accuracy of prompt word sensitivity analysis of the large language model.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the technical field of cue word sensitivity analysis for large language models, and in particular to a cue word sensitivity analysis method for large language models. Background Technology

[0002] Large language models are highly sensitive to subtle changes in prompt words. Even minor modifications, such as adding a few spaces, can significantly alter the performance of a large language model. This sensitivity makes it difficult for developers of large language models to determine whether changes in test set scores are due to performance improvements or prompt word selection when evaluating the model. Furthermore, users often need to fine-tune the prompt words multiple times to obtain higher-quality output.

[0003] Traditional cue sensitivity analysis techniques primarily analyze model cue sensitivity at the dataset level, focusing on the overall performance trend under different cue formats. For example, the PromptBench benchmark system evaluates the performance of several mainstream large language models under different cue words. Pezeshkpour et al. found that even small grammatical or lexical changes can lead to a significant drop in the performance of large language models. These studies reveal the high sensitivity of large language models to cue words, laying the foundation for further understanding and mitigation of this problem.

[0004] However, existing methods primarily focus on analyzing cue sensitivity at the dataset level, often based on score variations across different cue word templates on the same dataset. This evaluation method is inaccurate and fails to reflect the model's true cue word sensitivity. Furthermore, existing analyses of large language models do not involve subjective evaluation and lack consideration for practical application scenarios and user needs. Existing work also fails to delve into the root causes of large language models' cue sensitivity, limiting our understanding of its underlying mechanisms. Summary of the Invention

[0005] The purpose of this invention is to provide a method for analyzing the sensitivity of prompt words in large language models in order to improve the accuracy of prompt word sensitivity analysis.

[0006] The objective of this invention can be achieved through the following technical solutions:

[0007] A method for sensitivity analysis of cue words in a large language model, comprising the following steps:

[0008] S1. Obtain instance data, which includes multiple instances, each instance corresponding to a set of prompt words, which includes prompt words for multiple-choice questions. Input the instance data into the large language model to be tested, and analyze the prompt word sensitivity index PSS based on the answer analysis of the large language model.

[0009] S2. Input the multiple-choice prompt words in the instance data into the large language model to be detected. Calculate the decoding confidence based on the probability of the highest probability token corresponding to the multiple-choice prompt words predicted by the large language model. Calculate the average decoding confidence based on the decoding confidence of the multiple-choice prompt words corresponding to multiple instances.

[0010] S3. Analyze the overall sensitivity of prompt words in the large language model under test based on the prompt word sensitivity analysis index PSS and the average decoding confidence score.

[0011] Furthermore, the prompt word sensitivity analysis index PSS is the average sensitivity of all instances in the instance data.

[0012] Furthermore, the sensitivity is:

[0013]

[0014] in, For the sensitivity of an instance, Y(p) represents the performance metric under cue word p, C(|P|,2) represents the number of cue word pairs in the same instance, and i and j represent two different cue word indices.

[0015] Furthermore, the performance metrics under the prompt word p include the correctness of the large language model's answer given the true value in the instance data.

[0016] Furthermore, the correctness of the language model's answer is the similarity between the large language model's answer and the given true value.

[0017] Furthermore, the performance metrics under the prompt word p do not include the scores for the large language model's answer given the true value in the instance data.

[0018] Furthermore, the decoding confidence level corresponding to an instance is:

[0019]

[0020] Among them, Probability(t) next |p) represents the probability of the token that the model predicts with the highest probability under cue word p in the cue word set P.

[0021] Furthermore, in the process of analyzing the overall prompting word sensitivity of the large language model to be tested, the sensitivity analysis index PSS represents the probability of inconsistency in correctness between any two prompts in the same instance in both subjective and objective evaluation.

[0022] Furthermore, in the process of analyzing the overall prompt sensitivity of the large language model to be tested, the sensitivity analysis index PSS represents the difference in average response quality between two prompts in the same instance in the objective evaluation.

[0023] Furthermore, the average decoding confidence is positively correlated with the sensitivity of the prompt words.

[0024] Compared with the prior art, the present invention has the following beneficial effects:

[0025] This invention provides a Prompt Sensitivity (PPS) metric for measuring the cue word sensitivity of large language models at the instance level. This enables a more comprehensive and detailed analysis of the cue word sensitivity of large language models, allowing for analysis of both objective and subjective evaluations. Furthermore, it analyzes the mechanism of cue word sensitivity in large language models and proposes a methodology for analyzing cue word sensitivity from the perspective of decoding confidence, thereby improving the accuracy of cue word sensitivity analysis. Attached Figure Description

[0026] Figure 1 This is a flowchart of the present invention;

[0027] Figure 2 This is a graph showing the experimental results of the sensitivity difference of prompt words in this invention. Detailed Implementation

[0028] The present invention will now be described in detail with reference to the accompanying drawings and specific embodiments. These embodiments are implemented based on the technical solution of the present invention, providing detailed implementation methods and specific operating procedures. However, the scope of protection of the present invention is not limited to the following embodiments.

[0029] This invention introduces the ProSA framework, focusing on instance-level analysis, and proposes a method for cue word sensitivity analysis of large language models. The flowchart of the method is as follows: Figure 1 As shown, this invention quantifies the average difference in the responses of LLMs to different cue variants of the same instance and utilizes decoding confidence to reveal the underlying mechanisms of cue sensitivity. This invention aims to comprehensively evaluate and analyze the cue sensitivity of large language models and reveal the underlying mechanisms. ProSA emphasizes instance-level analysis, includes a novel sensitivity metric PSS, and can analyze cue word sensitivity of large language models on both objective and subjective evaluations. Furthermore, this framework utilizes decoding confidence to explore the root causes of cue sensitivity, aiming to provide guidance for developing more robust and user-responsive large language models.

[0030] The method of the present invention includes the following steps:

[0031] S1. Obtain instance data, which includes multiple instances, each instance corresponding to a set of prompt words, which includes prompt words for multiple-choice questions. Input the instance data into the large language model to be tested, and analyze the prompt word sensitivity index PSS based on the answer analysis of the large language model.

[0032] S2. Input the multiple-choice prompt words in the instance data into the large language model to be detected. Calculate the decoding confidence based on the probability of the highest probability token corresponding to the multiple-choice prompt words predicted by the large language model. Calculate the average decoding confidence based on the decoding confidence of the multiple-choice prompt words corresponding to multiple instances.

[0033] S3. Analyze the overall prompt word sensitivity of the large language model under test based on the prompt word sensitivity analysis index PSS and the average decoding confidence score.

[0034] This invention designs an instance-level prompt word sensitivity analysis index (PSS), the specific components of which are as follows:

[0035]

[0036] For a given instance, Y(p) represents the performance metric for a given cue word p within the set P of cue words. For instances with a given true value, Y(p) refers to the correctness of the large language model's answer. For tasks without a clear true value, the answer typically has a score representing the quality of generation, and Y(p) refers to that score, ranging from [0, 1]. i )-Y(p j )| represents a prompt p i and hint p j The absolute difference in performance metrics between instances. C(|P|, 2) represents the number of hint pairs in the same instance. PSS is the absolute difference in performance metrics between all instances in the same dataset. The average value.

[0037] Due to differences in task type and evaluation method, PSS has different meanings in objective and subjective evaluation. In objective evaluation, PSS represents the probability of inconsistency in correctness between any two prompts in the same instance. In subjective evaluation, PSS represents the difference in average response quality between two prompts in the same instance.

[0038] Compared to statistical analysis at the dataset level, PSS provides a more accurate and intuitive representation of cue sensitivity.

[0039] Furthermore, this invention uses token probabilities to calculate the decoding confidence of the large language model in a multiple-choice setting. The decoding confidence is defined as follows for one instance:

[0040]

[0041] Here, Probability(t) next |p) represents the probability of the token predicted by the model with the highest probability under cue word p, given the cue word set P. The average decoding confidence of a large language model is the probability of its output across different instances. The average value. The average decoding confidence in this invention is positively correlated with the sensitivity of the cue words.

[0042] The key point of this invention is the framework for sensitivity analysis of prompt words in large language models, which can be specifically divided into the following two points:

[0043] (1) This invention provides a metric, PPS, for measuring the cue word sensitivity of a large language model at the instance level.

[0044] This enables a more comprehensive and detailed analysis of the cue word sensitivity of large language models.

[0045] This invention is the first to analyze the mechanism of cue word sensitivity in large language models and proposes a methodology for analyzing cue word sensitivity from the perspective of decoding confidence of large language models.

[0046] The present invention has the following advantages:

[0047] (1) Compared with the data set level cue word sensitivity analysis index, the instance level analysis index PSS of the present invention can more comprehensively and meticulously characterize the cue word sensitivity of large language models.

[0048] This invention is the first to analyze the sensitivity of prompt words in a large language model from the perspective of interpretability. Experiments were conducted on four objective datasets and two subjective datasets.

[0049] First, on four datasets—CommonsenseQA[3], ARC-Challenge[4], MATH[5], and HumanEval[6]—we used eight models—InternLM2 series[7], Llama3 series[8], Qwen1.5 series[9], and Mistral-7B

[10] —to calculate the PSS for 12 cue words. The experimental results show that there are significant differences in cue word sensitivity among different models. The results are as follows: Figure 2 As shown.

[0050] In addition, we also conducted experiments on three sets of prompt words using five large language models on two subjective datasets: LC AlpacaEval 2.0

[11] and Arena Hard Auto

[12] . The results are shown in Table 1.

[0051] Table 1 Experimental Results

[0052] Model LC AlpacaEval 2.0 Arena Hard Auto InternLM2-20B-chat 0.022 0.249 Llama3-8B-instruct 0.013 0.266 Llama3-70B-instruct 0.016 0.258 Qwen1.5-14B-chat 0.022 0.249 Qwen1.5-72B-chat 0.036 0.250

[0053] It is evident that different models exhibit varying sensitivities to cue words on different subjective benchmarks, with the ArenaHard Auto benchmark showing greater sensitivity to cue words.

[0054] The references cited in the above experiment are as follows:

[0055] 【1】Promptbench:Towards evaluating the robustness of large language models on adversarial prompts

[0056] 【2】Large language models sensitivity to the order of options in multiple-choicequestions

[0057] 【3】Commonsenseqa: A question answering challenge targeting commonsenseknowledge

[0058] 【4】Think you have solved question answering? try arc, the ai2 reasoningchallenge.【5】Measuring mathematical problem solving with the math dataset

[0059] 【6】Evaluating Large Language Models Trained on Code

[0060] 【7】Internlm2 technical report

[0061] 【8】Llama 3 Model Card

[0062] 【9】Qwen Technical Report

[0063]

[10] Mistral 7B

[0064]

[11] Length-Controlled AlpacaEval:A Simple Way to Debias AutomaticEvaluators

[12] From Live Data to High-Quality Benchmarks:The Arena-HardPipeline

[0065] The preferred embodiments of the present invention have been described in detail above. It should be understood that those skilled in the art can make numerous modifications and variations based on the concept of the present invention without creative effort. Therefore, all technical solutions that can be obtained by those skilled in the art based on the concept of the present invention through logical analysis, reasoning, or limited experimentation on the basis of existing technology should be within the scope of protection defined by the claims.

Claims

1. A method for sensitivity analysis of cue words in a large language model, characterized in that, The method includes the following steps: S1. Obtain instance data, which includes multiple instances, each instance corresponding to a set of prompt words, which includes prompt words for multiple-choice questions. Input the instance data into the large language model to be tested, and analyze the prompt word sensitivity index PSS based on the answer analysis of the large language model. S2. Input the multiple-choice prompt words in the instance data into the large language model to be detected. Calculate the decoding confidence based on the probability of the highest probability token corresponding to the multiple-choice prompt words predicted by the large language model. Calculate the average decoding confidence based on the decoding confidence of the multiple-choice prompt words corresponding to multiple instances. S3. Analyze the overall prompt word sensitivity of the large language model under test based on the prompt word sensitivity analysis index PSS and the average decoding confidence score; The prompt word sensitivity analysis index PSS is the average sensitivity of all instances in the instance data; The sensitivity of the instance is: in, Sensitivity for a single instance Represents the prompt word The following performance indicators This represents the number of suggestion word pairs in the same instance, where i and j represent two different suggestion word indices; The decoding confidence level for an instance is: in, Represents the set of prompt words In the middle, the model in the prompt words The probability of the token with the highest probability in the prediction; The average decoding confidence is positively correlated with the sensitivity of the prompt words.

2. The method for analyzing the sensitivity of prompt words in a large language model according to claim 1, characterized in that, The prompt words The performance metrics under this model are the correctness of the responses given true values in the instance data.

3. The method for analyzing the sensitivity of cue words in a large language model according to claim 2, characterized in that, The correctness of the large language model's answer is defined as the similarity between the large language model's answer and the given true value.

4. The method for analyzing the sensitivity of prompt words in a large language model according to claim 1, characterized in that, The prompt words The performance metrics below do not include the scores given to the large language model's answers in the instance data when the true values are not included.

5. The method for analyzing the sensitivity of prompt words in a large language model according to claim 1, characterized in that, In the process of analyzing the overall prompting word sensitivity of the large language model to be tested, the sensitivity analysis index PSS represents the probability of inconsistency in correctness between any two prompts in the same instance in both subjective and objective evaluation.

6. The method for analyzing the sensitivity of prompt words in a large language model according to claim 1, characterized in that, In the process of analyzing the overall prompt sensitivity of the large language model to be tested, the sensitivity analysis index PSS represents the difference in average response quality between two prompts in the same instance in the objective evaluation.