Expandable large language model jailbreaking attack method, device, medium and product

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By updating the prompt template and adjusting the feedback parameters of the large language model, jailbreak attack prompts that meet the format requirements are generated, solving the problem of narrow security boundary assessment scope in large language model jailbreak attacks, and achieving high scalability and effective execution of jailbreak tasks.

CN119884311BActive Publication Date: 2026-06-26HANGZHOU HIGH-TECH ZONE (BINJIANG) INSTITUTE OF BLOCKCHAIN & DATA SECURITY +1

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: HANGZHOU HIGH-TECH ZONE (BINJIANG) INSTITUTE OF BLOCKCHAIN & DATA SECURITY
Filing Date: 2024-12-26
Publication Date: 2026-06-26

Application Information

Patent Timeline

26 Dec 2024

Application

26 Jun 2026

Publication

CN119884311B

IPC: G06F16/3329; G06F40/186

AI Tagging

Technology Topics

Linguistic model Algorithm

Technical Efficacy Phrases

meet execution needsSolve the problem of narrow assessment scope

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A SCADA configuration picture generation method and device based on generative AI, equipment and storage medium
CN122284968ALinguistic modelConfiguration design
A retrieval enhancement generation method and system based on dynamic information extraction of user queries
CN122240849AInference methods Special data processing applicationsResponse generationLinguistic model
Context construction and query response generation
US20260170033A1Natural language translation Digital data information retrieval Linguistic modelResponse generation
Personalized nuclear power management system question and answer method and device
CN122240764ASemantic analysis Text database indexing Personalization Linguistic model
Lightweight method and device for yolo model based on multi-agent cooperation and computer program
CN122263981AVersion control Biological models Linguistic model Dependability

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In existing technologies, large language models have a narrow scope for security boundary assessment during jailbreak attacks, and there is a lack of effective assessment methods.

Method used

By obtaining the first hint corresponding to the jailbreak mission, updating the hint template based on the character description and format requirements, generating a second hint that meets the format requirements, and using the target large language model to generate second response data, the hint template is adjusted through feedback parameters to meet the needs of different jailbreak missions.

Benefits of technology

It achieves high scalability of large language model jailbreak attack methods, avoids the problem of narrow content security boundary assessment caused by fixed algorithms and processes, and is adaptable to various jailbreak task scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN119884311B_ABST

Patent Text Reader

Abstract

The application relates to an expandable large language model jailbreaking attack method, device, medium and product. The method comprises the following steps: obtaining a first prompt corresponding to a jailbreaking task, and generating first answer data of the first prompt according to a question template; updating the writing content in a preset first prompt template according to a role description and / or a scene description corresponding to the jailbreaking task and a preset format requirement; transcribing the first prompt by taking the first answer data as an example and combining the role description and / or the scene description in the first prompt template to obtain a second prompt meeting the format requirement; and obtaining second answer data generated by a target large language model based on the second prompt. The method can solve the problem of narrow evaluation range of the security boundary of the large language model when coping with the jailbreaking attack.

Need to check novelty before this filing date? Find Prior Art