Live broadcast goods advertising violation clue identification method based on two-stage semantic retrieval

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By combining segmented indexing and knowledge graphs with a two-stage semantic retrieval method, the problem of dilution and missed detection of violations in long-time multimodal live-streaming e-commerce videos is solved, achieving high-precision identification of violation clues and evidence generation, and supporting automated compliance review.

CN122309807APending Publication Date: 2026-06-30HARBIN INST OF TECH

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: HARBIN INST OF TECH
Filing Date: 2026-03-20
Publication Date: 2026-06-30

Application Information

Patent Timeline

20 Mar 2026

Application

30 Jun 2026

Publication

CN122309807A

IPC: G06F16/783; G06F16/78; G06F16/71; G06V20/40; G06V20/62; G06V10/74; G06V10/75; G06N3/0455; G06N5/022; G06N5/02; G06N5/04; G06Q30/0241

AI Tagging

Technology Topics

Semantic filteringEngineering

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A method and apparatus for identifying an emotional cause segment
CN122287649ALinguistic modelSemantic filtering
Radio map construction and non-cooperative radiated source positioning method based on agent interaction
CN122248526APosition fixation Transmission monitoringSemantic filteringNoise (radio)
A relationship-guided full-induction multi-modal knowledge graph reasoning method and system
CN122264091ADigital data information retrieval Semantic analysisSemantic filteringMessage delivery
Object detection and coordinate output method based on visual large language model
CN122289221ASemantic filteringLinguistic model
Enhanced segment analysis and quality control for content distribution
US20260181212A1Selective content distributionSemantic filteringVideo recognition

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies struggle to effectively detect illegal content in long-term, multimodal live-streaming e-commerce videos, especially due to the dilution of detection, missed detections, and false positives caused by video length, multimodal information, and a lack of traceable evidence chains.

Method used

It adopts a framework of segmented indexing + knowledge graph + two-stage semantic retrieval + large model reasoning, and achieves deep alignment and traceable mapping between multimodal content and legal semantics through video index construction, coarse-grained retrieval of prohibited words, fine-grained matching of advertising law, and compliance determination by large model.

Benefits of technology

It achieves high-precision, explainable, and traceable automated violation detection of live-streaming e-commerce advertisements, can accurately locate violation clues in long videos and generate evidence chains, and supports automated review of regulatory approvals.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122309807A_ABST

Patent Text Reader

Abstract

This invention discloses a method for identifying violations in live-streaming e-commerce advertisements based on two-stage semantic retrieval, comprising the following steps: Step 1, video index construction; Step 2, coarse-grained retrieval of prohibited words; Step 3, fine-grained matching of advertising laws; Step 4, compliance determination by a large-scale model. This invention introduces for the first time a two-stage semantic retrieval mechanism combining video knowledge graph and prohibited word retrieval, and video content and legal vector retrieval. First, coarse-grained screening is performed using a prohibited word database, followed by fine-grained matching and reasoning using legal provisions, achieving hierarchical violation identification and semantic filtering. Finally, the recalled video content, detection requests, and matched legal provisions are input together into a large-scale language model to obtain traceable violation determination results.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of artificial intelligence and digital content compliance review technology, specifically a method for identifying clues of violations in live-streaming e-commerce advertisements based on two-stage semantic retrieval. Background Technology

[0002] In recent years, advertising formats have rapidly evolved from static text and images to long-duration, multimodal video ads, with live-streaming e-commerce being a prime example. Live-streaming e-commerce videos typically last tens of minutes to several hours, involving multiple modalities such as host announcements, product demonstrations, subtitles, background images, and audience comments. The content is highly dense and contextually relevant, often employing misleading language through absolute statements, medical terminology, value-driven narratives, or visual cues. Illegal and non-compliant practices are often covert, and because these expressions are frequently subtly distributed within long time segments, traditional content moderation techniques struggle to detect them effectively. While retrieval enhancement methods can augment the knowledge coverage of large language models through external knowledge bases, existing methods primarily target short-text question answering and document-level retrieval tasks, and have not yet addressed multimodal feature alignment and legal semantic modeling for long-video scenarios.

[0003] The characteristics of long-form live-streaming e-commerce advertisements present the following key technical challenges: First, live-streaming e-commerce videos are typically long, with semantic content changing significantly over time, containing multiple semantic units such as product introductions, efficacy descriptions, pricing information, and user feedback. Illegal semantics are often scattered across different time segments, and their illegal tendencies only become apparent when combined with the context. Existing methods often employ whole-segment or windowed text analysis, failing to perform segmented modeling and semantic fragment-level indexing for long videos, leading to diluted semantic expression and inaccurate localization. Second, live-streaming videos simultaneously contain multiple modalities, including audio, subtitles, and visual images, with semantic complementarity and redundancy existing between these modalities. Traditional detection methods often utilize only a single modality, ignoring the correlation between visual scenes, character actions, and audio content, resulting in missed or misjudged violations. Third, there is a lack of traceable chains of evidence for violations. Existing models often only output a classification result of "violation exists" or "violation does not exist," failing to pinpoint the time, text location, or visual image where the illegal content appeared, resulting in a lack of interpretability and legal evidentiary value in the review results. Especially in advertising law enforcement and content arbitration, the ability to trace evidence is a core indicator.

[0004] To address this, the technical solution proposed in this invention introduces for the first time a framework of "segmented indexing + knowledge graph + two-stage semantic retrieval + large-scale model reasoning" to achieve deep alignment between multimodal advertising content and regulatory semantics, segmented processing of long-sequence videos, and the establishment of semantic associations and traceable mappings between each video segment and prohibited sensitive words. This enables semantic alignment and reasoning between regulatory semantics and video expression, and generates interpretable evidence chains. This method is the first to integrate "prohibited word multimodal retrieval + video knowledge graph + regulatory knowledge tracing" in the field of advertising compliance, making it particularly suitable for automated review and evidence generation in long-video scenarios such as live-streaming e-commerce. It achieves high-precision, interpretable, and traceable automated advertising compliance detection. Summary of the Invention

[0005] The purpose of this invention is to provide a method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval, so as to solve the problems mentioned in the background art.

[0006] To achieve the above objectives, the present invention provides the following technical solution: a method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval, comprising the following steps:

[0007] Step 1: Video Index Construction: Input a long video of a live-streaming e-commerce advertisement, segment the video and perform multimodal information extraction, including speech transcription, visual frame sampling to generate descriptions, subtitle OCR extraction and entity relation extraction, to form a video knowledge graph and a vector index of the text;

[0008] Step 2, Coarse-grained search of prohibited words: Input the prohibited word database document, and after semantic embedding encoding, search for semantically similar segments and their graph nodes in the video knowledge graph and video text vector database to obtain preliminary candidate video content;

[0009] Step 3: Fine-grained matching of advertising law: Based on the structured advertising law JSON file, a legal database is built. The recalled video content text and knowledge graph entity information are input into the legal vector database for semantic matching to obtain the corresponding legal provisions.

[0010] Step 4: Compliance Determination of the Large Model: Input the video clips recalled in Step 2, the legal provisions hit in Step 3, and the advertising compliance detection request into the large language model, and output the violation category, evidence chain, legal provision number, and determination reason.

[0011] Preferably, the specific implementation of video segmentation and multimodal information extraction in step one includes: dividing the long live-streaming e-commerce video into several segments of 30-second duration; extracting 5 keyframes from each video segment using uniform interval sampling and recording timestamps; transing audio into speech text using the Distil-Whisper ASR model; generating keyframe scene descriptions using the Qwen2-VL-7B visual language model; and extracting scene text at a frequency of 1 second / frame using the open-source OCR tool EasyOCR; the multimodal semantic representation of each video segment is composed of speech text, visual description, and OCR text.

[0012] Preferably, the specific implementation of the knowledge graph and text vector index in step one includes: merging the multimodal semantic text of all video segments to form a text block set, encoding it through a text encoding model to obtain a text vector library; extracting entities and relations from the text block set through a large language model to construct an advertising semantic knowledge graph, where the graph entity nodes store the unique identifier, timestamp, and entity description of the video segment and quantize them, and the relation edges store the logical relation description and quantize them.

[0013] Preferably, the specific implementation of the coarse-grained retrieval of prohibited words in step two includes: using the embedding-3-large text embedding model to semantically encode the prohibited word library and generate a set of word vectors; calculating the semantic similarity between the video text, graph nodes and prohibited words based on cosine similarity in the video text vector library and the knowledge graph entity node vector library, setting a threshold to filter and obtain two preliminary candidate sets; introducing a structured rearrangement mechanism based on graph edges to perform one-hop neighbor expansion and rearrangement score calculation on the graph candidate sets, and taking the union of the two candidate sets to form a comprehensive candidate video segment set.

[0014] Preferably, the formula for calculating the structured rearrangement score is as follows: ,in The structural relevance weighting coefficient is... For nodes A set of neighbors that jumps over time. The neighbor node vector, These are the vectors for words that violate the rules.

[0015] Preferably, the specific implementation of fine-grained matching of advertising law in step three includes: constructing a set of legal blocks in terms of clauses and encoding them to generate a legal vector library based on a structured advertising law JSON file containing clause numbers, violation descriptions and legal basis; vectorizing the candidate video clip text and the corresponding graph entity node descriptions after concatenation; calculating semantic similarity in the legal vector library; retaining the top K laws with the highest similarity; and forming a triple mapping of query request-video clip-legal clause.

[0016] Preferably, the specific implementation of the compliance determination of the large model in step four includes: concatenating the violation detection request text, candidate video content text, and the set of matched legal clauses into contextual prompts, inputting them into the large language model to perform compliance reasoning; the model output results include violation determination results, violation category, evidence chain, matching legal clause number and clause summary, and determination reason, wherein the evidence chain includes the corresponding video clip, timestamp, voice text, screen description and OCR text.

[0017] Preferably, the violation categories include five major categories: issues involving guidance, politically sensitive information, absolute and exaggerated terms, sensitive and prohibited words, and suspected medical terms, with each category containing corresponding subcategories.

[0018] Preferably, step one also includes encoding the regulatory documents to generate a regulatory vector library, providing basic data support for subsequent fine-grained matching of advertising laws.

[0019] Compared with the prior art, the beneficial effects of the present invention are:

[0020] This invention introduces for the first time a two-stage semantic retrieval mechanism combining video knowledge graph and prohibited word retrieval, as well as video content and legal vector retrieval. First, a coarse-grained screening is performed using a prohibited word database, followed by fine-grained matching and reasoning using legal provisions. This achieves hierarchical violation identification and semantic filtering. Finally, the recalled video content, detection requests, and matched legal clauses are input into a large language model to obtain traceable violation determination results. The overall process consists of four stages: video index construction, coarse-grained prohibited word retrieval, fine-grained matching using advertising law, and compliance determination using a large language model. By combining video multimodal information retrieval, legal knowledge base, semantic association of knowledge graph, and visual evidence localization, it achieves fine-grained analysis and automatic identification of potential violations in live-streaming e-commerce advertising content. It can accurately locate violation clues in the implicit expressions of advertising content, forming a video evidence chain and providing automated compliance review capabilities for regulatory authorities or platforms. Attached Figure Description

[0021] Figure 1 This is the overall flowchart of the system of the present invention;

[0022] Figure 2 This is a detailed flowchart of step one of the present invention;

[0023] Figure 3 This is a detailed flowchart of step two of the present invention;

[0024] Figure 4 This is a detailed flowchart of step three of the present invention;

[0025] Figure 5 This is a detailed flowchart of step four of the present invention;

[0026] Figure 6 This is an example diagram of the present invention. Detailed Implementation

[0027] The technical solutions of the present invention will be clearly and completely described below with reference to the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present invention.

[0028] Please see Figure 1-6 This invention provides a method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval, comprising the following steps:

[0029] Step 1: Video Index Building Phase

[0030] This stage aims to extract multimodal information from long videos and build a structured index to provide data support for subsequent retrieval and matching. Specifically, it includes the following sub-steps:

[0031] 101 Video Segmentation and Multimodal Preprocessing:

[0032] The long live-streaming e-commerce video was divided into 30-second segments to obtain a collection of video clips: ,in , This represents the total number of video segments. A uniform interval sampling method is used to extract data from each video segment. Each keyframe is recorded and its corresponding timestamp is recorded to form a keyframe image set. , The keyframe number, 1≤ ≤5.

[0033] Audio processing: The Distil-Whisper ASR model is used to transcribe the audio portion of each video segment into speech-to-text. , Indicates the first The speech-to-text transcription results of a video clip.

[0034] Visual description generation: The Qwen2-VL-7B visual language model is used to analyze each keyframe image and generate a brief description of the image. , For the first In the video clip, the first The text describing the scene in each keyframe.

[0035] OCR Text Extraction: To fully extract embedded text information from the video frame, the open-source OCR tool EasyOCR was used to perform optical character recognition, with a set frame rate per second. =1s, to obtain the set of text recognition results , Indicates the first The video clip has a frame rate of All OCR recognition results below.

[0036] Multimodal semantic integration: The multimodal semantic representation of each video segment is composed of speech text, visual description, and OCR text, i.e. This is used for subsequent semantic retrieval and knowledge graph construction.

[0037] 102 Text Blocking and Knowledge Graph Construction:

[0038] Multimodal semantic text of all video clips Merge to form a collection of text blocks Each text block is processed through a text encoding model. Encode the text to obtain a text vector representation. Build a video text vector library.

[0039] Using a large language model to analyze a collection of text blocks Entity and relation extraction is performed to construct an advertising semantic knowledge graph. ,in For a set of entity nodes, For the set of relation edges:

[0040] Entity node set Each node in the database stores a unique identifier (id) for the corresponding video segment, a timestamp (used for locating subsequent segments), and a description field. , Representing entity types (such as products, effects, scenarios, etc.), for Vectorization is performed to obtain node vectors. This is used for subsequent node-level semantic retrieval;

[0041] Each edge in the relation edge set \(E_h\) has a description field. The logical relationships that characterize the semantics of advertising (such as promotion, inclusion, implication, etc.) Vectorization is performed to obtain the edge vectors. This is used for the structured rearrangement of subsequent recalled videos.

[0042] At the same time, the regulatory documents are encoded to generate a regulatory vector library, providing a foundation for subsequent regulatory matching.

[0043] Step 2: Coarse-grained search stage for prohibited words

[0044] This stage utilizes a prohibited keyword database to initially screen video content and quickly locate potentially prohibited segments. It includes the following sub-steps:

[0045] 201 Semantic Embedding Encoding of the Violation Thesaurus: Collect manually annotated violation thesaurus documents and use the embedding-3-large text embedding model to encode each violation word in the violation thesaurus. Perform semantic embedding encoding to generate a set of word vectors. ,in .

[0046] 202 Semantic Similarity Retrieval and Initial Candidate Set Screening:

[0047] In the video text vector library, each video text vector is calculated based on cosine similarity. With violation word vectors The semantic similarity is calculated using the following formula: Set a similarity threshold t1, and retain all values that meet the threshold. Video clips greater than t1 form a preliminary candidate set of violation videos. .

[0048] In the entity node vector set of the video knowledge graph, the same cosine similarity calculation method is used to calculate the similarity of each node vector. With violation word vectors Semantic similarity: Set a similarity threshold t2 and retain all values that meet the threshold. Video clips greater than t2 form a preliminary candidate set of violation videos. .

[0049] 203 Structured Rearrangement and Candidate Set Fusion:

[0050] To fully utilize the structural characteristics of knowledge graphs, a structured rearrangement mechanism based on graph edges is introduced to reorder the candidate set. Each entity node in Expand the search for its one-hop neighbor node set And calculate the structured rearrangement score: in, The structural relevance weighting coefficient is... For neighboring nodes The vector representation of .

[0051] Candidate set for text recall Candidate sets for graph recall are sorted by semantic similarity score. Sort by structured rearrangement score, and take the union of the two scores to form a comprehensive candidate video clip set. This provides high-confidence semantic input for the subsequent regulatory matching stage.

[0052] Step 3: Fine-grained matching stage of advertising law

[0053] This stage involves precisely matching candidate video clips with advertising law provisions to determine the legal basis for potential violations. This includes the following sub-steps:

[0054] Construction of the Section 301 knowledge base:

[0055] Import the structured advertising law JSON file, which contains core information such as clause numbers, violation descriptions, and legal basis. Then, create a collection of legal blocks organized by clause. Each regulatory block l is encoded using a text encoding model to generate a regulatory text vector library. This will form a structured legal knowledge base.

[0056] 302 Semantic Matching and Regulatory Mapping:

[0057] The combined candidate video clip set Text of each video clip Its corresponding knowledge graph entity node description By concatenating the two texts, we obtain the concatenated text. ,right After vectorization, it is input into the regulatory vector library.

[0058] The semantic similarity between the concatenated text vector and each legal vector in the legal vector library is calculated based on cosine similarity. The top K legal results with the highest similarity are retained to form a triple mapping of query request - video clip content - corresponding legal regulation. ,in This is an illegal testing query request. The matched set of legal provisions provides a rule-based basis for subsequent large language model determination.

[0059] Step 4: Compliance Assessment of Large Models

[0060] This stage uses a large language model to perform final semantic reasoning, outputting interpretable and traceable violation judgment results, specifically including the following sub-steps:

[0061] 401 Model Input Construction:

[0062] The violation detection request text Q, and the comprehensive candidate video clip set. The video content text Triple mapping Collection of hit legal provisions The keywords are concatenated to form contextual prompts, in the following format: "Violation detection request: {Q}; Candidate video content: {H}; Matched legal clause: {R}; Please determine whether the video violates the relevant provisions of the Advertising Law. If it does, please specify the violation category, the corresponding video segment and timestamp, the matching legal clause number and article summary, and the reason for the determination."

[0063] 402 Compliance Reasoning and Output:

[0064] Input the constructed contextual clues into the large language model and perform compliance inference: The model output includes:

[0065] Violation determination results: {No violation, suspected violation, confirmed violation};

[0066] Violation Categories: {1. Issues involving guidance: a. stability and unity; b. identity discrimination; c. vulnerable groups; d. using charity for personal gain. 2. Politically sensitive information: a. security and dignity; b. authoritative image; c. politically sensitive language. 3. Absolute and exaggerated language: a. absolute language; b. exaggerated statements; c. false and misleading statements. 4. Sensitive and prohibited words: a. red-line content; b. standard language. 5. Suspected medical terminology: a. confusing categories; b. treatment and cure; c. medical institutions.};

[0067] Chain of evidence: corresponding video clips, timestamps, audio text, on-screen descriptions, and OCR text;

[0068] Matching regulatory clauses: regulatory clause number and clause summary;

[0069] Reason for judgment: Based on the detailed explanation of violations in the Advertising Law.

[0070] Example:

[0071] To verify the effectiveness of the method of this invention, a total of 1681 live-streaming e-commerce video advertisement samples were tested. Each video may correspond to multiple violation categories, and the samples all included violation categories marked by advertising regulatory authorities as a reference. Experimental results show that the method of this invention can effectively identify illegal content in video clips. Among them, the number of samples whose predicted categories are consistent with the manual annotation results is 1480, and the overall recognition accuracy is approximately 88%.

[0072] Furthermore, to analyze the recognition performance under different violation categories, the recognition accuracy was statistically analyzed for five high-frequency violation types, and the results are shown in the table below. This method achieves high detection performance in categories such as "absolute terms," "exaggerated statements," and "false inducements," indicating that this method can effectively adapt to common violation expressions in live-streaming e-commerce videos.

[0073] Violation categories Number of samples (students) Number of correct predictions (number of predictions) Accuracy (%) Exaggerated statements 1011 894 88.4 Absolute terms 938 864 92.1 False inducement 447 410 91.7 Confusing categories 162 130 80.2 Treatment cured 78 70 89.7 overall 1681 1480 88.0

[0074] Example: See attached document Figure 6

[0075] Input: A 15-minute live-streaming e-commerce video.

[0076] Note: The advertising regulatory authorities determined that the video violated the categories of absolute terms and 3a. Exaggerated language - absolute terms and 3b. Absolute terms and exaggerated language - exaggerated statements.

[0077] Output:

[0078] Violation determination result: Violation confirmed.

[0079] Violation Category 3b: Absolute and Exaggerated Terms - Exaggerated Statements

[0080] Clue result: "Full-term, guaranteed income is 100% secure." (1:03-1:06)

[0081] Matching legal provision: Article 25, Paragraph 1 of the Advertising Law

[0082] Reasoning: The advertisement's promise that "future, lifelong, orderly income is 100% guaranteed" constitutes a guarantee of investment returns, which is an exaggerated statement and misleads consumers into believing that the returns are risk-free.

[0083] Violation determination result: Violation confirmed.

[0084] Violation Category: 3a: Absolute and Exaggerated Terms - Absolute Terms

[0085] Clue result: "No variables whatsoever." (13:12-13:16)

[0086] Matching legal provision: Article 9, Paragraph 3 of the Advertising Law

[0087] Reason for judgment: The use of absolute terms such as "there are no variables" implies that the product's returns are completely certain and excludes all uncertainties, which violates the rule against the use of absolute terms.

[0088] Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval, characterized in that: Includes the following steps: Step 1: Video Index Construction: Input a long video of a live-streaming e-commerce advertisement, segment the video and perform multimodal information extraction, including speech transcription, visual frame sampling to generate descriptions, subtitle OCR extraction and entity relation extraction, to form a video knowledge graph and a vector index of the text; Step 2, Coarse-grained search of prohibited words: Input the prohibited word database document, and after semantic embedding encoding, search for semantically similar segments and their graph nodes in the video knowledge graph and video text vector database to obtain preliminary candidate video content; Step 3: Fine-grained matching of advertising law: Based on the structured advertising law JSON file, a legal database is built. The recalled video content text and knowledge graph entity information are input into the legal vector database for semantic matching to obtain the corresponding legal provisions. Step 4: Compliance Determination of the Large Model: Input the video clips recalled in Step 2, the legal clauses hit in Step 3, and the advertising compliance detection request into the large language model, and output the violation category, evidence chain, legal clause number, and determination reason.

2. The method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval according to claim 1, characterized in that: The specific implementation of video segmentation and multimodal information extraction in step one includes: dividing the long live-streaming e-commerce video into several segments of 30 seconds each; extracting 5 keyframes from each segment using uniform interval sampling and recording timestamps; transing the audio into speech text using the Distil-Whisper ASR model; generating keyframe descriptions using the Qwen2-VL-7B visual language model; and extracting text from the screen using the open-source OCR tool EasyOCR at a frequency of 1 second / frame; the multimodal semantic representation of each video segment is composed of speech text, visual description, and OCR text.

3. The method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval according to claim 1, characterized in that: The specific implementation of the knowledge graph and text vector index in step one includes: merging the multimodal semantic text of all video segments to form a text block set, encoding it through a text encoding model to obtain a text vector library; extracting entities and relations from the text block set through a large language model to construct an advertising semantic knowledge graph. The graph entity nodes store the unique identifier, timestamp, and entity description of the video segment and quantize them, while the relation edges store the logical relation description and quantize them.

4. The method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval according to claim 1, characterized in that: The specific implementation of the coarse-grained retrieval of prohibited words in step two includes: using the embedding-3-large text embedding model to semantically encode the prohibited word library and generate a set of word vectors; calculating the semantic similarity between the video text, graph nodes and prohibited words based on cosine similarity in the video text vector library and the knowledge graph entity node vector library, setting a threshold to filter and obtain two preliminary candidate sets; introducing a structured rearrangement mechanism based on graph edges to perform one-hop neighbor expansion and rearrangement score calculation on the graph candidate sets, and taking the union of the two candidate sets to form a comprehensive candidate video segment set.

5. The method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval according to claim 4, characterized in that: The formula for calculating the structured rearrangement score is as follows: ,in The structural relevance weighting coefficient is... For nodes A set of neighbors that jumps over time. The neighbor node vector, These are the vectors for words that violate the rules.

6. The method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval according to claim 1, characterized in that: The specific implementation of fine-grained matching of advertising law in step three includes: constructing a set of legal blocks based on clauses and encoding them to generate a legal vector library, based on a structured advertising law JSON file containing clause numbers, violation descriptions and legal basis; vectorizing the candidate video clip text and the corresponding graph entity node descriptions, calculating semantic similarity in the legal vector library, retaining the top K laws with the highest similarity, and forming a triple mapping of query request-video clip-legal clause.

7. The method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval according to claim 1, characterized in that: The specific implementation of the compliance determination of the large model in step four includes: concatenating the violation detection request text, candidate video content text, and the set of matched legal clauses into contextual prompts, inputting them into the large language model to perform compliance reasoning; the model output results include violation determination results, violation category, evidence chain, matching legal clause number and clause summary, and determination reason, wherein the evidence chain includes the corresponding video clip, timestamp, voice text, screen description and OCR text.

8. The method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval according to claim 1, characterized in that: The violation categories include five major categories: issues involving guidance, politically sensitive information, absolute and exaggerated terms, sensitive and prohibited words, and suspected medical terms. Each category contains corresponding subcategories.

9. The method for identifying illegal clues in live-streaming e-commerce advertisements based on two-stage semantic retrieval according to claim 1, characterized in that: Step one also includes encoding the regulatory documents to generate a regulatory vector library.