Fine-grained harmful meme detection method and system based on information theory semantic decompression

By constructing a semantic decompression knowledge base and a risk-aware routing mechanism, the problems of metaphor understanding and high computational overhead in multimodal harmful meme detection are solved, achieving efficient and accurate fine-grained detection.

CN122241343APending Publication Date: 2026-06-19DALIAN UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
DALIAN UNIV OF TECH
Filing Date
2026-02-12
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing multimodal harmful meme detection methods suffer from insufficient generalization ability, high computational overhead, and difficulty in understanding new and emerging events when faced with rapidly changing internet content. In particular, they are unable to effectively decompress the obscure semantics in memes and perform real-time detection.

Method used

We employ an information theory-based semantic decompression method. By constructing a semantic decompression knowledge base and a risk-aware routing mechanism, we extract features using a pre-trained CLIP model, assess risk using a three-head calibration network, and obtain explicit definitions through cross-modal retrieval for fine-grained harmful meme detection.

Benefits of technology

It significantly improves the ability to understand metaphors and emerging hate symbols, enhances detection accuracy, and greatly reduces computational overhead and inference latency, making it suitable for real-time detection scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241343A_ABST
    Figure CN122241343A_ABST
Patent Text Reader

Abstract

This invention provides a fine-grained method and system for detecting harmful memes based on information theory semantic decompression, belonging to the field of multimodal information processing and internet content security. It includes: extracting image and text features of the meme to be detected and fusing them into a joint vector; calculating the risk value of the meme based on the joint vector and classifying it into low-risk or high-risk memes; using the joint vector of the high-risk meme as a query vector to retrieve an external explicit semantic explanation from a pre-built semantic decompression knowledge base, and concatenating and fusing it with the original high-risk meme features as the semantic decompression context to construct an enhanced input prompt; and outputting the fine-grained category of the meme based on the original low-risk meme or the enhanced input prompt. This invention significantly improves the model's ability to understand metaphors and emerging hate symbols, enhances fine-grained detection accuracy, and significantly reduces the system's computational overhead and inference latency.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of multimodal information processing and Internet content security technology, specifically relating to a fine-grained method and system for detecting harmful memes based on information theory semantic decompression. Background Technology

[0002] With the rapid development of social media platforms, user-generated multimodal content has exploded, with internet memes becoming one of the main forms of online communication. However, memes are often used to spread prejudice, sarcasm, and hate speech, harming social groups or cultural identities. Therefore, fine-grained harmful meme detection has become a research hotspot in the field of multimodal content security. Specifically, early fully supervised multimodal models were the main approach, typically employing a two-stream architecture to encode images and text separately, then using attention mechanisms or feature fusion for binary classification prediction. Many improved methods are based on this, such as introducing target decoupling or data augmentation techniques. However, these methods heavily rely on large-scale labeled data, and their generalization ability and adaptability are limited when faced with rapidly evolving meme content.

[0003] In recent years, with the rise of large multimodal models (LMMs) such as CLIP and BLIP, retrieval-enhanced zero-shot detection methods have gradually become mainstream. These methods (such as EVOLVER and MIND) attempt to retrieve similar historical memes from the training set and prompt the large model to generate rationales based on analogical relationships for classification. These methods have achieved certain results in closed-set testing, leveraging the prior knowledge of the large model to reduce reliance on labeled data to some extent.

[0004] However, existing retrieval augmentation methods also have some significant problems. First, the existing "retrieval-by-example" paradigm has limitations. It mainly relies on visual or superficial similarity, while memes inherently possess extremely high "semantic compression" characteristics, often compressing complex social biases (such as using "cotton" as a symbol of slavery) into simple visual symbols. When faced with novel and sudden events without visual precedents, simply retrieving similar images cannot reveal their deep metaphors, causing the model to be easily misled by superficial similarity and ignore the true semantic logic. Second, to understand these metaphors, existing methods often rely on large models to generate lengthy thought chains (CoT) or conduct multi-agent debates, which brings huge computational overhead. Their inference latency is usually 5 to 10 times higher than direct classification, making it difficult to meet the needs of real-time content moderation. Finally, existing methods lack a mechanism for predicting meme risks, treating all samples equally for complex retrieval and inference, resulting in a waste of computational resources.

[0005] To address the aforementioned issues, existing technologies urgently need a method to effectively "decompress" the implicit cultural and emotional semantics within memes. This could be achieved by introducing explicit definitions to replace vague analogies, thereby solving the metaphor comprehension challenge. Simultaneously, an efficient risk routing mechanism is required to significantly reduce inference latency while maintaining detection accuracy, thus adapting to large-scale real-time detection scenarios. Summary of the Invention

[0006] To improve the accuracy of fine-grained harmful meme detection and solve the detection problems caused by semantic ambiguity and high computational cost in existing technologies, this invention provides a fine-grained harmful meme detection method and system based on information theory semantic decompression.

[0007] The first aspect of this invention provides a fine-grained method for detecting harmful memes based on information-theoretic semantic decompression, comprising:

[0008] S1: Encode the image features and text features of the meme to be detected separately, and fuse the encoded visual feature vector and text feature vector into a joint vector;

[0009] S2: Based on the joint vector of the memes to be detected, obtain the potential risk value of the memes to be detected;

[0010] S3: Compare the risk value with a preset threshold. If the risk value is less than the preset threshold, it is determined to be a low-risk meme. The low-risk meme is then input into the classifier for direct classification to obtain the fine-grained category to which the low-risk meme belongs. Otherwise, it is determined to be a high-risk meme, triggering the next semantic decompression retrieval operation.

[0011] S4: Using a cross-modal retrieval model, the joint vector of the high-risk meme is used as the query vector. The retrieval is performed in a pre-built semantic decompression knowledge base to obtain the Top-K "term-definition" pairs with high mutual information with the high-risk meme. The "term-definition" pairs contain explicit explanations of obscure hate symbols.

[0012] S5: The retrieved Top-K "term-definition" pairs are used as the semantic decompression context and concatenated with the image and text features of the original high-risk memes to construct enhanced input prompts;

[0013] S6: Reason about the enhanced input prompts using a multimodal large model to obtain fine-grained harmful category labels for high-risk memes.

[0014] Furthermore, in step S1, the image encoder and text encoder, using pre-trained CLIP or SigLIP models, extract features from the memes to be detected and fuse them into a joint vector; the formula is as follows:

[0015]

[0016]

[0017]

[0018] in, For visual feature vectors; For text feature vectors; It is a joint vector.

[0019] Furthermore, in step S2, the potential risk value of the meme to be detected is obtained as follows:

[0020] Construct a semantic reference system that includes a set of harmless anchors and a set of harmful anchors;

[0021] The extracted meme joint vectors are mapped onto the semantic reference system;

[0022] Calculate the relative distance between the joint meme vector and the harmless anchor set and the harmful anchor set;

[0023] Based on the joint vector of memes and the relative distance, three risk scores are obtained through a three-head calibration network, and the weighted sum of the three risk scores is used as the final risk value.

[0024] Furthermore, the harmless anchor point set and the harmful anchor point set are constructed as follows:

[0025] The untrained CLIP model is used to perform preliminary risk scoring on all meme samples in the training set, and the risk scores are used to rank all samples. The top N% of meme samples are selected as high-risk candidate sets, and the bottom N% of meme samples are selected as low-risk candidate sets. The K-Means clustering algorithm is applied to the two candidate sets respectively, and the calculated cluster center vectors are used as harmless anchor set and harmful anchor set respectively.

[0026] Furthermore, relative distance Calculated based on the following formula:

[0027]

[0028] in, Represents cosine similarity. and These are the harmful anchor set and the harmless anchor set, respectively.

[0029] Furthermore, the three-head calibration network comprises three parallel branches:

[0030] The zero-shot head calculates the risk score of the memetic to be detected using frozen pre-trained CLIP model parameters. This is used to maintain general semantic knowledge;

[0031] A Global Head, using a perceptron (MLP) trained on the full dataset of the training set, generates joint vectors. and relative distance The concatenated input vector Mapped to risk score ;

[0032] The Harm-aware Head uses a perceptron (MLP) trained on a balanced dataset after resampling the training set to capture sparse, long-tailed hate signals, and then combines the joint vector... and relative distance The concatenated input vector Mapped to risk score .

[0033] Furthermore, the three risk scores and the final risk value obtained by the three-head calibration network are calculated based on the following method:

[0034] The risk score The calculation formula is:

[0035]

[0036] in, The text-encoded feature vector of harmful category warning words in CLIP; The text-encoded feature vector of harmless category prompts in CLIP; Temperature coefficient;

[0037] The risk score The calculation formula is:

[0038]

[0039] in, and Here are the weight matrix and bias terms for the fully connected layer. and For the weights and biases of the output layer, It is the ReLU activation function. Use the Sigmoid activation function;

[0040] The risk score The calculation formula is:

[0041]

[0042] in, and Here are the weight matrix and bias terms for the fully connected layer. and These are the weights and biases of the output layer;

[0043] Final risk value The calculation formula is as follows:

[0044]

[0045] in, , , For hyperparameter weights.

[0046] Furthermore, the semantic decompression knowledge base is constructed based on the following method:

[0047] Obtain a hate dictionary for the target language;

[0048] Filter out generic insults from the hate dictionary and retain hate words targeting specific groups;

[0049] A committee composed of multiple large language models (such as GPT-4, Claude, and Gemini) generates explicit explanatory definitions for each retained hate word, which describe the target of the word and its metaphorical meaning.

[0050] The generated definitions undergo consistency verification and manual review to form a knowledge base containing key-value pairs of "Term" and "Definition," which are then encoded into knowledge vectors using an encoder.

[0051] Furthermore, in step S4, the specific retrieval method for the Top-K "term-definition" pairs is as follows: calculate the alignment probability between the joint vector of high-risk memes and all knowledge vectors in the semantic decompression knowledge base, and select the Top-K "term-definition" pairs with the highest probabilities as the semantic decompression context; the alignment probability The calculation formula is as follows:

[0052]

[0053] in, and For the learnable scale and bias parameters of the cross-modal retrieval model, This is used for semantic decompression of knowledge vectors in a knowledge base.

[0054] A second aspect of the present invention provides a fine-grained harmful meme detection system based on information-theoretic semantic decompression, comprising:

[0055] The feature extraction module is used to extract image features and text features of the meme to be detected and fuse them into a joint vector;

[0056] The risk-aware routing module is used to calculate the risk value of the meme to be detected based on the joint vector of the meme to be detected, and to classify the meme to be detected into a low-risk meme or a high-risk meme based on the risk value.

[0057] The semantic decompression retrieval module is used to use the joint vector of high-risk memes as the query vector, retrieve external explicit semantic explanations from the pre-built semantic decompression knowledge base, and use them as semantic decompression context to splice and fuse with the image and text features of the original high-risk memes to construct enhanced input prompts.

[0058] A multimodal reasoning module is used to output a fine-grained category based on the original low-risk meme or an enhanced input prompt.

[0059] The beneficial effects of this invention are as follows: By introducing a semantic decompression mechanism from an information theory perspective, this invention uses external explicit definitions to "decompress" obscure visual symbols into specific hate semantics, which significantly improves the model's ability to understand metaphors and newly emerging hate symbols and enhances fine-grained detection accuracy. At the same time, through a risk-aware routing mechanism, only high-risk samples are subjected to expensive retrieval and enhanced inference, which greatly reduces the system's computational overhead and inference latency while ensuring performance. Attached Figure Description

[0060] Figure 1 This is a schematic diagram of the overall architecture and process of a fine-grained harmful meme detection method based on information theory semantic decompression provided in an embodiment of the present invention.

[0061] Figure 2 This is a schematic diagram of the structure of the three-head calibration network in an embodiment of the present invention.

[0062] Figure 3 This is a schematic diagram of a definition generation prompt template used to construct a semantic decompression knowledge base in an embodiment of the present invention. Detailed Implementation

[0063] To make the objectives, technical solutions, and advantages of the present invention clearer, the specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0064] like Figure 1As shown, this embodiment provides a fine-grained harmful meme detection method based on information-theory semantic decompression. Specifically, it is a detection method that utilizes external explicit definitions to enhance the understanding of obscure hate symbols by a multimodal large model. This method can, to some extent, solve the metaphorical understanding problem caused by "semantic compression" in memes and the high computational overhead caused by directly using large model reasoning, thereby improving the accuracy and efficiency of fine-grained detection. The detection method of this invention includes constructing a framework for detection, constructing a semantic knowledge base, using the framework to perform risk assessment and semantic enhancement reasoning on the collected meme images and text, and obtaining the detection results; the specific steps are as follows:

[0065] S1. Semantic compression refers to the process by which memes compress complex socio-historical backgrounds into simple visual symbols. This embodiment performs the reverse process of semantic decompression. First, a semantic decompression knowledge base is constructed. The construction process is as follows:

[0066] S1.1 Obtain a hate dictionary for the target language. For English, it is based on the HurtLex dictionary, and for Chinese, it is based on the slang list of the STATE ToxiCN dataset.

[0067] S1.2. In order to obtain high-quality explicit definitions, this embodiment is designed as follows: Figure 3 The "definition generation prompt template" shown first determines whether a word in the hate dictionary is racially discriminatory, insulting, or identity-based. If so, it proceeds to the second stage, which requires a committee composed of large language models (such as GPT-4, Gemini, and Claude) to generate a concise definition of no more than 30 words for the word, explaining its offensive meaning and target. If it is a general term (such as scientific terminology), it returns NULL.

[0068] S1.3. Cross-validate and manually review the generated definitions to form a semantically decompressed knowledge base containing "Term-Definition" key-value pairs, and encode it into knowledge vectors using an encoder. :

[0069]

[0070] S2. Using pre-trained CLIP models for image encoders and text encoders, the visual feature vectors of the memes to be detected are obtained respectively. and text feature vectors And merge the two into a joint vector. The formula is as follows:

[0071]

[0072]

[0073]

[0074] S3. Based on the joint vector of the meme to be detected, obtain the potential risk value of the meme to be detected; specifically as follows:

[0075] S3.1 Construct a semantic reference system containing a set of harmless anchors and a set of harmful anchors.

[0076] First, the untrained CLIP model is used to perform preliminary risk scoring on all meme samples in the training set. The risk scoring formula is as follows:

[0077]

[0078] in, For training set meme samples Risk score; For training set meme samples The joint vector; The text-encoded feature vector of harmful category warning words in CLIP; The text-encoded feature vector of harmless category prompts in CLIP; Temperature coefficient;

[0079] All samples were ranked using risk scores, and the top N% of meme samples were selected as the high-risk candidate set, while the bottom N% of meme samples were selected as the low-risk candidate set.

[0080] The K-Means clustering algorithm was applied to the two candidate sets respectively, and the calculated cluster center vectors were used as the harmless anchor set and the harmful anchor set respectively.

[0081] The anchor reference system generated by this clustering method has stronger semantic robustness compared to random sampling, and can effectively reduce the interference of single outlier samples on distance calculation.

[0082] S3.2 Calculate the joint vector of memes Relative distance to the two anchor point sets reference frames :

[0083]

[0084] in, Represents cosine similarity. and These are the harmful anchor set and the harmless anchor set, respectively.

[0085] S3.3. Based on the joint vector of memes and relative distance, three risk scores are obtained through a three-head calibration network, and the weighted sum of the three risk scores is used as the final risk value; as detailed below:

[0086] The three-head calibration network comprises three parallel branches:

[0087] 1. Zero-sample head: Calculates the risk score of the meme to be detected using frozen CLIP model parameters, aiming to preserve the open-world generalization ability of the CLIP model and prevent catastrophic forgetting; the risk score is... :

[0088]

[0089] in, The text-encoded feature vector for "harmful" category cue words (such as "harmful content") in CLIP; The text-encoded feature vector for "harmless" category cue words (such as "harmless content") in CLIP; This is the temperature coefficient.

[0090] 2. Fully Parametric Head: Includes a Multilayer Perceptron (MLP) trained using the full training data in the training set, employing class-weighted binary cross-entropy loss (BCE Loss) to capture the global data manifold; it also integrates meme joint vectors. and relative distance The concatenated input vector Mapped to risk score :

[0091]

[0092] in, and Here are the weight matrix and bias terms for the fully connected layer. and For the weights and biases of the output layer, It is the ReLU activation function. This is the Sigmoid activation function.

[0093] 3. Harmful Sensitive Header: Contains an MLP trained on a resampled subset of the training set (retaining all harmful samples and randomly downsampling harmless samples), designed to improve sensitivity to sparse hate signals; the joint vector... and relative distance The concatenated input vector Mapped to risk score The calculation formula is the same as that of the full parameter head, only the sampling strategy is different.

[0094] The final risk value is obtained by weighted summation of the risk scores from the three head outputs. :

[0095]

[0096] in, , , For hyperparameter weights.

[0097] S4, Set threshold ,like If a meme is identified as high-risk, path A is triggered, initiating the semantic decompression retrieval operation; otherwise, it is identified as low-risk, path B is triggered, and the low-risk meme is directly input into the classifier for classification. Through this mechanism, this embodiment can reduce retrieval overhead by approximately 45%.

[0098] S5. When semantic decompression retrieval is triggered, calculate the joint vector of high-risk memes. With all knowledge vectors in the semantic decompression knowledge base The alignment probabilities are used to select the Top-K definitions with the highest probabilities as the semantic decompression context. The alignment probability as follows:

[0099]

[0100] in, and Learnable scales and bias parameters for cross-modal retrieval models (such as the SigLIP model).

[0101] S6. Decompress the retrieved knowledge, i.e., semantic context. Injected into the Prompt, constructing the final enhanced input Prompt, formatted as follows: "You are a harmful meme detection expert... Meme content to be detected: [Image] + [Text]". Reference knowledge: [Semantic decompression context] Please classify the meme into one of the [category list] based on the information above.

[0102] S7. The final enhanced input Prompt is fed into the multimodal large model for inference to obtain fine-grained harmful category labels for high-risk memes. The results on datasets such as MAMI clearly show that by introducing the interpretation of "Dishwasher," the model successfully identifies the originally veiled hate intent.

[0103] This invention also provides a fine-grained harmful meme detection system based on information theory semantic decompression, including a feature extraction module, a risk-aware routing module, a semantic decompression retrieval module, and a multimodal reasoning module.

[0104] The feature extraction module is used to obtain the visual feature vector of the meme to be detected. and text feature vectors And merge the two into a joint vector Preferably, the image encoder and text encoder use pre-trained CLIP (or SigLIP) models. For English datasets, CLIP-ViT-Base-Patch16 is preferred; for Chinese datasets, Chinese-CLIP-ViT-Base-Patch16 is preferred.

[0105] The risk-aware routing module (such as) Figure 2 (As shown) is used to calculate the potential risk value of the meme to be detected based on the joint vector of the meme to be detected. Based on the risk value, the memes to be detected are classified as low-risk or high-risk memes, and the subsequent processing path is dynamically determined. This module includes a semantic reference system construction unit, a relative distance calculation unit, and a three-head calibration network unit. The semantic reference system construction unit pre-stores a set of harmless anchor points (…). ) and harmful anchor set ( The relative distance calculation unit is used to calculate the relative distance between the input meme joint vector and the two types of anchor point sets mentioned above. The three-head calibration network unit includes a zero-shot head, a global head, and a harm-aware head, used for comprehensive risk assessment.

[0106] The semantic decompression retrieval module retrieves the Top-K most relevant "term-definition" pairs from a pre-built semantic decompression knowledge base for high-risk memes. These pairs are then used as semantic decompression context and fused with the image and text features of the original high-risk meme to construct enhanced input prompts. This module employs SigLIP as a cross-modal retrieval model, mapping the joint vector of the high-risk meme to a query vector and calculating its alignment probability with all knowledge vectors in the semantic decompression knowledge base.

[0107] The multimodal inference module is used for final classification based on the original low-risk meme content or semantically enhanced input prompts. This module is implemented based on a Large Language Model (LLM) or a Multimodal Large Model (LMM), preferably using a large model such as Qwen2.5-7B or GPT-4o. Inference is performed by constructing a prompt that includes task instructions, meme content, and retrieved definitions. For low-risk memes, the input to the multimodal inference module only includes the original image and OCR text; for high-risk memes, the input includes the original image, OCR text, and explicit definition text provided by the semantic decompression retrieval module. Based on the input prompt, the multimodal inference module outputs the fine-grained category to which the meme belongs (e.g., stereotype, humiliation, incitement to violence, etc.).

[0108] Experimental verification

[0109] The proposed method was validated on three publicly available benchmark datasets: HarMeme (English, COVID-19 related), MAMI (English, sexism related), and ToxiCN MM (Chinese, mixed hate). The model training and testing environment consisted of two NVIDIA L20 (48GB) GPUs based on the PyTorch framework. The Adam optimizer was used for training the risk router. The model training included: acquiring a dataset of harmful memes with fine-grained labels; constructing an anchor set for a semantic reference system using the dataset; training the full-parameter head and harmful-sensitive head in the risk-aware routing module to accurately predict the risk score of memes; constructing a semantic decompression knowledge base and encoding it using a pre-trained SigLIP model; and performing zero-shot or few-shot inference by integrating the retrieved context through Prompt Engineering without fine-tuning the large model of the multimodal inference module.

[0110] Table 1 shows the F1 score performance comparison of the present invention (DeMeme) in the zero-sample setting. The comparison methods include general multimodal large models (such as Qwen2.5-VL) and domain-specific detection frameworks (such as Evolver, MIND).

[0111] Table 1

[0112]

[0113] Experimental results show that the proposed DeMeme method achieves significant performance improvements on all datasets. This demonstrates that retrieving explicit definitions through semantic decompression can effectively overcome metaphorical barriers and concept drift problems in memes. Furthermore, compared to example-based retrieval methods such as Evolver, this invention not only achieves higher accuracy but also significantly improves inference efficiency due to the introduction of a risk routing mechanism.

[0114] This invention also provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the steps in the fine-grained harmful meme detection method provided in the above embodiments, including risk assessment, semantic retrieval, and multimodal reasoning.

[0115] This invention also provides a computer-readable storage medium storing a computer program. When executed by a processor, the computer program implements the various processes of the fine-grained harmful meme detection method provided in this invention and achieves the same technical effect.

[0116] Finally, it should be noted that the above embodiments are intended to illustrate the technical solutions of the present invention and do not constitute any limitation on the present invention. Those skilled in the art should fully understand that modifications to the technical solutions described in the foregoing embodiments or equivalent substitutions for any part or all of the technical features are entirely feasible. Such modifications or substitutions, as long as they do not depart from the scope of protection defined by the claims of the present invention, should be considered reasonable extensions of the present invention.

Claims

1. A fine-grained harmful meme detection method based on information-theory semantic decompression, characterized in that, include: S1: Encode the image features and text features of the meme to be detected separately, and fuse the encoded visual feature vector and text feature vector into a joint vector; S2: Based on the joint vector of the memes to be detected, obtain the potential risk value of the memes to be detected; S3: Compare the risk value with a preset threshold. If the risk value is less than the preset threshold, it is determined to be a low-risk meme. The low-risk meme is then input into the classifier for direct classification to obtain the fine-grained category to which the low-risk meme belongs. Otherwise, it is determined to be a high-risk meme, triggering the next semantic decompression retrieval operation. S4: Using a cross-modal retrieval model, the joint vector of the high-risk meme is used as the query vector. The retrieval is performed in a pre-built semantic decompression knowledge base to obtain the Top-K "term-definition" pairs with high mutual information with the high-risk meme. The "term-definition" pairs contain explicit explanations of obscure hate symbols. S5: The retrieved Top-K "term-definition" pairs are used as the semantic decompression context and concatenated with the image and text features of the original high-risk memes to construct enhanced input prompts; S6: Reason about the enhanced input prompts using a multimodal large model to obtain fine-grained harmful category labels for high-risk memes.

2. The fine-grained harmful meme detection method based on information-theory semantic decompression according to claim 1, characterized in that, In step S1, the image encoder and text encoder, using pre-trained CLIP or SigLIP models, extract features from the memes to be detected and fuse them into a joint vector; the formula is as follows: in, For visual feature vectors; For text feature vectors; It is a joint vector.

3. The fine-grained harmful meme detection method based on information-theory semantic decompression according to claim 1, characterized in that, In step S2, the potential risk value of the meme to be detected is obtained as follows: Construct a semantic reference system that includes a set of harmless anchors and a set of harmful anchors; The extracted meme joint vectors are mapped onto the semantic reference system; Calculate the relative distance between the joint meme vector and the harmless anchor set and the harmful anchor set; Based on the joint vector of memes and the relative distance, three risk scores are obtained through a three-head calibration network, and the weighted sum of the three risk scores is used as the final risk value.

4. The fine-grained harmful meme detection method based on information-theory semantic decompression according to claim 3, characterized in that, The harmless anchor set and the harmful anchor set are constructed as follows: The untrained CLIP model is used to perform preliminary risk scoring on all meme samples in the training set, and the risk scores are used to rank all samples; the top N% of meme samples are selected as the high-risk candidate set, and the bottom N% of meme samples are selected as the low-risk candidate set. The K-Means clustering algorithm was applied to the two candidate sets respectively, and the calculated cluster center vectors were used as the harmless anchor set and the harmful anchor set respectively.

5. The fine-grained harmful meme detection method based on information-theory semantic decompression according to claim 3, characterized in that, relative distance Calculated based on the following formula: in, Represents cosine similarity. and These are the harmful anchor set and the harmless anchor set, respectively.

6. The fine-grained harmful meme detection method based on information-theory semantic decompression according to claim 3, characterized in that, The three-head calibration network comprises three parallel branches: Zero-sample head, using frozen pre-trained CLIP model parameters to calculate the risk score of the meme to be detected. ; A full-parameter head, using a perceptron trained on the full dataset, maps the input vector obtained by concatenating the joint vector and the relative distance to a risk score. ; The harmful sensitive head uses a perceptron trained on a balanced dataset after resampling the training set to map the input vector obtained by concatenating the joint vector and the relative distance to a risk score. .

7. The fine-grained harmful meme detection method based on information-theory semantic decompression according to claim 6, characterized in that, The three risk scores and the final risk value obtained by the three-head calibration network are calculated based on the following method: The risk score The calculation formula is: in, The text-encoded feature vector of harmful category warning words in CLIP; The text-encoded feature vector of harmless category prompts in CLIP; Temperature coefficient; The risk score The calculation formula is: in, and Here are the weight matrix and bias terms for the fully connected layer. and For the weights and biases of the output layer, It is the ReLU activation function. It is the Sigmoid activation function. The input vector is obtained by concatenating the joint vector and the relative distance; The risk score The calculation formula is: in, and Here are the weight matrix and bias terms for the fully connected layer. and These are the weights and biases of the output layer; Final risk value The calculation formula is as follows: in, , , For hyperparameter weights.

8. The fine-grained harmful meme detection method based on information-theory semantic decompression according to claim 1, characterized in that, The semantic decompression knowledge base is constructed based on the following method: Obtain a hate dictionary for the target language; Filter out generic insults from the hate dictionary and retain hate words targeting specific groups; A committee composed of multiple large language models generates explicit explanatory definitions for each retained hate word, which describe the target of the word and its metaphorical meaning. The generated definitions undergo consistency verification and manual review to form a knowledge base containing key-value pairs of "terms" and "definitions," which are then encoded into knowledge vectors using an encoder.

9. The fine-grained harmful meme detection method based on information-theory semantic decompression according to claim 1, characterized in that, In step S4, the specific retrieval method for the Top-K "term-definition" pairs is as follows: calculate the alignment probability between the joint vector of high-risk memes and all knowledge vectors in the semantic decompression knowledge base, and select the Top-K "term-definition" pairs with the highest probabilities as the semantic decompression context; the alignment probability The calculation formula is as follows: in, and For the learnable scale and bias parameters of the cross-modal retrieval model, To semantically decompress knowledge vectors in a knowledge base It is a joint vector.

10. A fine-grained harmful meme detection system based on information-theory semantic decompression, characterized in that, include: The feature extraction module is used to extract image features and text features of the meme to be detected and fuse them into a joint vector; The risk-aware routing module is used to calculate the risk value of the meme to be detected based on the joint vector of the meme to be detected, and to classify the meme to be detected into a low-risk meme or a high-risk meme based on the risk value. The semantic decompression retrieval module is used to use the joint vector of high-risk memes as the query vector, retrieve external explicit semantic explanations from the pre-built semantic decompression knowledge base, and use them as semantic decompression context to splice and fuse with the image and text features of the original high-risk memes to construct enhanced input prompts. A multimodal reasoning module is used to output a fine-grained category based on the original low-risk meme or an enhanced input prompt.