A small sample event detection method based on dynamic semantic meta-body driving

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a few-sample event detection method driven by dynamic semantic meta-body and leveraging multimodal feature fusion and knowledge graph enhancement, we can achieve accurate event type localization and precise extraction of trigger words in few-sample scenarios. This solves the problems of insufficient semantic modeling and weak generalization ability of traditional models in resource-constrained scenarios, and improves detection accuracy and semantic modeling depth.

CN120561694BActive Publication Date: 2026-06-16BEIJING INST OF COMP TECH & APPL

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: BEIJING INST OF COMP TECH & APPL
Filing Date: 2025-06-04
Publication Date: 2026-06-16

Application Information

Patent Timeline

04 Jun 2025

Application

16 Jun 2026

Publication

CN120561694B

IPC: G06F18/241; G06F18/25; G06F18/22; G06F40/30; G06N5/022; G06N3/0455; G06N3/0464; G06N3/08

CPC: G06F18/241; G06F18/253; G06F18/22; G06F40/30; G06N5/022; G06N3/0455; G06N3/0464; G06N3/08

AI Tagging

Application Domain

Semantic analysis Knowledge representation

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

In small sample scenarios, event detection models have weak generalization ability, insufficient semantic modeling, and are difficult to transfer to new event types. Furthermore, traditional methods rely on large-scale labeled data, and trigger word extraction depends on local features, leading to semantic fragmentation and lack of reasoning logic.

⚗Method used

A few-sample event detection method driven by dynamic semantic meta-body is adopted. Event type meta-body is generated by multimodal feature fusion network, and the trigger word semantic meta-body library is enhanced by knowledge graph. A two-layer constraint decoding mechanism is introduced to achieve accurate event type localization and accurate extraction of trigger words.

🎯Benefits of technology

It improves the accuracy of event classification and trigger word extraction under small sample conditions, reduces the dependence on large-scale labeled data, enhances the depth of semantic modeling and cross-level semantic collaborative mining capabilities, and solves the semantic modeling bottleneck of traditional models in resource-constrained scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN120561694B_ABST

Patent Text Reader

Abstract

The present application relates to a kind of small sample event detection methods based on dynamic semantic meta-body drive, belong to artificial intelligence, big data, natural language processing field.The present application is aimed at the problems such as weak generalization, semantic modeling deficiency of small sample event detection, using event classification-trigger word extraction progressive framework, by constructing event type meta-body anchoring mechanism and knowledge graph enhanced semantic meta-body library, effectively reduce trigger word bias problem, deep mining the class priori knowledge, context semantics and entity association information implied in limited sample;While designing the double-layer semantic interaction decoding mechanism of support set and query set, the feature complementation and knowledge transfer of two types of data are realized, so as to carry out structured reasoning to unlabeled data, significantly improve the generalization ability and semantic modeling depth of event detection under small sample situation.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the fields of artificial intelligence, big data, and natural language processing, and specifically relates to a small sample event detection method based on dynamic semantic metabody driving. Background Technology

[0002] With the rapid development of artificial intelligence and natural language processing technologies, event detection has become an important research direction in the field of natural language processing. The goal of event detection is to extract core elements of events from natural language text and identify event types. As a prerequisite for event extraction, high-performance models can provide crucial support for downstream tasks such as event relationship analysis and argument extraction. In vertical fields such as emergency management and financial monitoring, constructing structured intelligence reports through event detection can significantly improve the information conversion efficiency of unstructured text, providing decision-making systems with real-time and accurate event feature capture capabilities.

[0003] Compared to general domains, vertical domain event detection faces the dual challenges of extremely scarce labeled data and strong reliance on domain knowledge. Traditional supervised learning methods rely only on shallow text feature modeling, which has three core defects in small sample scenarios. First, the generalization ability of event types is weak, and new types require massive labeled data. Second, the semantic representation of trigger words is one-sided and lacks deep constraints from syntactic structure and knowledge graph. Third, argument association reasoning is lacking and cannot utilize the structured prior of event ontology, resulting in the detection accuracy decreasing as the sample size decreases.

[0004] This invention proposes a few-sample event detection method driven by dynamic semantic meta-entities. It constructs a three-level processing model: event type meta-entity anchoring, trigger word semantic meta-entity evolution, and two-layer constraint decoding. Specifically, it generates event type meta-entities containing deep semantic information through a multimodal feature fusion network, builds a dynamically evolving trigger word semantic meta-entity library using knowledge graph enhancement technology, and achieves accurate trigger word localization through two-layer constraints of event type priors and entity relationship structures. This method transforms event detection in low-resource scenarios into a knowledge transfer problem at the meta-entity level, breaking through the dependence of traditional methods on large-scale labeled data. Even with extremely limited available data, it can effectively capture core event features through cross-modal transmission and dynamic adaptation of semantic meta-entities, significantly improving the accuracy and generalization ability of event classification and trigger word extraction under small-sample conditions. Summary of the Invention

[0005] (a) Technical problems to be solved

[0006] The technical problem to be solved by this invention is how to provide a few-shot event detection method based on dynamic semantic metabody driving, so as to overcome the problems of weak generalization and insufficient semantic modeling in few-shot event detection.

[0007] (II) Technical Solution

[0008] To address the aforementioned technical problems, this invention proposes a few-sample event detection method driven by dynamic semantic metabody, which includes three steps:

[0009] Step 1: The event type metabody anchoring module constructs a multimodal feature fusion network to jointly encode the support set and query set; it uses a bidirectional Transformer to extract text semantic features, combines a dependency tree convolutional network to obtain syntactic structure information, and introduces a domain knowledge encoder to inject event ontology domain knowledge, generating a three-dimensional composite feature vector containing lexical semantics, syntactic structure, and domain knowledge, and then constructs the event type metabody. Through dynamic metric learning, it calculates the similarity between the query instance and each event type metabody to accurately anchor the event type.

[0010] Step 2: The trigger word semantic meta-body construction module constructs a knowledge graph-enhanced trigger word semantic meta-body library based on support set data for the anchored event types. It uses a bidirectional attention mechanism to encode the context and part-of-speech syntactic features of the trigger words, and integrates semantic primitives, semantic role frameworks, and domain ontology entity association rules into it with the help of graph attention network to form a three-dimensional meta-body representation with deep semantic constraints. Sequence labeling is performed through conditional random fields, and the meta-body parameters are dynamically updated through prototype gradient correction algorithm to adapt to cross-domain event variants.

[0011] Step 3: The dual-layer constraint decoding module introduces a dual-layer constraint decoding mechanism. The first layer generates a trigger word candidate filtering matrix based on the event type metabody and filters invalid cross-category candidates according to the category-trigger word mapping graph. The second layer constructs a dynamic bipartite graph by the co-occurrence relationship between trigger words and arguments, uses a graph convolutional network to learn ternary constraint relationships, and combines conditional random fields to achieve structured and accurate annotation of trigger word boundaries.

[0012] (III) Beneficial Effects

[0013] This invention proposes a few-sample event detection method based on dynamic semantic metabody driving, the main advantages of which are reflected in the following aspects:

[0014] (1) A progressive processing architecture for event classification and trigger word extraction was designed. First, the event type was anchored through a multimodal feature fusion network, and then trigger words were extracted based on a category-specific semantic metabody library. This mechanism enables the model to make full use of the implicit category prior knowledge and contextual structure information in the limited samples, avoiding the semantic fragmentation problem caused by over-reliance on local trigger word features. By jointly encoding the support set and query set through the multimodal feature fusion network, three-dimensional composite features of text semantics, syntactic structure, and domain keywords are extracted to generate semantic metabody representing the event type. Combined with dynamic metric learning, the accurate positioning of event types under small sample conditions is achieved, reducing the dependence on large-scale labeled data, fully mining the contextual semantic information in the limited data, and solving the problem that traditional models have insufficient generalization ability in few sample event types and are difficult to transfer to new event types.

[0015] (2) To address the shortcomings of traditional pipeline models that over-rely on local features of trigger words, a category-specific trigger word semantic metabody library is proposed. First, the target category is anchored through event type metabody, then the contextual semantics, part-of-speech syntax, semantic primitives, and entity association knowledge of trigger words (such as the co-occurrence patterns of trigger words and arguments in knowledge graphs) are integrated to form a three-dimensional metabody representation. This mechanism enables the model to learn event type-specific trigger word semantic features, avoiding the separation between global semantics and trigger words, and improving the collaborative mining capability of cross-level semantics in small sample scenarios. A deep interactive coding model for support sets and query sets is designed. Through cross-modal transmission of dynamic semantic metabody, the two types of data achieve bidirectional flow of semantic information during the feature extraction stage. The support set provides event type prototypes and trigger word semantic constraints for the query set, while the query set feeds back into the support set to optimize the domain adaptability of the metabody representation. This bidirectional association mechanism breaks through the semantic modeling bottleneck of traditional models in resource-constrained scenarios, enabling the model to efficiently transfer learned event type discrimination knowledge and trigger word structure rules to unknown domains.

[0016] (3) A hierarchical decoding framework is introduced in the decoding stage, which incorporates prior constraints on event types and structural constraints on knowledge graphs. The first layer filters out invalid trigger word candidates across categories through event type meta-entities, and the second layer models the co-occurrence patterns of trigger words and arguments using entity relationships in the knowledge graph. This mechanism solves the problem of blindness in trigger word extraction in traditional methods that rely on pure data-driven approaches. By integrating domain knowledge and structured semantic rules, it achieves knowledge guidance from probability prediction, improving the accuracy and reliability of trigger word boundary labeling under small sample conditions. Based on filtering invalid trigger word candidates across categories using event type meta-entities, semantically irrelevant interference items are eliminated. A co-occurrence relationship network of trigger words and argument entities is constructed through the knowledge graph. The trigger word information of the support set is injected as prior knowledge into the decoding process of the query set, realizing the semantic bidirectional flow of the two types of data and obtaining rich vector representations containing syntactic structures and entity constraints. This solves the problems of shallow semantic modeling and lack of reasoning logic in small sample scenarios. Attached Figure Description

[0017] Figure 1 This is a general framework diagram of the present invention. Detailed Implementation

[0018] To make the objectives, contents, and advantages of the present invention clearer, the specific embodiments of the present invention will be described in further detail below with reference to the accompanying drawings and examples.

[0019] This invention proposes a few-sample event detection method based on dynamic semantic metabody driving, which mainly solves three problems.

[0020] First, traditional supervised training models have insufficient generalization ability in event types with limited training samples and are difficult to transfer to newly emerging event types. This invention constructs an event type metabody anchoring mechanism, uses a multimodal feature fusion network to jointly encode support sets and query sets, extracts three-dimensional composite features of text semantics, syntactic structure, and domain keywords to generate event type metabodies, and combines dynamic metric learning to achieve accurate positioning of event types under small sample sizes, reduce dependence on large-scale labeled data, and fully explore the contextual semantic information in limited data.

[0021] Secondly, traditional event detection models adopt a pipeline mode of trigger word extraction-event classification, which over-relies on local lexical features and severs the connection between the global semantics of the event and the trigger words. This invention proposes a progressive framework of event classification-trigger word extraction. First, the event type is anchored through the event type metabody. Then, a knowledge graph-enhanced trigger word semantic metabody library is built for the target category. The context, part-of-speech syntax, semantic primitives and entity association knowledge of the trigger words are integrated to form a three-dimensional metabody representation. This enables the model to learn the trigger word semantic features specific to the event type and improves the collaborative mining ability of global semantics and local features in small sample scenarios.

[0022] Third, traditional meta-learning methods encode support sets and query sets independently, lacking deep semantic interaction and resulting in insufficient feature representation. This invention designs a two-layer constraint decoding mechanism: the first layer filters cross-category invalid trigger word candidates based on event type meta-entities; the second layer constructs a co-occurrence relationship network between trigger words and argument entities through a knowledge graph, introduces trigger word information from the support set as prior knowledge to guide query set decoding, realizes bidirectional semantic transmission between the two, and obtains rich vector representations containing syntactic structure and entity constraints, solving the problems of shallow semantic modeling and lack of reasoning in small sample scenarios.

[0023] This invention addresses the problems of weak generalization and insufficient semantic modeling in few-sample event detection by proposing a few-sample event detection method driven by dynamic semantic meta-entities. This method is based on a hierarchical framework driven by dynamic semantic meta-entities and includes three parts: (1) an event type meta-entity anchoring module, (2) a trigger word semantic meta-entity construction module, and (3) a two-layer constraint decoding module. The event type meta-entity anchoring module jointly encodes the support set and query set to identify event types; the trigger word semantic meta-entity construction module combines a knowledge graph to construct a dynamic meta-entity library; and the two-layer constraint decoding module filters invalid candidates and accurately labels trigger word boundaries based on graph convolution and conditional random fields. This method includes three steps:

[0024] Step 1: The event type metabody anchoring module constructs a multimodal feature fusion network to jointly encode the support set and query set. It extracts textual semantic features using a bidirectional Transformer, obtains syntactic structure information using a dependency tree convolutional network, and introduces a domain knowledge encoder to inject event ontology domain knowledge, generating a three-dimensional composite feature vector containing lexical semantics, syntactic structure, and domain knowledge. This vector is then used to construct the event type metabody. Through dynamic metric learning, the similarity between query instances and each event type metabody is calculated to accurately anchor the event type.

[0025] Step 2: The trigger word semantic meta-body construction module constructs a knowledge graph-enhanced trigger word semantic meta-body library based on support set data for the anchored event types. It uses a bidirectional attention mechanism to encode the context and part-of-speech syntactic features of trigger words, and integrates semantic primitives, semantic role frameworks, and domain ontology entity association rules into the graph attention network, forming a three-dimensional meta-body representation with deep semantic constraints. Sequence labeling is performed using conditional random fields, and the meta-body parameters are dynamically updated through a prototype gradient correction algorithm to adapt to cross-domain event variants.

[0026] Step 3: The two-layer constraint decoding module introduces a two-layer constraint decoding mechanism. The first layer generates a trigger word candidate filtering matrix based on the event type metabody, and filters invalid cross-category candidates according to the category-trigger word mapping graph; the second layer constructs a dynamic bipartite graph of the co-occurrence relationship between trigger words and arguments, uses a graph convolutional network to learn ternary constraint relationships, and combines conditional random fields to achieve structured and accurate annotation of trigger word boundaries.

[0027] The event type metabody anchoring module consists of two parts: (1) multimodal joint coding module and (2) dynamic measurement classification module.

[0028] The multimodal joint encoding module is used to construct a multimodal feature fusion network to jointly encode the support set and query set: it uses a bidirectional Transformer to extract text semantic features, combines a dependency tree convolutional network to obtain syntactic structure information, and introduces a domain knowledge encoder to inject event ontology domain knowledge to generate a three-dimensional composite feature vector containing lexical semantics, syntactic structure and domain knowledge, and then constructs an event type metabody;

[0029] The dynamic metric classification module is used to learn and calculate the similarity between query instances and event type meta-entities through dynamic metric learning, and predict which event type a sentence in the query set belongs to by combining contextual semantic similarity; the event classification model selects labeled data belonging to that event type as the exclusive support set for the trigger word extraction model.

[0030] The query set contains sentences to be classified. These sentences, along with sentences from the support set, are input into a multimodal feature fusion network for joint encoding. This invention utilizes a bidirectional Transformer to extract semantic features from the text, combines this with a dependency tree convolutional network to obtain syntactic structure information, and injects event ontology domain knowledge through a domain knowledge encoder, thereby generating a three-dimensional composite feature vector containing lexical semantics, syntactic structure, and domain knowledge. After constructing event type meta-entities based on these feature vectors, a dynamic metric learning mechanism is used to calculate the similarity between the query instance and each event type meta-entity. For example, the query set sentence "A certain team won the league championship," after multimodal feature fusion encoding, is compared with each event type meta-entity for similarity calculation. This accurately anchors the event type to which the sentence belongs, rather than simply comparing the similarity between the original sentences.

[0031] For the multimodal joint coding module, the input is:

[0032]

[0033] In the formula, These represent the textual features, syntactic features, and domain knowledge features of the sentences supporting the set, respectively. These represent the textual features, syntactic features, and domain knowledge features of the query set sentences, respectively. This represents the event type metabody. This invention employs a bidirectional Transformer fusion network to... Encoding is performed using a text semantic encoder. Dependency Tree Convolutional Network Domain knowledge encoder Extract text features, syntactic features, and domain knowledge features respectively:

[0034]

[0035] The three-dimensional features are integrated into a composite feature representation through a feature fusion mechanism:

[0036]

[0037] in, These are learnable attention weights. For each query instance, generate... A composite feature vector, where K is the number of labeled data for each event type in the support set, and N is the number of event types.

[0038] Then, a three-dimensional feature correlation matrix is generated as the event type metabody through a cross-modal interactive encoder:

[0039]

[0040] in, This represents the meta-parameter of the nth event type. These represent textual, syntactic, and domain knowledge feature sequences, respectively, where L is the sequence length and d is the feature dimension. e n Indicates the first n Event types, k Indicates the first k One labeled data;

[0041] The dynamic measurement and classification module then calculates the final matching score using a multi-dimensional similarity aggregation function:

[0042]

[0043] In the formula, Let represent the text, syntax, and knowledge dimension prototypes of the nth event metabody, respectively. Cosine is the cosine similarity function used to calculate semantic matching degree; DTW is the dynamic time warping algorithm used to capture the sequence similarity of syntactic structures; and KG-Match is the knowledge graph alignment function used to evaluate domain knowledge consistency. This indicates the feature concatenation operation; MLP stands for Multilayer Perceptron.

[0044] This invention employs a knowledge graph-enhanced multimodal meta-learning framework for trigger word extraction, constructing a three-dimensional semantic representation system for trigger words. This is based on already anchored event types. e n Based on this, the model needs to accurately locate the event trigger words in the query sentence q.

[0045] Given the event type of the query statement q e n First, select K similar event instances from the support set. Build a type-aware support set:

[0046]

[0047] in, Indicates an event e n The trigger word label space allows the model to focus on the trigger word features of specific event types, avoiding interference from irrelevant information.

[0048] To comprehensively capture the semantic, syntactic, and domain-specific features of trigger words, the model employs a multimodal encoder:

[0049]

[0050]

[0051] in, To fuse features, dependency tree convolutional networks Domain knowledge encoder that captures the syntactic structure of sentences. Map entities to the domain knowledge space; It is the syntactic analysis result obtained after processing the query sentence q, including information such as the sentence's dependency syntax tree; DepTreeCNN uses... As input, it is used to capture the syntactic structure of the sentence and obtain a syntactic feature representation. . This queries for entity-related information mentioned in sentence q. The domain knowledge encoder's knowledge graph embedding layer, KG-Embed, will... As input, entities are mapped to the domain knowledge space to obtain domain knowledge feature representations. .

[0052] To enable information exchange between the query set and the support set, a cross-modal interactive encoder is designed:

[0053]

[0054] The bidirectional attention mechanism enables query sentences to focus on key trigger word features in the support set, while the support set instance can also provide contextual clues for the query.

[0055] Target event type e n Construct its trigger word semantic metabody:

[0056]

[0057]

[0058]

[0059]

[0060] In the formula, This is the textual semantic prototype. As a prototype of syntactic structure, As a prototype of domain knowledge. It is a prototype of a knowledge graph.

[0061] It is for the target event type e n The constructed trigger word semantic metabody is a collection of multiple feature prototypes that integrates text semantics, syntactic structure, domain knowledge, and knowledge graph-related information to characterize the semantic features of trigger words for this event type.

[0062] Textual semantic prototypes, through textual features supporting K instances of similar events in a set. The value is obtained by performing an attention pooling operation (AttnPool) and averaging the results. It represents the typical characteristics of this event type at the textual semantic level, where... To support the kth instance of the same type in the set The text feature representation obtained after text feature extraction may be a vector representation after processing by models such as BERT.

[0063] Syntactic structure prototypes are obtained by using graph neural networks (GNNs) to analyze the syntactic features of K similar event instances in a support set. The results were obtained through processing. This reflects the typical syntactic pattern of the trigger words for this event type. Among them, Supports the kth instance of the same type in the set. Syntactic feature representations, for example, might be syntactic structure vectors obtained after processing by a Dependency Tree Convolutional Network (DepTreeCNN).

[0064] A knowledge graph prototype, utilizing the Graph Attention Network (GAT), is based on knowledge graphs. and the relationship between trigger words The calculation yielded the results, reflecting the association characteristics of the event type's trigger words within the knowledge graph. Among these, With the target event type e n The relevant knowledge graph contains information such as entities related to this event type and the relationships between entities. Target event type e n Trigger words and The relationship between them.

[0065] This serves as a prototype for domain knowledge. Based on a support set of similar event instances, key features are extracted from the domain-related knowledge contained in the text using specific models or methods to construct the prototype. For example, if our target event type is "financial event," which is the specific event type we want to study and detect, the model will conduct subsequent analyses around it, such as identifying relevant trigger words and determining whether the event has occurred. We collect relevant textual materials in the financial field, such as financial regulations and industry reports. We extract relevant domain knowledge from these, use text mining techniques to determine the relationships between entities through relation extraction, and organize and encode this knowledge to form a prototype of domain knowledge. This prototype provides a domain-level basis for the model to determine the trigger words for "financial events."

[0066] Graph attention networks update node representations in the following ways:

[0067]

[0068] Among them, the attention coefficient is .

[0069] W is a weight matrix that, in graph attention network computation, performs a linear transformation on the node features, mapping the original feature vectors of the nodes to a new feature space, facilitating subsequent calculation of attention coefficients and updating node representations; h i h j Let N represent the feature vectors of node i and node j, respectively. These feature vectors are numerical representations of node information, including the relevant attributes and semantic information of the nodes in graph structures such as knowledge graphs, and are used to calculate the attention relationships between nodes. i Let N represent the set of neighboring nodes of node i. In a graph structure, the nodes directly connected to node i constitute its set of neighboring nodes. When calculating a new representation of node i, its neighboring nodes N are considered. i Information about node j in the dataset.

[0070] Sequence labeling is performed using Conditional Random Fields (CRFs), taking into account multi-dimensional features:

[0071]

[0072]

[0073]

[0074]

[0075]

[0076]

[0077] In the formula, Representing text features, Indicate syntactic features, Representing knowledge characteristics, Indicates interactive features, Indicates the matching features of the metabody; Depend on , , , and composition;

[0078] These are transition matrix elements in a CRF, representing the transition from label... y i-1 Move to label y i The score reflects the dependencies between labels. For example, in a trigger word labeling task, it shows the probability that the label of the previous word (e.g., "non-trigger word") will be transferred to the label of the current word (e.g., "trigger word"), helping the model consider the rationality of the label sequence. y e It is the trigger word label space for event e, i.e., the set of labeled categories, where L represents the sentence length. This represents the set of all possible combinations of labels in a sentence. The model aims to find the label sequence from this set that maximizes the objective function value. This refers to the label that best matches the trigger word situation in the sentence. i represents the position index in the sequence. In the task of sequence labeling of sentences, the sentence can be regarded as a sequence of words. i ranges from 1 to L (L is the sentence length, i.e. the number of words), and it traverses each position in the sentence in turn to indicate the operation of feature calculation, labeling decision and other operations at that position. This represents the label category at the i-th position in the sequence. The set of label categories is... (Indicates an event) e n The trigger word label space represents different labels related to trigger words, such as "is a trigger word", "is not a trigger word", or more granular trigger word type labels, etc. The model calculates the most suitable label for each position i. Complete sequence labeling.

[0079] To adapt to cross-domain event variations, a dynamic update mechanism for meta-body parameters is designed:

[0080]

[0081] in, This is a set of trigger word nodes related to meta-learning. For example, it might represent a series of task samples used for meta-learning, upon which the model updates the trigger word semantic metabody. The parameters are adjusted to adapt to cross-domain event variations. This is the learning rate.

[0082] The meta-learning loss function is:

[0083]

[0084] in, This is the loss function associated with the support set S. In few-shot event detection, the support set provides crucial information to measure the difference between the model's predictions on the support set and the actual situation, helping the model learn the features of the support set data. ξ is the learning rate.

[0085] Is with The relevant loss function is used to evaluate the model's performance on a specific task and guide the model to transfer learning between different tasks.

[0086] The total loss function includes multi-dimensional constraints:

[0087]

[0088] in, It is a loss function calculated based on text features. It is a loss function calculated based on syntactic features. MMD ensures that the query is aligned with the feature distribution of the support set. This represents the alignment loss for the knowledge graph.

[0089] This invention employs a two-layer constraint decoding mechanism for trigger word extraction, constructing an event type-aware structured annotation framework. After obtaining the event type... e n and its trigger word semantic metabody Based on this, the model achieves accurate trigger word localization through the following steps:

[0090] First, based on the event type metabody Generate a candidate filter matrix for trigger words:

[0091]

[0092] in, To query the fusion features of the i-th token in the sentence, Event type e n With candidate words v j In terms of the relevance of knowledge graphs, It is a weight matrix. It is a bias vector. The sigmoid function normalizes the score to the interval [0,1].

[0093] This filtering matrix uses a type-trigger word mapping graph. Filtering invalid candidates across categories:

[0094]

[0095] in, Indicates event type e n Next trigger word The probability of its occurrence, This is the probability threshold.

[0096] Model the co-occurrence relationship between trigger words and arguments as a dynamic bipartite graph. ,in For the set of candidate trigger word nodes, For the set of candidate argument nodes, Let be the set of edges, with weights Indicates the intensity of co-occurrence.

[0097] Introducing event structure triples Representing event type, trigger words, and role relationships, constraint relationships are learned through a graph convolutional network:

[0098]

[0099] in, Add a self-loop to the adjacency matrix. The adjacency matrix A is a matrix describing the connection relationships between nodes in the graph structure. If there is a connecting edge between node i and node j, then A... ij The value is usually 1 or the weight of the edge; if there is no connection between node i and node j, then A ij =0. For degree matrix, For the first Layer node representation, Given a trainable weight matrix, the triplet scoring function is defined as follows:

[0100]

[0101] Where g is a multilayer perceptron and Embed is the embedded representation of entities and relations.

[0102] The model is optimized using the following multi-task loss function:

[0103]

[0104]

[0105] In the formula, For sequence labeling loss in conditional random fields, To reduce attention network loss, For knowledge graph alignment loss, For type consistency loss, C represents the actual labels, and C represents the number of labels. To predict probabilities, a two-layer constraint decoding mechanism filters invalid candidates through event type metabody and learns the deep relationship between trigger words and arguments using graph structure, achieving structured and accurate annotation of trigger word boundaries.

[0106] This invention presents a small-sample event detection method driven by dynamic semantic meta-entities. It employs a progressive framework of event classification and trigger word extraction. By constructing an event type meta-entity anchoring mechanism and a knowledge graph-enhanced semantic meta-entity library, it effectively reduces trigger word bias and deeply mines implicit category prior knowledge, contextual semantics, and entity association information in limited samples. Simultaneously, it designs a two-layer semantic interaction decoding mechanism for support sets and query sets, enabling feature complementarity and knowledge transfer between the two types of data. This allows for structured reasoning on unlabeled data, significantly improving the generalization ability of event detection and the depth of semantic modeling in small-sample scenarios.

[0107] This invention proposes a few-sample event detection method based on dynamic semantic metabody driving, the main advantages of which are reflected in the following aspects:

[0108] (1) A progressive processing architecture for event classification and trigger word extraction was designed. First, the event type was anchored through a multimodal feature fusion network, and then trigger words were extracted based on a category-specific semantic metabody library. This mechanism enables the model to make full use of the implicit category prior knowledge and contextual structure information in the limited samples, avoiding the semantic fragmentation problem caused by over-reliance on local trigger word features. By jointly encoding the support set and query set through the multimodal feature fusion network, three-dimensional composite features of text semantics, syntactic structure, and domain keywords are extracted to generate semantic metabody representing the event type. Combined with dynamic metric learning, the accurate positioning of event types under small sample conditions is achieved, reducing the dependence on large-scale labeled data, fully mining the contextual semantic information in the limited data, and solving the problem that traditional models have insufficient generalization ability in few sample event types and are difficult to transfer to new event types.

[0109] (2) To address the shortcomings of traditional pipeline models that over-rely on local features of trigger words, a category-specific trigger word semantic metabody library is proposed. First, the target category is anchored through event type metabody, then the contextual semantics, part-of-speech syntax, semantic primitives, and entity association knowledge of trigger words (such as the co-occurrence patterns of trigger words and arguments in knowledge graphs) are integrated to form a three-dimensional metabody representation. This mechanism enables the model to learn event type-specific trigger word semantic features, avoiding the separation between global semantics and trigger words, and improving the collaborative mining capability of cross-level semantics in small sample scenarios. A deep interactive coding model for support sets and query sets is designed. Through cross-modal transmission of dynamic semantic metabody, the two types of data achieve bidirectional flow of semantic information during the feature extraction stage. The support set provides event type prototypes and trigger word semantic constraints for the query set, while the query set feeds back into the support set to optimize the domain adaptability of the metabody representation. This bidirectional association mechanism breaks through the semantic modeling bottleneck of traditional models in resource-constrained scenarios, enabling the model to efficiently transfer learned event type discrimination knowledge and trigger word structure rules to unknown domains.

[0110] (3) A hierarchical decoding framework is introduced in the decoding stage, which incorporates prior constraints on event types and structural constraints on knowledge graphs. The first layer filters out invalid trigger word candidates across categories through event type meta-entities, and the second layer models the co-occurrence patterns of trigger words and arguments using entity relationships in the knowledge graph. This mechanism solves the problem of blindness in trigger word extraction in traditional methods that rely on pure data-driven approaches. By integrating domain knowledge and structured semantic rules, it achieves knowledge guidance from probability prediction, improving the accuracy and reliability of trigger word boundary labeling under small sample conditions. Based on filtering invalid trigger word candidates across categories using event type meta-entities, semantically irrelevant interference items are eliminated. A co-occurrence relationship network of trigger words and argument entities is constructed through the knowledge graph. The trigger word information of the support set is injected as prior knowledge into the decoding process of the query set, realizing the semantic bidirectional flow of the two types of data and obtaining rich vector representations containing syntactic structures and entity constraints. This solves the problems of shallow semantic modeling and lack of reasoning logic in small sample scenarios.

[0111] The above description is only a preferred embodiment of the present invention. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the technical principles of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.

Claims

1. A few-sample event detection method based on dynamic semantic metabody driving, characterized in that, The method includes three steps: Step 1: The event type metabody anchoring module constructs a multimodal feature fusion network to jointly encode the support set and query set; it uses a bidirectional Transformer to extract text semantic features, combines a dependency tree convolutional network to obtain syntactic structure information, and introduces a domain knowledge encoder to inject event ontology domain knowledge, generating a three-dimensional composite feature vector containing lexical semantics, syntactic structure, and domain knowledge, and then constructs the event type metabody. Through dynamic metric learning, it calculates the similarity between the query instance and each event type metabody to accurately anchor the event type. Step 2: The trigger word semantic meta-body construction module constructs a knowledge graph-enhanced trigger word semantic meta-body library based on support set data for the anchored event types. It uses a bidirectional attention mechanism to encode the context and part-of-speech syntactic features of the trigger words, and integrates semantic primitives, semantic role frameworks, and domain ontology entity association rules into it with the help of graph attention network to form a three-dimensional meta-body representation with deep semantic constraints. Sequence labeling is performed through conditional random fields, and the meta-body parameters are dynamically updated through prototype gradient correction algorithm to adapt to cross-domain event variants. Step 3: The dual-layer constraint decoding module introduces a dual-layer constraint decoding mechanism. The first layer generates a trigger word candidate filtering matrix based on the event type metabody and filters invalid cross-category candidates according to the category-trigger word mapping graph. The second layer constructs a dynamic bipartite graph by the co-occurrence relationship between trigger words and arguments, uses a graph convolutional network to learn ternary constraint relationships, and combines conditional random fields to achieve structured and accurate annotation of trigger word boundaries.

2. The few-sample event detection method based on dynamic semantic metabody-driven as described in claim 1, characterized in that, In step one, the event type metabody anchoring module includes: a multimodal joint coding module and a dynamic metric classification module; The multimodal joint encoding module is used to construct a multimodal feature fusion network, jointly encode the support set and query set, extract text semantic features using a bidirectional Transformer, obtain syntactic structure information by combining a dependency tree convolutional network, and introduce a domain knowledge encoder to inject event ontology domain knowledge, generating a three-dimensional composite feature vector containing lexical semantics, syntactic structure and domain knowledge, and then constructing an event type metabody; The dynamic metric classification module is used to learn and calculate the similarity between query instances and event type meta-entities through dynamic metric learning, and predict which event type a sentence in the query set belongs to by combining contextual semantic similarity; the event classification model selects labeled data belonging to that event type as the exclusive support set for the trigger word extraction model.

3. The few-sample event detection method based on dynamic semantic metabody driving as described in claim 2, characterized in that, The input to the multimodal joint coding module is: In the formula, These represent the textual features, syntactic features, and domain knowledge features of the sentences supporting the set, respectively. These represent the textual features, syntactic features, and domain knowledge features of the query set sentences, respectively. Represents the event type metabody; this invention employs a bidirectional Transformer fusion network to... Encoding is performed using a text semantic encoder. Dependency Tree Convolutional Network Domain knowledge encoder Extracting text features, syntactic features, and domain knowledge respectively feature: The three-dimensional features are integrated into a composite feature representation through a feature fusion mechanism: in, For learnable attention weights; for each query instance, generate A composite feature vector, where K is the number of labeled data for each event type in the support set, and N is the number of event types; Then, a three-dimensional feature correlation matrix is generated as the event type metabody through a cross-modal interactive encoder: in, This represents the meta-parameter of the nth event type. These represent textual, syntactic, and domain knowledge feature sequences, respectively, where L is the sequence length and d is the feature dimension. e n Indicates the first n Event types, k Indicates the first k Each labeled data.

4. The few-sample event detection method based on dynamic semantic metabody driving as described in claim 3, characterized in that, The dynamic measurement and classification module calculates the final matching score using a multi-dimensional similarity aggregation function: In the formula, Let represent the text, syntax, and knowledge dimension prototypes of the nth event metabody, respectively; Cosine is the cosine similarity function used to calculate semantic matching degree; DTW is the dynamic time warping algorithm used to capture the sequence similarity of syntactic structures; and KG-Match is the knowledge graph alignment function used to evaluate domain knowledge consistency. This indicates the feature concatenation operation; MLP stands for Multilayer Perceptron.

5. The few-sample event detection method based on dynamic semantic metabody driving as described in claim 4, characterized in that, In step two, the trigger word semantic metabody construction module, for the anchored event types, constructs a knowledge graph-enhanced trigger word semantic metabody library based on the support set data, including: Given the event type of the query statement q e n First, select K similar event instances from the support set. Build a type-aware support set: in, Indicates an event e n The trigger word label space allows the model to focus on the trigger word features of specific event types; To comprehensively capture the semantic, syntactic, and domain-specific features of trigger words, the model employs a multimodal encoder: in, To fuse features, dependency tree convolutional networks Domain knowledge encoder that captures the syntactic structure of sentences. Map entities to the domain knowledge space; It is the syntactic analysis result obtained after processing the query sentence q, containing the sentence's dependency syntax tree information; DepTreeCNN uses... As input, it is used to capture the syntactic structure of the sentence and obtain a syntactic feature representation. ; It queries entity-related information in sentence q; the domain knowledge encoder's knowledge graph embedding layer KG-Embed will... As input, entities are mapped to the domain knowledge space to obtain domain knowledge feature representations. ; To enable information exchange between the query set and the support set, a cross-modal interactive encoder is designed: The bidirectional attention mechanism enables query sentences to focus on key trigger word features in the support set, while the support set instance can also provide contextual clues for the query; Target event type e n Construct its trigger word semantic metabody: In the formula, This is the textual semantic prototype. As a prototype of syntactic structure, As a prototype of domain knowledge. As a prototype of knowledge graph; It is for the target event type e n The constructed trigger word semantic metabody is a collection containing multiple feature prototypes, integrating text semantics, syntactic structure, domain knowledge, and knowledge graph related information, used to characterize the semantic features of trigger words for this event type; Textual semantic prototypes, through textual features supporting K instances of similar events in a set. The value is obtained by performing an attention pooling operation (AttnPool) and averaging the results; it represents the typical characteristics of this event type at the textual semantic level, where... To support the kth instance of the same type in the set Text feature representation obtained after text feature extraction; Syntactic structure prototypes are obtained by using graph neural networks (GNNs) to analyze the syntactic features of K similar event instances in a support set. The processing yielded the results; reflecting the typical syntactic patterns of trigger words for this event type; among them, Supports the kth instance of the same type in the set. Syntactic features representation; A knowledge graph prototype, utilizing the Graph Attention Network (GAT), is based on knowledge graphs. and the relationship between trigger words The calculation yielded results; this reflects the association characteristics of the event type trigger words in the knowledge graph; among which, With the target event type e n The relevant knowledge graph contains information about entities related to this event type and the relationships between entities; Target event type e n Trigger words and The relationship between them; It serves as a prototype for domain knowledge; based on supporting instances of similar events in a set, it utilizes specific models or methods to extract key features from the domain-related knowledge contained in the text to construct the prototype.

6. The few-sample event detection method based on dynamic semantic metabody driving as described in claim 5, characterized in that, The graph attention network updates node representations in the following way: Among them, the attention coefficient is ; W is a weight matrix that plays a role in linearly transforming node features in graph attention network computation, mapping the original feature vectors of nodes to a new feature space, which facilitates subsequent calculation of attention coefficients and updating of node representations. h i , h j Let i and j represent the feature vectors of node i and node j, respectively. These feature vectors are numerical representations of node information, including the relevant attributes and semantic information of the nodes in the graph structure, and are used to calculate the attention relationship between nodes. N i Let N represent the set of neighboring nodes of node i. In a graph structure, the nodes directly connected to node i constitute its set of neighboring nodes. When calculating a new representation of node i, its neighboring nodes N are considered. i Information about node j in the dataset.

7. The few-sample event detection method based on dynamic semantic metabody driving as described in claim 5, characterized in that, The sequence labeling using conditional random fields includes: Sequence labeling is performed using Conditional Random Fields (CRFs), taking into account multi-dimensional features: In the formula, Representing text features, Indicate syntactic features, Representing knowledge characteristics, Indicates interactive features, Indicates the matching features of the metabody; Depend on , , , and composition; These are transition matrix elements in a CRF, representing the transition from label... y i-1 Move to label y i The score reflects the dependencies between labels; y e It is the trigger word label space for event e, i.e., the set of labeled categories, where L represents the sentence length. This represents the set of all possible combinations of labels in a sentence; the model aims to find the label sequence from this set that maximizes the objective function value. , which is the label that best matches the trigger word situation in the sentence; i represents the position index in the sequence. In the task of sequence labeling of sentences, the sentence is regarded as a sequence of words. i ranges from 1 to L, and iterates through each position in the sentence in turn to indicate the feature calculation and labeling decision operation at that position. This represents the label category at the i-th position in the sequence; the set of label categories is... It represents different labels related to the trigger word, and the model calculates the most suitable one for each position i. Complete sequence labeling.

8. The few-sample event detection method based on dynamic semantic metabody driving as described in claim 7, characterized in that, The method of dynamically updating meta-parameters through a prototype gradient correction algorithm to adapt to cross-domain event variants includes: To adapt to cross-domain event variations, a dynamic update mechanism for meta-body parameters is designed: in, This is a set of trigger word nodes related to meta-learning, representing a series of task samples used for meta-learning. The model updates the trigger word semantic metabody based on these task samples. The parameters are set to adapt to cross-domain event variations; The learning rate; The meta-learning loss function is: in, ξ is the loss function associated with the support set S; in few-sample event detection, the support set provides key information to measure the difference between the model's predictions on the support set and the actual situation, helping the model learn the features of the support set data; ξ is the learning rate. Is with The relevant loss function is used to evaluate the model's performance on a specific task and guide the model to transfer learning between different tasks; The total loss function includes multi-dimensional constraints: in, It is a loss function calculated based on text features. It is a loss function calculated based on syntactic features. MMD ensures that the query is aligned with the feature distribution of the support set. This represents the alignment loss for the knowledge graph.

9. The few-sample event detection method based on dynamic semantic metabody driving as described in claim 8, characterized in that, The first layer generates a trigger word candidate filtering matrix based on the event type metabody, and filters invalid cross-category candidates according to the category-trigger word mapping graph, including: First, based on the event type metabody Generate a candidate filter matrix for trigger words: in, To query the fusion features of the i-th token in the sentence, Event type e n With candidate words v j In terms of the relevance of knowledge graphs, It is a weight matrix. It is a bias vector. The sigmoid function normalizes the score to the [0,1] interval; This filtering matrix uses a type-trigger word mapping graph. Filtering invalid candidates across categories: in, Indicates event type e n Next trigger word The probability of its occurrence, This is the probability threshold.

10. The few-sample event detection method based on dynamic semantic metabody driving as described in claim 9, characterized in that, The second layer constructs a dynamic bipartite graph representing the co-occurrence relationship between trigger words and arguments. It then uses a graph convolutional network to learn ternary constraint relationships and combines them with conditional random fields to achieve structured and precise annotation of trigger word boundaries, including: Model the co-occurrence relationship between trigger words and arguments as a dynamic bipartite graph. ,in For the set of candidate trigger word nodes, For the set of candidate argument nodes, Let be the set of edges, with weights Indicates the intensity of co-occurrence; Introducing event structure triples Representing event type, trigger words, and role relationships, constraint relationships are learned through a graph convolutional network: in, Add a self-loop to the adjacency matrix. The adjacency matrix A is a matrix describing the connection relationships between nodes in the graph structure. If there is a connecting edge between node i and node j, then A... ij The value is usually 1 or the weight of the edge; if there is no connection between node i and node j, then A ij =0; For degree matrix, For the first Layer node representation, Given a trainable weight matrix, the triplet scoring function is defined as follows: Where g is a multilayer perceptron, and Embed is the embedded representation of entities and relations; The model is optimized using the following multi-task loss function: In the formula, For sequence labeling loss in conditional random fields, To reduce attention network loss, For knowledge graph alignment loss, For type consistency loss, C represents the actual labels, and C represents the number of labels. To predict probabilities, a two-layer constraint decoding mechanism filters invalid candidates through event type metabody and uses graph structure to learn the deep relationship between trigger words and arguments, achieving structured and accurate annotation of trigger word boundaries.