An event-oriented argument extraction method for military field

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By combining a multi-labeled entity classifier and a multi-task learning mechanism, using multi-labeled entity-guided attention and the BERT model, an AMR graph is constructed, which solves the problem of insufficient accuracy and recall in event argument extraction in the military field, and achieves higher accuracy and robustness in event argument extraction.

CN119886125BActive Publication Date: 2026-06-12BEIJING INFORMATION SCI & TECH UNIV

View PDF 4 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: BEIJING INFORMATION SCI & TECH UNIV
Filing Date: 2024-12-31
Publication Date: 2026-06-12

Application Information

Patent Timeline

31 Dec 2024

Application

12 Jun 2026

Publication

CN119886125B

IPC: G06F40/279; G06F40/205; G06F16/35; G06N3/047; G06N3/096; G06N3/098; G06N3/0455; G06N3/042; G06F40/30

CPC: G06F40/279; G06F40/205; G06F16/35; G06N3/047; G06N3/096; G06N3/098; G06N3/0455; G06N3/042

AI Tagging

Application Domain

Semantic analysis Biological models

Technical Efficacy Phrases

improve accuracyImprove the extraction effect

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A terminal learning software running performance monitoring and abnormality identification system
CN122195719Aavoid consumptionSolve positioning difficultiesFault response
A power equipment image processing method based on fiber bundle imaging, a storage medium and a system
CN118570110Breduce biasimprove accuracy Image enhancement Image analysis Imaging processingFiber bundle
A flood disaster assessment method and system integrating precipitation regime and time elements
CN116776601Bimprove accuracy Data processing applications Design optimisation/simulation
System and method for detecting the fit of a manufactured piece
CN122197203AAvoid misdiagnosis as design issuesImprove detection accuracy Machine part testing Geometric CAD
File verification method, storage medium, electronic device, and program product
CN119808173BSolve the technical problem of low verification accuracyReduce risk of migrationDigital data protection File system administration

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN119886125B_ABST

Patent Text Reader

Abstract

The application discloses a kind of event argument extraction methods for military field, it is related to event argument extraction technical field, including the following specific steps: step one, obtain data;Step two, define annotation rule;Step three, introduce multi-label entity classifier: entity is annotated as single label or multi-label type;Step four, introduce multi-label entity oriented attention mechanism;Step five, sequence labeling;Step six, joint training;Step seven, using BERT model is encoded;Step eight, capture context and encode document;Step nine, build global AMR graph and local AMR graph: through AMR guide module stimulates the interaction between concepts in document, uses information fusion module to fuse double-flow representation.The event argument extraction method of the application combines multi-label entity classifier and multi-task learning mechanism, by improving the recognition ability to event argument, especially in the application in military field, provide higher accuracy and recall rate.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of event argument extraction technology, and in particular to an event argument extraction method for the military field. Background Technology

[0002] In the field of natural language processing, event argument extraction is a core task, aiming to identify and extract information about participating elements related to a specific event from text content. In many applications, such as news text analysis and information extraction, accurate extraction of event arguments is crucial for understanding event information within text.

[0003] There are three main methods for event argument extraction: rule-based methods, statistical methods, and hybrid methods that combine the two. Rule-based methods often require a large amount of manually labeled data and cannot effectively handle the diversity of language. Statistical methods, such as Hidden Markov Models (HMMs) and Conditional Random Fields (CRFs), while better able to capture complex language features, still face challenges in terms of accuracy and recall.

[0004] In recent years, deep learning-based models, such as BERT, BiLSTM-CRF, and Transformer, have made significant progress. However, these models still face some challenges in handling domain-specific event extraction tasks. For example, although BERT can capture contextual information, it may still fail to fully solve the problem of event argument extraction in specific domains, such as the military domain. At the same time, the application of multi-label entity recognition and multi-task learning has also shown its potential to improve model performance. However, how to effectively combine these methods to meet the needs of specific domains remains an urgent problem to be solved. To this end, we propose an event argument extraction method for the military domain. Summary of the Invention

[0005] The purpose of this invention is to provide an event argument extraction method for the military field. This event argument extraction method combines a multi-label entity classifier and a multi-task learning mechanism, and by improving the ability to identify event arguments, it provides higher accuracy and recall, especially in military applications.

[0006] To achieve the above objectives, the present invention provides the following technical solution: a method for extracting event arguments in the military field, comprising the following specific steps:

[0007] Step 1: Data Acquisition: Acquire a certain amount of data and process the acquired data;

[0008] Step 2: Define annotation rules: Standardize the annotation of controversial corpora, ensure annotation quality and reduce annotation difficulty, and annotate events that are consistent with the information analysis topic and have clear semantics;

[0009] Step 3: Introduce a multi-label entity classifier: label entities as single-label or multi-label types;

[0010] Step 4: Introduce a multi-labeled entity-guided attention mechanism: Input sentences with multi-labeled entities into a document-level encoder, use a multi-labeled entity-guided attention mechanism to distinguish entity importance, and combine sentence-level contextual information and document-level features;

[0011] Step 5, Sequence Labeling: The obtained fused features are input into the Conditional Random Field (CRF) layer using a fusion mechanism for sequence labeling to extract event arguments;

[0012] Step 6, Joint Training: Joint training is conducted with event argument extraction as the main task and named entity boundary detection as the auxiliary task.

[0013] Step 7: Encode using the BERT model: Improve the performance of event argument extraction by combining the parameter sharing mechanism of multi-task learning with named entity boundary information;

[0014] Step 8: Capture context and encode the document: Use global and local encoders to capture different ranges of context and encode the document;

[0015] Step 9: Construct global AMR and local AMR graphs: Stimulate the interaction between concepts in the document through the AMR guidance module, and fuse the two-stream representations using the information fusion module.

[0016] Preferably, when defining the annotation rules in step two, the data is also preprocessed through word form restoration and a trusted trigger word set.

[0017] Preferably, in step four, a multi-encoding mechanism is used to dynamically fuse sentence-level features and document-level features of the text.

[0018] Preferably, in step four, an improved first-layer graph attention network model is used to process the instance embedding initialization matrix and the relation embedding initialization matrix to obtain the first-layer instance embedding optimization matrix.

[0019] Preferably, in step nine, the two stream representations are fused through an information fusion module, and the boundary information is enhanced through boundary loss.

[0020] Preferably, the calculation formula for the multi-label entity-guided attention mechanism is:

[0021] Attention(t,e)=Softmax(LeakyReLU(Watt·[t;e]))

[0022] Where Watt is a linear transformation matrix, and [t;e] represents the concatenation operation of trigger words and entities.

[0023] Preferably, the loss function for the multi-task learning is:

[0024] Loss = Lossmain + λ·Lossaux

[0025] Here, Lossmain is the main task loss for event argument extraction, and Lossaux is the auxiliary task loss for named entity boundary detection.

[0026] Preferably, the formula for fusing global and local context information in the dual-stream coding is as follows:

[0027] FusedRepresentation=Concat(GlobalEncoder(X),LocalEncoder(X))

[0028] In the formula, Concat means concatenating the outputs of the global encoder and the local encoder, and GlobalEncoder(X) and LocalEncoder(X) represent the results of global and local encoding of the document, respectively.

[0029] Preferably, the formula for constructing the AMR map is:

[0030] AMRGraph=ConstructAMR(X,Interactions)

[0031] In the formula, the ConstructAMR function is used to construct global and local AMR graphs based on the concepts in the document, and Interactions represents the interaction information between concepts.

[0032] The technical effects and advantages of this invention are as follows:

[0033] This method classifies entities in documents into multi-labeled entities and single-labeled entities to address the labeling confusion caused by inconsistent representations of the same military entity in different sentences. Secondly, it employs a multi-task learning model, using named entity boundary detection as an auxiliary task to integrate role information into the main task of event argument extraction. The auxiliary task learns entity boundary information to guide and optimize the main task. Accurate identification of entity boundaries directly promotes the correct assignment of argument roles, thereby improving the accuracy of event argument extraction. Finally, it utilizes an AMR-based and two-stream coding model to capture semantic features. The AMR graph reflects the long-range dependencies between arguments and triggers, and the combination of global and local encoders in two-stream coding fully leverages contextual information to enhance the extraction effect of event arguments. This method aims to improve the accuracy and robustness of military event argument extraction, especially in complex text environments.

[0034] By employing a document-level argument extraction method based on multi-labeled entities, a multi-labeled entity classifier can be easily introduced into the model to label entities in sentences as either single-labeled or multi-labeled. Next, sentences with multi-labeled entities are input into a document-level encoder, where a multi-labeled entity-guided attention mechanism is used to distinguish the importance of different entities in the document. Finally, a fusion mechanism is used to dynamically fuse sentence-level contextual information and document-level features, and the resulting fused features are input into a CRF layer for sequence labeling to extract event arguments.

[0035] By using a chapter-level ship event argument extraction model based on a multi-label entity-guided attention mechanism, it is convenient to jointly train the event argument extraction task and the named entity boundary detection task. By adapting to different tasks to obtain shared representations, the model can enhance the understanding of event arguments by utilizing entity boundary information, guide the generation of shared semantic information encoding, and improve the accuracy of the model's event argument extraction.

[0036] By employing an event argument extraction process based on dual-stream coding and AMR, it is convenient to use global and local encoders with different attention reception fields to capture the context of different ranges in the document to be extracted and encode the document. The global AMR graph and local AMR graph are constructed through the AMR guidance module to stimulate the interaction between concepts in the document, especially those concepts that are far apart. The two stream representations are fused using an information fusion module, and boundary information is enhanced through boundary loss. Finally, the candidate span is predicted using a classification module. Attached Figure Description

[0037] Figure 1 This is a flowchart of the event argument extraction method of the present invention.

[0038] Figure 2This is a diagram illustrating the architecture of the chapter-level ship event argument extraction model based on a multi-label entity-guided attention mechanism of the present invention.

[0039] Figure 3 This is a diagram illustrating the architecture of the chapter-level ship event argument extraction model based on multi-task assistance of the present invention.

[0040] Figure 4 This is a flowchart of the event argument extraction process based on dual-stream coding and AMR of the present invention. Detailed Implementation

[0041] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0042] This invention provides, for example Figure 1-4The method for extracting event arguments in the military domain includes the following specific steps: Step 1: Data Acquisition: Acquire a certain amount of data and process it; Step 2: Define Labeling Rules: Standardize the labeling of controversial corpora, ensure labeling quality and reduce labeling difficulty, and label events that conform to the information analysis theme and have clear semantics; Step 3: Introduce a Multi-Labeled Entity Classifier: Label entities as single-label or multi-labeled types; Step 4: Introduce a Multi-Labeled Entity-Guided Attention Mechanism: Input sentences with multi-labeled entities into a document-level encoder, use the multi-labeled entity-guided attention mechanism to distinguish entity importance, and combine sentence-level contextual information and document-level features; Step 5: Sequence Labeling: Use a fusion mechanism to input the obtained fused features into a Conditional Random Field (CRF) layer for sequence labeling to extract event arguments; Step 6: Joint Training: With event argument extraction as the main task, named entity boundary detection is also included. Joint training is performed as an auxiliary task; Step 7: Encoding with the BERT model: The performance of event argument extraction is improved by combining named entity boundary information with the parameter sharing mechanism of multi-task learning; Step 8: Capturing context and encoding the document: Global and local encoders are used to capture context of different ranges and encode the document; Step 9: Constructing global AMR graph and local AMR graph: The interaction between concepts in the document is stimulated by the AMR guidance module, and the dual-stream representation is fused by the information fusion module; The specific process of the whole method is as follows: Step 1: Acquire data; Step 2: Define annotation rules; Step 3: Introduce multi-labeled entity classifier; Step 4: Introduce multi-labeled entity guided attention mechanism; Step 5: Sequence labeling; Step 6: Joint training; Step 7: Encoding with the BERT model; Step 8: Capturing context and encoding the document; Step 9: Constructing global AMR graph and local AMR graph.

[0043] Furthermore, in step two, when defining the annotation rules, the data is preprocessed using lemmatization and a reliable trigger word set. This lemmatization and trigger word set approach allows for a more accurate grasp of the data when defining the annotation rules, thereby ensuring annotation quality and reducing annotation difficulty.

[0044] Furthermore, in step four, a multi-encoding mechanism is specifically used to dynamically fuse sentence-level and document-level features of the text. By utilizing a multi-encoding mechanism to dynamically fuse sentence-level and document-level features of the text, information extraction efficiency can be improved through multi-channel retrieval: First, multiple encoding methods provide multiple retrieval pathways. When information needs to be extracted, it can be retrieved through any one or more encoding clues. Second, it flexibly responds to interference. When one retrieval clue is interfered with or forgotten, other encoding clues may still help extract information.

[0045] Furthermore, in step four, the improved first-layer graph attention network model is used to process the instance embedding initialization matrix and the relation embedding initialization matrix to obtain the first-layer instance embedding optimization matrix.

[0046] Furthermore, in step nine, the two stream representations are fused through the information fusion module, and boundary information is enhanced through boundary loss. Enhancing boundary information through boundary loss can improve data quality in several ways: First, it accurately locates boundaries: boundary loss guides the model to more accurately determine the boundary position of the target object, reducing boundary ambiguity and errors, thereby improving the accuracy of data labeling and overall data quality. Second, it enhances data consistency: by emphasizing boundary information, the features of the data at the boundaries become clearer and more consistent, reducing internal contradictions and ambiguities, improving data reliability and stability, and facilitating subsequent analysis and processing. It can also optimize model performance: First, it improves feature learning ability: increased boundary information helps the model better learn local features of the data, especially subtle features at the boundaries, enabling the model to capture richer feature information, thereby improving the model's ability to represent data and the effect of feature learning. Second, it improves model convergence speed: clear boundary information can provide more effective guidance for model training, helping the model converge to the optimal solution faster, reducing training time and computational resource consumption, and improving the efficiency of model training. It can also improve classification and recognition performance: First, it enhances category discrimination: In classification tasks, boundary information can highlight the differences between different categories, especially in the boundary region, making it easier for the model to distinguish similar categories and improving classification accuracy and recall. Second, it enhances the ability to recognize small targets: For small targets or targets with unclear boundaries, adding boundary information can make the model pay more attention to the boundary features of the target, improving the detection and recognition ability of small targets and reducing missed detections and false detections. It can also enhance data security: First, it detects anomalies and attacks: In some security-sensitive data processing scenarios, clear boundary information helps to promptly detect anomalies and potential attacks in the data. By monitoring and analyzing the boundaries, it improves data security and risk prevention capabilities. Second, it protects privacy information: When processing data involving privacy, boundary loss can better define the scope and boundaries of privacy data, allowing for more effective encryption and protection measures to prevent the leakage of privacy information and enhance data privacy protection capabilities. It can also assist in data visualization and understanding: First, it clearly displays the data structure: Adding boundary information can make the internal structure and boundary features of the data more clearly displayed during visualization, allowing for a more intuitive understanding of the data's distribution and characteristics, providing stronger support for data analysis and decision-making.

[0047] It should be further explained that the calculation formula for the multi-label entity-guided attention mechanism is as follows:

[0048] Attention(t,e)=Softmax(LeakyReLU(Watt·[t;e]))

[0049] Where Watt is a linear transformation matrix, and [t;e] represents the concatenation operation of trigger words and entities; by introducing a multi-label entity-guided attention mechanism, the effectiveness of feature extraction can be improved: First, focusing on key information: it can guide the model to automatically focus on key regions or features in the data, ignoring irrelevant interference information, thereby extracting effective features relevant to the task more accurately. Second, enhancing feature representation: by highlighting key features, the model learns more discriminative feature representations, enriching the semantic information of features, improving the ability of features to characterize the essence of the data, and also improving model performance and efficiency: First, improving model accuracy: effectively utilizing the attention mechanism can enable the model to better understand the data, thereby making more accurate predictions in tasks such as classification and regression, improving performance indicators such as model accuracy and recall. Third, accelerating model convergence: since the attention mechanism can help the model quickly locate key information, it reduces the search space and interference factors in the model during the learning process, which can usually accelerate the training convergence speed of the model, save training time and computing resources, and also improve the model's adaptability and robustness: First Adapting to different data distributions: When faced with data of different distributions, the guided attention mechanism can dynamically adjust the allocation of attention according to the characteristics of the data, enabling the model to better adapt to changes in the data and improve the model's generalization ability. Secondly, it enhances anti-interference ability and can also promote multimodal data fusion, and can align information from different modalities: When processing multimodal data, the attention mechanism can be used to align information between different modalities, find the correlations and correspondences between them, achieve more effective multimodal data fusion, and improve the fusion effect: By focusing on and integrating the key parts of different modal data, the advantages of each modality can be fully utilized, improving the performance and effect of the multimodal fusion model.

[0050] Furthermore, the loss function for multi-task learning is:

[0051] Loss = Lossmain + λ·Lossaux

[0052] Here, Lossmain is the main task loss for event argument extraction, and Lossaux is the auxiliary task loss for named entity boundary detection.

[0053] Furthermore, the formula for calculating the fusion of global and local context information in dual-stream coding is as follows:

[0054] FusedRepresentation=Concat(GlobalEncoder(X),LocalEncoder(X))

[0055] In the formula, Concat means concatenating the outputs of the global encoder and the local encoder, and GlobalEncoder(X) and LocalEncoder(X) represent the results of global and local encoding of the document, respectively.

[0056] Furthermore, the formula for constructing the AMR map is:

[0057] AMRGraph=ConstructAMR(X,Interactions)

[0058] The ConstructAMR function in the formula is used to construct global and local AMR graphs based on the interaction of concepts in the document. Interactions represent the interaction information between concepts. AMR graphs facilitate knowledge extraction and integration: First, knowledge extraction is convenient: AMR graphs provide a standardized semantic representation. By constructing AMR graphs from a large amount of text, various knowledge, such as entities, events, and attributes, can be easily extracted. Second, knowledge integration is efficient: Text data from different sources can be represented in a unified semantic way by constructing AMR graphs. Then, these AMR graphs can be integrated to achieve knowledge fusion and sharing, providing a rich knowledge foundation for subsequent knowledge reasoning and application. It also supports various natural language processing tasks: First, machine translation assistance: In machine translation, AMR graphs can serve as an intermediate representation, converting source language text into AMR graphs, and then generating target language text from the AMR graphs based on the characteristics of the target language, helping to improve the accuracy and fluency of translation. Second, information retrieval optimization: By converting text into AMR graphs, the user's query intent can be understood more accurately, and semantic matching and retrieval can be performed on the graph, improving the relevance and accuracy of retrieval results. Third, language variant adaptation: Languages in different regions or fields may have differences in vocabulary, sentence structure, etc., but the semantics they express are often similar. AMR graphs focus on semantic-level representation, which can ignore these superficial language differences to a certain extent and has good adaptability to various language variants. Fourth, cross-language processing advantages: Because AMR graphs have a relatively unified semantic representation, they have great advantages in cross-language processing. Texts in different languages can be converted into AMR graphs, and then various processing and analysis can be performed at the AMR graph level to achieve cross-language information exchange and knowledge sharing.

[0059] By employing a multi-label entity classifier and a multi-task learning mechanism, the model's ability to identify event arguments in the military domain was improved. In the multi-label entity classifier, the classification of multi-label entities and single-label entities can more accurately capture event arguments. In the multi-task learning, the auxiliary task of named entity boundary detection is combined to significantly improve the extraction effect of event arguments in the main task. The multi-task learning framework is used to fuse these two tasks. Then, by constructing a document-level abstract semantic graph, a comprehensive modeling of text semantics is achieved, which significantly reduces the complexity of model training.

[0060] Figure 2 This paper presents a document-level argument extraction method based on multi-labeled entities. A multi-labeled entity classifier is introduced into the model to label entities in sentences as either single-labeled or multi-labeled. Next, sentences with multi-labeled entities are input into a document-level encoder, where a multi-labeled entity-guided attention mechanism is used to distinguish the importance of different entities in the document. Finally, a fusion mechanism is used to dynamically fuse sentence-level contextual information and document-level features. The resulting fused features are then input into a CRF layer for sequence labeling to extract event arguments.

[0061] Figure 3 We provide a chapter-level ship event argument extraction model based on a multi-label entity-guided attention mechanism. We jointly train the model for event argument extraction and named entity boundary detection tasks. By adapting to different tasks, we acquire shared representations, enhance the understanding of event arguments using entity boundary information, guide the generation of shared semantic information encoding, and improve the accuracy of the model's event argument extraction.

[0062] Figure 4 This paper presents an event argument extraction pipeline based on dual-stream encoding and AMR (Active Context Representation). It uses global and local encoders with different attention receptive fields to capture context across different ranges of the document to be extracted and encodes the document. An AMR guidance module constructs global and local AMR graphs to stimulate interactions between concepts in the document, especially those that are far apart. An information fusion module fuses the two stream representations, and boundary information is enhanced through boundary loss. Finally, a classification module predicts candidate spans.

[0063] Working principle of this invention:

[0064] Annotation rules are defined to improve annotation quality, and data is preprocessed through lemmatization and a trusted trigger word set. An improved first-layer graph attention network model is used to process the instance embedding initialization matrix and relation embedding initialization matrix to obtain the first-layer instance embedding optimization matrix. A multi-label entity classifier is used to classify entities into single-label and multi-label classes. Sentences with multi-label entities are input into a document-level encoder, and a multi-label entity-guided attention mechanism is used to distinguish the importance of different entities. A multi-encoding mechanism is used to dynamically fuse sentence-level features and document-level features of the text. Sequence labeling is performed through a CRF layer, and encoding is performed using a BERT model. Multi-task learning is used to utilize entity boundaries to improve the performance of event argument extraction. Global and local encoders are used to capture semantic information at different locations in the text. Global and local AMR graphs are constructed, and the two stream representations are fused through an information fusion module.

[0065] The event argument extraction method of the present invention will be described below with reference to specific embodiments:

[0066] The model in this paper was experimentally validated on the non-public Chinese military event dataset CEER. This dataset contains 13,000 data points, including 7,000 training samples, 1,500 validation samples, and 4,500 test samples. The dataset defines 28 event types, with each event type corresponding to 4-8 arguments.

[0067] The experimental setup of the CIME model proposed in this paper uses 768 hidden layers and 1024 feedforward layers. The model uses a three-layer graph convolutional network with a dropout rate of 0.1, a batch size of 16, a learning rate of 1e-4, and a training period of 100 epochs, from which the optimal model is saved.

[0068] In the field of document-level event extraction, the DCFEE model proposed by Yang et al. believes that document-level event arguments are mainly concentrated in the central sentence of the article. This model first extracts event arguments from the central sentence of the article, and then examines the sentences around the central sentence to supplement more argument information. There are two versions of the DCFEE model, DCFEE-S and DCFEE-M. DCFEE-S extracts only a single event record at a time, while DCFEE-M can generate multiple potential event argument combinations at the same time. In addition, the Doc2EDAG model uses two Transformer encoders to obtain sentence-level and entity-level feature representations and fuse contextual information, thereby supporting the simultaneous extraction of multiple events. Greedy-Dec is a variant of Doc2EDAG, which continues to extract other events after completing the extraction of one event. This paper uses the MicroF1 score to evaluate the performance of the model, and the results are shown in Table 1.

[0069] Table 1 Comparison of Micro F1 values for the models

[0070]

[0071] As can be seen, the CIME model exhibits superior performance compared to other ensemble models. Its advantage is mainly attributed to the abstract processing of the relationships between arguments when constructing the discourse-level abstract semantic graph, which enables the model to more effectively capture and understand the complex interactions between event arguments within a discourse.

[0072] The ablation experiment results of the model in this paper are shown in Table 2. It can be seen that this abstraction process significantly enhances the model's ability to understand and analyze chapter-level events.

[0073] Table 2. Results of Model Ablation Experiment

[0074]

[0075] First, semantic boundary information detection is not performed on the CIME model; only an abstract semantic graph oriented towards event arguments is constructed. This weakens the model's ability to understand text semantics, which is reflected in the performance metric that the model's MicroF1 score decreases by 1.

[0076] Next, this paper modifies the argument extraction method of the dataset and the CIME model. Event classifications in the dataset are standardized into a unified template, which is then used as the model's input. A tree structure based on this unified template is then employed for event argument extraction. This method frees the model from dependence on specific event types, simplifies the extraction process, and solves the problem of having to extract arguments separately according to event type in practical applications, thereby improving the model's Micro F1 score to 89.7.

[0077] Event extraction is a crucial step in transforming unstructured text into structured data, helping to filter out irrelevant information and extract valuable data. As a core part of this process, the extraction of event arguments directly impacts the accuracy of event information extraction. Therefore, this paper proposes a novel document-level event extraction method, CIME, which combines event argument extraction and entity detection tasks. First, it utilizes a multi-task learning framework to fuse these two tasks; second, it achieves comprehensive semantic modeling of the text by constructing a document-level abstract semantic graph; finally, it integrates text semantics and argument entity information to enhance the overall performance of the model. Experimental results show that this method achieves good results on the dataset and significantly reduces the complexity of model training. Future work will focus on further simplifying the application of abstract semantic graphs in CIME and exploring its application potential in resource-constrained environments.

[0078] Finally, it should be noted that the above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A method for extracting event arguments in the military field, characterized in that, The specific steps include the following: Step 1: Data Acquisition: Acquire a certain amount of data and process the acquired data; Step 2: Define annotation rules: Standardize the annotation of controversial corpora, ensure annotation quality and reduce annotation difficulty, and annotate events that are consistent with the information analysis topic and have clear semantics; Step 3: Introduce a multi-label entity classifier: label entities as single-label or multi-label types; Step 4: Introduce a multi-labeled entity-guided attention mechanism: Input sentences with multi-labeled entities into the document-level encoder. Use the multi-labeled entity-guided attention mechanism to distinguish entity importance. Combine sentence-level contextual information and document-level features, and use an improved first-layer graph attention network model to process the instance embedding initialization matrix and relation embedding initialization matrix to obtain the first-layer instance embedding optimization matrix. The calculation formula for the multi-labeled entity-guided attention mechanism is as follows: Attention(t,e)=Softmax(LeakyReLU(Watt·[t;e])) Where Watt is a linear transformation matrix, and [t;e] represents the concatenation operation of trigger words and entities; Step 5, Sequence Labeling: The obtained fusion features are input into the conditional random field layer using a fusion mechanism for sequence labeling to extract event arguments; Step 6, Joint Training: Joint training is conducted with event argument extraction as the main task and named entity boundary detection as the auxiliary task. Step 7: Encode using the BERT model: Improve the performance of event argument extraction by combining the parameter sharing mechanism of multi-task learning with named entity boundary information; Step 8: Capture context and encode the document: Use global and local encoders to capture different ranges of context and encode the document; Step 9: Constructing Global AMR and Local AMR Graphs: The AMR guidance module stimulates interactions between concepts in the document, and the information fusion module fuses the two-stream representations. The information fusion module further enhances boundary information through boundary loss. The formula for calculating the fusion of global and local contextual information in the two-stream representations is as follows: FusedRepresentation=Concat(GlobalEncoder(X),LocalEncoder(X)) In the formula, Concat means concatenating the outputs of the global encoder and the local encoder, and GlobalEncoder(X) and LocalEncoder(X) represent the results of global and local encoding of the document, respectively.

2. The event argument extraction method for the military field according to claim 1, characterized in that, In step two, when defining the annotation rules, data is also preprocessed through word form restoration and a trusted trigger word set.

3. The event argument extraction method for the military field according to claim 1, characterized in that, In step four, a multi-encoding mechanism is specifically used to dynamically fuse sentence-level features and document-level features of the text.

4. The event argument extraction method for the military field according to claim 1, characterized in that, The loss function for the multi-task learning is: Loss = Lossmain + λ·Lossaux Here, Lossmain is the main task loss for event argument extraction, and Lossaux is the auxiliary task loss for named entity boundary detection.

5. The event argument extraction method for the military field according to claim 4, characterized in that, The formula for constructing the AMR map is: AMRGraph=ConstructAMR(X,Interactions) In the formula, the ConstructAMR function is used to construct global and local AMR graphs based on the concepts in the document, and Interactions represents the interaction information between concepts.