An explainable question and answer pair large-scale generation method and system for vertical fields

By constructing a cross-modal heterogeneous meteorological knowledge graph and using a heat diffusion equation to simulate semantic information propagation, the shortcomings of existing question-and-answer systems in multimodal information processing are addressed. Logically complete and interpretable meteorological question-and-answer pairs are generated, improving the coverage and reasoning ability of meteorological knowledge.

CN121980000BActive Publication Date: 2026-06-16XIAMEN SHIBAO NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
XIAMEN SHIBAO NETWORK TECH CO LTD
Filing Date
2026-04-07
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

When dealing with complex meteorological issues, existing question-answering systems struggle to perform deep semantic alignment and joint reasoning between multimodal information and text entities. This results in insufficient coherence and completeness of the reasoning process in the generated question-answer pairs, especially when citing specific areas or moments in images or videos as supporting evidence. Existing methods are inadequate in organizing and presenting structured evidence chains.

Method used

We construct a cross-modal heterogeneous meteorological knowledge graph, learn node embedding vectors through a heterogeneous graph neural network and perform semantic community partitioning, simulate semantic information propagation using the heat diffusion equation, generate semantic diffusion equipotential surfaces, construct a hierarchical semantic space domain, and combine a pre-trained large language model to generate interpretable question-answer pairs.

Benefits of technology

It has achieved complete preservation of the diverse information features of meteorological knowledge, improved the completeness and accuracy of knowledge coverage, enhanced the interpretability and credibility of question answering, optimized the structured organization and retrieval efficiency of knowledge graphs, and improved the reasoning coherence and logic of complex meteorological problems.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121980000B_ABST
    Figure CN121980000B_ABST
Patent Text Reader

Abstract

The application provides an explainable question and answer pair large-scale generation method and system for vertical fields, and relates to the technical field of data processing.The method comprises the following steps: step 1, constructing a cross-modal heterogeneous meteorological knowledge graph; modeling the cross-modal heterogeneous meteorological knowledge graph by using a heterogeneous graph neural network, learning the embedding vector of each node, obtaining the knowledge graph with the node embedding vector, and performing semantic community division on the entities based on the embedding vector to generate an optimized knowledge graph.The application realizes the large-scale generation of explainable meteorological question and answer pairs in vertical fields by means of knowledge graph optimization, conical semantic space construction, structured evidence chain retrieval and pre-training of a large language model, and improves the explainability and reliability of the question and answer results.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data processing technology, and in particular to a method and system for large-scale generation of interpretable question-answer pairs for vertical domains. Background Technology

[0002] In the field of meteorological services, how to generate question-answer pairs with explanatory paths at scale based on a professional knowledge base is a research direction for improving the interpretability of intelligent question-answering systems. Existing methods can be further optimized in their approach to organizing fragmented entity information into a coherent explanatory chain when dealing with complex problems involving multi-hop reasoning.

[0003] Taking typhoon forecast consultation as an example, when a fisherman in a coastal area asks via mobile phone, "The typhoon suddenly changed direction before making landfall; is this related to changes in the subtropical high?", existing question-and-answer systems can usually retrieve the typhoon entity, the subtropical high entity, the time point of the change, and their respective attribute descriptions based on meteorological knowledge graphs. However, in practical applications, when organizing these retrieved entities and relationship fragments into an explanatory question-and-answer pair that can reflect a complete causal chain, the processing effect of existing methods has room for improvement. Specifically, to answer the above questions, ideally, a complete reasoning chain needs to be presented, from "weakening of the subtropical high" to "changes in guiding airflow" and then to "shift in typhoon path." However, existing methods, which focus on matching and extracting textual entities, mostly treat multimodal information widely present in meteorological documents, such as keyframes of satellite cloud images showing the evolution of high-pressure morphology and weather map image regions describing changes in flow field distribution, as supplementary information. They fail to perform deep semantic alignment and joint reasoning between visual entities and spatial relationships and textual entities. Due to limited attention to the semantic distribution patterns and cross-modal radiation propagation paths between core concepts and peripheral phenomena in meteorological knowledge, the key links in the generated answers—how changes in high-pressure morphology specifically affect the distribution of the surrounding flow field and how flow field adjustments affect the typhoon's movement trend—are sometimes difficult to integrate naturally into the answer content. This means that the generated question-and-answer pairs still have room for improvement in demonstrating the coherence and completeness of the reasoning process, especially when specific areas in images or specific moments in videos are needed as supporting evidence. Existing methods are insufficient in organizing and presenting structured evidence chains. Summary of the Invention

[0004] This invention provides a method and system for large-scale generation of interpretable question-answer pairs for vertical domains, which can locate knowledge content that is highly relevant to queries and quickly build a logically complete structured chain of evidence.

[0005] To solve the above-mentioned technical problems, the technical solution of the present invention is as follows:

[0006] Firstly, a method for scalable generation of interpretable question-answer pairs for vertical domains, the method comprising:

[0007] Step 1: Construct a cross-modal heterogeneous meteorological knowledge graph; perform heterogeneous graph neural network modeling on the cross-modal heterogeneous meteorological knowledge graph, learn the embedding vector of each node, obtain a knowledge graph with node embedding vectors, and perform semantic community partitioning of entities based on the embedding vectors to generate an optimized knowledge graph;

[0008] Step 2: Perform gradient field analysis on the embedding vectors of each node in the optimized knowledge graph to determine the semantic gradient of each node in different semantic community directions, and identify mutation nodes as semantic boundary anchors; with the semantic boundary anchors as vertices and the distribution centroid of the core entity group with the highest semantic relevance in the optimized knowledge graph as the base point, construct a hierarchical cone-shaped semantic space domain.

[0009] Step 3: Simulate the propagation process of semantic information in the cone-shaped semantic space domain according to the thermal diffusion equation, generate meteorological semantic diffusion equipotential surfaces radiating from the core area to the periphery, and establish a semantic diffusion path spectrum based on the hierarchical division of the meteorological semantic diffusion equipotential surfaces.

[0010] Step 4: Map the user query request to the cone semantic space domain, calculate the spatial distance between the query request vector and each meteorological semantic diffusion equipotential surface, determine the diffusion path level to which the user query request belongs, retrieve entities and inter-entity association information that match the diffusion path level from the optimized knowledge graph, and construct a structured evidence chain representing the reasoning path.

[0011] Step 5: After fusing the structured evidence chain with the user query request, input it into the pre-trained large language model to generate interpretable meteorological question-answer pairs.

[0012] Secondly, a system for scalable generation of interpretable question-answer pairs for vertical domains includes:

[0013] The knowledge graph generation module is used to construct a cross-modal heterogeneous meteorological knowledge graph. It performs heterogeneous graph neural network modeling on the cross-modal heterogeneous meteorological knowledge graph, learns the embedding vector of each node, obtains a knowledge graph with node embedding vectors, and performs semantic community partitioning of entities based on the embedding vectors to generate an optimized knowledge graph.

[0014] The spatial domain construction module is used to perform gradient field analysis on the embedding vectors of each node in the optimized knowledge graph, determine the semantic gradient of each node in different semantic community directions, and identify mutation nodes as semantic boundary anchors. With the semantic boundary anchors as vertices and the distribution centroid of the core entity group with the highest semantic relevance in the optimized knowledge graph as the base point, a hierarchical cone-shaped semantic spatial domain is constructed.

[0015] The spectrum generation module is used to simulate the propagation process of semantic information in the cone-shaped semantic space domain according to the thermal diffusion equation, generate meteorological semantic diffusion equipotential surfaces radiating from the core area to the periphery, and establish a semantic diffusion path spectrum based on the hierarchical division of the meteorological semantic diffusion equipotential surfaces.

[0016] The evidence chain acquisition module is used to map user query requests to a cone-shaped semantic space domain, calculate the spatial distance between the query request vector and each meteorological semantic diffusion equipotential surface, determine the diffusion path level to which the user query request belongs, retrieve entities and inter-entity association information that match the diffusion path level from the optimized knowledge graph, and construct a structured evidence chain representing the reasoning path.

[0017] The question-answer pair generation module is used to integrate the structured evidence chain with the user query request and input it into the pre-trained large language model to generate interpretable meteorological question-answer pairs.

[0018] The above-described solution of the present invention has at least the following beneficial effects:

[0019] This invention constructs a cross-modal heterogeneous meteorological knowledge graph, unifying and integrating multi-source and multi-format data from the meteorological field, including text, images, tables, and videos. This solves the problem of insufficient utilization of textual information and multimodal data, fully preserving the diverse information characteristics of meteorological expertise and enhancing the completeness and accuracy of knowledge coverage. Employing a heterogeneous graph neural network for feature learning and semantic classification of the knowledge graph fully explores the complex relationships between meteorological entities, improving the accuracy of entity semantic expression. This enables the orderly classification and structured organization of a large number of meteorological entities, optimizing the overall structure of the knowledge graph and improving retrieval efficiency. By performing semantic gradient analysis on node feature vectors, semantic boundary nodes are identified, and a hierarchical cone-shaped semantic space is constructed. This reflects the hierarchical distribution characteristics of meteorological knowledge from the core to the periphery and from the high to the low levels, clarifying the boundaries and transitional relationships between different semantic units and improving the problems of ambiguous semantic structure expression and unclear reasoning levels.

[0020] By simulating the transmission process of semantic information using the thermal diffusion equation, a semantic diffusion equipotential surface and diffusion path system are formed. This system can automatically extract and standardize the representation of the inherent connections and transmission logic between meteorological knowledge, making question-and-answer reasoning no longer limited to simple entity matching and improving the coherence and logic of reasoning for complex meteorological problems. By mapping user queries to a cone-shaped semantic space and matching the corresponding diffusion path level, highly relevant knowledge content can be located, and a logically complete structured evidence chain can be quickly constructed. By combining structured evidence chains with a large language model to generate question-and-answer content, the generated questions and answers have strong interpretability and effectively improve the credibility of professional questions and answers in vertical fields. Attached Figure Description

[0021] Figure 1This is a flowchart illustrating a method for scaling up interpretable question-answer pairs for vertical domains, as provided in an embodiment of the present invention.

[0022] Figure 2 This is a schematic diagram of a vertical domain-oriented, interpretable question-answer pair scaling system provided by an embodiment of the present invention. Detailed Implementation

[0023] Exemplary embodiments of the present disclosure will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

[0024] like Figure 1 As shown, embodiments of the present invention propose a method for scalable generation of interpretable question-answer pairs for vertical domains, the method comprising the following steps:

[0025] Step 1: Construct a cross-modal heterogeneous meteorological knowledge graph; perform heterogeneous graph neural network modeling on the cross-modal heterogeneous meteorological knowledge graph, learn the embedding vector of each node, obtain a knowledge graph with node embedding vectors, and perform semantic community partitioning of entities based on the embedding vectors to generate an optimized knowledge graph;

[0026] Step 2: Perform gradient field analysis on the embedding vectors of each node in the optimized knowledge graph to determine the semantic gradient of each node in different semantic community directions, and identify mutation nodes as semantic boundary anchors; with the semantic boundary anchors as vertices and the distribution centroid of the core entity group with the highest semantic relevance in the optimized knowledge graph as the base point, construct a hierarchical cone-shaped semantic space domain.

[0027] Step 3: Simulate the propagation process of semantic information in the cone-shaped semantic space domain according to the thermal diffusion equation, generate meteorological semantic diffusion equipotential surfaces radiating from the core area to the periphery, and establish a semantic diffusion path spectrum based on the hierarchical division of the meteorological semantic diffusion equipotential surfaces.

[0028] Step 4: Map the user query request to the cone semantic space domain, calculate the spatial distance between the query request vector and each meteorological semantic diffusion equipotential surface, determine the diffusion path level to which the user query request belongs, retrieve entities and inter-entity association information that match the diffusion path level from the optimized knowledge graph, and construct a structured evidence chain representing the reasoning path.

[0029] Step 5: After fusing the structured evidence chain with the user query request, input it into the pre-trained large language model to generate interpretable meteorological question-answer pairs.

[0030] In this embodiment of the invention, a cross-modal heterogeneous meteorological knowledge graph is constructed to unify and integrate multi-source and multi-form data in the meteorological field, including text, images, tables, and videos. This solves the problem of relying solely on textual information and insufficient utilization of multimodal data, fully preserving the diverse information features of meteorological professional knowledge and enhancing the completeness and accuracy of knowledge coverage. The use of heterogeneous graph neural networks for feature learning and semantic classification of the knowledge graph fully explores the complex relationships between meteorological entities, improves the accuracy of entity semantic expression, achieves orderly classification and structured organization of a large number of meteorological entities, and optimizes the overall structure and retrieval efficiency of the knowledge graph. By performing semantic gradient analysis on node feature vectors, semantic boundary nodes are identified and a hierarchical cone-shaped semantic space is constructed. This reflects the hierarchical distribution characteristics of meteorological knowledge from the core to the periphery and from the high level to the low level, clarifies the boundaries and transitional relationships between different semantic units, and improves the problems of vague expression of knowledge semantic structure and unclear reasoning levels. By simulating the transmission process of semantic information using the thermal diffusion equation, a semantic diffusion equipotential surface and diffusion path system are formed. This system can automatically extract and standardize the representation of the inherent connections and transmission logic between meteorological knowledge, making question-and-answer reasoning no longer limited to simple entity matching and improving the coherence and logic of reasoning for complex meteorological problems. By mapping user queries to a cone-shaped semantic space and matching the corresponding diffusion path level, highly relevant knowledge content can be located, and a logically complete structured evidence chain can be quickly constructed. By combining structured evidence chains with a large language model to generate question-and-answer content, the generated questions and answers have strong interpretability and effectively improve the credibility of professional questions and answers in vertical fields.

[0031] In a preferred embodiment of the present invention, the process of constructing a cross-modal heterogeneous meteorological knowledge graph is as follows:

[0032] Step 100a involves multimodal segmentation of the original meteorological technical document to obtain meteorological text blocks, meteorological image blocks, meteorological table blocks, and meteorological video keyframe blocks. Intramodal information extraction is performed on each of these blocks: meteorological entities and relationships between them are extracted from the text blocks; image meteorological entities and visual relationships between them are extracted from the image blocks; table meteorological entities and row / column relationships within the table are extracted from the table blocks; and video meteorological entities, spatiotemporal relationships, and dynamic evolution relationships between them are extracted from the video keyframe blocks. Semantic alignment and cross-modal association are then performed on the extracted meteorological entities across different modalities to construct a cross-modal heterogeneous meteorological knowledge graph, specifically including:

[0033] The system reads raw meteorological technical documents stored locally or in the cloud, traverses all pages and attachments of the document page by page, and physically segments the document content according to the storage format and structural characteristics of the data in the physical file. When a data block is identified as plain text or composed of continuous text paragraphs, it is divided into meteorological text blocks; when a data block contains pixel matrices, image file identifiers, or visualized meteorological charts, it is divided into meteorological image blocks; when a data block is organized in a row-column grid structure and contains meteorological observation values ​​or classification identifiers, it is divided into meteorological table blocks; when a data block is identified as a continuous frame sequence or a video file with a timeline, key frames are extracted at fixed time intervals of 15 seconds, or key frames are extracted according to nodes where the content of the frame changes significantly, and the extracted key frames are divided into meteorological video keyframe blocks.

[0034] For the segmented meteorological text blocks, meteorological image blocks, meteorological table blocks, and meteorological video keyframe blocks, intramodal information extraction operations are performed respectively. For the meteorological text blocks, syntactic analysis and entity recognition are performed to extract specific meteorological elements, weather phenomena, and geographical regions named in the text as meteorological entities. After completing entity extraction, based on the subject-verb-object structure and logical connectors of the sentences, the following three types of relationships are further extracted: subordinate relationships, which extract the relationship that a specific meteorological element belongs to a specific weather system, such as a meteorological element being a component of a high-pressure system; causal relationships, which extract the temporal and logical association of weather phenomenon A leading to weather change B, such as cold air moving southward leading to the formation of a front; and spatiotemporal relationships, which extract the specific time node and geographical location of the entity, such as a heavy rainstorm occurring in a certain area at a certain time on a certain day. For meteorological image patches, target detection and semantic segmentation are performed to identify meteorological systems in the image that exhibit specific shapes, colors, or textures as meteorological entities. After identifying the meteorological entities, the following three types of relationships are extracted by calculating the pixel coordinates, contour area, and relative position distance of the entities in the image coordinate system: spatial relationships, such as one entity being completely within the boundary of another entity (e.g., a cloud cluster containing multiple smaller cloud clusters); adjacency relationships, such as the spatial proximity between two entities, determined by calculating the distance between the center points of the two entities (e.g., a high-pressure area adjacent to a low-pressure area); and spatial layout relationships, such as the distribution pattern of multiple entities within the overall image area (e.g., a weather system distributed in a banded pattern). For the meteorological table block, the header structure and cell data content of the meteorological table block are analyzed, and the field items corresponding to the header and the specific cell values ​​are defined as meteorological entities in the table. After clarifying the meteorological entities in the table, the row and column correspondence rules inside the table are analyzed, and the following two types of relationships are extracted. The row and column correspondence relationship is to extract the one-to-one correspondence between entities in each column of the same row, such as the value of a certain column in a certain row corresponding to the temperature at a certain time. The time series and numerical relationship is to extract the trend relationship of the entity in the same column changing over time, or the size comparison relationship between different values, such as the value in a certain column gradually increasing over time. For keyframe blocks in meteorological videos, continuous frame target tracking and motion analysis are performed to extract dynamic weather systems that change position or shape over time as video meteorological entities. After extracting the video meteorological entities, the following two types of relationships are extracted: spatiotemporal correspondence, which establishes the correspondence between entities on a continuous time axis by analyzing the changes in the coordinate positions of entities in different keyframes; and dynamic evolution, which extracts the relationships of entity generation, development, dissipation, or morphological transformation by identifying the changes in entity outline, color, or structure in continuous keyframes.

[0035] After extracting entities and relations within each modality, to achieve semantic alignment of meteorological entities across different modalities, the extracted meteorological entities are uniformly transformed into the same feature space. Specifically, the process is as follows: First, a pre-trained text embedding model is constructed to generate feature vectors for text entities; a meteorological domain-specific text corpus is selected as the training dataset, and after word segmentation and stop word removal, a meteorological domain-specific vocabulary is constructed; based on this vocabulary, the model's embedding layer is initialized, with an embedding dimension of 1024. The model is trained using a stochastic gradient descent algorithm, with the cross-entropy loss function as the optimization objective during training. The training is iterated for 50 rounds, and after each round of training... The embedding accuracy of the model is calculated using a validation set. Training is stopped when the validation accuracy stabilizes above 92%, resulting in a pre-trained text embedding model. Inputting text entities into this model generates corresponding feature vectors. Next, a convolutional neural network model corresponding to image entities is constructed and trained to extract feature vectors from these entities. Various meteorological images (including satellite cloud images and weather maps) are selected as the training dataset. After image size normalization (uniformly adjusted to 224×224 pixels) and grayscale preprocessing, the dataset is divided into training and validation sets with a ratio of 8:2. A model containing an input layer, three convolutional layers, two pooling layers, and an output layer is constructed. A convolutional neural network with an output layer dimension of 1024 was used, employing the Adam optimizer with a learning rate of 0.001 and using the classification loss function as the optimization objective. The model was trained iteratively, with performance validated every 10 epochs. Training stopped when the model achieved a feature extraction accuracy of over 90% on the validation set, resulting in a trained image feature extraction convolutional neural network model. Inputting image entities into this model generated corresponding feature vectors. Then, a visual feature extraction model for table entities was constructed and trained to generate feature vectors for table entities. Various types of table data from the meteorological field were collected as training datasets, and the tables were structured. The process involves extracting visual features such as the row and column layout and cell value distribution of the table, converting the table to a grayscale image, and then normalizing its size. A visual feature extraction model based on a convolutional neural network is constructed, with the model structure consistent with the image entity feature extraction model. Only the input layer is adjusted to adapt to the grayscale image of the table. The model is trained using the same training parameters as the image entity feature extraction model (learning rate 0.001, Adam optimizer, classification loss function) and iterated for 45 rounds until the feature extraction accuracy on the validation set stabilizes above 88%. This yields the trained table entity visual feature extraction model. When table entities are input into this model, corresponding feature vectors are generated.A visual feature extraction model for video entities was constructed and trained to generate feature vectors for these entities. Meteorological video data was selected, and keyframes were extracted as training samples. The keyframes underwent the same preprocessing as those for image entities. The basic structure of the convolutional neural network for image entity feature extraction was adopted, with the addition of a temporal feature extraction layer to capture the temporal correlation of video keyframes. The training dataset was divided into training and validation sets in a 7:3 ratio. The Adam optimizer was used with a learning rate of 0.0008, and the temporal loss function was used as the optimization objective. Training was iterated for 55 epochs. Training was stopped when the feature extraction accuracy on the validation set reached above 89%, resulting in the trained video entity visual feature extraction model. Video entities were input into this model to generate corresponding feature vectors. Using the constructed and trained feature extraction models for each modality, feature vectors were generated for text entities, image entities, table entities, and video entities, ensuring that the entity vectors for all modalities resided in the same dimensional feature space.

[0036] Based on the generated entity feature vectors, for any two meteorological entities from different modalities, the semantic similarity between their feature vectors is calculated using cosine similarity. The calculated result ranges from -1 to 1, with a higher value indicating a higher degree of semantic similarity between the two entities. To achieve semantic alignment between entities from different modalities, a cosine similarity threshold of 0.85 is preset. The calculated similarity value is compared with the preset threshold of 0.85. If the calculated similarity value is greater than or equal to 0.85, the two entities from different modalities are determined to refer to the same semantic entity, thus completing cross-modal semantic alignment. If the calculated similarity value is less than 0.85, the two entities are determined to be different semantic entities, and no correspondence is established. After completing cross-modal semantic alignment of all meteorological entities, a cross-modal heterogeneous meteorological knowledge graph is constructed. All meteorological entities, after semantic alignment and deduplication, are used as nodes in the knowledge graph. Node types include text entity nodes, image entity nodes, table entity nodes, and video entity nodes. Logical relationships, visual relationships, row-column relationships, spatiotemporal relationships, and dynamic evolution relationships extracted from each modality are used as edges in the knowledge graph. Simultaneously, cross-modal association edges are established between semantically aligned entities from different modalities. Through the combination of these nodes and edges, a cross-modal heterogeneous meteorological knowledge graph containing multiple node types and edge types is formed, achieving structured integration and unified representation of multimodal meteorological data.

[0037] This embodiment, by performing multimodal block processing on the original meteorological technical documents, can independently parse data in different forms such as text, images, tables, and video keyframes, ensuring the information integrity of various types of data; by extracting information from each modality of data, it can fully extract the entities and relationships contained in each type of data, improving the comprehensiveness of information extraction; by calculating vector cosine similarity to achieve semantic alignment of meteorological entities in different modalities, it can objectively determine the correspondence between entities and improve the accuracy of entity matching; by establishing cross-modal association relationships to form a cross-modal heterogeneous meteorological knowledge graph, it can integrate multi-source heterogeneous meteorological data into a unified structured knowledge system, solving the problem of scattered multimodal data and the inability to use them jointly.

[0038] In another preferred embodiment of the present invention, a heterogeneous graph neural network is used to model the cross-modal heterogeneous meteorological knowledge graph, the embedding vector of each node is learned, a knowledge graph with node embedding vectors is obtained, and semantic community partitioning of entities is performed based on the embedding vectors to generate an optimized knowledge graph, including:

[0039] Step 100b involves formalizing the cross-modal heterogeneous meteorological knowledge graph into a heterogeneous graph containing multiple node types and edge types, initializing feature vectors for each node, and inputting the heterogeneous graph with initialized feature vectors into a heterogeneous graph attention network to aggregate the neighbor node information of each node. Specifically, this includes converting the cross-modal heterogeneous meteorological knowledge graph constructed in step 100a into structured graph data, organizing the node types according to the text entity nodes, image entity nodes, table entity nodes, and video entity nodes obtained in step 100a, and organizing the edge types according to the logical relationship edges, visual relationship edges, spatiotemporal relationship edges, and cross-modal association edges extracted and established in step 100a, thereby constructing a heterogeneous graph containing multiple node types and edge types. Based on the entity feature annotation standard in the meteorological field, a heterogeneous graph attention network adapted to multiple types of nodes and edges is designed. The network structure consists of four core layers: an input layer, an attention weighting layer, a feature fusion layer, and an output layer. The input layer receives the heterogeneous graph node feature vectors initialized in step 100b and performs a unified dimensional mapping on the input vectors of different types of nodes to ensure that all node feature vectors are converted into a unified 1024-dimensional feature space vector. The attention weighting layer sets up multiple heterogeneous attention calculation units, each corresponding to a relationship type (logical relationship, visual relationship, spatiotemporal relationship, cross-modal association relationship), which can independently calculate the attention weights of neighbor nodes and the center node under different relationship types to achieve differentiated information aggregation. The feature fusion layer adopts a concatenation and normalization fusion method to fuse the attention-weighted neighbor aggregation features with the initial feature vector of the center node, preserving the complete information of the node's own features and the neighbor aggregation features. The output layer performs dimensionality reduction mapping on the fused features through a single-layer fully connected neural network and outputs the node embedding vector of the current network layer. The output dimension is consistent with the input dimension of the input layer and supports multi-layer iterative propagation.

[0040] The network training process collects a multimodal entity association sample set in the meteorological domain. The sample set includes all entity nodes, neighbor nodes, and corresponding relationship types within the cross-modal heterogeneous meteorological knowledge graph constructed in step 100a. It is divided into training, validation, and test sets in a 7:2:1 ratio. The model parameters of the heterogeneous graph attention network are initialized, with input layer mapping weights, attention-weighted fully connected layer parameters, feature fusion layer weights, and output layer mapping weights all initialized using Gaussian randomization. The Adam optimizer is used to train the network, with an initial learning rate of 0.001 and a weight decay coefficient of 0.0005. The cross-entropy loss function is used as the optimization objective during training, and the network is iterated for 40 rounds. After each round of training, the entity association accuracy of the model is calculated using the validation set. Training is stopped when the validation set accuracy remains stable above 93% for five consecutive rounds, resulting in the trained heterogeneous graph attention network. This network can achieve meteorological entity neighbor information aggregation and semantic feature extraction. The network structure and parameters are preset and adjusted based on the characteristics of meteorological domain data. The initialized heterogeneous graph is input into the heterogeneous graph attention network constructed and trained above. Inside the heterogeneous graph attention network, neighbor information aggregation is performed on each node according to the hierarchical propagation structure, that is, all nodes in the heterogeneous graph are traversed, with each node as the center node; for each center node, all its directly associated neighbor nodes are found and obtained, and the corresponding relationship type is recorded (the relationship type is the relationship corresponding to the various edges extracted or established in step 100a); based on the node connection relationship, preliminary neighbor information aggregation is performed on each center node, and the feature vectors of all neighbor nodes (the feature vectors are the initialized vectors) are summarized to form the preliminary aggregation result. Thus, the construction of the heterogeneous graph, feature initialization and the first layer of neighbor aggregation are completed.

[0041] Step 101b involves calculating the attention weight of each neighbor node for the center node under different relation types during the aggregation process using an attention mechanism. The feature vectors of the neighbor nodes are then weighted and aggregated based on these attention weights to obtain aggregated features. Specifically, this includes: during the aggregation process of the heterogeneous graph attention network constructed in step 100b, an attention mechanism is introduced to achieve differentiated weighting of different neighbor nodes under different relation types. For each center node determined in step 100b and its corresponding neighbor node, the attention weight between the center node and its neighbor nodes is calculated based on the relation type between the nodes (the relation type extracted or established in step 100a). The attention weight is calculated using a normalized exponential function. The attention mapping layer is a single-layer fully connected neural network built into the heterogeneous graph attention network, used to map the concatenated vectors into a single numerical value to represent the attention intensity. The larger the mapped value, the higher the semantic correlation between the neighbor node and the center node under the corresponding relation type, and the stronger the semantic correlation between the neighbor node and the center node. The greater the contribution of semantic expression, the lower the correlation and the smaller the contribution; the initial feature vectors of the central node and the initial feature vectors of the neighboring nodes are the feature vectors initialized in step 100b; normalization is performed using the above formula to obtain the standardized attention weights of all neighboring nodes relative to the central node, with weight values ​​ranging from 0 to 1; based on the calculated standardized attention weights, the feature vectors of each neighboring node (the vectors initialized in step 100b) are weighted, i.e., weighted neighbor feature = neighboring node feature vector × corresponding attention weight; the weighted feature vectors of all neighbors of the central node are summed to obtain the neighbor aggregation feature of the central node. This aggregation feature distributes and integrates the feature information of different neighboring nodes according to their importance through weighted summation. The feature information of neighboring nodes with higher importance (higher weight) accounts for a higher proportion in the aggregation feature, while the feature information of neighboring nodes with lower importance (lower weight) accounts for a lower proportion in the aggregation feature, reflecting the difference in importance of different neighboring nodes under different relationships.

[0042] Step 102b involves fusing the aggregated features with the feature vector of the central node to update the embedding vector of the central node in the current layer. After several iterations, the final embedding vector of each node with fused structural semantics is obtained. The final embedding vectors are then clustered, and nodes whose embedding vector distance is less than the cluster radius are grouped into the same semantic community. Each node is labeled with its semantic community identifier, ultimately resulting in an optimized knowledge graph with semantic community labels. Specifically, this includes:

[0043] The neighbor aggregation features and the initial feature vector of the central node are fused by concatenation to retain more feature information (the feature vectors are all vectors initialized in step 100b and processed in step 101b). The fused feature vector is used as the output embedding vector of the central node in the current network layer. The neighbor information aggregation in step 100b and the attention weight calculation and weighted aggregation process in step 101b are repeated to perform multi-layer iteration in the heterogeneous graph attention network. Each layer recalculates the attention weights and updates the node embedding vector based on the embedding vector output by the previous layer. After multi-layer propagation iteration, each node will fuse global graph structure information, local neighbor information and multimodal semantic information (multimodal semantic information comes from multimodal entity extraction and alignment in step 100a) to finally form the final embedding vector of each node. Clustering is performed on the final embedding vectors of all nodes to achieve grouping and regularization of semantic entities. A clustering radius of 0.6 is preset. A clustering radius of 0.6 can effectively group nodes with high semantic similarity and close semantic association into the same semantic community, avoiding the splitting of semantically similar nodes and overly fine division of semantic communities due to an excessively small radius. It can also prevent nodes with different semantics from being misclassified into the same community due to an excessively large radius, which would affect the accuracy of semantic hierarchical regularization. Nodes with Euclidean distance less than the preset clustering radius are grouped into the same semantic community. Each node is labeled with the corresponding semantic community number to complete the semantic community division. The nodes labeled with semantic community identifiers are reintegrated into the original cross-modal heterogeneous meteorological knowledge graph constructed in step 100a, keeping the original edge relationships established in step 100a unchanged. The original cross-modal heterogeneous meteorological knowledge graph is replaced with the integrated nodes and edge relationships to finally form an optimized knowledge graph containing semantic community labels. This graph performs semantic hierarchical regularization on the extracted entities.

[0044] This embodiment, by formalizing the meteorological knowledge graph as a heterogeneous graph and processing it with a heterogeneous graph attention network, can adapt to the structural features of multiple types of nodes and edges, and fully explore the heterogeneous association information between meteorological entities. Calculating and weighting the neighbor nodes based on the attention mechanism and then aggregating them highlights the role of key neighbor nodes, improving the accuracy and relevance of node embedding vectors. The final node embedding vectors obtained through multi-level iterative updates can simultaneously integrate the structural and semantic information of the knowledge graph, providing a stable and reliable data foundation for subsequent semantic segmentation. Clustering based on the embedding vector distance to divide semantic communities enables automated classification and organization of meteorological entities, clarifying the semantic affiliation of different entities and optimizing the organizational structure of the knowledge graph. Generating an optimized knowledge graph with semantic community labels reduces the complexity of subsequent knowledge retrieval and reasoning, improving the efficiency of meteorological knowledge utilization.

[0045] In a preferred embodiment of the present invention, step 2 includes:

[0046] Step 200: For each node in the optimized knowledge graph, calculate the difference between the embedding vector of the corresponding node and the center vector of the adjacent semantic community to obtain the difference vector; use the projection of the difference vector in a preset direction as the semantic gradient of the corresponding node pointing to the semantic community adjacent to the corresponding node. The adjacent semantic community refers to the semantic community to which the neighboring nodes directly connected to the corresponding node through the edge belong. Specifically, this includes: traversing each node in the optimized knowledge graph and obtaining the final embedding vector of each node one by one. This vector is the final embedding vector of the node obtained after multiple iterations of the heterogeneous graph attention network in step 102b; at the same time, for each node, find all the neighboring nodes directly connected to it through the edge, record the semantic community label to which each neighboring node belongs. This label is the semantic community number marked after completing the semantic community clustering in step 102b, and determine the adjacent semantic community corresponding to the node, that is, the semantic community to which the neighboring nodes belong. If multiple neighboring nodes belong to the same semantic community, then the semantic community is only considered as an adjacent semantic community. For each adjacent semantic community, the final embedding vectors of all nodes in the community are collected, and the center vector of the community is calculated. The center vector of the semantic community is the geometric mean of the embedding vectors of all nodes in the corresponding semantic community. If an adjacent semantic community contains only a single node, then the final embedding vector of that node is the center vector of the semantic community.

[0047] For each node, calculate the difference between its final embedding vector and the center vector of each neighboring semantic community to obtain the difference vector for each neighboring semantic community. The number of difference vectors calculated corresponds to the number of neighboring semantic communities each node corresponds to, ensuring that each neighboring semantic community has a corresponding difference vector. The preset radial direction is the direction from the current node to the center of the neighboring semantic community. Project each difference vector onto this preset radial direction using a vector dot product method, i.e., semantic gradient = difference vector × preset radial direction unit vector. The preset radial direction unit vector is obtained by subtracting the node's final embedding vector from the neighboring semantic community center vector, and then normalizing it, i.e., preset radial direction unit vector = (neighboring semantic community center vector - node's final embedding vector) / difference between the neighboring semantic community center vector and the node's final embedding vector. The Euclidean distance is used to project the semantic gradient of the node towards its adjacent semantic community. The sign of the semantic gradient indicates the gradient direction (positive direction points towards the adjacent semantic community, negative direction points away from the adjacent semantic community), and the absolute value indicates the gradient strength. The larger the gradient strength, the looser the semantic connection between the node and its adjacent semantic community, the more obvious the node's deviation from the center of the semantic community, and the greater the difference between the node's own semantic features and the core semantic features of the adjacent semantic community. The smaller the gradient strength, the closer the semantic connection between the node and its adjacent semantic community, the slighter the node's deviation from the center of the semantic community, and the smaller the difference between the node's own semantic features and the core semantic features of the adjacent semantic community. The range of gradient strength values ​​is consistent with the projection result of the semantic gradient and changes dynamically with the position and semantic connection between the node and its adjacent semantic community.

[0048] Step 201: Traverse all nodes in the optimized knowledge graph and identify nodes whose semantic gradient vectors simultaneously point to two or more different semantic communities and whose gradient magnitudes all exceed a preset threshold as semantic boundary anchors. These semantic boundary anchors are located within the common boundary region of the multiple semantic communities they point to. Specifically, this includes: traversing all nodes in the optimized knowledge graph one by one, and comprehensively verifying the semantic gradient-related information of each node. The verification includes the number of semantic gradient vectors corresponding to the node, the unique identifier of the semantic community pointed to by each semantic gradient vector, and the magnitude of each semantic gradient vector, i.e., the gradient strength value. The preset gradient threshold is 0.3. This threshold is determined based on the semantic community boundary features in the meteorological field and can effectively distinguish semantic boundary nodes from internal nodes. The selection of semantic boundary anchor points must meet two core conditions simultaneously. First, the number of semantic gradient vectors corresponding to the node is not less than two, and each semantic gradient vector points to a different semantic community. That is, the direct neighbor nodes of the node belong to at least two different semantic communities, and each different semantic community corresponds to an independent semantic gradient vector. Second, the semantic gradient magnitude (gradient strength value) of the node pointing to all different semantic communities exceeds the preset gradient threshold of 0.3, and there is no case where the semantic gradient magnitude in any pointing direction is less than 0.3.

[0049] Nodes that simultaneously meet both of the above screening criteria are formally defined as semantic boundary anchors. The spatial location of a semantic boundary anchor is the common boundary region of multiple semantic communities it points to. The specific positioning logic and judgment criteria are as follows: First, calculate the Euclidean distance between the final embedding vector of the semantic boundary anchor and the center vector of each adjacent semantic community; then, calculate the sum of the Euclidean distances between all adjacent semantic community center vectors, and calculate the ratio of the distance from the node to each adjacent semantic community center vector to the corresponding community center distance. If this ratio is between 1 / 3 and 2 / 3, and the distance difference between any two adjacent semantic community center vectors and the node does not exceed 0.2 (distance difference is based on meteorological data), then the node is considered a semantic boundary anchor. By setting the difference in entity semantic features, it can be effectively determined that a node is located in the middle region of a community. This means that the embedding vector of the node is located in the intermediate transition region between the center vectors of multiple adjacent semantic communities. Secondly, from the perspective of semantic feature dimension analysis, the values ​​of each dimension of the final embedding vector of the node are all within ±0.1 of the mean difference between the values ​​of each dimension and the corresponding dimensions of all nodes in each adjacent semantic community. This means that the node does not completely match the core semantic features of a certain semantic community, nor does it completely deviate from the core semantic features of any adjacent semantic community. Instead, it possesses some semantic features of multiple adjacent semantic communities. Based on the above spatial location determination and semantic feature analysis, the node can be used as the core anchor point for dividing the spatial range of different semantic communities.

[0050] Step 202: Calculate the geometric mean of the embedding vectors of all nodes in the core entity group with the highest semantic relevance in the optimized knowledge graph to obtain the coordinates of the base point. Using the embedding vector of each semantic boundary anchor point as the vertex coordinates, connect the base point and all vertices to form a cone-shaped skeleton. Specifically, this includes: in the optimized knowledge graph, selecting the core entity group with the highest semantic relevance. The selection criteria must simultaneously meet three indicators: first, the number of nodes in the semantic community is not less than the preset node number threshold, which is 20, to ensure that the core entity group has sufficient semantic representativeness; second, the edge connection density between nodes in the community is within the preset density range (edge ​​connection density = actual number of edges in the community / maximum number of edges that nodes in the community can connect, selecting semantic communities with a density ≥ 0.6); third, the average similarity of the embedding vectors of nodes in the community is not lower than the preset similarity threshold (average similarity ≥ 0.8, similarity calculation adopts the cosine similarity calculation method in step 100a); the semantic community that meets the above three indicators is determined as the core entity group; the entities in the core entity group are all core meteorological entities in the meteorological field, which are the core carriers of the semantic expression of the entire knowledge graph.

[0051] Collect the final embedding vectors of all nodes within the core entity group. These vectors are the final embedding vectors of the nodes generated in step 102b. Calculate the arithmetic geometric mean of all vectors. This result serves as the base point coordinates of the cone-shaped semantic skeleton. The base point coordinates are the central reference point of the cone-shaped semantic space domain, and their values ​​directly determine the center position of the cone-shaped semantic space. Extract all semantic boundary anchors determined in step 201, and use the final embedding vector of each semantic boundary anchor as the vertex coordinates of the cone-shaped skeleton. Using the base point coordinates as the central starting point, connect the base point coordinates with the vertex coordinates of each semantic boundary anchor in a linear connection manner to form multiple radial skeleton edges. All skeleton edges, together with the base points and the vertices of the semantic boundary anchors, constitute a stable cone-shaped semantic skeleton. This skeleton is used to limit the overall semantic space range of the optimized knowledge graph and clarify the boundary contours of different semantic communities.

[0052] Step 203: Nodes in the optimized knowledge graph, excluding those in the core entity group and semantic boundary anchors, are treated as ordinary nodes. The Euclidean distance between the embedding vector of each ordinary node and the coordinates of the base point is calculated to obtain the embedding distance value for each ordinary node. The minimum and maximum embedding distance values ​​of all ordinary nodes are obtained, and the interval between the minimum and maximum distance values ​​is divided equally according to a preset number of radial levels to obtain several continuous and non-overlapping hierarchical distance intervals. The lower and upper limits of each hierarchical distance interval constitute the boundary threshold of that level. Each ordinary node is traversed, and its embedding distance value is compared with the boundary threshold of each hierarchical distance interval. The ordinary node is assigned to the radial level corresponding to the hierarchical distance interval to which its embedding distance value belongs. This forms a cone-shaped semantic space domain in the embedding space, centered on the base point, bounded by vertices, and distributed radially in layers. Specifically, this includes:

[0053] All nodes in the optimized knowledge graph are classified and filtered, excluding all nodes within the core entity group determined in step 202 and all semantic boundary anchors determined in step 201. The remaining nodes are uniformly defined as ordinary nodes. Ordinary nodes are non-core, non-boundary conventional entity nodes in the meteorological field and are the main components of the cone-shaped semantic space domain. Their proportion determines the main scale of the cone-shaped semantic space domain. For each ordinary node, its final embedding vector is calculated, which is the Euclidean distance between the node's final embedding vector generated in step 102b and the base point coordinates. This distance is the embedding distance value corresponding to the ordinary node. The embedding distance value is calculated for all ordinary nodes one by one, and the distance data corresponding to each ordinary node is recorded to ensure that the calculation is complete and error-free. The minimum and maximum embedding distance values ​​of all ordinary nodes are statistically analyzed to clarify the numerical distribution range of ordinary node embedding distances. The preset radial hierarchy is 5 layers, which is set based on the semantic hierarchy distribution characteristics of meteorological entities. This allows for precise semantic stratification while avoiding structural redundancy caused by too many layers. The numerical range from the minimum to the maximum distance value is divided equally according to the preset radial hierarchy. The division calculation formula is: interval length = (maximum distance value - minimum distance value) / preset radial hierarchy. After obtaining a uniform interval length according to the above formula, 5 consecutive, end-to-end, and non-overlapping hierarchical distance intervals are generated sequentially. The lower and upper limits of each hierarchical distance interval together constitute... The boundary thresholds for each level are defined as follows: the lower limit of the first radial level is the minimum distance value, and the upper limit of the first radial level is the sum of the minimum distance value and the length of a single-layer interval. This level satisfies the numerical assignment rule of being greater than or equal to the lower limit and less than the upper limit. The lower limit of the second to fourth radial levels is the upper limit of the previous level, and the upper limit is the sum of the lower limit of the corresponding level and the length of a single-layer interval. Each level satisfies the numerical assignment rule of being greater than or equal to the lower limit of the current level and less than the upper limit of the current level. The lower limit of the fifth radial level is the upper limit of the fourth radial level, and the upper limit of the fifth radial level is the maximum distance value. This level satisfies the numerical assignment rule of being greater than or equal to the lower limit of the current level and less than or equal to the upper limit.

[0054] Iterate through all ordinary nodes, comparing the embedding distance value of each ordinary node with the boundary threshold of each level's distance interval one by one. If the embedding distance value of an ordinary node is greater than or equal to the lower limit of a certain level and less than or equal to the upper limit of that level, then the ordinary node is assigned to the corresponding radial level. After all ordinary nodes have been assigned to levels, a cone-shaped semantic space domain is finally formed, centered on the base point coordinates and bounded by the semantic boundary anchor points, and distributed radially in layers. The core entity group is located at the center of the cone-shaped semantic space, i.e., the region where the base point coordinates are located. The semantic boundary anchor points are located at the boundary of the cone-shaped semantic space, i.e., the region where the skeleton vertices are located. Ordinary nodes are distributed radially in different radial levels according to the order of their embedding distance values ​​from near to far, realizing the orderly, layered, and regular arrangement of meteorological entities in the semantic space, and completing the construction of the entire cone-shaped semantic space domain.

[0055] This embodiment, by calculating the semantic gradient of nodes and selecting semantic boundary anchors, can delineate the boundaries of different semantic communities, clarify the semantic affiliation of meteorological entities, and improve the accuracy of semantic segmentation of the knowledge graph. Based on core entity groups as base points and semantic boundary anchors as vertices to form a conical skeleton, it can stably constrain the overall semantic space structure of meteorological knowledge, ensuring the regularity of semantic space distribution. A radially equidistant hierarchical approach is used to allocate ordinary nodes, achieving an orderly hierarchical arrangement of meteorological entities in the semantic space, simplifying the structural complexity of the knowledge graph. The constructed conical semantic space domain can intuitively present the coreity, semantic affiliation, and spatial distribution relationship of meteorological entities, improving the efficiency and accuracy of meteorological knowledge retrieval, reasoning, and analysis. The structural design combining semantic boundary anchors and radial hierarchical layering is compatible with the heterogeneous features of multimodal meteorological entities.

[0056] In a preferred embodiment of the present invention, step 3 includes:

[0057] Step 300: The base point in the conical semantic space domain is set as the heat diffusion source point. The embedding distance from each node in the conical semantic space domain to the base point is used as the spatial location variable. The heat diffusion coefficient and propagation time parameters are set, and the heat diffusion equation is solved to obtain the semantic information concentration value at any location within the conical semantic space domain. Specifically, this includes: setting the base point in the conical semantic space domain, i.e., the geometric mean coordinates calculated in step 202 using the embedding vectors of all nodes in the core entity group, as the unique heat diffusion source point in the heat diffusion model. This source point corresponds to the core region with the highest semantic density and the closest correlation in the meteorological knowledge graph, and is the starting point of the entire semantic information diffusion, as well as the location with the highest semantic energy and concentration; extracting the spatial location information of each node relative to the base point within the conical semantic space domain, and using the Euclidean embedding distance from each node to the base point calculated in step 203 as the spatial location variable in the heat diffusion model, denoted as... This variable describes the distance of a node from the core source node in the semantic space, and directly determines the concentration of semantic information after diffusion. The smaller the value, the closer the node is to the core source point (base point), the less the semantic information decays when it spreads to the node, and the higher the corresponding semantic information concentration value. The larger the value, the farther the node is from the core source point, the greater the attenuation during the semantic information diffusion process, and the lower the corresponding semantic information concentration value. Based on the semantic propagation law of multimodal knowledge in the meteorological field, and combined with the spatial distribution characteristics of semantic boundary anchor points obtained in step 201, the key parameters required for the heat diffusion equation are preset. All parameters are fixed values ​​determined by meteorological knowledge graph data, and the heat diffusion coefficient is denoted as . Used to control the speed and range of semantic information diffusion in the semantic space, a fixed value. =1.0, where a larger value indicates faster semantic information dissemination and wider coverage, while a smaller value indicates slower dissemination and higher concentration; the propagation time parameter is denoted as... This is used to control the stability of semantic information diffusion, ensuring a stable and uniform concentration distribution throughout the cone-shaped semantic space domain, and is set to a fixed value. =1.0. When the propagation time is set to this value, the semantic information can complete the full diffusion from the core to the boundary. It will not be too short a time to cause insufficient diffusion, nor too long a time to cause excessive dilution of semantic information.

[0058] After determining the source point, location variables, and diffusion parameters, a three-dimensional isotropic thermal diffusion equation is constructed and solved using the initial and boundary conditions of the conical semantic space domain to obtain the semantic information concentration value. The origin of the coordinate system is the base point, and the embedding distance from the node to the base point is used as the coordinate system value. For the spatial location variable, construct a three-dimensional spherically symmetric thermal diffusion equation, the expression of which is: In the formula, This indicates that the distance from the base point in the semantic space is... Location, in diffusion time The semantic information concentration value corresponding to each time point; a higher concentration value indicates that the semantic information at that location is denser and more important. This occurs at the initial diffusion time. When = 0, all semantic information is completely concentrated at the base point, which can be expressed mathematically as follows: ,in, The initial semantic concentration constant at the base point. =1.0; For Dirac The function, defined as, when When =0, The value can be infinity; when When =0, The value is 0, and it satisfies This is used to represent that semantic information is completely concentrated at the source point at the initial moment. The diffusion of semantic information is confined within the cone-shaped semantic space domain, when... When the maximum embedding distance obtained in step 203 is reached, the gradient of the semantic information concentration along the normal direction outside the boundary is 0, which is regarded as an adiabatic boundary, and there is no overflow or loss of semantic information.

[0059] The above heat diffusion equation is solved analytically using the method of separation of variables. Under given initial and boundary conditions, the solution is obtained for any position. Any time The formula for calculating the semantic information concentration at a given location. ,Will =1.0、 =1.0、 Substituting 1.0 into the equation allows direct calculation of the semantic information concentration value for any node or spatial location within the cone-shaped semantic space domain. The formula shows that the semantic information concentration value is related to the embedding distance from the node to the base point. There is a negative correlation: the closer to the base point, the higher the semantic information concentration value, the higher the semantic core degree, and the stronger the entity importance; the farther away from the base point, the lower the semantic information concentration value, the sparser the semantics, and the closer the entity is to the semantic edge region.

[0060] Step 301: Extract all spatial point sets with equal semantic information concentration values ​​in the cone-shaped semantic space domain, and fit the spatial point sets into a continuous surface as meteorological semantic diffusion equipotential surfaces; sort all meteorological semantic diffusion equipotential surfaces according to their corresponding semantic information concentration values ​​from high to low, and obtain the concentration values ​​corresponding to adjacent meteorological semantic diffusion equipotential surfaces as the concentration boundaries of each level. Specifically, this includes: first, traversing all nodes in the cone-shaped semantic space domain one by one, and synchronously reading the semantic information concentration value corresponding to each node; after the traversal is completed, deduplicating all semantic information concentration values ​​to obtain all unique concentration values, and recording each unique concentration value as... For each unique concentration value One by one, all semantic information concentration values ​​within the cone-shaped semantic space domain are selected. The nodes are defined as a set of points in an isoconcentration space, denoted as . Each set of isoconcentration spatial points All nodes within the same area have identical semantic information concentration values, differing only in spatial location, and each... Corresponding to the unique There is no case where a node simultaneously belongs to two or more isoconcentration point sets. For each isoconcentration point set, a surface fitting operation is performed individually, specifically using radial basis function interpolation. An interpolation function is constructed such that it is satisfied at all discrete nodes (isoconcentration point sets). The function value at the node (within) is equal to the semantic information concentration value of the corresponding node. Meanwhile, it ensures that the function is continuous and smooth throughout the entire conical semantic space domain, thereby fitting a complete isoconcentration surface; firstly, the isoconcentration space point set is extracted. Let the spatial coordinates of all nodes within be... Contains There are nodes, and the spatial coordinates of each node are denoted as . , These are the three-dimensional coordinate components of the node in the cone-shaped semantic space domain, and the semantic information concentration value corresponding to each node is... That is, the interpolation function in The function value at point must satisfy We select the Gaussian radial basis function as the interpolation kernel function to construct a three-dimensional spatial interpolation function, with the expression: ,in, These are the interpolation coefficients, which need to be determined by solving a system of linear equations. The width parameter of the radial basis functions is preset. =0.8 ensures that the fitted surface is smooth and fits the discrete nodes. For any point in space; according to the interpolation conditions Substituting the coordinates of all discrete nodes into the interpolation function yields a system of linear equations. ,in It is the first The spatial coordinates of discrete nodes (and) similar), Is the interpolation function at the th Discrete nodes The function value at the given point; the Gaussian elimination method is used to solve the system of linear equations to obtain all interpolation coefficients. The value of .

[0061] The interpolation coefficients obtained from the solution Substituting the values ​​into the interpolation function yields a complete continuous interpolation function. Within the conical semantic space domain, a sufficient number of sampling points are selected, and the corresponding concentration values ​​are calculated using the interpolation function. All sampling points are then smoothed to ensure that the fitted surface remains continuous, unbroken, and abrupt within the conical semantic space domain, conforming to the overall contour of the domain, ultimately forming a closed isoconcentration surface. This closed surface is formally defined as the meteorological semantic diffusion equipotential surface, denoted as […]. ,in and , One-to-one correspondence, that is, each set of spatial points with equal concentration Corresponding to a meteorological semantic diffusion equipotential surface Furthermore, the same meteorological semantic diffusion equipotential surface The semantic information concentration values ​​at all locations remain completely consistent, and their concentration values ​​are equal to the corresponding iso-concentration spatial point set. concentration value ,Right now ( )= It intuitively presents the equivalence distribution of semantic information in space.

[0062] After fitting all meteorological semantic diffusion equipotential surfaces, all generated meteorological semantic diffusion equipotential surfaces in the cone semantic space domain are sorted from high to low according to their corresponding semantic information concentration values. Based on the semantic information concentration range of 0.01 to 1.0 calculated in step 300, the equipotential surface concentration values ​​are set to 1.0, 0.8, 0.6, 0.4, 0.2, and 0.01 respectively. After sorting, the meteorological semantic diffusion equipotential surface with a concentration value of 1.0 is located in the inner layer region near the base point, the meteorological semantic diffusion equipotential surface with a concentration value of 0.01 is located in the outer layer region near the semantic boundary, and the remaining equipotential surfaces are arranged between the inner and outer layers in order of concentration from high to low. According to the sorted order, every two adjacent meteorological semantic diffusion equipotential surfaces are divided into an independent semantic diffusion level, for a total of 5 semantic diffusion levels, so that the entire cone semantic space domain forms a multi-layered ordered semantic diffusion structure from the inside to the outside. After completing the semantic diffusion hierarchy division, a clear concentration boundary is set for each semantic diffusion hierarchy. For the first to fifth semantic diffusion hierarchies, the semantic information concentration value corresponding to the meteorological semantic diffusion equipotential surface located inside the hierarchy and closer to the base point is taken as the inner layer concentration boundary of the hierarchy; the semantic information concentration value corresponding to the meteorological semantic diffusion equipotential surface located outside the hierarchy and farther from the base point is taken as the outer layer concentration boundary of the hierarchy. The boundary values ​​for each hierarchy are as follows: the inner layer concentration boundary of the first diffusion hierarchy is 1.0, and the outer layer concentration boundary is 0.8; the inner layer concentration boundary of the second diffusion hierarchy is 0.8, and the outer layer concentration boundary is 0.6; the inner layer concentration boundary of the third diffusion hierarchy is 0.6, and the outer layer concentration boundary is 0.4; the inner layer concentration boundary of the fourth diffusion hierarchy is 0.4, and the outer layer concentration boundary is 0.2; the inner layer concentration boundary of the fifth diffusion hierarchy is 0.2, and the outer layer concentration boundary is 0.01.

[0063] Step 302: Define the concentration value range between concentration boundaries as the concentration coverage interval of the corresponding level to establish a semantic diffusion path spectrum from the base point outward through each diffusion level to the outermost meteorological semantic diffusion equipotential surface. The semantic diffusion path spectrum includes multiple diffusion paths from the base point to the nodes covered by each diffusion level. Specifically, based on the five determined semantic diffusion levels and the concentration boundaries of each level, carry out the work of dividing the concentration coverage interval, assigning node levels, and constructing the semantic diffusion path spectrum to fully realize the construction from the concentration boundary to the path spectrum. The specific execution process is as follows: Taking the concentration boundaries of each semantic diffusion level determined in step 301 as the core basis, define the range of semantic information concentration values ​​between the inner and outer concentration boundaries of each level as the concentration coverage interval corresponding to that level, ensuring that each diffusion level has a unique corresponding concentration coverage interval, and the interval range completely matches the concentration boundary values ​​mentioned above. Combining the specific values ​​of each level boundary, the concentration coverage intervals of the five diffusion levels are divided as follows: The first diffusion level... The concentration coverage range of the first diffusion layer is greater than or equal to 0.8 and less than 1.0, corresponding to an inner layer concentration boundary of 1.0 and an outer layer concentration boundary of 0.8, covering all semantic information concentration values ​​within this boundary range; the concentration coverage range of the second diffusion layer is greater than or equal to 0.6 and less than 0.8, corresponding to an inner layer concentration boundary of 0.8 and an outer layer concentration boundary of 0.6; the concentration coverage range of the third diffusion layer is greater than or equal to 0.4 and less than 0.6, corresponding to an inner layer concentration boundary of 0.6 and an outer layer concentration boundary of 0.4; the concentration coverage range of the fourth diffusion layer is... The concentration coverage interval is greater than or equal to 0.2 and less than 0.4, corresponding to an inner layer concentration boundary of 0.4 and an outer layer concentration boundary of 0.2; the concentration coverage interval of the 5th diffusion level is greater than or equal to 0.01 and less than 0.2, corresponding to an inner layer concentration boundary of 0.2 and an outer layer concentration boundary of 0.01; all concentration coverage intervals are arranged sequentially from the base point outwards, with adjacent intervals being continuously connected and not overlapping, without any gaps or repetitions in concentration values, forming a complete and hierarchical semantic diffusion hierarchy system.

[0064] The semantic information concentration value of each node within the cone-shaped semantic space is traversed one by one, including core entity group nodes, semantic boundary anchors, and ordinary nodes. The value ranges from 0.01 to 1.0. The concentration value of each node is then compared with the concentration coverage intervals of the five diffusion levels to determine which level's coverage interval it falls into. The specific comparison rules are as follows: if the node concentration value is greater than or equal to 0.8 and less than 1.0, the node is assigned to the first diffusion level; if the node concentration value is greater than or equal to 0.6 and less than 0.8, the node is assigned to the second diffusion level; if... If a node's concentration value is greater than or equal to 0.4 and less than 0.6, the node is assigned to the 3rd diffusion level; if the node's concentration value is greater than or equal to 0.2 and less than 0.4, the node is assigned to the 4th diffusion level; if the node's concentration value is greater than or equal to 0.01 and less than 0.2, the node is assigned to the 5th diffusion level. Each node is assigned to only one diffusion level, and there is no cross-level assignment. Through this process, all nodes are assigned to the corresponding diffusion levels in an orderly manner according to their semantic information concentration values ​​and their distance from the base point, further improving the multi-layer diffusion structure that radiates outward from the base point.

[0065] Using the base point determined in step 202 as the sole starting point for semantic information diffusion, and combining the spatial distribution and node affiliation of each diffusion level, a complete semantic diffusion path spectrum is constructed, starting from the base point, sequentially traversing the concentration coverage area of ​​each diffusion level, passing through the meteorological semantic diffusion equipotential surface corresponding to each level, until reaching the outermost meteorological semantic diffusion equipotential surface. This semantic diffusion path spectrum contains multiple independent diffusion paths, each starting from the base point and sequentially traversing the corresponding diffusion levels from the innermost to the outermost level. The concentration coverage range of each level ultimately extends to every node covered by that level. Starting from the base point, multiple paths are generated directly to all nodes in the first diffusion level, each path corresponding to one node, completely covering all nodes in the first level. Simultaneously, starting from the base point, paths are generated sequentially through the first and second levels, reaching all nodes in the second level. This process continues, generating paths starting from the base point, sequentially traversing all inner levels, and reaching nodes in the third, fourth, and fifth levels, ensuring that each path accurately connects the base point and the target node. All paths together constitute a complete semantic diffusion path spectrum, with each path corresponding to a logical route for semantic information to propagate from the core region (base point) to a specific node, presenting the process of semantic information propagation from the core to the edge.

[0066] This embodiment combines a cone-shaped semantic space domain with a thermal diffusion model, enabling the quantitative description of the diffusion process of meteorological semantic information using physical laws. This transforms the semantic distribution from an abstract structure into a computable and quantifiable physical model, enhancing the rigor of knowledge graph representation. By solving the thermal diffusion equation to obtain semantic information concentration values, core and peripheral semantics can be intuitively distinguished, with concentration levels closely matching node importance, facilitating rapid identification of key entities in meteorological knowledge. Constructing meteorological semantic diffusion equipotential surfaces and setting concentration boundaries allows for multi-level and standardized division of the semantic space, making its structure clearer. Establishing a semantic diffusion path spectrum can fully present the propagation path and relationships of semantic information from the core to the periphery.

[0067] In a preferred embodiment of the present invention, step 4 includes:

[0068] Step 400: Input the user query request into the pre-trained language encoder for vectorization processing to obtain the query request vector corresponding to the user query request; project the query request vector onto the embedding space of the cone semantic space domain to obtain the mapping coordinates of the query request vector in the cone semantic space domain, specifically including:

[0069] The Transformer was selected as the basic architecture of the language encoder. A network structure with 12 encoder layers and 6 attention heads was built. The input and output dimensions were both set to 512 dimensions, and the hidden layer dimension was set to 1024 dimensions. A position encoding module was added to the input layer to ensure the integrity of the sequence order information. A normalization layer and a linear mapping layer were added to the output layer to make the output vector dimension fixed and evenly distributed. A meteorological training dataset was constructed, collecting text data such as meteorological observation reports, forecast documents, academic papers, popular science articles, and disaster early warning information. The data underwent deduplication, noise reduction, meteorological-specific word segmentation, and stop word filtering to retain relevant function words and professional terms. It was then converted into standard sequence data and divided into training, validation, and test sets in an 8:1:1 ratio. Self-supervised learning was used to pre-train the encoder, with tasks including masked language modeling and sentence order prediction. Two core self-supervised tasks were selected: masked language modeling and sentence order prediction. The masked language modeling task was used to train the encoder's semantic understanding and prediction ability for meteorological vocabulary. Specifically, 15% of the words in the input text were randomly masked (using special symbols for replacement), and the encoder was asked to predict the masked words based on the contextual semantics. The sequence prediction task is used to train the encoder's ability to capture logical relationships in meteorological texts. Specifically, two semantically related meteorological texts (such as the passage of a cold front and a drop in temperature) are shuffled and input into the encoder, which then judges whether the original order of the two texts is correct. The learning rate is set to 1e-5, the batch size to 32, and the training epochs to 50. The Adam optimizer is used to optimize the encoder parameters and reduce the loss value during training. After each training epoch, the performance of the encoder is verified using a validation set. The semantic similarity accuracy is used as the core validation metric, which judges whether the similarity of the encoder's encoding vectors for similar meteorological texts meets the standard. When the semantic similarity accuracy on the validation set no longer improves for 5 consecutive epochs, the pre-training is stopped, and all encoder parameters at this time are saved, resulting in a pre-trained language encoder.

[0070] We collected user query samples in the meteorological field, covering various types such as weather forecast queries, meteorological disaster queries, meteorological data queries, and meteorological terminology queries. Each sample was labeled with a corresponding semantic tag (such as precipitation query, typhoon query, temperature query, etc.). We used these data to make adjustments, freezing the parameters of the first 8 layers and training only the last 4 layers and the output layer. The fine-tuning learning rate was set to 5e-6, the batch size was 16, and the training epochs were 20. The cross-entropy loss function was used to calculate the training loss. After each training epoch, the encoding accuracy of the encoder was tested using a test set to ensure that the encoding accuracy was not less than 95%. After fine-tuning, we saved the final encoder parameters to obtain a pre-trained language encoder that is adapted to meteorological domain knowledge and can extract the semantics of user queries.

[0071] The user's natural language query request (such as the probability and magnitude of precipitation in a city over the next three days, the current typhoon warning level and impact range of a region, the average temperature data of a region over the past month, etc.) is fed into the pre-trained language encoder that has been constructed and trained. For vectorization encoding, the encoder first performs word segmentation on the user's query request. Using a meteorological-specific word segmentation tool, it separates the core words, modifiers, and meteorological terms from the query text to clarify the core intent of the query. Then, through the encoder's attention mechanism, semantic features are extracted from all the segmented words, capturing the semantic vector of each word. Simultaneously, semantic relationships between words are explored, such as the temporal relationship between precipitation probability and the next three days, and the spatial relationship between typhoon warning and a region, eliminating ambiguity in the query text. Finally, through the encoder's global feature fusion module, the semantic features of all words are integrated and transformed into a continuous numerical vector with a fixed 512 dimensions. This vector is the query request vector uniquely corresponding to the user's query request. Each dimension of the vector corresponds to a semantic feature of the query request, representing the core semantic intent of the user's query.

[0072] After obtaining the query request vector corresponding to the user's query request, the vector is projected into the embedding space corresponding to the previously constructed cone semantic space domain. Through vector space alignment and coordinate normalization transformation, the user's query request is mapped from semantic features to spatial location. The specific process is as follows: First, the query request vector is subjected to Min-Max normalization processing, and the value of each dimension of the vector is mapped to the interval [0, 1]. The 512 dimensions of the query request vector are normalized one by one to obtain the normalized query request vector.

[0073] The normalized query request vector is projected onto the 3D embedding space of the cone semantic space domain through a preset spatial projection matrix, realizing the transformation of high-dimensional semantic vectors to low-dimensional spatial coordinates. The spatial projection matrix is ​​a 3×512 matrix preset simultaneously during the construction of the cone semantic space. Its dimensional design adapts to the projection requirements. The three rows of the matrix correspond to the x-axis, y-axis, and z-axis of the cone semantic space, respectively. The parameters of each row correspond to the projection coefficients of a coordinate axis, used to control the projection components of the normalized query request vector on that coordinate axis. The 512 columns completely correspond to the 512 dimensions of the normalized query request vector, ensuring that both satisfy the dimensionality requirements for matrix multiplication. The matching requirement is that the spatial projection matrix is ​​used to accurately transform the 512-dimensional query request vector into three-dimensional coordinates in the cone semantic space. The normalized 512-dimensional query request vector is precisely calculated with the 3×512-dimensional spatial projection matrix. During the calculation, the matrix multiplication rules are followed. The value of each dimension of the query request vector is multiplied and summed with the parameter of the corresponding column of the projection matrix to obtain the coordinate components in the three directions of x-axis, y-axis and z-axis. Finally, the three-dimensional mapping coordinates of the query request vector in the cone semantic space are obtained. The three-dimensional mapping coordinates accurately correspond to a unique spatial point in the cone semantic space, and the position of the spatial point matches the semantic features of the query request.

[0074] Step 401: Calculate the Euclidean distance between the query request vector and each meteorological semantic diffusion equipotential surface in the cone semantic space domain based on the mapped coordinates, obtaining the spatial distance value between the query request vector and each meteorological semantic diffusion equipotential surface; sort all the spatial distance values ​​of the meteorological semantic diffusion equipotential surfaces, select the meteorological semantic diffusion equipotential surface corresponding to the smallest spatial distance value as the target equipotential surface, and determine the diffusion path level corresponding to the target meteorological semantic diffusion equipotential surface in the semantic diffusion path spectrum as the diffusion path level of the user query request, specifically including:

[0075] The reference coordinates for each meteorological semantic diffusion equipotential surface are determined, with the geometric center point of each equipotential surface selected as the reference coordinates. The three-dimensional coordinates of all nodes on the meteorological semantic diffusion equipotential surface are collected, and the average values ​​of the x-coordinate, y-coordinate, and z-coordinate of all nodes are calculated. The coordinates formed by these three average values ​​are determined as the reference coordinates of the meteorological semantic diffusion equipotential surface. Using the Euclidean distance calculation formula, the spatial distance between the query request vector mapping coordinates and the reference coordinates of each meteorological semantic diffusion equipotential surface is calculated. The spatial distance between the query request vector and each meteorological semantic diffusion equipotential surface within the cone-shaped semantic space domain is calculated one by one. Each equipotential surface corresponds to a unique spatial distance value. The smaller the distance value, the closer the semantics of the query request are to the semantic range represented by the equipotential surface, and the higher the semantic matching degree.

[0076] All spatial distance values ​​obtained through the above calculations are sorted in ascending order. The sorting process uses the bubble sort method, comparing the size of each distance value one by one to ensure the accuracy of the sorting results. After sorting, the meteorological semantic diffusion equipotential surface corresponding to the spatial distance value with the smallest value is selected. This equipotential surface is determined as the target equipotential surface that is closest to the semantics of the user's query request. Since the meteorological semantic diffusion equipotential surfaces have been sorted in descending order of semantic information concentration value in the early stage, each equipotential surface corresponds to a unique semantic range. Therefore, the selection of the target equipotential surface is essentially to achieve the matching of the user's query semantics with the meteorological semantic range. After identifying the target equipotential surface, the semantic diffusion path spectrum is queried to find the diffusion path level to which the target meteorological semantic diffusion equipotential surface belongs in the spectrum. Since the semantic diffusion path spectrum is constructed based on the sorting results of meteorological semantic diffusion equipotential surfaces, each equipotential surface corresponds to a unique diffusion path level. There is no situation where an equipotential surface belongs to multiple levels. Therefore, the diffusion path level corresponding to the target equipotential surface can be directly located, and this level is determined as the diffusion path level corresponding to the current user query request. Through this process, the matching between the user query request and the semantic diffusion path level is completed.

[0077] Step 402: Based on the diffusion path hierarchy, retrieve the entity nodes covered by the corresponding diffusion path hierarchy and the associated edges between entity nodes from the optimized knowledge graph. Sort the entity nodes according to their original document sources and organize the associated edges according to semantic relationship types to form a structured evidence chain containing entity node sequences and associated relationship paths. Entity nodes include text entities, image entities, table entities, and video entities, specifically including:

[0078] The retrieval scope is clearly defined as the diffusion path hierarchy determined in step 401, with a clearly defined semantic information concentration range. Only all entity nodes and their associated edges within this range are retrieved, avoiding the retrieval of knowledge from irrelevant levels and improving retrieval efficiency. During the retrieval process, a dedicated retrieval interface for the meteorological multimodal knowledge graph is invoked. The identifier information of the diffusion path hierarchy is input, and all entity nodes and their associated edges within that hierarchy are extracted. Entity nodes include text entities, image entities, table entities, and video entities, comprehensively covering multimodal knowledge types in the meteorological field: text entities include meteorological terminology, meteorological observation data, meteorological event descriptions, meteorological disaster warnings, and other textual knowledge; image entities include meteorological... Knowledge is presented in the form of images such as satellite cloud images, meteorological radar images, meteorological disaster site images, and meteorological element change curves; tabular entities include knowledge in the form of tables such as meteorological observation data tables, meteorological statistical reports, and meteorological element comparison tables; video entities include knowledge in the form of videos such as meteorological forecast videos, meteorological science videos, meteorological disaster early warning videos, and meteorological observation process videos; and associative edges are the semantic relationships connecting the various entity nodes, mainly including causal relationships, subordinate relationships, associative relationships, and comparative relationships, such as the causal relationship between the passage of a cold front and the drop in temperature, the subordinate relationship between a typhoon and a typhoon warning level, and the associative relationship between precipitation and humidity. Each associative edge clearly corresponds to the semantic relationship between two entity nodes.

[0079] All retrieved entity nodes are sorted in order according to the original document source corresponding to each entity node. The sorting rules take into account both document type and publication time. Specifically, they are first sorted by the type of original document, in the following order: meteorological observation report source, meteorological forecast document source, meteorological academic paper source, meteorological popular science article source, and meteorological disaster early warning information source. For entity nodes with the same type of document source, they are sorted from newest to oldest according to the document publication time to ensure that the sequence of entity nodes is orderly and that the original document source of each entity node is traceable. All retrieved associated edges are categorized and organized according to the semantic relationship types between nodes. First, all associated edges are divided into four main categories: causal relationship, subordinate relationship, association relationship, and contrast relationship. Each semantic relationship category is further subdivided into specific relationship types. For example, causal relationships are divided into causing, triggering, and resulting; subordinate relationships are divided into belonging, containing, and belonging to; association relationships are divided into related, accompanying, and influencing; and contrast relationships are divided into higher than, lower than, and superior to. During the organization process, associated edges of the same semantic relationship type are grouped together, clarifying the two entity nodes connected by each associated edge and the specific semantic relationship between them, forming an association relationship set. At the same time, association relationship paths are constructed to clarify the semantic transmission relationship between entity nodes.After completing the entity node sorting and associated edge organization, the ordered entity node sequence is deeply integrated with the associated relationship paths organized according to semantic relationship types. Throughout the integration process, the ordered entity node sequence serves as the basic framework, following the sorting logic and semantic association rules of the entity nodes. Each associated relationship path, categorized by semantic relationship type, is mapped to the relevant entity nodes in the sequence. Specifically, the two entity nodes connected by each associated relationship path are first located, confirming their specific positions in the ordered sequence. Then, the associated relationship path is precisely embedded between the two nodes, clarifying the semantic association type between the nodes (e.g., causal relationship, subordinate relationship, association relationship, etc.). Simultaneously, each entity node in the sequence is individually verified to ensure that each entity node can connect with other relevant entities through at least one associated relationship path. Effective connections are established between entity nodes to eliminate isolated nodes, gradually building a complete semantic association network. After fusion, a structured evidence chain containing multimodal information is formed. This structured evidence chain is based on ordered multimodal entity nodes, covering various types of knowledge carriers in the meteorological field, such as text entities, image entities, table entities, and video entities. All entity nodes retain the sorting rules of the original document source. At the same time, the evidence chain embeds association paths organized according to semantic relationship types. Each path clearly marks the specific semantic relationship between entity nodes, clarifying both the type of association and the logical transmission relationship between entities. Ultimately, this structured evidence chain can completely and accurately present all meteorological knowledge related to the user's query request, clearly reflecting the inherent logical relationship between various multimodal entities.

[0080] This embodiment, by vectorizing user query requests and mapping them to a cone-shaped semantic space domain, can transform natural language queries into computable spatial coordinates, achieving a unified expression of query semantics and meteorological knowledge space, and improving the accuracy and stability of semantic matching. By using Euclidean distance to match the target meteorological semantic diffusion equipotential surface and locate the diffusion path hierarchy, it can quickly lock the knowledge range most relevant to the user query semantics, reduce interference from irrelevant knowledge, and improve the efficiency and targeting of knowledge retrieval. Based on the diffusion path hierarchy, multimodal entity nodes and their relationships are retrieved from the optimized knowledge graph and organized into a structured evidence chain according to source and relationship type, which can simultaneously integrate meteorological knowledge in multiple forms such as text, images, tables, and videos.

[0081] In a preferred embodiment of the present invention, step 5 includes:

[0082] Step 500: Traverse each entity node and each associated edge in the structured evidence chain. Convert each entity node into a first description statement according to its entity type and entity name. Convert each associated edge into a second description statement according to its corresponding relationship type and the head and tail entities it connects. Concatenate all the first and second description statements according to the order in which the entity nodes appear in the evidence chain to obtain the natural language evidence text corresponding to the structured evidence chain, specifically including:

[0083] The entire entity node and its associated paths within the structured evidence chain are traversed sequentially, following the hierarchical structure and semantic logic of the evidence chain. During the traversal, standardized natural language conversion is performed on both entity nodes and associated paths, clearly distinguishing the conversion rules for the two types of information. For each entity node, a corresponding first description statement is generated based on its entity type and name. The entity types are fixed as four categories: text entities, image entities, table entities, and video entities, maintaining complete consistency with the multimodal entity types retrieved previously. During conversion, the first description statement for text entities must clearly reflect the text entity and its name; the first description statement for image entities must reflect the image entity and its name; the first description statement for table entities must reflect the table entity and its name; and the first description statement for video entities must reflect the video entity and its name, ensuring that each first description statement clearly and accurately reflects the entity's category and core information.

[0084] For each association path, based on the semantic relationship type (causal, subordinate, associative, comparative) and the two connected entity nodes, a corresponding second description statement is generated. The statement content must clearly and directly express the logical relationship between the two entities. For example, if the association path is cold front passage (text entity), causal relationship, temperature drop (text entity), then the second description statement is: a cold front passage will lead to a temperature drop. If the association path is typhoon (text entity), subordinate relationship, typhoon warning level (text entity), then the second description statement is: the typhoon warning level belongs to typhoon-related meteorological information. If the association path is satellite cloud image (image entity), associative relationship, precipitation area (text entity), then the second description statement is: there is a relationship between the satellite cloud image and the precipitation area, and the range of the precipitation area can be determined through the satellite cloud image.

[0085] After conversion, all first and second description statements are arranged sequentially according to the order in which entity nodes appear in the structured evidence chain. During the arrangement, the logical connection of the association paths is taken into account. When an association path connects two entity nodes, the second description statement corresponding to that path is inserted between the first description statements of the two entity nodes to ensure that the order of the statements is consistent with the semantic logic of the evidence chain. Periods are used as separators between adjacent statements to avoid confusion. At the same time, the overall text is formatted, and redundant spaces and repetitive expressions are removed. Finally, a natural language evidence text that completely corresponds to the structured evidence chain is obtained.

[0086] Step 501: Concatenate the natural language evidence text and the user query request according to a preset prompt word template. Add an evidence start identifier before the natural language evidence text and an evidence end identifier after the natural language evidence text. Add a query start identifier before the user query request to obtain the concatenated text. Specifically, this includes: defining the preset prompt word concatenation structure and fixing the concatenation order as the user query request with the query start identifier and the natural language evidence text with the evidence start and end identifiers; adding an evidence start identifier at the beginning of the natural language evidence text to clearly mark the start boundary of the evidence text; and adding an evidence end identifier at the end of the natural language evidence text. The symbol is used to clearly mark the end boundary of the evidence text. At the same time, a query start identifier is added at the beginning of the user's original query request to clearly mark the starting boundary of the user's question. After adding the identifier, the user query request with the query start identifier is combined with the natural language evidence text with the evidence start and end identifiers in a preset concatenation order, with the user query request with the query start identifier first and the natural language evidence text with the evidence start and end identifiers second. During the concatenation process, it is ensured that the sentences are connected smoothly and no unnecessary or redundant content is added. Finally, a complete concatenated text is obtained. This concatenated text is the enhanced prompt word that integrates the user's query intent and meteorological evidence information.

[0087] Step 502: The concatenated text is used as an enhanced prompt word that incorporates evidence information. This enhanced prompt word is input into a pre-trained large language model to perform semantic understanding and context modeling. Based on the natural language evidence text in the enhanced prompt word, the corresponding answer text is generated, resulting in the original response text output by the pre-trained large language model. Specifically, this includes:

[0088] The Transformer architecture was selected as the basic architecture for the pre-trained large language model. A network structure containing 24 encoder layers and 12 attention heads was built. The input layer dimension was set to 512 dimensions, the hidden layer dimension to 2048 dimensions, and the output layer dimension to 512 dimensions. A positional encoding module was added to the model input layer to capture the sequence order information of the text and avoid order confusion during semantic understanding. A layer normalization module and a linear mapping layer were added to the output layer. In addition, an attention mechanism optimization module was added to enhance the model's attention to meteorological terminology and entity relationships, thereby improving the accuracy of semantic understanding. Various types of text data related to the meteorological field were collected to construct a dataset for model training. The data covers meteorological observation reports, short-term and medium-to-long-term meteorological forecast documents, meteorological academic papers, meteorological popular science articles, meteorological disaster early warning information, meteorological question-and-answer samples, and meteorological multimodal entity description texts, ensuring that the dataset can cover various semantic and question-and-answer scenarios in the meteorological field. The collected raw text data undergoes multi-step preprocessing. The first step is deduplication, removing duplicate text data to avoid redundant information interference during training. The second step is noise reduction, removing text containing garbled characters, invalid characters, and irrelevant content while retaining core semantic information. The third step is word segmentation, using meteorological-specific word segmentation tools to accurately identify meteorological terms such as cold front, warm front, pressure field, precipitation probability, and typhoon path, avoiding missegmentation of meteorological terms by general word segmentation tools. The fourth step is stop word removal, filtering and removing general stop words that do not contribute semantically, while retaining meteorological-specific function words to avoid deleting key semantic terms and ensure semantic integrity. The fifth step is text annotation, mapping answers to evidence sources for meteorological question-and-answer samples and annotating semantic relationships in meteorological texts. After preprocessing, all text data is converted into sequence data conforming to the model input format and divided into training, validation, and test sets in an 8:1:1 ratio for model training, performance validation, and accuracy testing, respectively.

[0089] A combination of self-supervised and supervised learning was used to pre-train the constructed model. The core objective was to enable the model to master meteorological terminology, semantic association patterns, question-and-answer logic, and the correspondence between evidence and answers. The specific pre-training process was as follows: three core pre-training tasks were selected: masked language modeling, sentence order prediction, and question-and-answer matching. The masked language modeling task was used to train the model's semantic understanding and prediction ability of meteorological vocabulary. 15% of the words in the input text were randomly masked, and the model was asked to predict the masked words based on the context semantics, focusing on improving the prediction accuracy of meteorological terminology. The sentence order prediction task was used to train the model's ability to capture the logical relationships in meteorological text. Two related meteorological texts were input into the model in shuffled order, and the model was asked to judge whether the text order was correct. The question-and-answer matching task was used to train the model's ability to recognize the association between query requests, evidence texts, and answer texts. The user query, evidence text, and answer text were combined, and the model was asked to judge the degree of matching among the three. The learning rate was set to 1e-5, the batch size to 64, and the training epochs to 80. The AdamW optimizer was used to optimize the model parameters and reduce the loss value during the training process. After each round of training, the performance of the model is verified using a validation set. Semantic understanding accuracy and question-answering matching accuracy are used as the core verification metrics. When the two metrics on the validation set no longer improve for 5 consecutive rounds, pre-training is stopped, and all parameters of the model at this point are saved to obtain the large language model that has been pre-trained.

[0090] We collected user query samples, corresponding evidence texts, and standard answers in the meteorological field, covering various types such as weather forecast queries, meteorological disaster queries, meteorological data queries, and meteorological terminology queries. Each sample was labeled with the query request, evidence text, standard answer, and the correspondence between evidence and answer. During fine-tuning training, the parameters of the first 16 layers of the model were frozen (preserving the basic semantic features and meteorological professional knowledge learned in the pre-training stage), and only the last 8 layers of the encoder and output layer were trained to avoid overfitting caused by overtraining. The learning rate was set to 5e-6 (lower than the pre-training learning rate to avoid parameter oscillation), the batch size was 32, and the training epochs were 30. The cross-entropy loss function was used to calculate the training loss. After each training epoch, the model's performance was tested using a test set, focusing on testing the accuracy, coherence, and professionalism of the model's answers generated based on evidence text, ensuring that the answer generation accuracy was not less than 96%, and that the output text conformed to the professional expression standards of the meteorological field. After fine-tuning, the final model parameters were saved, resulting in a pre-trained large language model adapted to the meteorological field and capable of accurately generating answers based on evidence text.

[0091] The enhanced prompts obtained in step 501 are input into the pre-trained large language model that has been constructed and trained. The answer generation process is then initiated. The model first performs global semantic understanding of the enhanced prompts. Through an attention mechanism, it quickly identifies the query start identifier, evidence start identifier, and evidence end identifier, distinguishing between the user's query request and the natural language evidence text. The model performs contextual association modeling on the natural language evidence text and the user's query request, deeply mining the core intent of the user's query, such as querying the precipitation situation in a certain region or the impact range of a certain meteorological disaster. It accurately locates the relevant evidence content in the natural language evidence text, including the corresponding entity node information and the logical relationship between them, ensuring that the answer generation strictly relies on the evidence text and does not deviate from the core information. After the modeling is completed, the model generates an answer text that accurately matches the user's query based on the meteorological knowledge, entity relationships, and logical structure in the evidence text, combined with the professional expression standards in the meteorological field. The answer content must completely cover the core query needs, be logically clear, and be rigorously expressed, while avoiding the introduction of external irrelevant knowledge, ensuring that the answer content is completely consistent with the source of evidence. After generation, the model outputs the original response text containing the answer content. The original response text only retains the answer-related content and does not contain any extra formatting information or redundant expressions.

[0092] Step 503: Extract the answer text portion from the original response text output by the pre-trained large language model, and extract the original document location of the entity node referenced by each answer text fragment and the original document source of the associated edge from the structured evidence chain as evidence source annotations; associate and store the answer text with the corresponding evidence source annotations to generate an interpretable weather question-and-answer pair containing the answer content and the original document source annotation corresponding to each answer content. The original document location includes: for video entities, it also includes the timestamp and spatial region of the video keyframe; for image entities, it includes the region coordinates in the image, specifically including:

[0093] Text segmentation and semantic extraction algorithms are used to extract the valid answer text, and redundant spaces, irrelevant expressions, formatting marks and other invalid content are removed. During the parsing process, the answer text is split into segments, and the complete answer text is divided into multiple key segments. Each key segment corresponds to a core knowledge point. For example, the statement "Moderate to heavy rain in a certain city in the next three days, temperature will drop by 5 to 8℃, and precipitation will be concentrated on the night of the 17th" is split into three key segments: "Moderate to heavy rain in a certain city in the next three days, temperature will drop by 5 to 8℃ in a certain city in the next three days, and precipitation will be concentrated on the night of the 17th in a certain city". For each key fragment of the answer, the process traces back to the previously constructed structured evidence chain. A semantic matching algorithm is used to locate the entity nodes and related paths referenced by the fragment. The key fragment of the answer is semantically compared with the entity node names and related path descriptions in the structured evidence chain, and semantic similarity is calculated. When the semantic similarity is not less than 85%, the answer fragment is determined to reference the corresponding entity node and related path. After location, the corresponding original document source information is extracted. This information is divided into original document location and original document type. The original document location uses different annotation standards depending on the entity type, as follows: For text entities, the original document location is annotated with the corresponding original document name, page number, and paragraph position; for table entities, it is annotated with the corresponding original document name, page number, and table number; for image entities, it is annotated with the corresponding original document name, image number, and region coordinates in the image (using two-dimensional coordinates); and for video entities, it is annotated with the corresponding original document name, video number, keyframe timestamp, and spatial region (using the same two-dimensional coordinates as for image entities).

[0094] The original document source of the association path is marked as the intersection of the original document sources corresponding to the two entity nodes it connects. After extraction, the extracted answer text (including all key fragments) is associated one by one with the original document source annotations corresponding to each answer content, clarifying the entity node, association path, and original document location corresponding to each key fragment of the answer, avoiding annotation confusion and association errors. Finally, the answer text and evidence source annotations are associated and stored according to a fixed structure. The storage structure follows the format of key fragments of the answer and corresponding evidence source annotations. Multiple key fragments are arranged in the original order of the answer text. Finally, an interpretable meteorological question-and-answer pair is generated that contains both the answer content and the original document source annotations corresponding to each answer content. This question-and-answer pair can not only clearly present the answer queried by the user, but also allow the user to quickly trace the original source of each answer fragment.

[0095] This embodiment transforms structured evidence chains into standardized natural language evidence text, enabling pre-trained large language models to more efficiently understand meteorological knowledge structures and logical relationships between entities, thus improving the accuracy and coherence of answer generation. By using identifiers to separate evidence and queries to construct enhanced prompts, the model's generation scope is effectively constrained, preventing the introduction of irrelevant information or erroneous inferences, thereby improving the reliability of meteorological question-and-answer systems. Answer generation based on a pre-trained large language model in the meteorological domain, combined with multimodal entity and relational path information, outputs complete, logically clear, and professionally accurate responses. Matching the answer text with corresponding original document source annotations, including fine-grained location information such as image region coordinates and video timestamps, ensures that the question-and-answer results are interpretable and traceable, enhancing the credibility of the meteorological question-and-answer system.

[0096] like Figure 2 As shown, embodiments of the present invention also provide a system for scalable generation of interpretable question-answer pairs for vertical domains, comprising:

[0097] The knowledge graph generation module is used to construct a cross-modal heterogeneous meteorological knowledge graph. It performs heterogeneous graph neural network modeling on the cross-modal heterogeneous meteorological knowledge graph, learns the embedding vector of each node, obtains a knowledge graph with node embedding vectors, and performs semantic community partitioning of entities based on the embedding vectors to generate an optimized knowledge graph.

[0098] The spatial domain construction module is used to perform gradient field analysis on the embedding vectors of each node in the optimized knowledge graph, determine the semantic gradient of each node in different semantic community directions, and identify mutation nodes as semantic boundary anchors. With the semantic boundary anchors as vertices and the distribution centroid of the core entity group with the highest semantic relevance in the optimized knowledge graph as the base point, a hierarchical cone-shaped semantic spatial domain is constructed.

[0099] The spectrum generation module is used to simulate the propagation process of semantic information in the cone-shaped semantic space domain according to the thermal diffusion equation, generate meteorological semantic diffusion equipotential surfaces radiating from the core area to the periphery, and establish a semantic diffusion path spectrum based on the hierarchical division of the meteorological semantic diffusion equipotential surfaces.

[0100] The evidence chain acquisition module is used to map user query requests to a cone-shaped semantic space domain, calculate the spatial distance between the query request vector and each meteorological semantic diffusion equipotential surface, determine the diffusion path level to which the user query request belongs, retrieve entities and inter-entity association information that match the diffusion path level from the optimized knowledge graph, and construct a structured evidence chain representing the reasoning path.

[0101] The question-answer pair generation module is used to integrate the structured evidence chain with the user query request and input it into the pre-trained large language model to generate interpretable meteorological question-answer pairs.

[0102] It should be noted that this system is a system corresponding to the above method. All implementation methods in the above method embodiments are applicable to this embodiment and can achieve the same technical effect.

[0103] The above description represents the preferred embodiments of the present invention. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principles of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.

Claims

1. A method for scalable generation of interpretable question-answer pairs for vertical domains, characterized in that, The method includes: Step 1: Construct a cross-modal heterogeneous meteorological knowledge graph; perform heterogeneous graph neural network modeling on the cross-modal heterogeneous meteorological knowledge graph, learn the embedding vector of each node, obtain a knowledge graph with node embedding vectors, and perform semantic community partitioning of entities based on the embedding vectors to generate an optimized knowledge graph; Step 2: For each node in the optimized knowledge graph, calculate the difference between the embedding vector of the corresponding node and the center vector of the adjacent semantic community to obtain the difference vector; use the projection of the difference vector in a preset direction as the semantic gradient of the corresponding node pointing to the semantic community adjacent to the corresponding node. The adjacent semantic community refers to the semantic community to which the neighboring nodes directly connected to the corresponding node through the edge belong; traverse all nodes in the optimized knowledge graph, and determine the nodes whose semantic gradient vectors point to two or more different semantic communities and whose gradient magnitudes all exceed a preset threshold as semantic boundary anchors. The semantic boundary anchors are located in the common boundary region of the multiple semantic communities pointed to by the semantic boundary anchors; calculate the geometric mean of the embedding vectors of all nodes in the core entity group with the highest semantic relevance in the optimized knowledge graph to obtain the base point coordinates; use the embedding vector of each semantic boundary anchor as the vertex coordinates, and connect the base point and all vertices to form a cone skeleton; Nodes in the optimized knowledge graph, excluding those in the core entity group and semantic boundary anchors, are treated as ordinary nodes. Ordinary nodes are assigned to different levels of the cone skeleton based on their embedding distance from the base point, so as to form a cone semantic space domain with the base point as the center, the vertices as the boundaries, and radially layered distribution in the embedding space. Step 3: Simulate the propagation process of semantic information in the cone-shaped semantic space domain according to the thermal diffusion equation, generate meteorological semantic diffusion equipotential surfaces radiating from the core area to the periphery, and establish a semantic diffusion path spectrum based on the hierarchical division of the meteorological semantic diffusion equipotential surfaces. Step 4: Map the user query request to the cone semantic space domain, calculate the spatial distance between the query request vector and each meteorological semantic diffusion equipotential surface, determine the diffusion path level to which the user query request belongs, retrieve entities and inter-entity association information that match the diffusion path level from the optimized knowledge graph, and construct a structured evidence chain representing the reasoning path. Step 5: After fusing the structured evidence chain with the user query request, input it into the pre-trained large language model to generate interpretable meteorological question-answer pairs.

2. The method for scalable generation of interpretable question-answer pairs oriented towards vertical domains according to claim 1, characterized in that, The process of constructing a cross-modal heterogeneous meteorological knowledge graph is as follows: The original meteorological technical documents are processed into multimodal blocks to obtain meteorological text blocks, meteorological image blocks, meteorological table blocks, and meteorological video keyframe blocks. Intramodal information extraction is performed on each of the meteorological text blocks, meteorological image blocks, meteorological table blocks, and meteorological video keyframe blocks. Meteorological entities and relationships between meteorological entities are extracted from meteorological text blocks. Image meteorological entities and visual relationships between image entities are extracted from meteorological image blocks. Table meteorological entities and row and column relationships within tables are extracted from meteorological table blocks. Video meteorological entities, spatiotemporal relationships, and dynamic evolution relationships between video entities are extracted from meteorological video keyframe blocks. Semantic alignment and cross-modal association are performed on the meteorological entities extracted from different modalities to construct a cross-modal heterogeneous meteorological knowledge graph.

3. The method for scalable generation of interpretable question-answer pairs for vertical domains according to claim 2, characterized in that, Heterogeneous graph neural network modeling is performed on a cross-modal heterogeneous meteorological knowledge graph. The embedding vector of each node is learned to obtain a knowledge graph with node embedding vectors. Based on these embedding vectors, semantic community partitioning of entities is performed to generate an optimized knowledge graph, including: The cross-modal heterogeneous meteorological knowledge graph is formalized into a heterogeneous graph containing multiple node types and multiple edge types, and a feature vector is initialized for each node. The heterogeneous graph with initialized feature vectors is then input into a heterogeneous graph attention network to aggregate the neighbor node information of each node. During the aggregation process, the attention mechanism is used to calculate the attention weight of each neighbor node to the center node under different relation types. The feature vectors of the neighbor nodes are weighted and aggregated according to the attention weights to obtain the aggregated features. After fusing the aggregated features with the feature vector of the central node, the embedding vector of the central node in the current layer is updated. After several iterations, the final embedding vector of the fused structural semantics of each node is obtained. The final embedding vector is clustered, and nodes whose embedding vector distance is less than the cluster radius are assigned to the same semantic community. Each node is labeled with the semantic community identifier to which it belongs, and finally an optimized knowledge graph with semantic community labels is obtained.

4. The method for scalable generation of interpretable question-answer pairs for vertical domains according to claim 3, characterized in that, Ordinary nodes are assigned to different hierarchical intervals of the cone skeleton based on their embedding distance from the base point, including: Calculate the Euclidean distance between the embedding vector of each ordinary node and the coordinates of the base point in the optimized knowledge graph to obtain the embedding distance value corresponding to each ordinary node; obtain the minimum and maximum distance values ​​among all ordinary node embedding distance values, and divide the interval from the minimum to the maximum distance value equally according to the preset number of radial levels to obtain several continuous and non-overlapping hierarchical distance intervals. The lower limit and upper limit of each hierarchical distance interval constitute the boundary threshold of that level. Iterate through each ordinary node, compare the embedding distance value of the ordinary node with the boundary threshold of each level distance interval, and assign the ordinary node to the radial level corresponding to the level distance interval to which the corresponding embedding distance value belongs.

5. The method for scalable generation of interpretable question-answer pairs for vertical domains according to claim 4, characterized in that, Step 3 includes: The base point in the cone-shaped semantic space domain is set as the heat diffusion source point. The embedding distance from each node in the cone-shaped semantic space domain to the base point is used as the spatial position variable. The heat diffusion coefficient and propagation time parameters are set, and the heat diffusion equation is solved to obtain the semantic information concentration value at any position in the cone-shaped semantic space domain. In the conical semantic space domain, extract all spatial point sets with equal semantic information concentration values, fit the spatial point sets into a continuous surface as the meteorological semantic diffusion equipotential surface; sort all meteorological semantic diffusion equipotential surfaces from high to low according to their corresponding semantic information concentration values, and obtain the concentration values ​​corresponding to adjacent meteorological semantic diffusion equipotential surfaces as the concentration boundaries of each level. The concentration range between concentration boundaries is defined as the concentration coverage interval of the corresponding level, so as to establish a semantic diffusion path spectrum from the base point outward through each diffusion level to the outermost meteorological semantic diffusion equipotential surface. The semantic diffusion path spectrum contains multiple diffusion paths from the base point to the nodes covered by each diffusion level.

6. The method for scalable generation of interpretable question-answer pairs for vertical domains according to claim 5, characterized in that, Step 4 includes: The user query request is input into the pre-trained language encoder for vectorization to obtain the query request vector corresponding to the user query request; the query request vector is projected into the embedding space of the cone semantic space domain to obtain the mapping coordinates of the query request vector in the cone semantic space domain; The Euclidean distance between the query request vector and each meteorological semantic diffusion equipotential surface in the cone semantic space domain is calculated based on the mapped coordinates, and the spatial distance value between the query request vector and each meteorological semantic diffusion equipotential surface is obtained. The spatial distance values ​​of all meteorological semantic diffusion equipotential surfaces are sorted, and the meteorological semantic diffusion equipotential surface corresponding to the smallest spatial distance value is selected as the target equipotential surface. The diffusion path level corresponding to the target meteorological semantic diffusion equipotential surface in the semantic diffusion path spectrum is determined as the diffusion path level of the user query request. Based on the diffusion path hierarchy, the entity nodes covered by the corresponding diffusion path hierarchy and the associated edges between entity nodes are retrieved from the optimized knowledge graph. The entity nodes are sorted according to their original document sources, and the associated edges are organized according to their semantic relationship types to form a structured evidence chain containing entity node sequences and associated relationship paths. The entity nodes include text entities, image entities, table entities, and video entities.

7. The method for scalable generation of interpretable question-answer pairs for vertical domains according to claim 6, characterized in that, Step 5 includes: Traverse each entity node and each associated edge in the structured evidence chain, convert each entity node into a first description statement according to its entity type and entity name, and convert each associated edge into a second description statement according to its corresponding relationship type and the head and tail entities it connects. Concatenate all the first and second description statements according to the order in which the entity nodes appear in the evidence chain to obtain the natural language evidence text corresponding to the structured evidence chain. The natural language evidence text and the user query request are concatenated according to the preset prompt word template. An evidence start identifier is added before the natural language evidence text and an evidence end identifier is added after the natural language evidence text. A query start identifier is added before the user query request to obtain the concatenated text. The concatenated text is used as an enhanced prompt word that incorporates evidence information. The enhanced prompt word is then input into a pre-trained large language model to perform semantic understanding and context modeling. Based on the natural language evidence text in the enhanced prompt word, the corresponding answer text is generated, resulting in the original response text output by the pre-trained large language model. Extract the answer text portion from the original response text output by the pre-trained large language model, and extract the original document position of the entity node referenced by each answer text fragment and the original document source of the associated edge from the structured evidence chain as evidence source annotations; associate and store the answer text with the corresponding evidence source annotations to generate an interpretable weather question-answer pair containing the answer content and the original document source annotation corresponding to each answer content.

8. The method for scalable generation of interpretable question-answer pairs for vertical domains according to claim 7, characterized in that, The original document location includes: For video entities, this also includes the timestamps of video keyframes and spatial regions; for image entities, it includes the coordinates of regions within the image.

9. A system for scalable generation of interpretable question-answer pairs for vertical domains, the system implementing the method as described in any one of claims 1 to 8, characterized in that, include: The knowledge graph generation module is used to construct cross-modal heterogeneous meteorological knowledge graphs; Heterogeneous graph neural network modeling is performed on the cross-modal heterogeneous meteorological knowledge graph. The embedding vector of each node is learned to obtain a knowledge graph with node embedding vectors. Based on the embedding vectors, semantic community partitioning of entities is performed to generate an optimized knowledge graph. The spatial domain construction module is used to calculate the difference between the embedding vector of each node in the optimized knowledge graph and the center vector of the adjacent semantic community to obtain the difference vector; the projection of the difference vector in a preset direction is used as the semantic gradient of the corresponding node pointing to the semantic community adjacent to the corresponding node. The adjacent semantic community refers to the semantic community to which the neighboring nodes directly connected to the corresponding node belong; all nodes in the optimized knowledge graph are traversed, and nodes whose semantic gradient vectors point to two or more different semantic communities and whose gradient magnitudes all exceed a preset threshold are determined as semantic boundary anchors. The semantic boundary anchors are located in the common boundary region of the multiple semantic communities pointed to by the semantic boundary anchors. The geometric mean of the embedding vectors of all nodes in the core entity group with the highest semantic relevance in the optimized knowledge graph is calculated to obtain the coordinates of the base point. The embedding vector of each semantic boundary anchor point is used as the vertex coordinates, and the base point and all vertices are connected to form a cone-shaped skeleton. Nodes in the optimized knowledge graph other than those in the core entity group and semantic boundary anchor points are treated as ordinary nodes. According to the embedding distance of each ordinary node to the base point, the ordinary nodes are assigned to different level intervals of the cone-shaped skeleton to form a cone-shaped semantic space domain with the base point as the center, the vertices as the boundaries, and a radially hierarchical distribution in the embedding space. The spectrum generation module is used to simulate the propagation process of semantic information in the cone-shaped semantic space domain according to the thermal diffusion equation, generate meteorological semantic diffusion equipotential surfaces radiating from the core area to the periphery, and establish a semantic diffusion path spectrum based on the hierarchical division of the meteorological semantic diffusion equipotential surfaces. The evidence chain acquisition module is used to map user query requests to a cone-shaped semantic space domain, calculate the spatial distance between the query request vector and each meteorological semantic diffusion equipotential surface, determine the diffusion path level to which the user query request belongs, retrieve entities and inter-entity association information that match the diffusion path level from the optimized knowledge graph, and construct a structured evidence chain representing the reasoning path. The question-answer pair generation module is used to integrate the structured evidence chain with the user query request and input it into the pre-trained large language model to generate interpretable meteorological question-answer pairs.