Method and device for constructing insurance claim rate prediction model based on causal diagram federated learning
By employing a causal graph federated learning method, combining feature alignment of structured and unstructured data with blockchain technology, an insurance odds prediction model supporting multi-party privacy protection was constructed. This model addresses the accuracy and stability issues in cross-institutional data collaboration and achieves dynamic adaptive insurance odds prediction.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING MICROCHIP EDGE COMPUTING RES INST
- Filing Date
- 2026-03-04
- Publication Date
- 2026-06-12
AI Technical Summary
Existing insurance odds prediction methods have poor accuracy in cross-institutional data collaboration, lack causal explanations, exhibit poor stability when data distribution changes, and lack automated performance monitoring and iterative triggering mechanisms.
We employ a causal graph-based federated learning approach. By acquiring local structured policy data and unstructured claims text, we extract numerical and semantic features, align and fuse them in a shared latent space, and use blockchain to build a neural network prediction model. We perform privacy-preserving parameter updates and aggregation, and combine the global causal graph adjacency matrix for model training.
It improves the interpretability and accuracy of insurance odds predictions and enables privacy protection and dynamic adaptation capabilities through cross-institutional data collaboration.
Smart Images

Figure CN122199157A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of insurance odds prediction technology, specifically to a method and apparatus for constructing an insurance odds prediction model based on causal graph federated learning. Background Technology
[0002] In the field of InsurTech, cross-institutional data collaboration is crucial for improving the accuracy of risk models. However, traditional methods (statistical models or general federated learning-based risk prediction methods) face serious challenges: single-institutional data samples are limited and modalities are singular, making it difficult to comprehensively characterize risks; while federated learning can alleviate data silos, its general framework lacks an effective mechanism for integrating structured data and unstructured text in insurance scenarios, and risk models are often black boxes, unable to provide causal explanations that meet regulatory requirements; existing solutions mostly rely on statistical correlation, ignoring the causal structure between risk factors, resulting in poor stability of risk models when data distribution changes; furthermore, risk models lack automated performance monitoring and iteration triggering mechanisms, making it difficult to adapt to dynamically changing business environments. Summary of the Invention
[0003] To address this issue, this application provides a method and apparatus for constructing an insurance odds prediction model based on causal graph federated learning, in order to solve the problem of poor accuracy of existing insurance odds prediction methods in cross-institutional data collaboration.
[0004] To achieve the above objectives, this application provides the following technical solution:
[0005] Firstly, a method for constructing an insurance odds prediction model based on causal graph federated learning includes:
[0006] Step 1: Obtain local structured policy data and unstructured claims text, extract numerical features from the structured policy data, and extract semantic features from the unstructured claims text;
[0007] Step 2: Align the numerical features and semantic features in the shared latent space, and perform concatenation or weighted fusion to obtain the fused multimodal features;
[0008] Step 3: Train a pre-built neural network prediction model using the multimodal features and the corresponding payout labels, and calculate the local parameter update amount; the neural network prediction model is constructed based on the global causal graph adjacency matrix distributed by the blockchain;
[0009] Step 4: Perform privacy protection processing on the local parameter update amount and upload it to the blockchain; after verification by the smart contract on the blockchain, call the secure aggregation protocol to aggregate the local parameter update amounts of each participant to obtain the global update amount;
[0010] Step 5: Obtain the global update amount from the blockchain, and update the parameters of the neural network prediction model according to the global update amount to obtain the trained insurance odds prediction model.
[0011] Preferably, in step 1, when extracting numerical features from the structured policy data, numerical features characterizing policy attributes are extracted from the structured policy data using numerical coding and feature engineering methods.
[0012] Preferably, in step 1, when extracting semantic features from the unstructured claims text, a large language model based on the Transformer architecture, which has been fine-tuned using insurance domain corpus, is used to extract text features that represent the deep semantics of the text from the unstructured claims text.
[0013] Preferably, step 2 specifically includes:
[0014] Step 201: Construct a shared encoder network to map the numerical features and textual features in the policy samples to the same latent semantic space;
[0015] Step 202: Using the contrastive learning loss function, construct positive sample pairs by combining numerical features and text features from the same policy sample, and construct negative sample pairs by combining numerical features and text features from different policy samples.
[0016] Step 203: By training the shared encoder network, maximize the mutual information between positive sample pairs and minimize the similarity between negative sample pairs to achieve semantic alignment between numerical features and text features.
[0017] Step 204: After completing feature alignment, concatenate or weightedly fuse the numerical and semantic features mapped to the latent semantic space to obtain the fused multimodal features.
[0018] Preferably, step 3, the process of generating the global causal graph adjacency matrix, specifically includes:
[0019] Based on multimodal features, a local causal graph is constructed using a causal discovery algorithm, and an adjacency matrix is extracted from the local causal graph.
[0020] The adjacency matrix is uploaded to the blockchain; the smart contract on the blockchain calls the privacy-preserving aggregation protocol to aggregate the adjacency matrices of each participant to generate a global causal graph adjacency matrix.
[0021] Preferably, in step 3, the neural network prediction model includes a main task regression layer and a causal consistency auxiliary task module. The main task regression layer is used to output the predicted insurance odds value, and the causal consistency auxiliary task module is used to constrain the internal feature relationship of the model to be consistent with the adjacency matrix of the global causal graph.
[0022] Preferably, in step 4, the secure aggregation protocol employs a secure multi-party computation or a privacy-preserving federated averaging algorithm.
[0023] Preferably, the method further includes: predicting insurance odds based on the insurance odds prediction model to obtain predicted values and causal contribution vectors of each risk factor; calculating the error between the predicted value and the actual value to obtain the local prediction error; signing the local prediction error and submitting it to the blockchain oracle; the blockchain oracle verifying and calculating the global average error, and if the average error exceeds a threshold, automatically triggering the smart contract to start a new round of federated learning iteration of the neural network prediction model.
[0024] Preferably, the calculation process of the causal contribution vector is as follows: using a gradient-based attribution method or SHAP value method, combined with the global causal graph adjacency matrix, the contribution of each risk factor in the input features to the predicted value is calculated to form a causal contribution vector.
[0025] Secondly, an apparatus for constructing an insurance odds prediction model based on causal graph federated learning includes:
[0026] The feature extraction module is used to acquire local structured policy data and unstructured claims text, and extract numerical features from the structured policy data and semantic features from the unstructured claims text.
[0027] The feature fusion module is used to align the numerical features and the semantic features in the shared latent space, and perform splicing or weighted fusion to obtain the fused multimodal features;
[0028] The training module is used to train a pre-built neural network prediction model using the multimodal features and the payout labels corresponding to the multimodal features, and to calculate the local parameter update amount; the neural network prediction model is constructed based on the global causal graph adjacency matrix distributed by the blockchain;
[0029] A privacy processing module is used to perform privacy protection processing on the local parameter update volume and upload it to the blockchain; after verification by the smart contract on the blockchain, the security aggregation protocol is called to aggregate the local parameter update volumes of each participant to obtain the global update volume;
[0030] The parameter update module is used to obtain the global update amount from the blockchain and update the parameters of the neural network prediction model according to the global update amount to obtain the trained insurance odds prediction model.
[0031] Compared with the prior art, this application has at least the following beneficial effects:
[0032] 1. This application provides a method for constructing an insurance odds prediction model based on causal graph federated learning, including: acquiring local structured policy data and unstructured claims text, and extracting numerical and semantic features respectively; aligning the numerical and semantic features in a shared latent space to obtain fused multimodal features; training a pre-built neural network prediction model using the multimodal features and corresponding claims labels, and calculating local parameter update amounts; performing privacy protection processing on the parameter update amounts and uploading them to the blockchain; after verification by the smart contract on the blockchain, aggregating the local parameter update amounts of each participant to obtain a global update amount; updating the parameters of the neural network prediction model according to the global update amount to obtain the trained insurance odds prediction model. This application, while protecting data privacy, combines multimodal feature alignment, causal discovery learning, and the blockchain's trusted collaborative mechanism to construct an insurance odds prediction model that supports multi-party privacy protection participation and causal awareness, thereby improving the interpretability and accuracy of insurance odds prediction in cross-institutional data collaboration.
[0033] 2. Based on the insurance odds prediction model, predict the insurance odds to obtain the predicted value and the causal contribution vector of each risk factor; calculate the error between the predicted value and the actual value to obtain the local prediction error; sign the local prediction error and submit it to the blockchain oracle; the blockchain oracle verifies and calculates the global average error. If the average error exceeds the threshold, it automatically triggers the smart contract to start a new round of federated learning iterative neural network prediction model, thereby improving the dynamic adaptive capability of the insurance odds prediction model. Attached Figure Description
[0034] To more intuitively illustrate the prior art and this application, exemplary drawings are provided below. It should be understood that the specific shapes and structures shown in the drawings should not generally be regarded as limiting conditions for implementing this application; for example, based on the technical concept disclosed in this application and the exemplary drawings, those skilled in the art are able to easily make conventional adjustments or further optimizations to the addition / reduction / classification, specific shapes, positional relationships, connection methods, size ratios, etc. of certain units (components).
[0035] Figure 1 This is a flowchart illustrating a method for constructing an insurance odds prediction model based on causal graph federated learning, provided in Embodiment 1 of this application.
[0036] Figure 2A schematic diagram of the interaction structure between the insurance odds prediction model construction method based on causal graph federated learning provided in Embodiment 1 of this application and the blockchain.
[0037] Figure 3 This is a flowchart illustrating the generation process of the global cause-effect graph adjacency matrix provided in Embodiment 1 of this application. Detailed Implementation
[0038] The present application will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0039] In the description of this application: unless otherwise stated, "a plurality of" means two or more. The terms "first," "second," "third," etc., in this application are intended to distinguish the objects referred to and do not have any special meaning in terms of technical connotation (e.g., they should not be construed as an emphasis on importance or order). Expressions such as "including," "comprising," and "having" also mean "not limited to" (certain units, components, materials, steps, etc.).
[0040] The terms used in this application, such as "upper," "lower," "left," "right," and "middle," are generally used to indicate the general relative positional relationship for the purpose of intuitive understanding by referring to the accompanying drawings, and are not absolute limitations on the positional relationship in the actual product.
[0041] Example 1
[0042] Please see Figure 1 and Figure 2 This embodiment provides a method for constructing an insurance odds prediction model based on causal graph federated learning. It is suitable for scenarios involving multiple institutions and with high data privacy protection requirements, enabling cross-institutional joint risk modeling and prediction. The method includes:
[0043] S1: Obtain local structured policy data and unstructured claims text, and extract numerical features from the structured policy data and semantic features from the unstructured claims text;
[0044] Specifically, various insurance institutions Collect local structured policy data With unstructured claims text For structured policy data Numerical features representing policy attributes are extracted through numerical encoding and feature engineering methods (e.g., missing value handling, normalization, feature crossing). For unstructured claims text Semantic encoding is performed using a pre-trained large language model based on the Transformer architecture, and fine-tuning is done using corpora from the insurance domain to adapt to claims texts, thereby extracting text features that represent the deep semantics of the text. That is, using a large language model based on the Transformer architecture, which has been fine-tuned using insurance domain corpus, to extract text features that represent the deep semantics of the text.
[0045] More specifically, various insurance institutions Structured policy data collected locally Data cleaning and preprocessing are performed, including handling missing values and detecting and correcting outliers. Subsequently, feature engineering is applied to the preprocessed data, using different encoding methods for different types of fields: one-hot or embedded encoding is used for categorical fields (such as vehicle type and region), while numerical fields (such as coverage amount and vehicle age) are standardized. Finally, all encoded features are concatenated and combined to form numerical features representing the objective attributes and risk status of the policy.
[0046] More specifically, various insurance institutions Unstructured claims text collected locally Text preprocessing is performed, including word segmentation, stop word removal, stemming, or lemmatization. The preprocessed text is then input into a pre-trained deep learning semantic encoding model. This model obtains the context-aware vector representation of each word or sentence in the text, and aggregates these representations into fixed-dimensional text features that characterize the overall semantics, sentiment, and detailed descriptions of the claims event through pooling operations. .
[0047] S2: Align numerical and semantic features in the shared latent space and perform splicing or weighted fusion to obtain fused multimodal features;
[0048] Specifically, S2 includes:
[0049] S201: Construct a shared encoder network to map numerical features and textual features in policy samples to the same latent semantic space;
[0050] More specifically, this step constructs a shared encoder network that separately encodes the numerical features. Text features Mapped to the same latent space.
[0051] S202: Using the contrastive learning loss function, numerical features and text features from the same policy sample are combined to construct positive sample pairs, and numerical features and text features from different policy samples are combined to construct negative sample pairs.
[0052] More specifically, this step uses a contrastive learning loss function to analyze the numerical features from the same policy sample. Text features Positive sample pairs are formed by combining features from different policy samples into negative sample pairs. This involves combining the numerical features of each policy sample into positive sample pairs. Text features To form a positive sample pair, its numerical features Text features compared to other samples in the same batch This forms a negative sample pair.
[0053] S203: By training a shared encoder network, the mutual information between positive sample pairs is maximized and the similarity between negative sample pairs is minimized, so as to achieve semantic alignment between numerical features and text features.
[0054] More specifically, this step achieves semantic alignment of features from the two modalities by maximizing the mutual information between positive sample pairs and minimizing the similarity between negative sample pairs during training. A contrastive loss function such as InfoNCE is used during training.
[0055] S204: After feature alignment is completed, the numerical and semantic features mapped to the latent semantic space are concatenated or weighted and fused to obtain the fused multimodal features.
[0056] More specifically, this step concatenates (element-by-element addition or concatenation) or weighted fuses the mapped and aligned features to form a unified multimodal feature representation. .
[0057] S3: Train a pre-built neural network prediction model using multimodal features and the corresponding payout labels, and calculate the local parameter update amount; the neural network prediction model is constructed based on the global causal graph adjacency matrix distributed by the blockchain;
[0058] For details, please refer to Figure 3 The process of generating the adjacency matrix of the global cause-effect graph includes:
[0059] S301: Based on multimodal features, a local causal graph is constructed using a causal discovery algorithm, and an adjacency matrix is extracted from the local causal graph;
[0060] More specifically, in this embodiment, multimodal features Each dimension is considered as a node representing a different risk factor. This step employs either a score-based or constraint-based causal discovery algorithm to learn the conditional independence and causal dependencies between risk factor variables from local data. If a constraint-based causal discovery algorithm is used, an undirected skeleton graph of variables is constructed through a series of conditional independence tests, and causal directions are inferred using rules such as V-structures. If a score-based causal discovery algorithm is used, the optimal directed acyclic graph structure is searched by optimizing a score function that measures the fit between the graph structure and the data, resulting in a graph structure that represents the causal topological relationships between variables.
[0061] Then, by optimizing the scoring function or performing a series of conditional independence tests, the causal direction between variables is inferred, and a local causal graph in the form of a directed acyclic graph is constructed. From the local cause-effect graph Extract its adjacency matrix , where matrix elements Indicates risk factors Risk factors There is a direct causal effect. This involves post-processing the output causal graph structure, applying directed acyclic graph constraints, eliminating potential directed cycles by removing or reversing edges, ensuring the causal graph satisfies non-cyclicity, and verifying and correcting edge directions using pre-defined domain knowledge rules. The processed, constraint-satisfied directed acyclic graph is then formally defined as a local causal graph. ,Will The structural information is encoded as an adjacency matrix. Storage and transmission are performed, where matrix elements Indicates risk factors Risk factors There is a direct causal relationship.
[0062] S302: Upload the adjacency matrix to the blockchain; the smart contract on the blockchain calls the privacy-preserving aggregation protocol to aggregate the adjacency matrices of each participant to generate a global causal graph adjacency matrix.
[0063] More specifically, participating institutions use homomorphic encryption or secure multi-party computation protocols to configure the adjacency matrix. The encrypted data is then uploaded to the blockchain network. After a predetermined number of valid encrypted submissions are collected, the smart contract on the blockchain triggers a pre-defined privacy-preserving aggregation function, which performs an analysis on all valid adjacency matrices within the ciphertext domain. Perform weighted average or majority voting operations to generate an encrypted global causal graph adjacency matrix. The data is then written into the blockchain and then distributed synchronously to all participating nodes (i.e., participating institutions) using the global causal graph adjacency matrix. .
[0064] Each organization obtains and decrypts the global causal graph adjacency matrix from the blockchain. Build a local neural network prediction model and initialize the model parameters as follows: The neural network prediction model includes a main task regression layer for outputting predicted insurance odds values, and a layer for constraining the internal feature relationships and the global causal graph adjacency matrix. A consistent causal consistency auxiliary task module. It should be noted that the neural network prediction model is implemented using a fully connected neural network, and the number and dimensions of hidden layers can be configured according to the local data scale of each institution.
[0065] Specifically, step S3 uses local multimodal feature data. The neural network prediction model is trained with its corresponding payout labels, and the total loss function is optimized. To adjust the parameters, among which, To predict losses (such as mean squared error). For global causal graph adjacency matrix The loss due to causal consistency constraints These are hyperparameters, and then the parameter update amount after training is calculated. (i.e., local parameter update amount).
[0066] S4: Perform privacy protection processing on the local parameter update and upload it to the blockchain; after verification by the smart contract on the blockchain, call the secure aggregation protocol to aggregate the local parameter update of each participant to obtain the global update;
[0067] Specifically, regarding the amount of local parameter updates After privacy protection processing (e.g., adding differential privacy noise), the data is encrypted and uploaded to the blockchain network. The smart contract on the blockchain verifies the validity of the submission and then invokes a secure aggregation protocol (e.g., secure multi-party computation or a privacy-preserving federated averaging algorithm) to aggregate the data while protecting the original updates from each participant, thus calculating the global update. And write it into the blockchain.
[0068] S5: Obtain the global update value from the blockchain and update the parameters of the neural network prediction model based on the global update value to obtain the trained insurance odds prediction model.
[0069] Specifically, participating institutions obtain the global update volume synchronously from the blockchain. and update local model parameters. This allows us to obtain a well-trained insurance odds prediction model.
[0070] This embodiment provides a method for constructing an insurance odds prediction model based on causal graph federated learning, which further includes: predicting insurance odds according to the insurance odds prediction model to obtain the predicted value and the causal contribution vector of each risk factor; calculating the error between the predicted value and the actual value to obtain the local prediction error; signing the local prediction error and submitting it to the blockchain oracle; the blockchain oracle verifies and calculates the global average error, and if the average error exceeds the threshold, it automatically triggers the smart contract to start a new round of federated learning iterative neural network prediction model.
[0071] Specifically, the updated local model (i.e., the insurance odds prediction model) is used to perform forward propagation calculations on the newly input policy data, outputting the final predicted insurance odds value. Simultaneously, gradient-based attribution methods or SHAP value methods are utilized, combined with a global causal graph adjacency matrix synchronized from the blockchain. Information, calculate the impact of each risk factor in the input features on the predicted value. The degree of contribution is used to form a causal contribution vector. The model is used to make predictions on local validation or test set data, and the error between the predicted and actual values is calculated to obtain the local prediction error. Using the private key of the participating institution to Digital signatures are used to ensure the integrity of the data and the credibility of its source.
[0072] Each participating institution will sign the local prediction error. The data is submitted to an oracle smart contract deployed on the blockchain. The oracle contract verifies the validity of the digital signatures of all submitted data and calculates the global average prediction error for the current round of federated learning. The calculated global average error is then compared with a preset accuracy threshold. If the global average error exceeds the threshold, a comparison is made. Then, the oracle will automatically trigger a new round of federated learning, causing each institution to return to S3 to perform model iteration and optimization.
[0073] This embodiment provides a method for constructing an insurance odds prediction model based on causal graph federated learning. It combines multimodal feature alignment, causal discovery learning, and blockchain trusted collaboration mechanism to construct an insurance odds prediction model that supports multi-party privacy protection participation, causal perception modeling, and adaptive iterative optimization. This improves the interpretability and accuracy of insurance odds prediction and provides an effective solution for cross-institutional intelligent risk control in the insurance industry.
[0074] Example 2
[0075] This embodiment provides an apparatus for constructing an insurance odds prediction model based on causal graph federated learning, including:
[0076] The feature extraction module is used to acquire local structured policy data and unstructured claims text, and extract numerical features from the structured policy data and semantic features from the unstructured claims text.
[0077] The feature fusion module is used to align the numerical features and the semantic features in the shared latent space, and to perform concatenation or weighted fusion to obtain the fused multimodal features;
[0078] The training module is used to train a pre-built neural network prediction model using the multimodal features and the payout labels corresponding to the multimodal features, and to calculate the local parameter update amount; the neural network prediction model is constructed based on the global causal graph adjacency matrix distributed by the blockchain;
[0079] A privacy processing module is used to perform privacy protection processing on the local parameter update volume and upload it to the blockchain; after verification by the smart contract on the blockchain, the security aggregation protocol is called to aggregate the local parameter update volumes of each participant to obtain the global update volume;
[0080] The parameter update module is used to obtain the global update amount from the blockchain and update the parameters of the neural network prediction model according to the global update amount to obtain the trained insurance odds prediction model.
[0081] For details on the specific implementation of each module in the device for constructing an insurance odds prediction model based on causal graph federated learning, please refer to the above description of the limitations of the method for constructing an insurance odds prediction model based on causal graph federated learning, which will not be repeated here.
[0082] The technical features of the above embodiments can be combined in any way (as long as there is no contradiction in the combination of these technical features). For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described; these embodiments not explicitly written should also be considered to be within the scope of this specification.
Claims
1. A method for constructing an insurance odds prediction model based on causal graph federated learning, characterized in that, include: Step 1: Obtain local structured policy data and unstructured claims text, extract numerical features from the structured policy data, and extract semantic features from the unstructured claims text; Step 2: Align the numerical features and semantic features in the shared latent space, and perform concatenation or weighted fusion to obtain the fused multimodal features; Step 3: Train a pre-built neural network prediction model using the multimodal features and the corresponding payout labels, and calculate the local parameter update amount; the neural network prediction model is constructed based on the global causal graph adjacency matrix distributed by the blockchain; Step 4: Perform privacy protection processing on the local parameter update amount and upload it to the blockchain; after verification by the smart contract on the blockchain, call the secure aggregation protocol to aggregate the local parameter update amounts of each participant to obtain the global update amount; Step 5: Obtain the global update amount from the blockchain, and update the parameters of the neural network prediction model according to the global update amount to obtain the trained insurance odds prediction model.
2. The method for constructing an insurance odds prediction model based on causal graph federated learning according to claim 1, characterized in that, In step 1, when extracting numerical features from the structured policy data, numerical features representing policy attributes are extracted from the structured policy data through numerical coding and feature engineering methods.
3. The method for constructing an insurance odds prediction model based on causal graph federated learning according to claim 1, characterized in that, In step 1, when extracting semantic features from the unstructured claims text, a large language model based on the Transformer architecture, which has been fine-tuned using insurance domain corpus, is used to extract text features that represent the deep semantics of the text from the unstructured claims text.
4. The method for constructing an insurance odds prediction model based on causal graph federated learning according to claim 1, characterized in that, Step 2 specifically includes: Step 201: Construct a shared encoder network to map the numerical features and textual features in the policy samples to the same latent semantic space; Step 202: Using the contrastive learning loss function, construct positive sample pairs by combining numerical features and text features from the same policy sample, and construct negative sample pairs by combining numerical features and text features from different policy samples. Step 203: By training the shared encoder network, maximize the mutual information between positive sample pairs and minimize the similarity between negative sample pairs to achieve semantic alignment between numerical features and text features. Step 204: After completing feature alignment, concatenate or weightedly fuse the numerical and semantic features mapped to the latent semantic space to obtain the fused multimodal features.
5. The method for constructing an insurance odds prediction model based on causal graph federated learning according to claim 1, characterized in that, Step 3, the process of generating the global causal graph adjacency matrix specifically includes: Based on multimodal features, a local causal graph is constructed using a causal discovery algorithm, and an adjacency matrix is extracted from the local causal graph. The adjacency matrix is uploaded to the blockchain; the smart contract on the blockchain calls the privacy-preserving aggregation protocol to aggregate the adjacency matrices of each participant to generate a global causal graph adjacency matrix.
6. The method for constructing an insurance odds prediction model based on causal graph federated learning according to claim 1, characterized in that, In step 3, the neural network prediction model includes a main task regression layer and a causal consistency auxiliary task module. The main task regression layer is used to output the predicted insurance odds value, and the causal consistency auxiliary task module is used to constrain the internal feature relationship of the model to be consistent with the adjacency matrix of the global causal graph.
7. The method for constructing an insurance odds prediction model based on causal graph federated learning according to claim 1, characterized in that, In step 4, the secure aggregation protocol employs a secure multi-party computation or a privacy-preserving federated averaging algorithm.
8. The method for constructing an insurance odds prediction model based on causal graph federated learning according to claim 1, characterized in that, Also includes: Based on the insurance odds prediction model, insurance odds are predicted to obtain the predicted value and the causal contribution vector of each risk factor. The error between the predicted value and the actual value is calculated to obtain the local prediction error; the local prediction error is signed and submitted to the blockchain oracle; the blockchain oracle verifies and calculates the global average error, and if the average error exceeds the threshold, the smart contract is automatically triggered to start a new round of federated learning iteration of the neural network prediction model.
9. The method for constructing an insurance odds prediction model based on causal graph federated learning according to claim 8, characterized in that, The calculation process of the causal contribution vector is as follows: using gradient-based attribution methods or SHAP value methods, combined with the global causal graph adjacency matrix, the contribution of each risk factor in the input features to the predicted value is calculated to form the causal contribution vector.
10. A device for constructing an insurance odds prediction model based on causal graph federated learning, characterized in that, include: The feature extraction module is used to acquire local structured policy data and unstructured claims text, and extract numerical features from the structured policy data and semantic features from the unstructured claims text. The feature fusion module is used to align the numerical features and the semantic features in the shared latent space, and to perform concatenation or weighted fusion to obtain the fused multimodal features; The training module is used to train a pre-built neural network prediction model using the multimodal features and the payout labels corresponding to the multimodal features, and to calculate the local parameter update amount; the neural network prediction model is constructed based on the global causal graph adjacency matrix distributed by the blockchain; A privacy processing module is used to perform privacy protection processing on the local parameter update volume and upload it to the blockchain; after verification by the smart contract on the blockchain, the security aggregation protocol is called to aggregate the local parameter update volumes of each participant to obtain the global update volume; The parameter update module is used to obtain the global update amount from the blockchain and update the parameters of the neural network prediction model according to the global update amount to obtain the trained insurance odds prediction model.