An underwriting data verification method and device

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By quantifying and calculating the characteristic data and similarity of insured users, the underwriting process is automated, solving the problems of low efficiency and accuracy of manual underwriting and improving the efficiency and accuracy of underwriting.

CN114742136BActive Publication Date: 2026-06-30CHINA INSURANCE TECH CO LTD

View PDF 3 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: CHINA INSURANCE TECH CO LTD
Filing Date: 2022-03-24
Publication Date: 2026-06-30

Application Information

Patent Timeline

24 Mar 2022

Application

30 Jun 2026

Publication

CN114742136B

IPC: G06F18/22; G06Q40/08; G06F18/24

AI Tagging

Technology Topics

Feature vector Source Data Verification

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A method and system for text structure recognition and recovery based on structural features and spatial distribution analysis
CN122309518AEngineering Spatial distribution
Target text retrieval method and apparatus
CN115730037BFeature vector Data set
Identification and classification method for typical process of air pollution in karst mountainous city
CN122262820AImprove the level of refinementovercome limitations Complex mathematical operationsLagrangian trajectoryPrincipal component analysis
Systems and methods for training a camera-based perception model using machine learning
US20260175867A1Navigation instruments Scene recognition Feature vectorPerception model
Multivariate power load prediction method and system based on deep learning
CN122315634AEngineering Power usage

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

The current insurance underwriting process relies on manual review, which leads to low efficiency and accuracy.

Method used

By acquiring the characteristic data of insured users, quantifying it into feature vectors, matching it with standard feature vectors, calculating cosine similarity, and determining the underwriting result based on similarity and threshold, automated underwriting is achieved.

Benefits of technology

It improved the efficiency and accuracy of underwriting and enabled automated risk assessment of insured users.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN114742136B_ABST

Patent Text Reader

Abstract

This application provides a method and apparatus for verifying underwriting data. The method includes: acquiring feature data of a target insured user; quantifying the feature data of the target insured user to obtain a target feature vector; matching a standard feature vector of a standard body associated with the target insured user based on the age and / or gender of the target insured user; determining the target similarity between the target feature vector and the standard feature vector; and determining whether the target insured user passes verification based on the relationship between the target similarity and a preset similarity threshold. This solution solves the problem of low efficiency and accuracy in existing underwriting processes that require manual intervention, effectively improving underwriting accuracy and efficiency, and enabling automated underwriting, thus automating the entire underwriting system.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application belongs to the field of data processing technology, and in particular relates to a method and apparatus for verifying underwriting data. Background Technology

[0002] Insurance verification (i.e., underwriting) has always been a crucial part of the insurance underwriting process. It involves checking the risk and compliance of policyholders to ensure the validity of their insurance policies. This not only helps insurance companies mitigate risks but also reduces the risks for policyholders when making claims.

[0003] Currently, risk verification for insured users is generally conducted manually, with underwriters reviewing and judging each item one by one. This results in low efficiency and accuracy in underwriting.

[0004] There is currently no effective solution to the above problems. Summary of the Invention

[0005] The purpose of this application is to provide a method and apparatus for verifying underwriting data, so as to improve the efficiency and accuracy of underwriting.

[0006] This application provides a method and apparatus for verifying underwriting data, which is implemented as follows:

[0007] A method for verifying underwriting data, the method comprising:

[0008] Obtain characteristic data of the target insured users;

[0009] The feature data of the target insured user are quantified to obtain the target feature vector;

[0010] Based on the age and / or gender of the target insured user, a standard feature vector of a standard body associated with the target insured user is matched;

[0011] Determine the target similarity between the target feature vector and the standard feature vector;

[0012] Based on the relationship between the target similarity and the preset similarity threshold, it is determined whether the target insured user has passed the verification.

[0013] In one implementation, determining the target similarity between the target feature vector and the standard feature vector includes:

[0014] Calculate the cosine similarity between the target feature vector and the standard feature vector:

[0015] The calculated cosine similarity is used as the target similarity between the target feature vector and the standard feature vector.

[0016] In one implementation, the target feature vector and the standard feature vector are divided into multiple groups according to dimensions, and each group includes at least two or more feature factors.

[0017] In one implementation, determining the target similarity between the target feature vector and the standard feature vector includes:

[0018] Calculate the cosine similarity between the target feature vector and the standard feature vector corresponding to each of the multiple groups to obtain multiple intermediate similarities;

[0019] Obtain the weight value corresponding to each group, and use it as the weight value of each intermediate similarity;

[0020] The target similarity is calculated based on each intermediate similarity and its corresponding weight value.

[0021] In one implementation, intermediate similarity is calculated according to the following formula:

[0022]

[0023] Among them, Sim(P a Age ix ) a NatV(P) represents the intermediate similarity of group a. a Age ix ) v StdV(P) represents the target feature vector. a Age ix ) v StdF represents the standard eigenvector. o (P a Age ix ) represents the grouping P corresponding to the standard individuals of age stage i and gender x. a eigenfactors, NatF o (P a Age ix ) represents the feature factors of group a corresponding to the target insured users of age stage i and gender x, and m represents the number of feature factors contained in group a.

[0024] In one implementation, target similarity is calculated according to the following formula:

[0025]

[0026] Among them, FS (Age) ix Sim(P) represents the final target similarity, where i represents the age group and x represents the gender. a Age ix) a Weight represents the median similarity of group a. a represents the weight value corresponding to group a, and v represents the number of groups.

[0027] In one implementation, the weight value for each dimension is determined according to the following formula:

[0028]

[0029] Among them, Weight a Let v represent the weight value corresponding to group a, v represent the number of groups, and w represent the weight value corresponding to group a. ak wz represents the relevance of group a itself. k Indicates the relevance of group 'a' to other groups, adj. a This indicates the adjustment value.

[0030] In one implementation, determining whether the target insured user passes verification based on the relationship between the target similarity and a preset similarity threshold includes:

[0031] If the target similarity is less than the similarity threshold, the verification is determined to have failed.

[0032] If the target similarity is greater than or equal to the similarity threshold, the verification is deemed successful.

[0033] In one implementation, after determining that the verification failed when the target similarity is less than the similarity threshold, the method further includes:

[0034] Calculate the relative difference between the target similarity and the similarity threshold;

[0035] Based on the relationship between the similarity difference and the preset comparison value, it is determined whether the target insured user has passed the verification.

[0036] A device for verifying underwriting data, comprising:

[0037] The acquisition module is used to acquire the characteristic data of the target insured user;

[0038] The quantization module is used to quantize the feature data of the target insured user to obtain the target feature vector;

[0039] The matching module is used to match the standard feature vector of the standard body associated with the target insured user based on the age and / or gender of the target insured user;

[0040] A determination module is used to determine the target similarity between the target feature vector and the standard feature vector;

[0041] The verification module is used to determine whether the target insured user passes the verification based on the relationship between the target similarity and a preset similarity threshold.

[0042] A terminal device includes a processor and a memory for storing processor-executable instructions, wherein the processor implements the steps of the method described above when executing the instructions.

[0043] A computer-readable storage medium having a computer program / instructions stored thereon, which, when executed by a processor, implement the steps of the above-described method.

[0044] The underwriting data verification method and apparatus provided in this application quantify the feature data of the target insured user to obtain a target feature vector, then match it with the standard feature vector of the corresponding standard body, determine the similarity between the two, and thus determine whether the target insured user meets the requirements of the standard body, thereby realizing the verification of the insured user. This solution solves the problem of low efficiency and accuracy in existing underwriting processes that require manual intervention, achieving a significant improvement in underwriting accuracy and efficiency. Attached Figure Description

[0045] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0046] Figure 1 This is a flowchart of one embodiment of the underwriting data verification method provided in this application;

[0047] Figure 2 This is a schematic diagram illustrating the division of feature factors according to dimensions provided in this application;

[0048] Figure 3 This is a schematic diagram illustrating the method for determining the weight values of each dimension provided in this application;

[0049] Figure 4 This is a flowchart of one embodiment of the insurance verification method provided in this application;

[0050] Figure 5 This is a hardware structure block diagram of an electronic device for a method of verifying underwriting data provided in this application;

[0051] Figure 6 This is a schematic diagram of the module structure of an underwriting data verification device provided in this application. Detailed Implementation

[0052] To enable those skilled in the art to better understand the technical solutions in this application, the technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of this application.

[0053] Given that current underwriting generally uses manual underwriting methods, which have low efficiency and accuracy, this example considers quantifying user data and setting up quantitative data for a standard body to obtain quantitative data for the insured user. The similarity between the two is then compared to determine whether the target user meets the underwriting requirements.

[0054] Specifically, by quantifying the feature vector of each insured user and the standard feature vector of the insured object, the similarity between the feature vector of each natural insured user and the standard feature vector can be calculated to assess the risk level of the insured user, thereby meeting the needs of automated underwriting and improving underwriting efficiency and accuracy.

[0055] Figure 1 This is a flowchart of one embodiment of the underwriting data verification method provided in this application. Although this application provides method operation steps or apparatus structures as shown in the following embodiments or figures, more or fewer operation steps or module units may be included in the method or apparatus based on conventional or non-inventive effort. In steps or structures where there is no logically necessary causal relationship, the execution order of these steps or the module structure of the apparatus is not limited to the execution order or module structure described in the embodiments and figures of this application. When the method or module structure is applied in actual devices or terminal products, it can be executed sequentially or in parallel according to the method or module structure shown in the embodiments or figures (e.g., in a parallel processor or multi-threaded processing environment, or even a distributed processing environment).

[0056] like Figure 1 As shown, the verification method for this underwriting data may include the following steps:

[0057] Step 101: Obtain the characteristic data of the target insured user;

[0058] For example, the characteristic data of the aforementioned target insured users can be obtained from historical data, such as the user's past insurance records, claims records, health records, etc., as the user's characteristic data.

[0059] Step 102: Quantize the feature data of the target insured user to obtain the target feature vector;

[0060] Since this involves verifying the insured user, the feature data may include, but is not limited to, one or more of the following: age, gender, BMI, occupation, income, region, medical history, past insurance policies, claims history, exercise habits, smoking, alcohol abuse, etc. Quantifying these feature data will yield a feature vector reflecting the target insured user's situation.

[0061] Step 103: Based on the age and / or gender of the target insured user, match the standard feature vector of the standard body associated with the target insured user;

[0062] Considering that insured users differ in gender, age, and physical condition, to more accurately and precisely determine their characteristics, they can be pre-grouped according to age (in years) and gender. For example, they could be grouped into 5-year groups, such as 21 groups for males: Age M [0,4]、Age M [5,9]…Age M [100,∞); Women are divided into 21 groups: Age F [0,4]、Age F [5,9]…Age F [100,∞), denoted as Age ix Where i is a specific number of groups, x is a gender identifier, M refers to male, and F refers to female.

[0063] That is, generate characteristic indicators corresponding to each age range and gender, and denote the generated indicator characteristics as F. i (Age ix ) = F i Age[a,b], where a and b represent the upper and lower limits of age, x is the gender identifier, and i is the number of groups.

[0064] However, it is worth noting that the age groups grouped by age and gender mentioned above are only an example. In actual implementation, other age group division modes can also be used. For example, they do not necessarily have to be divided according to fixed age intervals. They can be determined by analysis of similar age groups, thereby flexibly implementing age group settings.

[0065] The standard feature vector for a standard body can be pre-defined based on historical data analysis, or it can be generated in real-time by statistically analyzing historical data when verification is required. For example, the values of each feature factor corresponding to each age group and gender can be determined based on historical data analysis, thereby forming the standard feature vector for that age group and gender. This vector is then associated and stored with the age group and gender. When there is an underwriting verification requirement, the age and gender of the target insured user are determined, and the corresponding standard body is matched. Alternatively, when there is an underwriting requirement, the age and gender of the target insured user are determined, and then, based on the age and gender of the target insured user, the corresponding age group and gender of the successfully underwritten data are obtained from historical data. This data is then aggregated and analyzed to obtain the corresponding standard body. The specific method used to form the standard feature vector of the standard body can be set according to actual needs, and this application does not limit this.

[0066] Step 104: Determine the target similarity between the target feature vector and the standard feature vector;

[0067] In implementation, determining the similarity between the target feature vector and the standard feature vector can be achieved by determining the cosine similarity between them, and using the determined cosine similarity as the target similarity. That is, determining the target similarity between the target feature vector and the standard feature vector can include: calculating the cosine similarity between the target feature vector and the standard feature vector; and using the calculated cosine similarity as the target similarity between the target feature vector and the standard feature vector.

[0068] To achieve fine-grained control over feature vectors, considering that the target bidding users are essentially people, and that the segmentation of human feature data can be multi-dimensional, the target feature vector and the standard feature vector can be divided into multiple groups according to dimensions. Each group includes at least two or more feature factors. Accordingly, when determining the target similarity between the target feature vector and the standard feature vector, the cosine similarity between the target feature vector and the standard feature vector in each of the multiple groups can be calculated first to obtain multiple intermediate similarities. Then, the weight value corresponding to each group is obtained as the weight value of each intermediate similarity. Based on each intermediate similarity and its corresponding weight value, the target similarity is calculated.

[0069] For example, underwriting risk can be categorized according to multiple dimensions such as basic physical condition, occupation, financial capacity, medical history, and lifestyle habits. Specifically, the characteristic factors can be divided into: BMI underwriting risk, occupational underwriting risk, underwriting risk based on policy history and claims record, financial capacity risk, medical history underwriting risk, and lifestyle underwriting risk. When further refining the grouping, let's assume the total characteristic factors include: age, BMI, gender, geographic representation, medical history, product type, occupation, history of policy surrenders, history of policy lapses, insured amount, claims history within one year, claims history within two years, claims history within five years, claims history exceeding five years, dangerous sports habits, smoking, and alcohol abuse.

[0070] These characteristic factors can be categorized as follows:

[0071] 1) Underwriting risks related to medical history, including: age, gender, medical history, and regional identifiers;

[0072] 2) Underwriting risks related to the policy's claims history, including: product type, claims history within one year, claims history within two years, claims history within five years, claims history exceeding five years, records of policy cancellation, and records of policy lapse.

[0073] 3) Occupational underwriting risks, including: age, gender, regional identifier, and occupation;

[0074] 4) Lifestyle-related underwriting risks, including: dangerous sports habits, smoking, alcohol abuse, age, and gender;

[0075] 5) Financial capability risks, including: age, gender, region, product type, insured amount, history of policy surrender, and history of policy lapse.

[0076] 6) BM underwriting risks, including: age, BMI, and gender.

[0077] However, it is worth noting that the feature factors listed in the above example, as well as the classification method of the feature factors, are only an exemplary description. In actual implementation, other feature factors can be selected or classified according to other classification methods, depending on the needs and circumstances. This application does not limit this.

[0078] The aforementioned intermediate similarity can be calculated using the following formula:

[0079]

[0080] Among them, Sim(P a Age ix ) a NatV(P) represents the intermediate similarity of group a. a Age ix ) vStdV(P) represents the target feature vector. a Age ix ) v StdF represents the standard eigenvector. o (P a Age ix ) represents the grouping P corresponding to the standard individuals of age stage i and gender x. a eigenfactors, NatF o (P a Age ix ) represents the feature factors of group a corresponding to the target insured users of age stage i and gender x, and m represents the number of feature factors contained in group a.

[0081] The target similarity mentioned above can be calculated using the following formula:

[0082]

[0083] Among them, FS (Age) ix Sim(P) represents the final target similarity, where i represents the age group and x represents the gender. a Age ix ) a Weight represents the median similarity of group a. a represents the weight value corresponding to group a, and v represents the number of groups.

[0084] Considering the interrelationships between dimensions, the weight of each dimension can be determined by taking into account their correlation. For example, the weight value for each dimension can be determined using the following formula:

[0085]

[0086] Among them, Weight a Let v represent the weight value corresponding to group a, v represent the number of groups, and w represent the weight value corresponding to group a. ak wz represents the relevance of group a itself. k Indicates the relevance of group 'a' to other groups, adj. a This indicates the adjustment value.

[0087] Step 105: Determine whether the target insured user has passed the verification based on the relationship between the target similarity and the preset similarity threshold.

[0088] During verification, if the target similarity is less than the similarity threshold, the verification is determined to have failed; if the target similarity is greater than or equal to the similarity threshold, the verification is determined to have passed.

[0089] Furthermore, in order to make more detailed distinctions and improve the probability of verification passing and the success rate of insurance application, conditions can be set for passing verification if it fails. For this purpose, if the target similarity is less than the similarity threshold, after determining that the verification has failed, the relative difference between the target similarity and the similarity threshold can be calculated; based on the relationship between the similarity difference and the preset comparison value, it can be determined whether the target insured user has passed the verification.

[0090] That is, when classifying the judgment results, in addition to determining the bidders who meet the standard and those who do not, a conditional standard can also be set up. That is, although it does not meet the standard, it can be used as a conditional standard if it meets certain conditions.

[0091] For example, if the similarity is less than the threshold, the relative difference can be calculated. If the relative difference GapValue < 0.5, it can be considered a conditional standard object. If the relative difference GapValue >= 0.5, it can be considered a rejected object.

[0092] The relative difference can be calculated using the following formula:

[0093]

[0094] Where GapValue represents the relative difference, Similarity represents the final similarity, and thresholdValue represents the similarity threshold.

[0095] The aforementioned underwriting data verification method can be executed automatically. That is, it can acquire the characteristic data of the target insured user, automatically quantify and compare the data for similarity verification. Through automated processing, the goal of automated underwriting can be achieved.

[0096] The above method will be described below with reference to a specific embodiment. However, it is worth noting that this specific embodiment is only for better illustration of this application and does not constitute an improper limitation of this application.

[0097] In this example, for efficient underwriting, considering the possibility of quantifying the data of the standard body (i.e., the insured target) (e.g., age, gender, BMI, occupation, income, region, medical history, past insurance policies, claims history, exercise habits, smoking, alcohol abuse), we obtain characteristic entities. Then, we quantify the data of each insured user (which can be called the insured target), for example, converting age, gender, BMI, occupation, income, region, medical history, past insurance policies, claims history, exercise habits, smoking, alcohol abuse, etc., into characteristic entities. Finally, we compare the characteristic entities of the insured user with those of the standard body to determine the insured user's risk level.

[0098] That is, by quantifying the feature vector of each insured user and the standard feature vector of the insured object, the similarity (e.g., cosine similarity) between the feature vector of each natural insured object and the standard feature vector is calculated to assess the risk level of the insured user, thereby meeting the needs of automated underwriting and improving underwriting efficiency and accuracy.

[0099] Specifically, the insurance data verification method provided in this example may include the following steps:

[0100] Step 1: Collect insurance application records, claims records, and health records;

[0101] Specifically, n feature factors can be collected for each insured user, where n is the total number of feature factors;

[0102] Step 2: Considering that different ages may have different medical conditions and physical statuses in underwriting, to more accurately determine the characteristics of insured users, pre-grouping can be done according to age (in years), gender, etc. For example, groups can be formed in 5-year intervals; for example, males can be divided into 21 groups: Age M [0,4]、Age M [5,9]…Age M [100,∞); Women are divided into 21 groups: Age F [0,4]、Age F [5,9]…Age F [100,∞), denoted as Age ix Where i is a specific number of groups, x is a gender identifier, M refers to male, and F refers to female.

[0103] That is, generate characteristic indicators corresponding to each age range and gender, and denote the generated indicator characteristics as F. i (Age ix ) = F i Age[a,b], where a and b represent the upper and lower limits of age, x is the gender identifier, and i is the number of groups.

[0104] Step 3: Group the feature factors;

[0105] This is primarily because insured individuals are human, and the statistical analysis of their characteristics involves multiple dimensions, including basic physical condition, occupation, financial capacity, medical history, and lifestyle habits. During the underwriting process, each dimension influences the assessment of the underwriting results to varying degrees. Therefore, based on the overall characteristic factors, the characteristic indicators can be further subdivided and grouped according to dimensions such as BMI underwriting risk, occupational underwriting risk, historical claims history underwriting risk, financial capacity risk, medical history underwriting risk, and lifestyle underwriting risk. Each group can be denoted as P. a Age ix , where a represents the number of subgroups for further refinement.

[0106] For example, when performing detailed grouping, it can be done according to the following... Figure 2 Grouping can be done in various ways, for example, the total characteristic factors include: age, BMI, gender, region, medical history, product type, occupation, history of policy surrender, history of policy lapse, insured amount, claims history within one year, claims history within two years, claims history within five years, claims history beyond five years, preference for dangerous sports, smoking, and alcoholism.

[0107] These characteristic factors can be categorized as follows:

[0108] 1) Underwriting risks related to medical history, including: age, gender, medical history, and regional identifiers;

[0109] 2) Underwriting risks related to the policy's claims history, including: product type, claims history within one year, claims history within two years, claims history within five years, claims history exceeding five years, records of policy cancellation, and records of policy lapse.

[0110] 3) Occupational underwriting risks, including: age, gender, regional identifier, and occupation;

[0111] 4) Lifestyle-related underwriting risks, including: dangerous sports habits, smoking, alcohol abuse, age, and gender;

[0112] 5) Financial capability risks, including: age, gender, region, product type, insured amount, history of policy surrender, and history of policy lapse.

[0113] 6) BM underwriting risks, including: age, BMI, and gender.

[0114] However, it is worth noting that the feature factors listed in the above example, as well as the classification method of the feature factors, are only an exemplary description. In actual implementation, other feature factors can be selected or classified according to other classification methods, depending on the needs and circumstances. This application does not limit this.

[0115] Step 4: Based on the collected data, the determined feature factors, and the pre-defined feature groups, the data can be segmented by age and gender. ix The standard characteristic entity underwriting factor characteristic data within each group are calculated and statistically analyzed. Each characteristic factor can be denoted as StdF1(P) in turn. a Age ix ),StdF2(P a Age ix ),StdF3(P a Age ix ...StdF m (P a Age ix This allows the generation of feature vectors StdV(P) for specific age groups and genders of characteristic entities. a Age ix v = StdF1(P a Age ix ),StdF2(P a Age ix ),StdF3(P a Age ix ...StdF m (P a Age ix ), where m is the group P a The number of characteristic factors of the characteristic statistical index. That is, the characteristic data that forms the standard body.

[0116] Step 5: For policyholders, we can obtain their data and then group them according to age and gender. ix The data of each characteristic factor of the insured user are calculated separately for each group, and each characteristic factor is denoted as NatF1(P) in turn. a Age ix ), NatF2(P a Age ix ), NatF3(P a Age ix ...NatF m (P a Age ix This generates a feature vector NatV(P) for the entity data of the insured user in the corresponding age group and gender. a Ageix v = NatF1(P a Age ix ), NatF2(P a Age ix ), NatF3(P a Age ix ...NatF m (P a Age ix ), where m is the group P a The number of characteristic factors of a characteristic statistical indicator.

[0117] Step 6: Calculate the cosine similarity between the feature factor vector of each insured user and the feature factor vector of the standard body, and denote the result as Sim(P). a Age ix ) a , where 'a' is the group label.

[0118] The cosine similarity is calculated using the following formula:

[0119]

[0120] Among them, NatV(P) a Age ix ) v StdV(P) represents the feature factor vector of bidding users. a Age ix ) v StdF represents the feature factor vector of the standard body. o (P a Age ix ) represents the group P corresponding to the standard body. a Characteristic factors corresponding to age stage i and gender x, NatF o (P a Age ix ) represents the group P corresponding to the bidding user. a Characteristic factors corresponding to age stage i and gender x.

[0121] Step 7: Use the cosine similarity of each group as the intermediate similarity, and calculate the final similarity based on the intermediate similarity;

[0122] Specifically, the final similarity can be calculated by assigning weights to each set of feature factor vectors. The final similarity can be calculated using the following formula:

[0123]

[0124] Among them, FS (Age) ixSim(P) represents the final similarity score. a Age ix ) a Weight represents the median similarity among the feature factor vectors of group a. a denoted by 'a', where 'v' represents the weight value of the feature factor vectors in group a, and 'v' represents the number of rows in the feature matrix. The number of intermediate similarities is equal to the number of groups of feature factor vectors.

[0125] In implementation, weights can be set for each group based on BMI underwriting risk, occupational underwriting risk, underwriting and claims history underwriting risk, financial capacity risk, medical history underwriting risk, and lifestyle underwriting risk. If the group is divided into 'a' groups, then 'a' intermediate similarity scores will be generated. Considering that in actual business scenarios, risk bias may vary depending on the product type, for example, accident insurance will place greater emphasis on the underwriting of the insured's lifestyle and occupational risks. This is because an insured person with extreme sports habits or a criminal police officer's profession would have a higher risk of accidents than someone without extreme sports or a regular occupation. Therefore, higher weights can be assigned to lifestyle and occupational risks. Similarly, for medical products, greater attention will be paid to medical history and claims history risks. Therefore, higher weights can be assigned to these risks for medical products. The weight settings for each group for each product can be set according to actual product needs, and this application does not impose any restrictions on this.

[0126] When determining weights, one can follow the following steps: Figure 3 The method shown determines the weights of each feature factor grouping vector in different products: obtain past underwriting conclusion data, decompose the underwriting conclusion data, analyze and statistically analyze each conclusion, and aggregate the underwriting grouping dimensions to classify the conclusion data. Analyze the statistical data of the conclusions for different dimensions on different products to abstract the weight data, and then manually calibrate the weight data to obtain the weight values of each dimension (i.e., each group).

[0127] The weights for each dimension can be calculated by matrixing the statistical data. This takes into account the interactions between dimensions, resulting in more accurate weight values. For example, as shown in Table 1 below, a matrix w with rows a and columns k can be constructed. ak By performing a division operation on the data elements in row a and column k for the number of items, the following table is obtained:

[0128] Table 1

[0129]

[0130]

[0131] Then, the weight value for each dimension is calculated using the following formula:

[0132]

[0133] Among them, w ak wz represents the element in row a, column k. k Represents matrix w ak A vector-normalized array, adj a This indicates a manually adjusted value. If it is 0, it means the result is calculated entirely based on the system simulation. Otherwise, it means the weights were manually adjusted.

[0134] Step 8: After determining the final similarity, it can be compared with the preset similarity threshold. If it is greater than the similarity threshold, it can be determined as a bidding user that meets the standard. If it is less than the similarity threshold, it can be determined as a bidding user that does not meet the standard.

[0135] Furthermore, when classifying the judgment results, in addition to identifying bidders who meet the standard and those who do not, a conditional standard can be set up. That is, although a bidder does not meet the standard, it can be used as a conditional standard if it meets certain conditions.

[0136] That is, if the similarity is less than the threshold, the relative difference can be calculated. If the relative difference GapValue < 0.5, it can be regarded as a conditional standard object. If the relative difference GapValue >= 0.5, it can be regarded as a rejection object.

[0137] The relative difference can be calculated using the following formula:

[0138]

[0139] Where GapValue represents the relative difference, Similarity represents the final similarity, and thresholdValue represents the similarity threshold.

[0140] Specifically, it can be done according to, for example Figure 4 The following process is used for insurance verification:

[0141] 1) Obtain data on natural bidding targets (i.e., bidding users), and then abstract the features into specific data, that is, realize feature calculation;

[0142] 2) Obtain data of the standard bidding subject (i.e., the standard body), and then perform feature simulation;

[0143] 3) Group the features according to their dimensions and then vectorize them;

[0144] 4) Calculate the cosine similarity of the characteristics of natural bids and standard bids;

[0145] 5) Introduce the weight data set for each dimension and calculate the final similarity;

[0146] 6) Determine whether the final similarity is less than the preset standard threshold. If it is less, output the risk level; if it is greater, determine that the underwriting is approved.

[0147] The following is a specific example to illustrate this:

[0148] Assuming a standard feature entity StdEntity is defined, and the target age group is [20, 24], its specific features are shown in Table 2 below:

[0149] Table 2

[0150]

[0151] Assuming bidder A1 targets an age group of [20, 24], their specific characteristics are shown in Table 3 below:

[0152] Table 3

[0153]

[0154] Assuming bidder A2 targets an age group of [20, 24], their specific characteristics are shown in Table 4 below:

[0155] Table 4

[0156]

[0157]

[0158] Assuming bidder A3 targets an age group of [20, 24], their specific characteristics are shown in Table 5 below:

[0159] Table 5

[0160]

[0161]

[0162] Based on the insurance verification method given in the example above, calculations were performed on the above bidding users A1, A2, and A3 respectively, and the results are shown in Table 6 below:

[0163] Table 6

[0164] Bidding users A1 A1 A3 Similarity 0.777087 0.956183 0.556244 Risk level Conditions, standards, and targets Standard body reject

[0165] In the example above, by verifying the risk of the insured, the risk level of the insured can be determined. The premium can then be adjusted or additional charges can be added based on the risk level, so as to provide a reasonable insurance liability plan for a specific customer group.

[0166] In the example above, by analyzing standard body data and insured user data, and dividing them by age group and gender, specific characteristic factors are abstracted, and the vector similarity between them and the standard body is calculated to determine the risk level of the insured user. At the same time, when determining the similarity, the characteristic factors are grouped and calculated according to dimensions, and a weight factor is set for each dimension to obtain the final similarity result. By setting the weight value based on the dimensions, the accuracy of the verification results can be improved as needed, and the controllability of the verification process can be enhanced.

[0167] The methods and embodiments provided in the above-described embodiments of this application can be executed in a mobile terminal, computer terminal, or similar computing device. Taking operation on an electronic device as an example... Figure 5 This is a hardware structure block diagram of an electronic device for a method of verifying underwriting data provided in this application. (Example:) Figure 5 As shown, the electronic device 10 may include one or more (only one is shown in the figure) processors 02 (processors 02 may include, but are not limited to, microprocessors MCUs or programmable logic devices FPGAs, etc.), a memory 04 for storing data, and a transmission module 06 for communication functions. Those skilled in the art will understand that... Figure 5 The structure shown is for illustrative purposes only and does not limit the structure of the electronic device described above. For example, electronic device 10 may also include... Figure 5 The more or fewer components shown, or having the same Figure 5 The different configurations shown.

[0168] The memory 04 can be used to store software programs and modules of application software, such as the program instructions / modules corresponding to the underwriting data verification method in this embodiment. The processor 02 executes various functional applications and data processing by running the software programs and modules stored in the memory 04, thereby realizing the application's underwriting data verification method. The memory 04 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory 04 may further include memory remotely located relative to the processor 02, and these remote memories can be connected to the electronic device 10 via a network. Examples of such networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.

[0169] The transmission module 06 is used to receive or send data via a network. Specific examples of the network described above may include a wireless network provided by the communication provider of the electronic device 10. In one example, the transmission module 06 includes a Network Interface Controller (NIC), which can connect to other network devices via a base station to communicate with the Internet. In another example, the transmission module 06 may be a Radio Frequency (RF) module, used for wireless communication with the Internet.

[0170] At the software level, the aforementioned underwriting data verification device can, for example, Figure 6 As shown, it may include:

[0171] The acquisition module 601 is used to acquire the characteristic data of the target insured user;

[0172] Quantization module 602 is used to quantize the feature data of the target insured user to obtain a target feature vector;

[0173] Matching module 603 is used to match the standard feature vector of the standard body associated with the target insured user based on the age and / or gender of the target insured user;

[0174] The determining module 604 is used to determine the target similarity between the target feature vector and the standard feature vector;

[0175] The verification module 605 is used to determine whether the target insured user passes the verification based on the relationship between the target similarity and the preset similarity threshold.

[0176] In one implementation, the determining module 604 can specifically be used to calculate the cosine similarity between the target feature vector and the standard feature vector: the calculated cosine similarity is used as the target similarity between the target feature vector and the standard feature vector.

[0177] In one implementation, the target feature vector and the standard feature vector can be divided into multiple groups according to dimensions, and each group includes at least two or more feature factors.

[0178] In one implementation, the determining module 604 can be specifically used to calculate the cosine similarity between the target feature vector and the standard feature vector corresponding to each of the multiple groups, to obtain multiple intermediate similarities; obtain the weight value corresponding to each group as the weight value of each intermediate similarity; and calculate the target similarity based on each intermediate similarity and the weight value corresponding to each intermediate similarity.

[0179] In one implementation, the determining module 604 can specifically calculate the intermediate similarity according to the following formula:

[0180]

[0181] Among them, Sim(P a Age ix ) a NatV(P) represents the intermediate similarity of group a. a Age ix ) v StdV(P) represents the target feature vector. a Age ix ) v StdF represents the standard eigenvector. o (P a Age ix ) represents the grouping P corresponding to the standard individuals of age stage i and gender x. a eigenfactors, NatF o (P a Age ix ) represents the feature factors of group a corresponding to the target insured users of age stage i and gender x, and m represents the number of feature factors contained in group a.

[0182] In one implementation, the determining module 604 can specifically calculate the target similarity according to the following formula:

[0183]

[0184] Among them, FS (Age) ix Sim(P) represents the final target similarity, where i represents the age group and x represents the gender. a Age ix ) a Weight represents the median similarity of group a. a represents the weight value corresponding to group a, and v represents the number of groups.

[0185] In one implementation, the determining module 604 can specifically determine the weight value corresponding to each dimension according to the following formula:

[0186]

[0187] Among them, Weight a Let v represent the weight value corresponding to group a, v represent the number of groups, and w represent the weight value corresponding to group a. ak wz represents the relevance of group a itself. k Indicates the relevance of group 'a' to other groups, adj. a This indicates the adjustment value.

[0188] In one implementation, the verification module 605 may specifically determine that the verification has failed if the target similarity is less than the similarity threshold, and determine that the verification has passed if the target similarity is greater than or equal to the similarity threshold.

[0189] In one implementation, the verification module 605 may, after determining that the verification has failed when the target similarity is less than the similarity threshold, calculate the relative difference between the target similarity and the similarity threshold; and determine whether the target insured user has passed the verification based on the relationship between the similarity difference and a preset comparison value.

[0190] This application also provides a specific implementation of an electronic device capable of implementing all steps of the underwriting data verification method in the above embodiments. The electronic device specifically includes: a processor, a memory, a communication interface, and a bus; wherein the processor, memory, and communication interface communicate with each other via the bus; the processor is used to call a computer program in the memory, and when the processor executes the computer program, it implements all steps of the underwriting data verification method in the above embodiments. For example, when the processor executes the computer program, it implements the following steps:

[0191] Step 1: Obtain the characteristic data of the target insured user;

[0192] Step 2: Quantize the feature data of the target insured user to obtain the target feature vector;

[0193] Step 3: Based on the age and / or gender of the target insured user, match the standard feature vector of the standard body associated with the target insured user;

[0194] Step 4: Determine the target similarity between the target feature vector and the standard feature vector;

[0195] Step 5: Determine whether the target insured user has passed verification based on the relationship between the target similarity and the preset similarity threshold.

[0196] As described above, this application embodiment quantifies the feature data of the target insured user to obtain a target feature vector, then matches it with the standard feature vector of the corresponding standard body, determines the similarity between the two, and thus determines whether the target insured user meets the requirements of the standard body, thereby realizing the verification of the insured user. This solution solves the problem of low efficiency and accuracy in existing underwriting processes that require manual processing, achieving a significant improvement in underwriting accuracy and efficiency.

[0197] Embodiments of this application also provide a computer-readable storage medium capable of implementing all steps of the underwriting data verification method in the above embodiments. The computer-readable storage medium stores a computer program that, when executed by a processor, implements all steps of the underwriting data verification method in the above embodiments. For example, when the processor executes the computer program, it implements the following steps:

[0198] Step 1: Obtain the characteristic data of the target insured user;

[0199] Step 2: Quantize the feature data of the target insured user to obtain the target feature vector;

[0200] Step 3: Based on the age and / or gender of the target insured user, match the standard feature vector of the standard body associated with the target insured user;

[0201] Step 4: Determine the target similarity between the target feature vector and the standard feature vector;

[0202] Step 5: Determine whether the target insured user has passed verification based on the relationship between the target similarity and the preset similarity threshold.

[0203] As described above, this application embodiment quantifies the feature data of the target insured user to obtain a target feature vector, then matches it with the standard feature vector of the corresponding standard body, determines the similarity between the two, and thus determines whether the target insured user meets the requirements of the standard body, thereby realizing the verification of the insured user. This solution solves the problem of low efficiency and accuracy in existing underwriting processes that require manual processing, achieving a significant improvement in underwriting accuracy and efficiency.

[0204] The acquisition, storage, use, and processing of data in this application all comply with relevant national laws and regulations. The various embodiments in this specification are described in a progressive manner; similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, for hardware + program embodiments, since they are basically similar to the method embodiments, the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.

[0205] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired result. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired result. In some embodiments, multitasking and parallel processing are possible or may be advantageous.

[0206] While this application provides the method operation steps as described in the embodiments or flowcharts, more or fewer operation steps may be included based on conventional or non-inventive labor. The order of steps listed in the embodiments is merely one possible execution order among many and does not represent the only execution order. In actual device or client product execution, the methods shown in the embodiments or drawings can be executed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment).

[0207] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer. Specifically, a computer can be, for example, a personal computer, a laptop computer, an in-vehicle human-machine interaction device, a cellular phone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or any combination of these devices.

[0208] While this specification provides method operation steps as described in the embodiments or flowcharts, more or fewer operation steps may be included based on conventional or non-inventive means. The order of steps listed in the embodiments is merely one possible execution order among many and does not represent the only execution order. In actual device or end product execution, the methods shown in the embodiments or drawings may be executed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment, or even a distributed data processing environment). The terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, product, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, product, or apparatus. Without further limitations, the presence of other identical or equivalent elements in the process, method, product, or apparatus that includes said elements is not excluded.

[0209] For ease of description, the above devices are described in terms of function, divided into various modules. Of course, in implementing the embodiments of this specification, the functions of each module can be implemented in one or more software and / or hardware components, or a module that performs the same function can be implemented by a combination of multiple sub-modules or sub-units. The device embodiments described above are merely illustrative. For example, the division of units is only a logical functional division; in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces, or indirect coupling or communication connection between devices or units, and may be electrical, mechanical, or other forms.

[0210] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0211] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0212] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0213] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0214] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0215] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.

[0216] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, the embodiments of this specification can take the form of entirely hardware embodiments, entirely software embodiments, or embodiments combining software and hardware aspects. Furthermore, the embodiments of this specification can take the form of computer program products implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0217] The embodiments described in this specification can be described in the general context of computer-executable instructions, such as program modules, that are executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a specific task or implement a specific abstract data type. The embodiments of this specification can also be practiced in distributed computing environments where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.

[0218] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, system embodiments are basically similar to method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments. In the description of this specification, the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., refer to specific features, structures, materials, or characteristics described in connection with that embodiment or example, which are included in at least one embodiment or example of the embodiments in this specification. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described can be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification and the features of different embodiments or examples.

[0219] The above description is merely an embodiment of the present specification and is not intended to limit the embodiments of the present specification. For those skilled in the art, various modifications and variations can be made to the embodiments of the present specification. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principle of the embodiments of the present specification should be included within the scope of the claims of the embodiments of the present specification.

Claims

1. A method for verifying underwriting data, the method comprising: The method includes: Obtain characteristic data of the target insured users; The feature data of the target insured user are quantified to obtain the target feature vector; Based on the age and / or gender of the target insured user, a standard feature vector of a standard body associated with the target insured user is matched; Determine the target similarity between the target feature vector and the standard feature vector; Based on the relationship between the target similarity and the preset similarity threshold, it is determined whether the target insured user has passed the verification. The target feature vector and the standard feature vector are divided into multiple groups according to dimensions, and each group includes at least two or more feature factors. Determining the target similarity between the target feature vector and the standard feature vector includes: Calculate the cosine similarity between the target feature vector and the standard feature vector corresponding to each of the multiple groups to obtain multiple intermediate similarities; Obtain the weight value corresponding to each group, and use it as the weight value of each intermediate similarity; The target similarity is calculated based on each intermediate similarity and its corresponding weight value. The weight value for each group is determined according to the following formula: wherein, representing a group corresponding weight value, representing a number of groups, representing a group a degree of relevance of itself, representing a group a degree of relevance to other groups, representing an adjustment value.

2. The method according to claim 1, characterized in that, Determining the target similarity between the target feature vector and the standard feature vector includes: Calculate the cosine similarity between the target feature vector and the standard feature vector: The calculated cosine similarity is used as the target similarity between the target feature vector and the standard feature vector.

3. The method according to claim 1, characterized in that, Calculate the intermediate similarity using the following formula: in, Indicates grouping The intermediate similarity, Represents the target feature vector. Represents the standard eigenvector. For the corresponding age group ,gender Standard body corresponding grouping eigenfactors, For the corresponding age group ,gender Grouping of target insured users eigenfactors, Indicates grouping The number of characteristic factors contained therein.

4. The method according to claim 1, characterized in that, Calculate the target similarity using the following formula: in, Indicates the similarity of the final target. Indicates age group, Indicates gender. Indicates grouping The intermediate similarity, Indicates grouping The corresponding weight value, Indicates the number of groups.

5. The method according to claim 1, characterized in that, Based on the relationship between the target similarity and a preset similarity threshold, determine whether the target insured user passes verification, including: If the target similarity is less than the similarity threshold, the verification is determined to have failed. If the target similarity is greater than or equal to the similarity threshold, the verification is deemed successful.

6. The method according to claim 5, characterized in that, If the target similarity is less than the similarity threshold, after determining that the verification has failed, the method further includes: Calculate the relative difference between the target similarity and the similarity threshold; Based on the relationship between the similarity difference and the preset comparison value, it is determined whether the target insured user has passed the verification.

7. A device for verifying underwriting data, characterized in that, include: The acquisition module is used to acquire the characteristic data of the target insured user; The quantization module is used to quantize the feature data of the target insured user to obtain the target feature vector; The matching module is used to match the standard feature vector of the standard body associated with the target insured user based on the age and / or gender of the target insured user; A determination module is used to determine the target similarity between the target feature vector and the standard feature vector; The verification module is used to determine whether the target insured user passes the verification based on the relationship between the target similarity and a preset similarity threshold. The target feature vector and the standard feature vector are divided into multiple groups according to dimensions, and each group includes at least two or more feature factors. Determining the target similarity between the target feature vector and the standard feature vector includes: Calculate the cosine similarity between the target feature vector and the standard feature vector corresponding to each of the multiple groups to obtain multiple intermediate similarities; Obtain the weight value corresponding to each group, and use it as the weight value of each intermediate similarity; The target similarity is calculated based on each intermediate similarity and its corresponding weight value. The weight value for each group is determined according to the following formula: in, Indicates grouping The corresponding weight value, Indicates the number of groups. Indicates grouping Its own relevance Indicates grouping Correlation with other groups This indicates the adjustment value.

8. A terminal device, comprising a processor and a memory for storing processor-executable instructions, characterized in that, When the processor executes the instructions, it implements the steps of the method according to any one of claims 1 to 6.

9. A computer-readable storage medium having a computer program / instructions stored thereon, characterized in that, When the computer program / instructions are executed by the processor, they implement the steps of the method described in any one of claims 1 to 6.