BERT-based order assignment model training method and device based on multi-factor decision and medium

By preprocessing the original work order data and implementing a two-layer filtering mechanism, combined with a natural language big data model and dual quality checks, the training of the dispatch model is optimized, which solves the problem of insufficient multi-factor decision-making in the existing dispatch system and improves the dispatch accuracy and generalization ability.

CN122264352APending Publication Date: 2026-06-23JIANGSU LIYUN TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
JIANGSU LIYUN TECHNOLOGY CO LTD
Filing Date
2026-02-10
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In existing dispatch systems, the dispatch model training method does not fully integrate decision-making factors such as address information, event type, and involved unit. The training logic is crude and cannot effectively cope with the error optimization in multi-factor decision-making scenarios. Furthermore, it has poor generalization ability in niche event types and special address areas.

Method used

By preprocessing the original work order data, a two-layer filtering mechanism and a multi-factor decision association layer are constructed. Combined with a natural language big data model and dual quality checks, data expansion and model optimization are carried out to generate an optimized model.

Benefits of technology

The system improves the order dispatch model's ability to learn multi-factor correlation features, reduces the error dispatch rate in scenarios with logical changes, enhances its generalization ability in niche event types and special address areas, and achieves high-accuracy order dispatch output.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122264352A_ABST
    Figure CN122264352A_ABST
Patent Text Reader

Abstract

This invention discloses a training method and apparatus for a BERT dispatch model based on multi-factor decision-making. The method includes: acquiring raw work order data and preprocessing it; filtering work orders with logical changes to obtain a training dataset; performing initial model training based on the training dataset to obtain an initial dispatch model; constructing a test dataset and performing multi-factor decision-making error analysis; data augmentation; secondary model training to generate an optimized model; iterative optimization and model deployment. This invention improves the dispatch model's ability to learn multi-factor correlation features and adapt to logical change scenarios by standardizing multi-dimensional decision factors, supplementing outdated standard data for cleaning, refining fusion training, conducting specialized error analysis, and expanding data with controllable quality. Simultaneously, it designs an optimized work order filtering mechanism for logical changes, ensuring the model quickly reaches the preset dispatch accuracy standard.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of work order dispatch model training technology, specifically to a BERT work order dispatch model training method, device, and storage medium based on multi-factor decision-making. Background Technology

[0002] In existing dispatch systems, the accuracy of dispatch decisions depends on a comprehensive assessment of core factors such as address information, event type, and involved units. The training quality of the dispatch model directly determines the accuracy of the correlation analysis of these factors.

[0003] Currently, order dispatch model training methods based on pre-trained models generally suffer from the following key problems:

[0004] First, the work order data processing only focuses on the request text and does not specifically address the hierarchical standardization of address information, the accuracy of event type classification, and the consistency of matching of involved units, which makes the model unable to effectively learn the correlation features of multiple decision factors;

[0005] Second, the model training logic is crude, and the fusion mechanism of multi-factor features is not clearly defined. Training based on only a single text feature makes it difficult to support accurate order dispatch decisions.

[0006] Third, the model training lacks specific analysis for multi-factor decision-making errors, and fails to accurately locate and optimize subdivided error types such as "address matching deviation", "event type misjudgment" and "error in associating involved units".

[0007] Fourth, there is an uneven distribution of work order data. Existing data expansion methods do not combine multi-factor features to generate data that conforms to business logic, resulting in poor generalization ability of the model in scenarios such as niche event types and special address areas.

[0008] BERT (Bidirectional Encoder Representations from Transformers) possesses powerful contextual semantic understanding and multi-feature fusion capabilities, demonstrating excellent performance in natural language processing and multi-dimensional decision-making tasks. Secondary training based on BERT pre-trained models to construct dispatch models is an effective way to improve dispatch accuracy. However, how to fully leverage the advantages of the BERT model and solve the aforementioned problems of existing training methods in multi-factor decision-making dispatch scenarios through targeted data processing, refined multi-factor fusion training logic, and precise error analysis and optimization mechanisms has become a pressing technical challenge. Summary of the Invention

[0009] To address the problems of existing dispatch model training methods that fail to fully integrate decision factors such as address information, event type, and involved units, have coarse training logic, lack targeted error optimization, and cannot effectively handle dispatch logic change scenarios, this invention aims to provide a BERT dispatch model training method, device, and storage medium based on multi-factor decision-making. This is achieved through standardized processing of multi-dimensional decision factors, cleaning and supplementing outdated standard data, refined fusion training, specialized error analysis, and quality-controlled data expansion. Simultaneously, an optimized dispatch logic change work order filtering mechanism is designed to enhance the dispatch model's learning ability for multi-factor correlation features and its adaptability to logic change scenarios, ensuring the model quickly reaches the preset dispatch accuracy standard.

[0010] To achieve the above objectives, in a first aspect, embodiments of the present invention provide a BERT dispatch model training method based on multi-factor decision-making, comprising:

[0011] Obtain the original work order data and preprocess the original work order data;

[0012] Based on a two-layer filtering mechanism of keyword indexing and logical change feature matching, the preprocessed original work order data is subjected to logical change work order filtering to obtain a training dataset.

[0013] The initial dispatch model is obtained by performing the first model training based on the training dataset.

[0014] Construct a test dataset and perform multi-factor decision error analysis on the initial dispatch model based on the test dataset to obtain the analysis results;

[0015] Based on the analysis results, data augmentation was performed using a large natural language model and a dual quality check method.

[0016] The initial dispatch model is trained a second time based on the expanded data to generate an optimized model.

[0017] As a specific implementation of this application, the original work order data includes multiple core fields, including the request content, address information, event type, name of the involved unit, dispatch basis, and dispatch standard version; the original work order data is preprocessed as follows:

[0018] Construct a keyword dictionary and standardized expression template for requests, and use the keyword dictionary and standardized expression template to match and correct the content of requests in the original work order data;

[0019] Address segmentation and hierarchical calibration algorithms are used to perform hierarchical splitting and standardization of the address information from province to city to district to street to community / segment, generating standardized address features;

[0020] The event types in the original work order data are encoded and calibrated according to the preset event type classification system;

[0021] A whitelist of involved entities is constructed, and the whitelist of involved entities is used to perform fuzzy matching and standardized correction on the names of involved entities in the original work order data;

[0022] Construct a current dispatch standard version library and a valid dispatch basis dictionary to filter out work orders in the original work order data that are based on outdated dispatch standards or contain invalid dispatch basis.

[0023] Filter out redundant work orders in the original work order data that have core fields exceeding preset thresholds, incorrect formats, or are submitted repeatedly.

[0024] As one specific implementation of this application, the training dataset is obtained as follows:

[0025] Construct a dispatch logic change filtering layer; the dispatch logic change filtering layer includes a first layer and a second layer; the first layer is a keyword index filtering layer, and the second layer is a logic change feature matching layer;

[0026] The keyword indexing and filtering layer is used to perform keyword matching on the input raw work order data, and work orders that match the keywords are intercepted.

[0027] The logical change feature matching layer is used to perform cosine similarity matching on the original work order data, and work orders with cosine similarity reaching the threshold are intercepted.

[0028] The original work order data after filtering based on logical change work orders is used to form a training dataset.

[0029] As a specific implementation of this application, a dispatch logic change filtering layer is constructed, specifically as follows:

[0030] Collect historical work orders that change the dispatch logic, extract core keywords related to the dispatch logic changes, and build a keyword index library based on the core keywords to form a keyword index filtering layer;

[0031] Multi-dimensional features are extracted from the historical work orders with logical changes, and a logical change feature library is constructed based on the multi-dimensional features to form a logical change feature matching layer; the multi-dimensional features include event type, address region, type of involved unit, and semantic features of the request.

[0032] As a specific implementation of this application, the initial model training is performed based on the training dataset to obtain the initial dispatch model, specifically as follows:

[0033] The standardized appeal text, address features, event type codes, and involved unit features in the training dataset are concatenated to generate multi-dimensional fusion input features;

[0034] Configure the training model; the training model includes a BERT pre-trained model and a multi-factor decision association layer;

[0035] The multi-dimensional fused input features are divided into a training set and a validation set;

[0036] The training model is trained based on the training set and validation set to obtain an initial dispatch model; wherein, the training process adopts an early stop mechanism.

[0037] As a specific implementation of this application, a test dataset is constructed, and a multi-factor decision error analysis is performed on the initial dispatch model based on the test dataset to obtain the analysis results, specifically:

[0038] Select work order data that did not participate in training within a preset time period to construct a test dataset containing multi-dimensional core fields;

[0039] The test dataset is filtered using the keyword indexing filtering layer and the logical change feature matching layer.

[0040] The filtered test dataset is input into the initial dispatch model for accuracy testing, and test data with dispatch errors are filtered out.

[0041] Based on multiple decision factors, the test data of dispatch errors are labeled with error types to obtain various subdivided error data; the error types include address matching deviation, event type misjudgment, error in association of involved units, and multi-factor crossover error;

[0042] By performing feature statistics on various subdivided error data, the multidimensional feature distribution patterns of the error data are obtained.

[0043] As a specific implementation of this application, based on the analysis results, a natural language large model and a dual quality check method are used to augment the data, and the initial dispatch model is trained a second time based on the augmented data to generate an optimized model, specifically as follows:

[0044] Perform data cleaning and repair on various types of erroneous data;

[0045] Based on the multi-dimensional feature distribution pattern of the erroneous data, the training set is augmented using a large natural language model and a dual quality check method to obtain an optimized training dataset.

[0046] Based on the optimized training dataset, the feature fusion and model training processes are repeated to generate an optimized model.

[0047] As a preferred implementation of this application, after generating the optimized model, the method further includes:

[0048] For each model obtained from training, repeat the multi-factor decision error analysis, data augmentation, and secondary training.

[0049] By combining the aforementioned optimization model with logical change work order filtering, coordinated deployment and work order dispatch can be achieved.

[0050] Secondly, embodiments of the present invention also provide a BERT dispatch model training device based on multi-factor decision-making, including a processor, an input device, an output device, and a memory, wherein the processor, input device, output device, and memory are interconnected, wherein the memory is used to store a computer program, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method described in the first aspect.

[0051] Thirdly, embodiments of the present invention also provide a computer-readable storage medium storing a computer program, the computer program including program instructions, which, when executed by a processor, cause the processor to perform the method described in the first aspect.

[0052] The advantages of implementing the BERT dispatch model training scheme based on multi-factor decision-making provided in the embodiments of the present invention are as follows:

[0053] 1. By limiting the data to valid work orders within the past year and supplementing it with cleaning steps to remove outdated dispatching standards and invalid criteria, the timeliness and validity of the training data are ensured, and invalid data is prevented from interfering with model training.

[0054] 2. Design a two-layer filtering mechanism of "keyword indexing + logical change feature matching". This retains the lightweight advantage of keyword indexing (controlling computing costs) and makes up for the shortcomings of incomplete keyword indexing coverage through feature matching. It can comprehensively and accurately intercept work orders with logical changes, significantly reducing the error dispatch rate in logical change scenarios. At the same time, a dynamic update channel is set up to ensure that the filtering mechanism adapts to subsequent logical change requirements.

[0055] 3. By standardizing the processing of core decision-making factors such as address information, event type, and involved units, multi-dimensional fusion features are constructed, which solves the problem that existing methods only focus on the text of the appeal and ignore key decision-making factors, and provides a more comprehensive feature foundation for the model;

[0056] 4. The training logic of the multi-factor fusion model has been refined. By adding a multi-factor decision association layer and attention mechanism, the cross-association learning of core factors has been strengthened, and the model's ability to understand the core logic of order dispatch decision has been improved.

[0057] 5. Based on multi-factor decision-making dimensions, detailed error analysis and attribution are performed, making model optimization more targeted and effectively solving dispatching errors in specific scenarios such as "address matching deviation" and "event type misjudgment";

[0058] 6. The data augmentation method of "large natural language model + dual quality verification" is adopted to ensure the effectiveness of the augmented data and business consistency. At the same time, combined with multi-factor feature constraints, the generalization ability of the model in unbalanced scenarios such as niche event types and special address areas is improved.

[0059] 7. The iterative optimization mechanism ensures that the model can continuously improve the accuracy of multi-factor decision-making, ultimately achieving high-accuracy dispatch output and significantly improving dispatch efficiency and resource allocation rationality. Attached Figure Description

[0060] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the accompanying drawings used in the description of the specific embodiments or the prior art will be briefly introduced below.

[0061] Figure 1 This is a flowchart of the BERT dispatch model training method based on multi-factor decision-making provided in the first embodiment of the present invention;

[0062] Figure 2 This is a flowchart of the BERT dispatch model training method based on multi-factor decision-making provided in the second embodiment of the present invention;

[0063] Figure 3 This is a structural diagram of the BERT dispatch model training device based on multi-factor decision-making provided in an embodiment of the present invention. Detailed Implementation

[0064] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0065] It should be understood that, when used in this specification and the appended claims, the terms "comprising" and "including" indicate the presence of the described features, integrals, steps, operations, elements and / or components, but do not exclude the presence or addition of one or more other features, integrals, steps, operations, elements, components and / or collections thereof.

[0066] Please refer to Figure 1This is the BERT dispatch model training method based on multi-factor decision-making provided in the first embodiment of the present invention. In this embodiment, the method flow is implemented based on the following software structure:

[0067] M1, original work order data access;

[0068] M2, multi-dimensional data preprocessing;

[0069] M3, work order filtering with changes to dispatch logic;

[0070] M4, multi-factor feature fusion;

[0071] M5, the BERT basic model and multi-factor decision association layer;

[0072] M6, Multifactor Decision Error Analysis and Attribution;

[0073] M7, Quality Controllable Data Expansion Module (including large natural language model and dual verification);

[0074] M8, model iterative optimization;

[0075] M9, the final dispatch model is deployed in conjunction with the filtering layer;

[0076] M10, supported by standards / dictionaries / feature libraries (including dispatch standard version library, whitelist library of involved units, logical change feature library, etc.).

[0077] like Figure 1 As shown, the training steps of the BERT dispatch model based on multi-factor decision-making in this embodiment mainly include:

[0078] S1, Data Access and Preprocessing: M1 obtains valid original work order data from the past year, and M2 performs multi-dimensional cleaning and standardization processing, including content calibration, address standardization, and event type standardization, filtering out invalid and non-standard data to obtain a clean training dataset.

[0079] S2, Logic Change Work Order Filtering: Through the "keyword index + logic change feature matching" two-layer filtering mechanism built by M3, work orders with logic changes are intercepted to avoid such work orders interfering with model training and subsequent work order dispatch; at the same time, the filtering keywords and feature library are dynamically updated by M10 to ensure the timeliness of filtering.

[0080] S3, First Model Training: Qualified data filtered by S2 is fused with multi-dimensional features by M4 and input into M5 (configured with BERT base model + multi-factor decision association layer). The training set and validation set are divided proportionally, and the first training is completed using an early stopping mechanism to generate the initial order dispatch model.

[0081] S4, Error Analysis and Attribution: The initial / intermediate model is used for the test dataset. The dispatch error data is filtered by M6, the error types are subdivided and the feature distribution patterns are mined to complete the error attribution analysis.

[0082] S5, Data Augmentation and Secondary Training: Based on the error attribution results of S4, data augmentation with controllable quality is carried out through M7 using the "natural language large model + dual quality check" method. After the augmented data is added to the training dataset, the feature fusion and model training process of S3 is repeated to generate an optimized model.

[0083] S6, Iterative Optimization: By coordinating and repeating the S4-S5 process through M8, the model performance is continuously optimized until the overall order dispatch accuracy and the accuracy of various sub-error types of the model reach the preset standards.

[0084] S7, Linked Deployment and Order Dispatch: The final model that meets the standards is deployed in conjunction with the M3 filtering layer. The M9 enables the complete process of "filtering logic change work orders first, and then accurately dispatching orders" and outputs the order dispatch results.

[0085] Please refer to this again. Figure 2 The second embodiment of the present invention provides a BERT dispatch model training method based on multi-factor decision-making, which includes the following steps:

[0086] S201: Obtain the original work order data and perform multi-dimensional preprocessing on it.

[0087] In practice, valid original work order data from the past year is obtained. This original work order data includes at least core fields such as the request content, address information, event type, involved unit, dispatch basis, and dispatch standard version. The original work order data undergoes multi-dimensional cleaning and standardization processing, specifically including:

[0088] Content calibration of requests: Construct a dictionary of request keywords and standardized expression templates, and match and correct vague, incomplete or ambiguous request content to ensure that the request information is clear and consistent;

[0089] Address information standardization: An address segmentation and hierarchical calibration algorithm is used to perform hierarchical segmentation and standardization of address information from province to city to district to street to community / segment, eliminating invalid addresses, filling in missing levels, and generating standardized address features. It should be noted that the address segmentation and hierarchical calibration algorithm in this embodiment adopts a dictionary-based matching method. By constructing a complete address hierarchical dictionary, the address information is initially segmented and the hierarchical information is completed to ensure accurate address hierarchical segmentation.

[0090] Event type standardization: Refer to the preset event type classification system (such as primary categories such as public services, facility maintenance, and emergency response, and subdivided secondary categories) to encode and calibrate the event types in the work order to ensure that the same type of event uses a unified classification identifier;

[0091] Matching and verification of involved entities: Construct a whitelist of involved entities, perform fuzzy matching and standardized correction on the names of involved entities in work orders, eliminate invalid entity information, and ensure that the information of involved entities is accurate and associative.

[0092] Cleaning outdated standards and invalid basis: Build a version library of current dispatch standards and a dictionary of valid dispatch basis, and filter out work orders based on outdated dispatch standards (non-current versions) or containing invalid dispatch basis;

[0093] Non-standard data removal: Filter out redundant work orders with missing core fields exceeding a preset threshold (such as two or more missing core fields), incorrect format, or duplicate submissions, to obtain a clean training dataset containing standardized multi-dimensional features.

[0094] S202, based on a two-layer filtering mechanism of keyword indexing and logical change feature matching, the preprocessed original work order data is subjected to logical change work order filtering processing to obtain the training dataset.

[0095] In this implementation, a two-layer filtering mechanism of "keyword indexing + logical change feature matching" is designed to intercept work orders with logical changes in dispatching logic, thereby preventing the model from incorrectly dispatching such work orders. The specific process is as follows:

[0096] 1. Construct a dispatch logic change filtering layer

[0097] The dispatch logic change filtering layer comprises a first layer and a second layer. The first layer is a keyword index filtering layer, and the second layer is a logic change feature matching layer. The construction process of the first layer can be as follows: collect historical dispatch logic change work orders, extract core keywords related to the dispatch logic change (such as "logic adjustment," "process update," "standard change," etc.), and construct a keyword index library based on these core keywords to form the keyword index filtering layer. The construction process of the second layer can be as follows: extract multi-dimensional features from the historical dispatch logic change work orders, and construct a logic change feature library based on these multi-dimensional features to form the logic change feature matching layer. These multi-dimensional features include event type, address region, type of involved unit, and semantic features of the request.

[0098] 2. Using the keyword indexing and filtering layer, keyword matching is performed on the input raw work order data, and work orders that match the keywords are intercepted.

[0099] Specifically, a keyword index library is used to match keywords in the input work orders. Work orders that match the keywords will be directly intercepted and will not enter the subsequent model dispatch process.

[0100] 3. Using the aforementioned logical change feature matching layer, cosine similarity matching is performed on the original work order data, and work orders with a cosine similarity reaching the threshold are intercepted.

[0101] Specifically, a lightweight feature extraction network (TextCNN) is first used to extract semantic features from the work order's request text. Then, a cosine similarity algorithm is used to calculate the similarity between the extracted features and the features of historical logical change work orders in the logical change feature library. A similarity threshold is set (e.g., ≥0.8). When the similarity of an input work order reaches the threshold, it is determined to be a logical change-related work order and blocked. It should be noted that TextCNN can improve the accuracy of feature matching. The similarity threshold can be dynamically adjusted according to the actual business scenario. When it is necessary to improve the filtering coverage, the threshold can be appropriately lowered; when it is necessary to improve the filtering accuracy, the threshold can be appropriately increased.

[0102] 4. Based on the original work order data after logical change work order filtering, a training dataset is formed.

[0103] Furthermore, this embodiment also includes filter layer updates, specifically:

[0104] Establish a dynamic update channel for the filtering mechanism. When new changes occur in the order dispatch logic, promptly add new logic change keywords and feature samples to the corresponding database to ensure the timeliness and coverage of the filtering layer.

[0105] S203, Perform the first model training based on the training dataset to obtain the initial dispatch model.

[0106] Step S203 can be understood as the first model training phase, which may include:

[0107] 1. Feature Fusion: The standardized appeal text, address features, event type encoding, and involved unit features from the clean training dataset are concatenated to generate multi-dimensional fused input features;

[0108] 2. Model Configuration: A BERT pre-trained model (such as BERT-base) is selected as the base model. A multi-factor decision association layer is added to the model output layer to learn the cross-correlation features of address, event type, and involved unit. The model output is the prediction result of the dispatch object (such as processing department, staff). In this embodiment, the multi-factor decision association layer is implemented using an attention mechanism. By calculating the attention weights of address features, event type features, and involved unit features, the influence of key features on dispatch decision is strengthened.

[0109] 3. Training Execution: The multi-dimensional fused input features are divided into training and validation sets in an 8:2 ratio and input into the configured BERT model for secondary training. The training process adopts an early stopping mechanism with a preset number of rounds of 3-10. The order dispatch accuracy or F1 score of the validation set is monitored in real time. When the performance index no longer improves for a preset number of rounds, training is stopped, and the model with the current best training round is retained as the initial order dispatch model.

[0110] S204. Construct a test dataset and perform multi-factor decision error analysis on the initial dispatch model based on the test dataset to obtain the analysis results.

[0111] Step S204 can be understood as the multi-factor decision error analysis stage, which may specifically include:

[0112] 1. Model Testing: Construct a test dataset containing multi-dimensional core fields (all of which are valid work order data within the past year that have not been used for training). First, filter out work orders with logical changes through the filter layer constructed in step S202. Then, input the remaining test data into the initial dispatch model for accuracy testing and statistically analyze the overall dispatch accuracy of the model.

[0113] 2. Error Filtering and Classification: Test data with dispatch errors are filtered out, and error types are further subdivided and labeled based on multiple decision factors. Specific error types include: ① Address matching deviation (e.g., dispatching a work order from street A to street B); ② Event type misjudgment (e.g., classifying an emergency response work order as a routine public service); ③ Incorrect association of involved units (e.g., dispatching a work order involving unit A to unit B); ④ Multi-factor cross error (e.g., both address and event type matching have deviations).

[0114] 3. Error Attribution: Perform feature statistics on various subdivided error data to uncover the multi-dimensional feature distribution patterns of error data (such as high error rates in work orders from specific remote address areas and insufficient learning of association features for niche event types).

[0115] S205. Based on the analysis results, data expansion is performed using a natural language large model and a dual quality check method. The initial dispatch model is then trained a second time based on the expanded data to generate an optimized model.

[0116] Step S205 can be understood as the data optimization and secondary training stage, which may specifically include:

[0117] 1. Data Correction: Clean and repair the various subdivided erroneous data marked in step S204.

[0118] 2. Quality-controlled data augmentation: For data categories with uneven distribution in the training dataset (such as niche event types or work orders from specific address areas), a generation method based on a specialized model using a "large natural language model + multi-factor constraints" is employed.

[0119] First, construct a prompt word template that includes address hierarchy, event type encoding, and involved unit attributes to guide a large natural language model (such as DeepSeek / Qwen) to generate simulated data that conforms to the feature distribution. In this embodiment, the large model can also be fine-tuned and optimized: the large natural language model is fine-tuned using multi-dimensional feature samples from real error data to improve the consistency between the generated data and the target feature distribution.

[0120] After generation, the data undergoes dual quality checks: (① Automatic check: Calculate the semantic similarity between the generated data and the actual erroneous data, set a cosine similarity threshold of ≥0.85, and filter out unqualified data; ② Manual sampling check: Sample and review at a rate of 10% to ensure that the generated data conforms to the order dispatch business logic).

[0121] 3. Secondary training: The verified simulation data is added to the training dataset to obtain the optimized training dataset; the feature fusion and model training process in step S203 is repeated, and the same early stopping mechanism is used to retain the optimal model.

[0122] S206, For each model obtained from training, repeatedly perform multi-factor decision error analysis, data augmentation, and secondary training.

[0123] In practice, the process of steps S204 to S205 is repeated, that is, the model obtained from each training is subjected to multi-factor decision error analysis, quality-controlled data expansion, and secondary training until the overall order dispatch accuracy of the model reaches the preset standard (e.g., ≥90%), and the accuracy of each type of subdivided error is ≥85%, then the iteration stops and the final order dispatch model is obtained.

[0124] S207, combining the aforementioned optimization model with logical change work order filtering processing to achieve coordinated deployment and work order dispatch.

[0125] In practice, the final model is deployed in conjunction with the filtering layer built in step S202 to achieve the complete process of "filtering logic change work orders first, and then accurately dispatching orders".

[0126] The advantages of implementing the BERT dispatch model training scheme based on multi-factor decision-making provided in the embodiments of the present invention are as follows:

[0127] 1. By limiting the data to valid work orders within the past year and supplementing it with cleaning steps to remove outdated dispatching standards and invalid criteria, the timeliness and validity of the training data are ensured, and invalid data is prevented from interfering with model training.

[0128] 2. Design a two-layer filtering mechanism of "keyword indexing + logical change feature matching". This retains the lightweight advantage of keyword indexing (controlling computing costs) and makes up for the shortcomings of incomplete keyword indexing coverage through feature matching. It can comprehensively and accurately intercept work orders with logical changes, significantly reducing the error dispatch rate in logical change scenarios. At the same time, a dynamic update channel is set up to ensure that the filtering mechanism adapts to subsequent logical change requirements.

[0129] 3. By standardizing the processing of core decision-making factors such as address information, event type, and involved units, multi-dimensional fusion features are constructed, which solves the problem that existing methods only focus on the text of the appeal and ignore key decision-making factors, and provides a more comprehensive feature foundation for the model;

[0130] 4. The training logic of the multi-factor fusion model has been refined. By adding a multi-factor decision association layer and attention mechanism, the cross-association learning of core factors has been strengthened, and the model's ability to understand the core logic of order dispatch decision has been improved.

[0131] 5. Based on multi-factor decision-making dimensions, detailed error analysis and attribution are performed, making model optimization more targeted and effectively solving dispatching errors in specific scenarios such as "address matching deviation" and "event type misjudgment";

[0132] 6. The data augmentation method of "large natural language model + dual quality verification" is adopted to ensure the effectiveness of the augmented data and business consistency. At the same time, combined with multi-factor feature constraints, the generalization ability of the model in unbalanced scenarios such as niche event types and special address areas is improved.

[0133] 7. The iterative optimization mechanism ensures that the model can continuously improve the accuracy of multi-factor decision-making, ultimately achieving high-accuracy dispatch output and significantly improving dispatch efficiency and resource allocation rationality.

[0134] Based on the same inventive concept, embodiments of the present invention provide a BERT dispatch model training device based on multi-factor decision-making, comprising:

[0135] The data access and preprocessing unit is used to acquire the original work order data and preprocess the original work order data.

[0136] The logical change work order filtering unit is used to perform logical change work order filtering on the preprocessed original work order data based on a two-layer filtering mechanism of keyword index and logical change feature matching to obtain a training dataset.

[0137] The model training unit is used to perform initial model training based on the training dataset to obtain the initial dispatch model.

[0138] The error analysis and attribution unit is used to construct a test dataset and perform multi-factor decision error analysis on the initial dispatch model based on the test dataset to obtain the analysis results.

[0139] The data augmentation and secondary training unit is used to augment the data based on the analysis results using a large natural language model and a dual quality check method, and to perform secondary model training on the initial dispatch model based on the augmented data to generate an optimized model.

[0140] The iterative optimization unit is used to repeatedly perform multi-factor decision error analysis, data augmentation, and secondary training for each trained model.

[0141] The coordinated deployment and dispatch unit is used to combine the optimization model with the logical change work order filtering process to achieve coordinated deployment and dispatch.

[0142] In specific implementation, the original work order data includes multiple core fields, including the request content, address information, event type, name of the involved unit, dispatch basis, and dispatch standard version; the data access and preprocessing unit is specifically used for:

[0143] Construct a keyword dictionary and standardized expression template for requests, and use the keyword dictionary and standardized expression template to match and correct the content of requests in the original work order data;

[0144] Address segmentation and hierarchical calibration algorithms are used to perform hierarchical splitting and standardization of the address information from province to city to district to street to community / segment, generating standardized address features;

[0145] The event types in the original work order data are encoded and calibrated according to the preset event type classification system;

[0146] A whitelist of involved entities is constructed, and the whitelist of involved entities is used to perform fuzzy matching and standardized correction on the names of involved entities in the original work order data;

[0147] Construct a current dispatch standard version library and a valid dispatch basis dictionary to filter out work orders in the original work order data that are based on outdated dispatch standards or contain invalid dispatch basis.

[0148] Filter out redundant work orders in the original work order data that have core fields exceeding preset thresholds, incorrect formats, or are submitted repeatedly.

[0149] Specifically, the logic change order filtering unit is used for:

[0150] Construct a dispatch logic change filtering layer; the dispatch logic change filtering layer includes a first layer and a second layer; the first layer is a keyword index filtering layer, and the second layer is a logic change feature matching layer;

[0151] The keyword indexing and filtering layer is used to perform keyword matching on the input raw work order data, and work orders that match the keywords are intercepted.

[0152] The logical change feature matching layer is used to perform cosine similarity matching on the original work order data, and work orders with cosine similarity reaching the threshold are intercepted.

[0153] The original work order data after filtering based on logical change work orders is used to form a training dataset.

[0154] Specifically, the task dispatch logic change filtering layer is constructed as follows:

[0155] Collect historical work orders that change the dispatch logic, extract core keywords related to the dispatch logic changes, and build a keyword index library based on the core keywords to form a keyword index filtering layer;

[0156] Multi-dimensional features are extracted from the historical work orders with logical changes, and a logical change feature library is constructed based on the multi-dimensional features to form a logical change feature matching layer; the multi-dimensional features include event type, address region, type of involved unit, and semantic features of the request.

[0157] In specific implementation, the model training unit is used for:

[0158] The standardized appeal text, address features, event type codes, and involved unit features in the training dataset are concatenated to generate multi-dimensional fusion input features;

[0159] Configure the training model; the training model includes a BERT pre-trained model and a multi-factor decision association layer;

[0160] The multi-dimensional fused input features are divided into a training set and a validation set;

[0161] The training model is trained based on the training set and validation set to obtain an initial dispatch model; wherein, the training process adopts an early stop mechanism.

[0162] In practical implementation, the error analysis and attribution unit is specifically used for:

[0163] Select work order data that did not participate in training within a preset time period to construct a test dataset containing multi-dimensional core fields;

[0164] The test dataset is filtered using the keyword indexing filtering layer and the logical change feature matching layer.

[0165] The filtered test dataset is input into the initial dispatch model for accuracy testing, and test data with dispatch errors are filtered out.

[0166] Based on multiple decision factors, the test data of dispatch errors are labeled with error types to obtain various subdivided error data; the error types include address matching deviation, event type misjudgment, error in association of involved units, and multi-factor crossover error;

[0167] By performing feature statistics on various subdivided error data, the multidimensional feature distribution patterns of the error data are obtained.

[0168] In specific implementation, the data augmentation and secondary training units are used for:

[0169] Perform data cleaning and repair on various types of erroneous data;

[0170] Based on the multi-dimensional feature distribution pattern of the erroneous data, the training set is augmented using a large natural language model and a dual quality check method to obtain an optimized training dataset.

[0171] Based on the optimized training dataset, the feature fusion and model training processes are repeated to generate an optimized model.

[0172] It should be noted that the specific workflow of this embodiment is described in the foregoing method embodiment section, and will not be repeated here.

[0173] Furthermore, such as Figure 3 As shown, another embodiment of the present invention also provides a BERT dispatch model training device based on multi-factor decision-making, which may include: one or more processors 101, one or more input devices 102, one or more output devices 103, and a memory 104. The processors 101, input devices 102, output devices 103, and memory 104 are interconnected via a bus 105. The memory 104 is used to store a computer program, the computer program including program instructions, and the processor 101 is configured to invoke the program instructions to execute the method described in the above-described method embodiment.

[0174] It should be understood that, in this embodiment of the invention, the processor 101 may be a central processing unit (CPU), but it may also be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or any conventional processor.

[0175] Input device 102 may include a keyboard, etc., and output device 103 may include a display (LCD, etc.), a speaker, etc.

[0176] The memory 104 may include read-only memory and random access memory, and provides instructions and data to the processor 101. A portion of the memory 104 may also include non-volatile random access memory. For example, the memory 104 may also store device type information.

[0177] In specific implementations, the processor 101, input device 102, and output device 103 described in the embodiments of the present invention can execute the implementation methods described in the embodiments of the BERT dispatch model training method based on multi-factor decision-making provided in the embodiments of the present invention, which will not be repeated here.

[0178] Accordingly, embodiments of the present invention provide a computer-readable storage medium storing a computer program, the computer program including program instructions, which, when executed by a processor, implement the above-described BERT dispatch model training method based on multi-factor decision-making.

[0179] The computer-readable storage medium can be an internal storage unit of the system described in any of the foregoing embodiments, such as the system's hard disk or memory. The computer-readable storage medium can also be an external storage device of the system, such as a plug-in hard disk, Smart Media Card (SMC), Secure Digital (SD) card, or Flash Card. Furthermore, the computer-readable storage medium can include both internal storage units and external storage devices. The computer-readable storage medium is used to store the computer program and other programs and data required by the system. The computer-readable storage medium can also be used to temporarily store data that has been output or will be output.

[0180] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0181] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative. For instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, or may be electrical, mechanical or other forms of connection.

[0182] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of the embodiments of the present invention, depending on actual needs.

[0183] Furthermore, the functional units in the various embodiments of this invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated units can be implemented in hardware or as software functional units. When using each module, user information is collected and stored only with the user's full authorization and in compliance with relevant laws and regulations, protecting the security and privacy of user data, and strictly prohibiting unauthorized access; data processing will be conducted within the scope stipulated by law and will not exceed the purpose and scope authorized by the user; at the same time, users have the rights to access, correct, delete, restrict processing, and refuse their personal data; and must strictly comply with applicable laws and regulations and conduct compliance reviews.

[0184] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0185] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope disclosed in the present invention, and these modifications or substitutions should all be covered within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

1. A training method for a BERT dispatch model based on multi-factor decision-making, characterized in that, include: Obtain the original work order data and preprocess the original work order data; Based on a two-layer filtering mechanism of keyword indexing and logical change feature matching, the preprocessed original work order data is subjected to logical change work order filtering to obtain a training dataset. The initial dispatch model is obtained by performing the first model training based on the training dataset. Construct a test dataset and perform multi-factor decision error analysis on the initial dispatch model based on the test dataset to obtain the analysis results; Based on the analysis results, data augmentation was performed using a large natural language model and a dual quality check method. The initial dispatch model is trained a second time based on the expanded data to generate an optimized model.

2. The method as described in claim 1, characterized in that, The original work order data includes several core fields, including the request content, address information, event type, name of the involved unit, dispatch basis, and dispatch standard version; the original work order data is preprocessed as follows: Construct a keyword dictionary and standardized expression template for requests, and use the keyword dictionary and standardized expression template to match and correct the content of requests in the original work order data; Address segmentation and hierarchical calibration algorithms are used to perform hierarchical splitting and standardization of the address information from province to city to district to street to community / segment, generating standardized address features; The event types in the original work order data are encoded and calibrated according to the preset event type classification system; A whitelist of involved entities is constructed, and the whitelist of involved entities is used to perform fuzzy matching and standardized correction on the names of involved entities in the original work order data; Construct a current dispatch standard version library and a valid dispatch basis dictionary to filter out work orders in the original work order data that are based on outdated dispatch standards or contain invalid dispatch basis. Filter out redundant work orders in the original work order data that have core fields exceeding preset thresholds, incorrect formats, or are submitted repeatedly.

3. The method as described in claim 1, characterized in that, The training dataset is as follows: Construct a dispatch logic change filtering layer; the dispatch logic change filtering layer includes a first layer and a second layer; the first layer is a keyword index filtering layer, and the second layer is a logic change feature matching layer; The keyword indexing and filtering layer is used to perform keyword matching on the input raw work order data, and work orders that match the keywords are intercepted. The logical change feature matching layer is used to perform cosine similarity matching on the original work order data, and work orders with cosine similarity reaching the threshold are intercepted. The original work order data after filtering based on logical change work orders is used to form a training dataset.

4. The method as described in claim 3, characterized in that, Construct a dispatch logic change filtering layer, specifically as follows: Collect historical work orders that change the dispatch logic, extract core keywords related to the dispatch logic changes, and build a keyword index library based on the core keywords to form a keyword index filtering layer; Multi-dimensional features are extracted from the historical work orders with logical changes, and a logical change feature library is constructed based on the multi-dimensional features to form a logical change feature matching layer; the multi-dimensional features include event type, address region, type of involved unit, and semantic features of the request.

5. The method as described in claim 1, characterized in that, The initial model is trained based on the aforementioned training dataset to obtain the initial dispatch model, specifically as follows: The standardized appeal text, address features, event type codes, and involved unit features in the training dataset are concatenated to generate multi-dimensional fusion input features; Configure the training model; the training model includes a BERT pre-trained model and a multi-factor decision association layer; The multi-dimensional fused input features are divided into a training set and a validation set; The training model is trained based on the training set and validation set to obtain an initial dispatch model; wherein, the training process adopts an early stop mechanism.

6. The method as described in claim 3, characterized in that, A test dataset was constructed, and a multi-factor decision error analysis was performed on the initial dispatch model based on the test dataset. The analysis results are as follows: Select work order data that did not participate in training within a preset time period to construct a test dataset containing multi-dimensional core fields; The test dataset is filtered using the keyword indexing filtering layer and the logical change feature matching layer. The filtered test dataset is input into the initial dispatch model for accuracy testing, and test data with dispatch errors are filtered out. Based on multiple decision factors, the test data of dispatch errors are labeled with error types to obtain various subdivided error data; the error types include address matching deviation, event type misjudgment, error in association of involved units, and multi-factor crossover error; By performing feature statistics on various subdivided error data, the multidimensional feature distribution patterns of the error data are obtained.

7. The method as described in claim 6, characterized in that, Based on the analysis results, a natural language processing model and a dual quality check method were used to augment the data. The initial dispatch model was then trained a second time using the augmented data to generate an optimized model. Specifically: Perform data cleaning and repair on various types of erroneous data; Based on the multi-dimensional feature distribution pattern of the erroneous data, the training set is augmented using a large natural language model and a dual quality check method to obtain an optimized training dataset. Based on the optimized training dataset, the feature fusion and model training processes are repeated to generate an optimized model.

8. The method according to any one of claims 1-7, characterized in that, After generating the optimization model, the method further includes: For each model obtained from training, repeat the multi-factor decision error analysis, data augmentation, and secondary training. By combining the aforementioned optimization model with logical change work order filtering, coordinated deployment and work order dispatch can be achieved.

9. A training device for a BERT dispatch model based on multi-factor decision-making, characterized in that, The system includes a processor, an input device, an output device, and a memory, which are interconnected. The memory is used to store a computer program, which includes program instructions. The processor is configured to invoke the program instructions to execute the method as described in claim 8.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, the computer program including program instructions that, when executed by a processor, cause the processor to perform the method as described in claim 8.