Marketing strategy optimization method, device, equipment, storage medium and program product

By constructing a historical sample set and training a counterfactual estimation model to generate proxy labels, marketing strategies are optimized. This solves the problems of high cost and low efficiency caused by relying on manual annotation in existing technologies, and achieves more efficient and accurate marketing strategy optimization.

CN122199054APending Publication Date: 2026-06-12CHINA MOBILE ONLINE SERVICES CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHINA MOBILE ONLINE SERVICES CO LTD
Filing Date
2026-02-03
Publication Date
2026-06-12

Smart Images

  • Figure CN122199054A_ABST
    Figure CN122199054A_ABST
Patent Text Reader

Abstract

Embodiments of the application disclose a marketing strategy optimization method and device, equipment, a storage medium and a program product to solve the problem of high marketing cost, low strategy optimization efficiency and low marketing conversion rate due to the high dependence of marketing opportunity identification and recommended dialogue generation process on large-scale and high-quality artificial annotation data, and the difficulty in accurately identifying and annotating potential marketable users who have not been recommended but can actually handle. The method comprises: constructing a historical sample set based on the historical interaction log generated in the marketing process; training an counterfactual estimation model based on the historical sample set to obtain a trained counterfactual estimation model; generating counterfactual proxy labels for target samples through the trained counterfactual estimation model; determining a multi-objective fitness function for evaluating the marketing strategy based on the counterfactual proxy labels; and optimizing the marketing strategy to be optimized based on the multi-objective fitness function to obtain an optimized marketing strategy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of artificial intelligence technology, and in particular to a marketing strategy optimization method, apparatus, device, storage medium, and program product. Background Technology

[0002] In the field of AI-driven marketing, especially in human-machine collaborative customer service marketing scenarios, to improve marketing efficiency and user experience, AI models can typically identify marketing opportunities and generate recommended scripts based on real-time conversations between users and human agents. Human agents can then combine the conversation context and user intent to flexibly apply these recommended scripts to interact with users, thereby significantly improving marketing conversion efficiency while enhancing user experience.

[0003] Currently, existing technologies for identifying marketing opportunities and generating recommended scripts using AI models involve the following steps: First, manual annotation of massive amounts of historical dialogue content is required to identify potential marketable users and excellent recommended scripts. Then, this annotated data is used to train and fine-tune the opportunity identification model and the script generation model to obtain the trained models. Finally, based on real-time dialogue between users and human agents, the trained opportunity identification model and script generation model are used to identify marketing opportunities and generate recommended scripts.

[0004] However, the aforementioned existing technologies have the following shortcomings: Firstly, the process of marketing opportunity identification and recommendation script generation heavily relies on large-scale, high-quality manually labeled data. This results in a large workload and high human resource costs, potentially leading to high marketing costs and long cycles. Secondly, when identifying potential marketable users, it is difficult for humans to accurately identify and label users who are not recommended but are actually eligible for the service. Furthermore, the labeling process is affected by subjective human factors, making it difficult to maintain consistent quality. This may affect the accuracy of the AI ​​model's identification and generation results, ultimately leading to low marketing conversion rates.

[0005] Therefore, there is an urgent need for a method that can continuously optimize marketing strategies without manual annotation, in order to reduce marketing costs, improve the efficiency of strategy optimization, and ensure the marketing conversion rate of marketing strategies. Summary of the Invention

[0006] This application provides a marketing strategy optimization method to solve the problems in the prior art where the marketing opportunity identification and recommendation script generation process heavily relies on large-scale, high-quality manually labeled data, and it is difficult to accurately identify and label potential marketable users who are not recommended but are actually available for marketing, resulting in high marketing costs, low strategy optimization efficiency and low marketing conversion rate.

[0007] This application also provides a marketing strategy optimization device, an electronic device, a computer-readable storage medium, and a computer program product.

[0008] The embodiments of this application adopt the following technical solutions: In a first aspect, embodiments of this application provide a marketing strategy optimization method, including: A historical sample set is constructed based on historical interaction logs generated during the marketing process; The counterfactual estimation model is trained based on the historical sample set to obtain the trained counterfactual estimation model; The trained counterfactual estimation model generates counterfactual proxy labels for the target samples. Determine a multi-objective fitness function for evaluating marketing strategies based on counterfactual agent labels; The marketing strategy to be optimized is optimized based on a multi-objective fitness function to obtain the optimized marketing strategy. Each sample in the historical sample set includes at least: user covariates, actual business processing results, historical recommendation decision markers, customer service adoption markers, and user complaint markers. Counterfactual estimation models are used to estimate the potential business outcome, potential adoption outcome, and potential complaint outcome of a target sample when performing marketing recommendations. The target sample is the sample whose historical recommendation decision was marked as unrecommended.

[0009] Secondly, embodiments of this application provide a marketing strategy optimization device, including a construction module, a training module, a generation module, a determination module, and an optimization module, wherein: The building module is used to construct a historical sample set based on historical interaction logs generated during the marketing process; The training module is used to train the counterfactual estimation model based on the historical sample set, and obtain the trained counterfactual estimation model. The generation module is used to generate counterfactual proxy labels for target samples based on the trained counterfactual estimation model; The determination module is used to determine a multi-objective fitness function for evaluating marketing strategies based on counterfactual agent labels; The optimization module is used to optimize the marketing strategy to be optimized based on a multi-objective fitness function, so as to obtain the optimized marketing strategy. Each sample in the historical sample set includes at least: user covariates, actual business processing results, historical recommendation decision markers, customer service adoption markers, and user complaint markers. Counterfactual estimation models are used to estimate the potential business outcome, potential adoption outcome, and potential complaint outcome of a target sample when performing marketing recommendations. The target sample is the sample whose historical recommendation decision was marked as unrecommended.

[0010] Thirdly, embodiments of this application provide an electronic device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the steps of the marketing strategy optimization method as described above.

[0011] Fourthly, embodiments of this application provide a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of the marketing strategy optimization method described above.

[0012] Fifthly, embodiments of this application provide a computer program product, including a computer program that, when executed by a processor, implements the marketing strategy optimization method as described above.

[0013] The above-described technical solutions adopted in the embodiments of this application can achieve the following beneficial effects: The method provided in this application constructs a multi-dimensional historical sample set based on historical interaction logs, including user covariates, actual business processing results, historical recommendation decision markers, customer service adoption markers, and user complaint markers. This sample set is then used to train a counterfactual estimation model to generate counterfactual proxy labels for target samples that have not been recommended. In this way, the business processing, adoption, and complaint results of potential users can be objectively estimated without relying on manual annotation. This can significantly reduce marketing costs, shorten the optimization cycle, improve strategy optimization efficiency, and enhance the accuracy of opportunity identification and recommendation. Attached Figure Description

[0014] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings: Figure 1 A schematic diagram illustrating the implementation process of a marketing strategy optimization method provided in this application embodiment; Figure 2 A schematic diagram illustrating the implementation flow of a method for determining a counterfactual proxy label, provided in an embodiment of this application; Figure 3 A schematic diagram illustrating the implementation process of a method for determining a multi-objective fitness function provided in an embodiment of this application; Figure 4 A schematic diagram illustrating the implementation process of a method for optimizing a marketing strategy to be optimized, provided in an embodiment of this application; Figure 5A schematic diagram illustrating an application process of the marketing strategy optimization method provided in this application embodiment; Figure 6 A schematic diagram illustrating the implementation process of a method for generating dynamic marketing decisions and recommended marketing scripts, provided in an embodiment of this application; Figure 7 This application provides a schematic diagram of the specific structure of a marketing strategy optimization device. Figure 8 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0015] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0016] It should be understood that the training and prediction processes of the AI ​​models involved in the various embodiments of this specification all adhere to multiple legal and compliant principles, including legal data sources, compliant data content, compliant data governance, compliant training objectives and schemes, compliant training processes, compliant training environments and tools, and compliant ethical verification of training results, and comply with the requirements of Article 5 of the Patent Law. Among them: Data source legitimacy: All datasets used for AI model training were obtained through legal means, covering three categories: publicly authorized data, data authorized by partners, and self-collected compliant data. Publicly authorized data comes from compliant data sources following open-source licenses such as Apache 2.0, with complete copyright attribution and authorization scope clearly marked, and no unauthorized open-source code or data reuse. Data authorized by partners has been subject to formal data usage agreements, clearly defining the scope, duration, and confidentiality obligations, and possessing a complete authorization chain. For self-collected data involving personal information, strict informed consent procedures have been followed, and anonymization processes (including but not limited to field masking, feature anonymization, and differential privacy technology applications) have been implemented to remove personally identifiable information, fully complying with the requirements of relevant laws and regulations such as the "Interim Measures for the Administration of Generative Artificial Intelligence Services" and the "Personal Information Protection Law."

[0017] Data content compliance: The AI ​​model's dataset undergoes multiple screenings and cleaning processes to remove all content that may violate social morality or harm public interests. It contains no obscene, pornographic, violent, discriminatory, or information that endangers national or public safety, nor does it involve the illegal acquisition or use of genetic resources. For data in sensitive fields (such as healthcare and finance), an additional privacy-preserving computation module (including federated learning and secure multi-party computation technologies) ensures that the data is "usable but not visible," avoiding compliance risks during the original data transmission process and ensuring that the data application scenarios and uses comply with public order and good morals and industry regulatory requirements.

[0018] Data governance norms: A complete data traceability system is established during the AI ​​model training process to automatically record the source, collection time, annotation process, cleaning rules, and permission allocation of training data, generating traceable compliance reports to ensure that the data is verifiable throughout its entire lifecycle. The dataset annotation process for AI models is completed by a professional human R&D team, clearly defining the proportion of human creative contributions and avoiding reliance on AI-generated data that has not undergone substantial human modification, thus meeting the examination requirements for "human main contributions" in AI patent applications.

[0019] Training objectives and plans are compliant: The AI ​​model training aims to optimize corporate marketing strategies through artificial intelligence technology, significantly reducing excessive user disruption and potential risks while improving marketing efficiency. The training scheme and final output do not violate any mandatory provisions of laws or administrative regulations, do not harm public interests or the legitimate rights and interests of others, and pose no potential risk of being used for illegal activities, privacy violations, or disruption of public safety. It strictly adheres to the ethical principle of "intelligent for good."

[0020] Training process compliance: A closed-loop training framework is adopted to ensure compliance and controllability of the training process. The specific process is as follows: First, training samples are obtained through compliant data sources. After the aforementioned data cleaning and desensitization, they are input into the neural network model to generate preliminary training results. Second, an expert system is introduced to verify the preliminary results. Based on preset rules and human expert experience, the feasibility of the results is evaluated, and outputs that may pose ethical risks or compliance hazards are corrected (such as removing decision-making logic that violates public order and good morals, and adjusting model parameters that do not comply with safety regulations). Finally, the loss function weights are dynamically optimized based on expert system feedback to strengthen the model's learning of compliant results, avoid overfitting errors or non-compliant labels, and form a closed-loop control of "data input - model training - expert verification - parameter optimization - result feedback" to ensure that the entire training process complies with A5 ethical review requirements.

[0021] Training environment and tool compliance: AI model training is implemented using nationally licensed chips and a compliant training platform. All open-source frameworks and components used in the training process have obtained their corresponding licenses, and copyright statements and patent citation information are fully retained, with no instances of infringement or reuse. The training environment is built using virtual devices (containers / virtual machines) with fixed random seeds and initial parameter configurations to ensure the reproducibility of the training process. Furthermore, through access control and operation log recording, risks such as data leakage and parameter tampering during training are prevented, ensuring the security and compliance of the training process.

[0022] Training results ethical verification compliance: After the model is trained, it undergoes additional third-party ethical compliance assessment and algorithm filing review to verify that the model output does not violate social morality or harm public interests. For potentially sensitive scenarios (such as public services and intelligent decision-making), a special result verification mechanism is established to ensure that the model always complies with Article 5 of the Patent Law and relevant laws and regulations in practical applications.

[0023] In summary, the data and training process used in the AI ​​model of this specification strictly comply with the relevant provisions of Article 5 of the Patent Law and the Patent Examination Guidelines (2023 Edition), and there are no violations of laws, social ethics, public interests, or illegal use of genetic resources. It fully meets the compliance requirements for patent authorization.

[0024] To address the problems in existing technologies where the process of identifying marketing opportunities and generating recommendation scripts heavily relies on large-scale, high-quality manually labeled data, and where it is difficult to accurately identify and label potential marketable users who are not recommended but are actually available for marketing, resulting in high marketing costs, low strategy optimization efficiency, and low marketing conversion rates, this application provides a marketing strategy optimization method.

[0025] The execution subject of this method can be various types of computing devices, or it can be an application or app installed on the computing device. The computing device can be a user terminal such as a mobile phone, tablet computer, or smart wearable device, or it can be a server.

[0026] For ease of description, this application uses a server as the execution subject of the method in its embodiments to illustrate the method. Those skilled in the art will understand that this embodiment uses a server as an example to describe the method, which is merely an illustrative example and does not limit the scope of protection of the corresponding claims.

[0027] Specifically, the implementation flow of the method provided in this application embodiment is as follows: Figure 1 As shown, it includes the following steps: Step 102: Construct a historical sample set based on the historical interaction logs generated during the marketing process.

[0028] In some implementations, historical interaction logs within a preset period generated during the marketing process can be extracted from a log database to construct a historical sample set. For example, historical interaction logs from the past 7 days can be extracted from the log database to construct a historical sample set.

[0029] Among them, historical sample set Each sample in At least include: user covariates Actual business processing results Historical recommendation decision markers Customer service acceptance mark and user complaint tags .

[0030] Optional, user covariates This can include user profiles (such as age, plan type, and duration of service), dynamic behavioral characteristics (such as data usage in the past 30 days and recent complaint records), current conversation characteristics, and other user feature information.

[0031] Actual business processing results In other words, whether a user who has been involved in a marketing referral ultimately completes the referral service. This actual service completion outcome is a binary variable, i.e. .in, This indicates that the user subscribed to the marketing referral service after the marketing referral was executed. This indicates that the user did not subscribe to the recommended marketing service after the marketing referral was executed.

[0032] Historical Recommendation Decision Markers This is used to identify whether a marketing recommendation was executed during the marketing process. Among them, , This indicates that a marketing recommendation was executed during the marketing process; This indicates that marketing recommendations were not implemented during the marketing process.

[0033] Customer service acceptance mark This is used to identify whether a human customer service representative or agent has adopted and used the generated recommended marketing script. , This indicates that customer service representatives or agents have adopted and used the generated recommended marketing scripts. This indicates that customer service representatives or agents did not adopt or use the generated recommended marketing script.

[0034] User complaint flag This flag is used to identify whether a user has filed a complaint after the marketing process. The user complaint flag is also a binary variable, i.e. ,in, This indicates that a user filed a complaint after the marketing process. This indicates that no user filed a complaint after the marketing process.

[0035] All the above samples are compiled to form a structured historical sample set, i.e. Where N represents the number of samples in the historical sample set.

[0036] In some implementations, to avoid insufficient learning of specific sample regions during subsequent model training due to class imbalance in historical sample sets, when a historical sample set is detected... When the proportion of samples is lower than a preset threshold (e.g., 20%), data augmentation can be performed to enhance the historical sample set. The sample.

[0037] Optionally, when performing data augmentation, the Synthetic Minority Over-sampling Technique (SMOTE) can be used to enhance the data augmentation effect. The coverage and density of the samples in the historical sample set. Specifically, for each For a given sample, a nearest neighbor sample can be randomly selected from its K nearest neighbor samples. Then, the two samples are linearly interpolated in the feature space at a random ratio to generate a synthetic sample, thus obtaining a new sample. The sample.

[0038] For example, suppose a historical sample set containing 10,000 conversation records is extracted from a log database. Statistics show that only 1,500 of these records triggered marketing recommendations, representing 15% (less than the preset threshold of 20%). In this case, to balance the historical sample dataset, the target sampling multiple for SMOTE can be set to maximize the effectiveness of marketing recommendations. The sample size reaches 50% of the total sample size. Correspondingly, these 1500 positive samples can be synthesized to generate 3500 new synthetic positive samples. Thus, The sample size can be increased to approximately 37%, which can effectively alleviate the class imbalance problem.

[0039] Step 104: Train the counterfactual estimation model based on the historical sample set to obtain the trained counterfactual estimation model.

[0040] The counterfactual estimation model is used to estimate the potential business outcome, potential adoption outcome, and potential complaint outcome of the target sample when executing marketing recommendations. The target sample consists of samples whose historical recommendation decisions were marked as not recommended.

[0041] Potential business processing outcome refers to the outcome representation of whether the business corresponding to the target sample will be successfully processed (or the probability of processing) given the user covariates of the target sample and assuming that a marketing recommendation is performed on the target sample.

[0042] In some implementations, the potential business processing result can be a binary result, for example, using 0 to represent that the potential business processing result is not processed, and using 1 to represent that the potential business processing result is processed. Alternatively, the potential business processing result can also be an estimate in the form of a probability, for example, outputting a potential business processing probability of 90%.

[0043] Potential adoption outcome refers to the outcome representation of customer service's adoption (or the probability of adoption) of the recommended marketing messages pushed by the system, given the user covariates of the target sample and assuming that marketing recommendations are performed on the target sample.

[0044] In some implementations, the potential adoption result can be a binary result (0 indicates non-adoption, 1 indicates adoption); or it can be a probability result.

[0045] Potential complaint outcome refers to the outcome representation of whether users will file a complaint (or the likelihood of a complaint occurring) given the user covariates of a target sample and assuming that a marketing recommendation is performed on that target sample.

[0046] In some implementations, the potential complaint result can be a binary result (0 indicates no complaint, 1 indicates a complaint); or it can be in the form of risk / probability.

[0047] In some implementations, when training a counterfactual estimation model based on a historical sample set, compliance preprocessing can first be performed on the historical sample set, including the following: 1) Field minimization: Only retain the fields necessary for training, namely user covariates, actual business processing results, historical recommendation decision tags, customer service adoption tags, and user complaint tags; unnecessary sensitive fields are not included in the model or are desensitized before being included in the model.

[0048] 2) De-identification and de-identification: Replace information that can directly identify a user (such as number, ID number, precise address, etc.) with irreversible identifiers through hashing, masking, or mapping tables; the profile features involved in user covariates can be expressed in interval / bucket format (such as age group, consumption level) to reduce the risk of privacy leakage.

[0049] 3) Legality and Usage Constraints: Verify the source of the sample, its authorization status, and the user's refusal to engage in marketing / withdrawal of consent; directly remove or mark as unusable samples that are not allowed to be reached or used for modeling to prevent the model from learning patterns that may lead to unauthorized reach.

[0050] Through the above processing, the training process is technically guaranteed to have a controllable data range, controllable access, and controllable use.

[0051] After completing the above compliance preprocessing, non-target samples, excluding the target samples, can be identified from the historical sample set based on historical recommendation decision labels. Since the target samples are those historically labeled as "not recommended,"... Therefore, the non-target samples are the set of samples whose historical recommendation decisions have been labeled as recommendation labels, i.e. The set of all samples. This set reflects samples of marketing recommendation actions that have actually occurred historically, providing observable supervised information related to the execution of marketing recommendations, such as customer service adoption tags and user complaint tags, thus making it suitable for supervised training of counterfactual estimation models.

[0052] After identifying the non-target samples, a counterfactual estimation model can be trained based on the user covariates, actual business processing results, customer service acceptance markers, and user complaint markers of the non-target samples, resulting in a trained counterfactual estimation model. Specifically, in some implementations, this training includes: using the user covariates of the non-target samples as model input features, and using the actual business processing results, customer service acceptance markers, and user complaint markers as supervision signals; and optimizing the model parameters using supervised learning so that the model can output probability or risk estimates related to the execution of marketing recommendations given the user covariates.

[0053] Optionally, to avoid the model from having effects detrimental to the public interest when applied, such as unfair discrimination against specific groups or inducing a high number of complaints, one or a combination of the following technical constraints can be introduced during the training phase of the counterfactual estimation model: 1) Risk Priority Constraint: In the selection of loss function or threshold, higher weight is given to the risk corresponding to user complaint label, so that the model tends to reduce the potential complaint risk; 2) Compliance hard constraints: Compliance conditions such as user rejection of marketing, blacklisting, and protection of minors are treated as hard rules that cannot be covered by the model. Hard rule filtering is prioritized during training and inference, and the model only outputs probability results after compliance is passed. 3) Auditability and traceability: Save training data versions, feature versions, model versions and evaluation reports to ensure that the model output is traceable, interpretable and auditable.

[0054] The method described in this application can effectively reduce user complaints and resource waste caused by indiscriminate marketing, improve the work efficiency and professionalism of customer service personnel, and ultimately improve the communication service experience for a wide range of end users. It has positive social benefits and is in the public interest.

[0055] In some implementations, the counterfactual estimation model can adopt a composable model structure, specifically including a trained potential business processing probability estimation model, a potential adoption probability estimation model, and a potential complaint risk estimation model; and the training of the counterfactual estimation model based on the historical sample set can be further refined into the process of training the above three sub-models separately and then combining them.

[0056] The training of the potential business transaction probability estimation model involves inputting user covariates (not from the target samples) and actual business transaction results (not from the target samples) into the model for training. Specifically, the model's input is the user covariate, and its output is the estimated probability of business transactions under the condition of performing marketing recommendations. The training process can be implemented using binary classification probabilistic learning, for example, using actual business transaction results as supervision labels and fitting the output probability by minimizing the log loss. After training, the model can output the potential business transaction probability under the condition of performing marketing recommendations when the user covariate is input. ,in, User covariates In executing marketing recommendations (i.e.) The probability of potential business processing when ( ).

[0057] Secondly, regarding the training of the potential adoption probability estimation model: User covariates from non-target samples and customer service adoption tags from non-target samples can be input into the potential adoption probability estimation model for training, resulting in a trained model. Furthermore, since customer service adoption tags are strongly correlated with "whether a marketing recommendation occurred" in the business chain, and in some implementations can only be stably observed in samples where the recommendation actually occurred, using non-target samples for training ensures the authenticity and consistency of the supervision signal. After training, the model can output the customer service adoption probability when performing a marketing recommendation when the user covariates are input, i.e. ,in, User covariates In executing marketing recommendations (i.e.) The probability of customer service acceptance at that time.

[0058] Finally, regarding the training of the potential complaint risk estimation model: User covariates from non-target samples and user complaint tags from non-target samples can be input into the potential complaint risk estimation model for training, resulting in a trained model. Similar to customer service adoption tags, user complaint tags are also strongly correlated with "whether a marketing recommendation occurred" in the business chain. In some implementations, they are also stably observable only in samples where the recommendation actually occurred. Therefore, non-target samples can be used for training. After training, the model can output the probability of complaint risk when executing a marketing recommendation when user covariates are input, i.e. ,in, User covariates In executing marketing recommendations (i.e.) The probability of complaint risk when ( ).

[0059] After the three sub-models mentioned above have been trained, a trained counterfactual estimation model is formed based on the trained potential business processing probability estimation model, the trained potential adoption probability estimation model, and the trained potential complaint risk estimation model. Specifically, in some implementations, the trained counterfactual estimation model can exist in the form of a model set or a cascaded inference pipeline: when any sample's user covariate is input, the three sub-models are called respectively to output the corresponding potential business processing probability, potential adoption probability, and potential complaint risk probability, and these three are used as the joint output of the counterfactual estimation model for direct use when generating counterfactual proxy labels for the target sample. Optionally, to balance performance and interpretability, the aforementioned potential business processing probability estimation model, potential adoption probability estimation model, and potential complaint risk estimation model can be trained using gradient boosting tree models, such as XGBoost or LightGBM. Each model can have its own training hyperparameters set according to its task characteristics, such as tree depth, learning rate, and number of leaves, and early stopping or cross-validation can be performed on the validation set to obtain a trained model with stronger generalization ability.

[0060] It should be noted that, in this embodiment, the output of the trained counterfactual estimation model is only used for offline evaluation and strategy optimization to reduce unnecessary marketing outreach, lower complaint risks, and improve service quality; the model output is not used to carry out any behavior that violates laws and regulations, social ethics, or harms public interests. During the system's online execution phase, a compliance verification module can be set up to pre-filter the outreach targets and recommended content. For users who are not allowed to be reached or businesses that are not allowed to be recommended, marketing recommendations will be directly prohibited, technically ensuring that the implementation boundaries of this solution comply with public interests and compliance requirements.

[0061] Step 106: Generate counterfactual proxy labels for the target samples using the trained counterfactual estimation model. The counterfactual agency label includes the potential business outcome, potential adoption outcome, and potential complaint outcome of the target sample when performing marketing recommendations.

[0062] In some embodiments, user covariates of the target sample can be input into a trained counterfactual estimation model to obtain the potential business processing results, potential adoption results, and potential complaint results of the target sample when performing marketing recommendations.

[0063] Optionally, potential business processing results, potential adoption results, and potential complaint results can be represented in binary form (e.g., 0 / 1), or in probability or risk value form (e.g., probability / risk with a value of 0 to 1) to meet different assessment accuracy requirements.

[0064] Subsequently, the potential business processing results, potential adoption results, and potential complaint results obtained can be taken as a whole and determined as the counterfactual agency label of the target sample, which represents the expected performance of the target sample in the counterfactual scenario.

[0065] like Figure 2 As shown, in an optional implementation, considering that the potential business processing results predicted directly by the potential business processing probability estimation model may not be accurate enough due to model errors or sample bias, a correction step based on causal inference can be added on top of the potential business processing results predicted in step 106 above to obtain more accurate and robust potential business processing results. The specific process is as follows: Step 202: Calculate the recommendation tendency score of user covariates based on historical recommendation decision labels in the historical sample set.

[0066] Among them, the recommendation propensity score is used to characterize the probability that a historical marketing strategy is more likely to execute a marketing recommendation, given a user covariate.

[0067] In some embodiments, when calculating the recommendation tendency score of user covariates, a specialized classification model, such as logistic regression or gradient boosting tree, can be trained first using the entire historical sample set. Specifically, the classification model (such as logistic regression or gradient boosting tree) is trained using user covariates as input features and historical recommendation decision labels as prediction targets, resulting in a trained classification model. Then, the user covariates to be calculated can be input as input features into this classification model, and its output value... This is the recommendation tendency score.

[0068] Step 204: Based on the recommendation preference score and the actual business processing results of non-target samples in the historical sample set (excluding the target sample), the potential business processing results are corrected to obtain the corrected potential business processing results.

[0069] In one alternative implementation, a dual robust estimation method can be employed, combining the recommendation propensity score and the actual business processing results observed in non-target samples to calibrate the predicted potential business processing results, thereby obtaining the corrected potential business processing results. The calculation method of the dual robust estimation method is as follows:

[0070] in, This indicates the potential business processing result after correction using the dual robust estimation method. Indicates the probability of a potential business transaction being processed; This represents the actual business processing results observed in non-target samples. This indicates the recommendation preference score. For the target sample, its... Therefore, the above formula can be simplified to .

[0071] In another alternative implementation, when correcting potential business processing results based on causal inference, the following steps may also be adopted: (1) First, in non-target samples (i.e. The target sample is (i.e.) in the sample) Find all approximate samples with similar features to the given sample, where the approximate samples are those with the recommendation preference scores. Similar samples.

[0072] (2) After obtaining all approximate samples, a weighted average can be calculated on the recommendation tendency scores of all approximate samples, and the weighted average result can be used as the corrected potential business processing result. The specific calculation method is as follows:

[0073] in, This indicates the revised potential business processing result. This represents the actual business processing results observed in non-target samples. This represents the preset Gaussian kernel function, specifically... , Let be the independent variable of the Gaussian kernel function; This is the preset bandwidth parameter.

[0074] Step 206: The potential adoption results, potential complaint results, and corrected potential business processing results generated by the trained counterfactual estimation model are identified as counterfactual agency labels for the target sample.

[0075] In this embodiment of the application, the modified potential business processing result can be represented as follows: ,include or Potential adoption results can be represented as ,in Potential complaint outcomes can be expressed as: ,in Ultimately, the counterfactual surrogate label for the target sample can be obtained as ( or , , ).

[0076] Step 108: Determine a multi-objective fitness function for evaluating marketing strategies based on counterfactual agent labels.

[0077] In this embodiment of the application, in order to evaluate the overall effectiveness of the marketing strategy, a multi-objective fitness function can be determined based on the counterfactual proxy label. This multi-objective fitness function can be composed of at least one of the following functions: maximizing the counterfactual conversion rate function, maximizing the counterfactual adoption rate function, and minimizing the counterfactual complaint rate function.

[0078] Among them, the marketing strategy C maximizes the counterfactual conversion rate function. It can be represented as follows:

[0079] in, It is a very small positive number, for example, it can take the value of This is used to prevent calculation errors when the denominator is zero. N Indicates the number of samples; Indicates the recommendation decision marker.

[0080] Among them, the marketing strategy C maximizes the counterfactual adoption rate function. It can be represented as follows:

[0081] in, It is a very small positive number, for example, it can take the value 10. -6 This is used to prevent calculation errors when the denominator is zero.

[0082] In this embodiment, since complaints are sparse high-risk events, their 95th percentile upper confidence bound (UB95) can be used as a risk measure. Based on this, the marketing strategy C minimizes the counterfactual complaint rate function. It can be represented as follows:

[0083] Where Quantile represents the fractional place, used to describe the location of the data distribution. B is the number of Bootstrap resampling attempts.

[0084] Finally, a multi-objective fitness function is used to evaluate marketing strategy C. It can be defined as:

[0085] In some embodiments, to scientifically and comprehensively quantify the overall effect of any marketing strategy, when determining the multi-objective fitness function, in addition to considering counterfactual proxy labels, the true proxy labels of non-target samples can also be considered, where the true proxy labels are the actual observations of the non-target samples. Specifically, a complete proxy label dataset can be constructed first based on the counterfactual proxy labels and the true proxy labels, and then the multi-objective fitness function can be determined based on the complete proxy label dataset.

[0086] like Figure 3 The diagram shown illustrates the process for determining a multi-objective fitness function based on a complete proxy label dataset, as provided in this embodiment of the application. The process specifically includes the following steps: (1) Extract historical interaction logs generated during the marketing process within a preset period from the log database to construct a historical sample set.

[0087] This step is the same as step 102 mentioned above, and you can refer to the relevant content of step 102 mentioned above. It will not be repeated here.

[0088] (2) The counterfactual estimation model is trained based on the historical sample set to obtain the trained counterfactual estimation model. The counterfactual estimation model includes a potential business processing probability estimation model, a potential adoption probability estimation model, and a potential complaint risk estimation model.

[0089] This step is the same as step 104 above, and you can refer to the relevant content of step 104 above. It will not be repeated here.

[0090] (3) Determine whether the marketing recommendation was implemented based on the historical recommendation decision markers in the historical sample set, or determine whether the historical recommendation decision marker is 1.

[0091] If the historical strategy executed a marketing recommendation, or if the historical recommendation decision is marked as 1, then proceed to step (4). Conversely, if the historical strategy did not execute a marketing recommendation, or if the historical recommendation decision is marked as 0 (i.e., not equal to 1), then proceed to step (5).

[0092] (4) When it is determined that a marketing recommendation was executed in the past, or the historical recommendation decision label is determined to be 1, then its actual observation result is directly used as the proxy label, that is: , ,

[0093] in, A genuine agency label indicating the result of the business transaction; Indicates the true proxy label of the adoption result; A genuine agent label indicating the outcome of a complaint.

[0094] (6) Construct a complete proxy label dataset based on the real proxy labels and counterfactual proxy labels determined in steps (4) and (5) above.

[0095] (7) Determine the multi-objective fitness function based on the complete agent label dataset.

[0096] The implementation steps for determining the multi-objective fitness function based on the complete agent label dataset are similar to those for determining the multi-objective fitness function based on maximizing the counterfactual conversion rate function, maximizing the counterfactual adoption rate function, and minimizing the counterfactual complaint rate function. The only difference is in the values ​​of the historical recommendation decision labels. .

[0097] Step 110: Optimize the marketing strategy to be optimized based on the multi-objective fitness function to obtain the optimized marketing strategy.

[0098] In this embodiment, the marketing strategy to be optimized can be obtained by using the Non-dominated Sorting Genetic Algorithm III (NSGA-III) based on a multi-objective fitness function. When using this approach, because NSGA-III introduces a reference point-based selection mechanism, it can better maintain the diversity and distribution of the population on the Pareto front when dealing with three or more high-dimensional multi-objective problems, such as maximizing the counterfactual conversion probability, maximizing the counterfactual adoption probability, and minimizing the counterfactual complaint probability in this embodiment, thus avoiding premature convergence of the solution set to a local region.

[0099] like Figure 4 As shown, in some embodiments, the marketing strategy to be optimized can be obtained by following the steps of a multi-objective fitness function: (1) The marketing strategies to be optimized are encoded as strategy individuals to form a strategy population.

[0100] In this embodiment, the marketing strategies to be optimized can be individually encoded, making each marketing strategy operable by the evolutionary algorithm. The marketing strategy includes triggering decision parameters. and prompt word parameters .

[0101] Optionally, to achieve end-to-end joint optimization, a hybrid chromosome encoding method can be adopted, which encodes the marketing strategy to be optimized into a hybrid genotype chromosome, which includes three types of parameters: discrete genes, continuous genes, and read-only genes.

[0102] Among them, discrete genes are used to represent category or switch parameters, such as rule enable / disable switch, speech template category, etc.

[0103] Continuous genes are used to represent adjustable numerical parameters, such as opportunity trigger thresholds and generation parameters.

[0104] Read-only genes are used to represent hard compliance rules, such as "users who have complained in the past 7 days are prohibited from triggering this rule". After initialization, their values ​​are locked and any mutation operations are prohibited, thereby ensuring that all marketing strategies maintain compliance constraints during the evolution process.

[0105] After obtaining the hybrid genotype chromosome corresponding to the marketing strategy to be optimized, a preset number of strategy individuals that conform to the coding rules in the hybrid chromosome coding method can be randomly generated as the strategy population.

[0106] Optionally, to accelerate convergence, in some embodiments, the best-performing strategy in the current production environment can also be encoded as an individual and added to the strategy population as an elite individual.

[0107] (2) Perform the following iterative optimization operation on each strategy individual in the strategy population until the number of iterations reaches the preset threshold or no new non-dominant strategy individual is generated for M consecutive generations, then terminate the iterative optimization operation: First, based on the multi-objective fitness function, the maximum counterfactual conversion probability, the maximum counterfactual adoption probability, and the minimum counterfactual complaint probability are calculated for each policy individual in the policy population.

[0108] Iterative optimization is performed on each individual strategy in the strategy population. In each iteration, the fitness of each individual strategy is evaluated based on a multi-objective fitness function, and the values ​​of the multi-objective functions, namely maximizing the counterfactual conversion probability, maximizing the counterfactual adoption probability, and minimizing the counterfactual complaint probability, are calculated.

[0109] Secondly, based on the probability of maximizing counterfactual conversion, maximizing counterfactual adoption, and minimizing counterfactual complaint for each strategy individual, non-dominated ranking is performed on each strategy individual to screen out non-dominated strategy individuals.

[0110] After obtaining the fitness values ​​of all policy individuals (including maximizing the counterfactual conversion probability, maximizing the counterfactual adoption probability, and minimizing the counterfactual complaint probability), the core mechanism of the NSGA-III algorithm is used to select policy individuals, as follows: First, based on the three fitness values ​​of each individual, the non-dominated individuals of each strategy are sorted, and all strategy individuals are divided into multiple levels (frontiers). The first frontier includes non-dominated strategy individuals in the strategy population that are not dominated by any other strategy individuals at any fitness value.

[0111] Subsequently, NSGA-III further selected outstanding strategic individuals from the current strategic population and its offspring through a reference-point-based association and normalization selection mechanism, in preparation for generating the next generation of strategic population.

[0112] Among them, the reference point-based association and normalization selection mechanism can maintain the diversity of the strategy population in the three-dimensional target space while taking into account convergence, thus avoiding premature convergence of the search to a local region.

[0113] Then, crossover and mutation operations are performed on the non-dominated policy individuals to generate a new generation of policy individuals.

[0114] Among them, crossover refers to randomly selecting two parent strategy individuals with a certain probability and exchanging parts of their chromosomes to integrate the advantageous features of different marketing strategies.

[0115] Mutation refers to the random perturbation of certain gene loci in a strategy individual with a certain probability.

[0116] To accommodate business strategy flexibility and compliance requirements, in some implementations, the following key mechanisms can be introduced during the crossover and variation phases to ensure the optimization efficiency and compliance of marketing strategies: First, the hard rule immutable bit mechanism automatically skips read-only gene bits during crossover and mutation, ensuring that offspring strategies naturally meet compliance constraints.

[0117] Secondly, an adaptive Poisson jump mutation operator mechanism. To achieve efficient adaptive exploration of the discrete policy space, in some embodiments, an adaptive Poisson jump mutation operator can be provided to dynamically control the mutation intensity parameter of the mutation process.

[0118] Specifically, as the number of iterations in the optimization operation increases, the mutation intensity parameter is dynamically decayed. For example, the formula for calculating its dynamic decay can be: To make it satisfy the Poisson distribution Furthermore, during each mutation, an integer k is sampled from this distribution, and k discrete gene loci are randomly selected for synchronous mutation. This enables large-scale exploration in the early stages of evolution and small-scale fine-grained convergence in the later stages, improving optimization efficiency and solution accuracy. This represents the variation intensity parameter after dynamic decay; This represents the initial variation intensity parameter; The preset attenuation coefficient represents the dynamic attenuation process; This indicates the current generation number.

[0119] Finally, the strategy population is updated based on the new generation of strategy individuals to obtain the updated strategy population.

[0120] In this embodiment of the application, after obtaining a new generation of strategy individuals, the strategy population can be updated based on the new generation of strategy individuals to obtain an updated strategy population. Then, the above iterative optimization operation is repeated for the updated strategy population until the number of iterations reaches a preset threshold or no new non-dominant strategy individuals are generated for M consecutive generations. At this point, the iterative optimization operation is terminated, and the final strategy population is obtained.

[0121] (3) When the iterative optimization operation is terminated, the marketing strategy corresponding to the non-dominated solution set in the final strategy population is determined as the optimized marketing strategy.

[0122] Among them, the non-dominated solution set in the final strategy population contains a series of optimal marketing strategies that achieve different balance points on multiple business objectives.

[0123] like Figure 5 The diagram shown illustrates an application flow of the marketing strategy optimization method provided in this embodiment. The method is executed by a human-machine collaborative marketing strategy optimization system, which mainly consists of an online inference module, an offline optimization module, a strategy library, and a log database. The process specifically includes the following steps: First, obtain the real-time conversation between customer service and the target user, and then send the real-time conversation to the online reasoning module.

[0124] After receiving the real-time dialogue content, the online inference module selects the optimal marketing strategy from the optimized marketing strategies stored in the strategy library, loads this optimal strategy, and activates it. Then, based on this optimal marketing strategy and the real-time dialogue content between customer service and the target user, it generates a prompt word strategy and a trigger decision strategy that match the real-time dialogue content. When the trigger decision strategy is to execute a marketing recommendation, it generates a recommended marketing script based on the prompt word strategy using a script generation model, and recommends the marketing script to customer service so that customer service can conduct business marketing to the target user based on the recommended marketing script.

[0125] Afterwards, the interaction logs generated during the business marketing process can be stored in a log database. The interaction logs include information such as dialogue content, recommended marketing scripts, user covariates, actual business processing results, customer service acceptance markers, and user complaint markers.

[0126] Furthermore, the offline optimization module can be deployed in a background analysis environment for periodic optimization of marketing strategies. Specifically, this module can execute the marketing strategy optimization method provided in the embodiments of this application, construct a counterfactual evaluation environment based on historical logs, then execute a multi-objective evolutionary algorithm to optimize the strategy parameters of the marketing strategy, and finally output a Pareto optimal candidate strategy set, and synchronize the results to the strategy library.

[0127] The strategy library can include marketing strategies currently in effect online, strategies in shadow testing, and marketing strategies optimized by the offline optimization module.

[0128] In some embodiments, such as Figure 6 As shown, the online inference module performs dynamic marketing decisions and generates recommended marketing scripts in a real-time production environment, which can be achieved through the following steps: (1) Based on the real-time dialogue content and the user covariates of the target user, the business opportunity probability of the target user is determined by the business opportunity identification model; whereby the business opportunity probability is used to characterize the potential conversion probability of the target user in the marketing process.

[0129] Determining the probability of business opportunities for target users is mainly to assess the potential of specific business marketing opportunities in the current conversation in real time and automatically during the dialogue between customer service and users, so as to provide a quantitative basis for subsequent triggering decisions.

[0130] In some embodiments, an automatic speech recognition module can be used to transcribe the real-time voice stream between customer service and the target user into a structured text sequence in real time. This text sequence contains complete semantic information of the current conversation. Then, the static and dynamic behavioral characteristics of the target user are queried and concatenated in real time from the user profile database and the business system to form user covariates.

[0131] Static profiles may include, for example, user age, plan type, duration of service, and spending level. Dynamic behavioral characteristics may include, for example, recent data usage, call minutes, complaint history, and recently viewed or inquired about services.

[0132] Subsequently, the processed real-time dialogue text and user covariates are input into a pre-trained business opportunity recognition model. This business opportunity recognition model can be a deep semantic understanding model, such as a pre-trained model based on the Transformer architecture, like BERT or its variants. This model has been supervisedly fine-tuned on massive amounts of historical dialogue text and corresponding business transaction records. Its learning objective is to accurately determine whether a dialogue contains potential demand or arousable interest in a specific business, such as "upgrading data plans" or "activating international roaming."

[0133] For example, suppose in a conversation, a user says, "I seem to be using up my data really fast this month; even watching videos is a bit choppy." Meanwhile, the business marketing system determines that the user's current data plan is low-tier and their average data usage over the past three months exceeds 90%. In this case, the business opportunity identification model, based on the conversation content and user covariates, analyzes the user's business opportunity probability and obtains it as 0.82. Therefore, it can be considered that there is a strong marketing opportunity to upgrade the data plan for this user.

[0134] It should be noted that the above-described methods for determining the probability of a target user's business opportunity through a business opportunity identification model are merely illustrative examples of embodiments of this application and do not impose any limitations on the embodiments of this application.

[0135] (2) Generate a triggering decision strategy based on the probability of business opportunities and the strategy triggering parameters in the optimal marketing strategy.

[0136] In this embodiment of the application, after calculating the business opportunity probability, the strategy function can be used to determine the business opportunity probability and the strategy triggering parameters in the optimal marketing strategy. Generate binary marketing decisions , where 1 indicates that a marketing recommendation has been triggered.

[0137] (3) When the triggering decision strategy is to execute marketing recommendations, the marketing strategy is subject to compliance hard rule verification.

[0138] (4) If the verification passes, a recommended marketing script will be generated using the script generation model based on the prompt word strategy. The script will then be sent to customer service personnel.

[0139] (5) Finally, output the marketing recommendation results and record the execution log.

[0140] Optionally, considering practical applications, to ensure the effectiveness of marketing results, it is usually necessary to periodically fine-tune the opportunity identification model and the script generation model before deploying the fine-tuned model to the actual marketing environment. This may lead to a degradation in the model's generalization ability. To avoid this problem, in some embodiments, the strategy parameters of the prompt word strategy and the strategy triggering parameters of the triggering decision strategy can be completely decoupled from the model parameters of the underlying model; wherein, the underlying model includes the opportunity identification model and the script generation model.

[0141] The strategy parameters of the prompt word strategy can include the style type parameter, tone type parameter, and script template of the recommended marketing script.

[0142] Optionally, the style of recommended marketing messages can include formal, semi-conversational, or fully conversational. Tone of voice can include enthusiastic, professional, and caring.

[0143] The script template can be combined with several recommended marketing script examples. For example, Example 1: "Hello, it seems you're running out of data..."

[0144] The final output prompt strategy can be in the format of [system command] + [user profile] + [business rule] + [example of recommended marketing script].

[0145] Opportunity identification models are used to determine the potential conversion probability of target users in the marketing process based on user covariates of target users and dialogue content related to target users. The script generation model is used to generate recommended marketing scripts based on prompt word strategies.

[0146] The strategy parameters of the prompt word strategy include discrete style parameters that characterize the style of the recommendation marketing message and continuous generation parameters that characterize the content length of the recommendation marketing message.

[0147] The strategy triggering parameters for triggering decision-making strategies include discrete rule switch vectors that indicate the activation status of marketing business rules, and continuous feature weight vectors that characterize the importance of dialogue content and the importance of user covariates, respectively.

[0148] Among them, the discrete rule switch vector is used to control whether marketing rules are enabled based on discrete conditions such as customer profiles and recent trajectories, such as prohibiting users who have complained in the past 7 days from triggering.

[0149] The continuous feature weight vector can include weight vectors of continuous features such as sentiment score, package matching degree, user's recent bills, and data consumption, and is used to adjust the importance of continuous features in triggering decisions.

[0150] This method allows for the optimization of marketing strategies without altering the underlying model parameters. Since the optimization process only requires optimizing the marketing strategy itself, without needing to periodically fine-tune the opportunity identification model and the script generation model, the problem of model generalization degradation can be avoided.

[0151] Furthermore, considering that in existing technologies, to ensure the effectiveness of business marketing, marketing strategies are often tested on real users, this may trigger numerous complaints or compliance risks, resulting in high trial-and-error costs and failing to meet the stringent regulatory requirements of the telecommunications industry. To avoid this problem, in some implementations, before generating prompt word strategies and trigger decision strategies that match the real-time dialogue content between customer service and target users based on the optimal marketing strategy, the optimized marketing strategy can be deployed in a shadow service environment. The real-time dialogue content and the target user's user covariates are synchronously copied to the shadow service environment. Based on the optimized marketing strategy deployed in the shadow service environment (hereinafter referred to as the shadow strategy for ease of description), the real-time dialogue content, and the user covariates, the marketing trigger decision and the generation of recommended marketing scripts are simulated, and the simulation execution results are obtained. The simulated recommended marketing scripts are not pushed to customer service. Based on the simulation execution results, the trigger rate difference, conversion rate difference, and complaint risk difference between the shadow strategy and the current online marketing strategy are calculated. If the trigger rate difference, conversion rate difference, and complaint risk difference meet preset verification conditions, the shadow strategy is then uploaded to the marketing strategy library. The preset verification conditions are that the shadow strategy's results are stable and its performance is superior to the online marketing strategy.

[0152] Among them, the difference in trigger rate It can be calculated as follows:

[0153] in, This indicates the trigger rate of the current online marketing strategy; This indicates the trigger rate of the shadow strategy.

[0154] Conversion rate differences It can be calculated as follows:

[0155] in, This indicates the conversion rate of the current online marketing strategy; This indicates the conversion rate of the shadow strategy.

[0156] Differences in complaint risk It can be calculated as follows:

[0157] in, Indicates the complaint risk rate of the shadow strategy; This indicates the complaint risk rate of the current online marketing strategy.

[0158] If the differences in trigger rate, conversion rate, and complaint risk meet the preset verification conditions, the shadow strategy is considered to be stable and outperforming the online marketing strategy. Therefore, it can be gradually expanded to a wider range of users and then launched online. Otherwise, it returns to the offline optimization phase to continue evolving, achieving closed-loop learning and secure deployment. In this way, since the marketing strategy does not need to be tested on real users, it avoids triggering a large number of complaints or compliance risks, reducing trial-and-error costs.

[0159] The method provided in this application constructs a multi-dimensional historical sample set based on historical interaction logs, including user covariates, actual business processing results, historical recommendation decision markers, customer service adoption markers, and user complaint markers. This sample set is then used to train a counterfactual estimation model to generate counterfactual proxy labels for target samples that have not been recommended. In this way, the business processing, adoption, and complaint results of potential users can be objectively estimated without relying on manual annotation. This can significantly reduce marketing costs, shorten the optimization cycle, improve strategy optimization efficiency, and enhance the accuracy of opportunity identification and recommendation.

[0160] To address the problems in existing technologies where marketing opportunity identification and recommendation script generation heavily rely on large-scale, high-quality manually labeled data, and where it is difficult to accurately identify and label potential marketable users who are not recommended but are actually eligible for marketing services, resulting in high marketing costs, low strategy optimization efficiency, and low marketing conversion rates, this application provides a marketing strategy optimization device. A schematic diagram of the specific structure of this device is shown below. Figure 7 As shown, it includes a construction module 701, a training module 702, a generation module 703, a determination module 704, and an optimization module 705. The functions of each module are as follows: Module 701 is used to build a historical sample set based on historical interaction logs generated during the marketing process. Training module 702 is used to train a counterfactual estimation model based on the historical sample set to obtain the trained counterfactual estimation model; The generation module 703 is used to generate counterfactual proxy labels for the target sample using the trained counterfactual estimation model; The determination module 704 is used to determine a multi-objective fitness function for evaluating marketing strategies based on the counterfactual agent label; Optimization module 705 is used to optimize the marketing strategy to be optimized based on the multi-objective fitness function to obtain the optimized marketing strategy; Each sample in the historical sample set includes at least: user covariates, actual business processing results, historical recommendation decision markers, customer service adoption markers, and user complaint markers. Counterfactual estimation models are used to estimate the potential business outcome, potential adoption outcome, and potential complaint outcome of a target sample when performing marketing recommendations. The target sample is the sample whose historical recommendation decision was marked as unrecommended.

[0161] Optional, training module 702, used for: Based on historical recommendation decision labels, non-target samples other than the target samples are identified from the historical sample set; Based on user covariates from non-target samples, actual business processing results, customer service adoption markers, and user complaint markers, a counterfactual estimation model is trained to obtain the trained counterfactual estimation model. Optionally, the counterfactual estimation model includes a trained model for estimating the probability of potential business transactions, a model for estimating the probability of potential adoption, and a model for estimating the risk of potential complaints; then, the training module is used for: The counterfactual estimation model is trained based on a historical sample set, resulting in a fully trained counterfactual estimation model, including: The user covariates of non-target samples and the actual business processing results are input into the potential business processing probability estimation model for training, and the trained potential business processing probability estimation model is obtained. Input the user covariates and customer service adoption labels of non-target samples into the potential adoption probability estimation model for training, and obtain the trained potential adoption probability estimation model. The user covariates and user complaint labels of non-target samples are input into the potential complaint risk estimation model for training, and the trained potential complaint risk estimation model is obtained. Based on the trained potential business processing probability estimation model, the trained potential adoption probability estimation model, and the trained potential complaint risk estimation model, a trained counterfactual estimation model is obtained.

[0162] Optionally, module 703 is used for: By inputting the user covariates of the target sample into the trained counterfactual estimation model, the potential business processing results, potential adoption results, and potential complaint results of the target sample when performing marketing recommendations are obtained. Counterfactual agency labels are used to identify potential business processing outcomes, potential adoption outcomes, and potential complaint outcomes as target samples.

[0163] Optionally, the counterfactual agency label includes the target sample's potential business transaction outcomes, potential adoption outcomes, and potential complaint outcomes when performing marketing recommendations; then, the generation module is used for: Based on historical recommendation decision labels in the historical sample set, the recommendation propensity score of user covariates is calculated; the recommendation propensity score is used to characterize the probability that a historical marketing strategy is inclined to execute marketing recommendations given user covariates; Based on the recommendation preference score and the actual business processing results of non-target samples in the historical sample set (excluding the target sample), the potential business processing results are corrected to obtain the corrected potential business processing results. The potential adoption results, potential complaint results, and corrected potential business processing results generated by the counterfactual estimation model after training will be identified as counterfactual agency labels for the target sample.

[0164] Optional, optimization module 705, used for: The marketing strategies to be optimized are coded into individual strategies, forming a strategy population; For each individual in the strategy population, perform the following iterative optimization operation until the number of iterations reaches a preset threshold or no new non-dominant strategy individuals are generated for M consecutive generations, then terminate the iterative optimization operation: The maximum counterfactual conversion probability, the maximum counterfactual adoption probability, and / or the minimum counterfactual complaint probability of each strategy individual in the strategy population are calculated based on the multi-objective fitness function. Based on the maximum counterfactual conversion probability, the maximum counterfactual adoption probability, and / or the minimum counterfactual complaint probability of each strategy individual, the strategy individuals are non-dominated and ranked to select non-dominated strategy individuals. A new generation of policy individuals is generated by performing crossover and mutation operations on non-dominated policy individuals. The updated strategy population is obtained by updating the strategy population based on the new generation of strategy individuals; When the iterative optimization operation is terminated, the marketing strategy corresponding to the non-dominated solution set in the final strategy population is determined as the optimized marketing strategy.

[0165] Optionally, the multi-objective fitness function consists of at least one of the following: maximizing the counterfactual conversion rate function, maximizing the counterfactual adoption rate function, and minimizing the counterfactual complaint rate function.

[0166] Optionally, the marketing strategy optimization device also includes: The selection module is used to select the best marketing strategy from the optimized marketing strategies; The first processing module is used to generate prompt word strategies and trigger decision strategies that match the real-time dialogue content between customer service and target users, based on the optimal marketing strategy and the real-time dialogue content between customer service and target users. The second processing module is used to generate recommended marketing messages based on the prompt word strategy and the message generation model when the trigger decision strategy is to execute marketing recommendations. The recommendation module is used to recommend marketing scripts to customer service representatives, enabling them to conduct business marketing to target users based on the recommended scripts.

[0167] Optionally, the first processing module is used for: Based on real-time dialogue content and target user covariates, the business opportunity probability of the target user is determined through a business opportunity identification model; whereby the business opportunity probability is used to characterize the potential conversion probability of the target user in the marketing process. Based on the probability of business opportunities and the strategy triggering parameters in the optimal marketing strategy, a triggering decision strategy is generated.

[0168] Optionally, the marketing strategy optimization device is also used for: Before generating prompts and triggering decisions based on the optimal marketing strategy and the real-time dialogue content between customer service and target users, the optimized marketing strategy will be deployed in the shadow service environment. Synchronously replicate the real-time dialogue content and the target user's user covariates to the shadow service environment; Based on the optimized marketing strategy deployed in the shadow service environment, real-time dialogue content, and user covariates, the simulation executes marketing trigger decisions and generates recommended marketing scripts to obtain simulation execution results; among them, the simulated recommended marketing scripts are not pushed to customer service. Based on the simulation results, the differences in trigger rate, conversion rate, and complaint risk between the optimized marketing strategy and the current online marketing strategy are calculated. If the differences in trigger rate, conversion rate, and complaint risk meet the preset verification conditions, the optimized marketing strategy will be launched into the marketing strategy library.

[0169] Optionally, the policy parameters of the prompt word policy, the policy triggering parameters of the triggering decision policy, and the model parameters of the underlying model are completely decoupled; The underlying model includes a business opportunity identification model and a sales script generation model; Opportunity identification models are used to determine the potential conversion probability of target users in the marketing process based on user covariates of target users and dialogue content related to target users; A script generation model is used to generate recommended marketing scripts based on prompt word strategies; The strategy parameters of the prompt word strategy include discrete style parameters that characterize the style of the recommendation marketing message and continuous generation parameters that characterize the content length of the recommendation marketing message; The strategy triggering parameters for triggering decision-making strategies include discrete rule switch vectors that indicate the activation status of marketing business rules, and continuous feature weight vectors that characterize the importance of dialogue content and the importance of user covariates, respectively.

[0170] The device provided in this application constructs a multi-dimensional historical sample set based on historical interaction logs, including user covariates, actual business processing results, historical recommendation decision markers, customer service adoption markers, and user complaint markers. This sample set is then used to train a counterfactual estimation model to generate counterfactual proxy labels for target samples that have not been recommended. In this way, the business processing, adoption, and complaint results of potential users can be objectively estimated without relying on manual annotation. This can significantly reduce marketing costs, shorten the optimization cycle, improve strategy optimization efficiency, and enhance the accuracy of business opportunity identification and recommendation.

[0171] Figure 8 To illustrate the hardware structure of an electronic device according to various embodiments of this application, the electronic device may include a processor 801 and a memory 802 storing computer program instructions. Specifically, the processor 801 may include a central processing unit (CPU), an application-specific integrated circuit (ASIC), or one or more integrated circuits configured to implement embodiments of this application.

[0172] Memory 802 may include mass storage for data or instructions. For example, and not limitingly, memory 802 may include a hard disk drive (HDD), floppy disk drive, flash memory, optical disk, magneto-optical disk, magnetic tape, or Universal Serial Bus (USB) drive, or a combination of two or more of these. Where appropriate, memory 802 may include removable or non-removable (or fixed) media. Where appropriate, memory 802 may be internal or external to an electronic device. In a particular embodiment, memory 802 may be a non-volatile solid-state memory.

[0173] In one embodiment, memory 802 may be read-only memory (ROM). In one embodiment, the ROM may be a mask-programmed ROM, a programmable ROM (PROM), an erasable PROM (EPROM), an electrically erasable PROM (EEPROM), an electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these.

[0174] The processor 801 reads and executes computer program instructions stored in the memory 802 to implement any of the marketing strategy optimization methods in the above embodiments.

[0175] In one example, the electronic device may also include a communication interface 803 and a bus 810. For example, Figure 8As shown, the processor 801, memory 802, and communication interface 803 are connected through bus 810 and complete communication with each other.

[0176] The communication interface 803 is mainly used to realize communication between various modules, devices, units and / or equipment in the embodiments of this application.

[0177] Bus 810 includes hardware, software, or both, that couples components of an electronic device together. For example, and not limitingly, the bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an Infinite Bandwidth Interconnect, a Low Pin Count (LPC) bus, a memory bus, a Microchannel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a Video Electronics Standards Association Local (VLB) bus, or other suitable buses, or combinations of two or more of these. Where appropriate, bus 810 may include one or more buses. Although specific buses are described and illustrated in embodiments of this application, this application contemplates any suitable bus or interconnect.

[0178] Furthermore, in conjunction with the marketing strategy optimization methods in the above embodiments, this application embodiment can provide a computer-readable storage medium for implementation. This computer-readable storage medium stores computer program instructions; when executed by a processor, these computer program instructions implement any of the marketing strategy optimization methods in the above embodiments.

[0179] It should be clarified that this application is not limited to the specific configurations and processes described above and shown in the figures. For the sake of brevity, detailed descriptions of known methods are omitted here. In the above embodiments, several specific steps are described and shown as examples. However, the method process of this application is not limited to the specific steps described and shown. Those skilled in the art can make various changes, modifications, and additions, or change the order of steps, after understanding the spirit of this application.

[0180] The above description is merely a specific implementation example of this application. Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, modules, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0181] Secondly, those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0182] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0183] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0184] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0185] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0186] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0187] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.

[0188] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0189] The above description is merely an embodiment of this application and is not intended to limit the scope of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of the claims of this application.

Claims

1. A marketing strategy optimization method, characterized in that, include: A historical sample set is constructed based on historical interaction logs generated during the marketing process; The counterfactual estimation model is trained based on the historical sample set to obtain the trained counterfactual estimation model; The counterfactual estimation model trained in the training process generates counterfactual proxy labels for the target samples. Based on the counterfactual agent labels, a multi-objective fitness function is determined for evaluating marketing strategies; The marketing strategy to be optimized is then optimized based on the multi-objective fitness function to obtain the optimized marketing strategy. Each sample in the historical sample set includes at least: user covariates, actual business processing results, historical recommendation decision markers, customer service adoption markers, and user complaint markers; The counterfactual estimation model is used to estimate the potential business processing results, potential adoption results, and potential complaint results of the target sample when performing marketing recommendations; The target sample is the sample that was marked as not recommended in the historical recommendation decision.

2. The method as described in claim 1, characterized in that, The process of training a counterfactual estimation model based on the historical sample set to obtain the trained counterfactual estimation model includes: Based on the historical recommendation decision markers, non-target samples other than the target samples are determined from the historical sample set; Based on the user covariates of the non-target samples, the actual business processing results, the customer service adoption markers, and the user complaint markers, the counterfactual estimation model is trained to obtain the trained counterfactual estimation model.

3. The method as described in claim 1 or 2, characterized in that, The counterfactual estimation model includes a trained model for estimating the probability of potential business transactions, a model for estimating the probability of potential adoption, and a model for estimating the risk of potential complaints. The process of training a counterfactual estimation model based on the historical sample set to obtain the trained counterfactual estimation model includes: The user covariates of the non-target samples and the actual business processing results are input into the potential business processing probability estimation model for training, and the trained potential business processing probability estimation model is obtained. The user covariates and customer service adoption markers of the non-target samples are input into the potential adoption probability estimation model for training, and the trained potential adoption probability estimation model is obtained. The user covariates and user complaint labels of the non-target samples are input into the potential complaint risk estimation model for training, and the trained potential complaint risk estimation model is obtained. Based on the trained potential business processing probability estimation model, the trained potential adoption probability estimation model, and the trained potential complaint risk estimation model, the trained counterfactual estimation model is obtained.

4. The method according to any one of claims 1 to 3, characterized in that, The counterfactual estimation model trained on the training process generates counterfactual proxy labels for the target sample, including: The user covariates of the target sample are input into the trained counterfactual estimation model to obtain the potential business processing results, potential adoption results, and potential complaint results of the target sample when performing marketing recommendations. The potential business processing results, the potential adoption results, and the potential complaint results are identified as counterfactual agency labels for the target sample.

5. The method according to any one of claims 1 to 4, characterized in that, The counterfactual agency label includes the potential business processing results, the potential adoption results, and the potential complaint results of the target sample when performing marketing recommendations; The counterfactual estimation model trained on the training process generates counterfactual proxy labels for the target sample, including: Based on the historical recommendation decision labels in the historical sample set, the recommendation tendency score of the user covariate is calculated; the recommendation tendency score is used to characterize the probability that the historical marketing strategy tends to execute marketing recommendations given the user covariate. Based on the recommendation tendency score and the actual business processing results of non-target samples in the historical sample set excluding the target sample, the potential business processing results are corrected to obtain the corrected potential business processing results. The potential adoption results, potential complaint results, and corrected potential business processing results generated by the counterfactual estimation model after training are determined as the counterfactual proxy labels of the target sample.

6. The method as described in claim 1, characterized in that, The multi-objective fitness function consists of at least one of the following: maximizing the counterfactual conversion rate function, maximizing the counterfactual adoption rate function, and minimizing the counterfactual complaint rate function; The optimization of the marketing strategy to be optimized based on the multi-objective fitness function to obtain the optimized marketing strategy includes: The marketing strategies to be optimized are coded into individual strategies, forming a strategy population; For each policy individual in the policy population, perform the following iterative optimization operation until the number of iterations reaches a preset threshold or no new non-dominant policy individual is generated for M consecutive generations, then terminate the iterative optimization operation: Based on the multi-objective fitness function, the maximum counterfactual conversion probability, the maximum counterfactual adoption probability, and / or the minimum counterfactual complaint probability of each strategy individual in the strategy population are calculated respectively. Based on the maximum counterfactual conversion probability, the maximum counterfactual adoption probability, and / or the minimum counterfactual complaint probability of each strategy individual, the strategy individuals are non-dominated and sorted to filter out non-dominated strategy individuals. Based on the non-dominated policy individuals, crossover and mutation operations are performed to generate a new generation of policy individuals; The strategy population is updated based on the new generation of strategy individuals to obtain the updated strategy population; When the iterative optimization operation is terminated, the marketing strategy corresponding to the non-dominated solution set in the final strategy population is determined as the optimized marketing strategy.

7. The method as described in claim 1, characterized in that, The method further includes: Select the optimal marketing strategy from the optimized marketing strategies; Based on the optimal marketing strategy, and the real-time dialogue content between customer service and target users, a prompt word strategy and a trigger decision strategy that match the real-time dialogue content are generated. When the triggering decision strategy is to execute a marketing recommendation, a recommendation marketing message is generated based on the prompt word strategy and the message generation model. The recommended marketing script is recommended to the customer service representative so that the representative can conduct business marketing to the target user based on the recommended marketing script.

8. The method as described in claim 7, characterized in that, The generation of a trigger decision strategy based on the optimal marketing strategy and the real-time dialogue content between customer service and target users, matching the real-time dialogue content, includes: Based on the real-time dialogue content and the target user's user covariates, the business opportunity probability of the target user is determined through a business opportunity identification model; wherein, the business opportunity probability is used to characterize the potential conversion probability of the target user in the marketing process; The triggering decision strategy is generated based on the business opportunity probability and the strategy triggering parameters in the optimal marketing strategy.

9. The method as described in claim 7, characterized in that, Before generating a prompt word strategy and a trigger decision strategy that match the real-time dialogue content based on the optimal marketing strategy and the real-time dialogue content between customer service and the target user, the method further includes: Deploy the optimized marketing strategy in a shadow service environment; The real-time dialogue content and the target user's user covariates are synchronously copied to the shadow service environment; Based on the optimized marketing strategy deployed in the shadow service environment, the real-time dialogue content, and the user covariates, the marketing trigger decision and the generated recommended marketing scripts are simulated to obtain the simulation execution results; wherein, the generated recommended marketing scripts are not pushed to the customer service. Based on the simulation results, calculate the differences in trigger rate, conversion rate, and complaint risk between the optimized marketing strategy and the current online marketing strategy; If the difference in trigger rate, the difference in conversion rate, and the difference in complaint risk meet the preset verification conditions, the optimized marketing strategy will be uploaded to the marketing strategy library.

10. The method according to any one of claims 7 to 9, characterized in that, The policy parameters of the prompt word strategy and the policy triggering parameters of the triggering decision strategy are completely decoupled from the model parameters of the underlying model; The underlying model includes a business opportunity identification model and a script generation model; The business opportunity identification model is used to determine the potential conversion probability of the target user in the marketing process based on the user covariates of the target user and the dialogue content related to the target user. The script generation model is used to generate recommended marketing scripts based on the prompt word strategy; The strategy parameters of the prompt word strategy include discrete style parameters for characterizing the style of the recommended marketing message and continuous generation parameters for characterizing the content length of the recommended marketing message; The strategy triggering parameters of the decision-making strategy include a discrete rule switch vector indicating the activation status of marketing business rules, and a continuous feature weight vector representing the importance of the dialogue content and the importance of the user covariate, respectively.

11. A marketing strategy optimization device, characterized in that, It includes a construction module, a training module, a generation module, a determination module, and an optimization module, among which: The building module is used to construct a historical sample set based on historical interaction logs generated during the marketing process; The training module is used to train the counterfactual estimation model based on the historical sample set, so as to obtain the trained counterfactual estimation model. The generation module is used to generate counterfactual proxy labels for the target sample using the trained counterfactual estimation model; The determination module is used to determine a multi-objective fitness function for evaluating marketing strategies based on the counterfactual agent labels; The optimization module is used to optimize the marketing strategy to be optimized based on the multi-objective fitness function to obtain the optimized marketing strategy. Each sample in the historical sample set includes at least: user covariates, actual business processing results, historical recommendation decision markers, customer service adoption markers, and user complaint markers; The counterfactual estimation model is used to estimate the potential business processing results, potential adoption results, and potential complaint results of the target sample when performing marketing recommendations; The target sample is the sample that was marked as not recommended in the historical recommendation decision.