Data processing method, apparatus and device

By acquiring the feature vectors of target information and the frequency, matching information, and intent recognition of candidate statements, the target statement is determined, which solves the problem of untimely statement updates in risk control scenarios and achieves timely and accurate risk control.

CN115222262BActive Publication Date: 2026-06-16ALIPAY (HANGZHOU) INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
Filing Date
2022-07-22
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

In risk control scenarios, the rapid changes in black market fraud methods lead to a large volume of risk control data and a fast update speed. Existing models cannot be updated in a timely manner, and it is impossible to accurately determine the wording that matches the current scenario, resulting in poor risk control effectiveness.

Method used

By acquiring the feature vector of the target information, and combining it with the frequency information, matching information and intent recognition processing of the candidate dialogue, the matching degree between the candidate dialogue and the target user is determined, and the target dialogue that matches the target user is output for risk control.

🎯Benefits of technology

In risk control scenarios, the ability to promptly and accurately determine the appropriate wording for the current situation ensures timely and accurate risk control, preventing risk control failures caused by untimely model updates.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115222262B_ABST
    Figure CN115222262B_ABST
Patent Text Reader

Abstract

Embodiments of the present specification provide a data processing method, device and equipment, the method comprises: in the case of detecting that a target user triggers to execute a target service, based on the target information obtained, determine the first feature vector corresponding to the target information, the target information includes the information required by the target user to trigger to execute the target service, and / or the interaction information of the target user for triggering to execute the target service;Based on the frequency information, matching information and first information of the candidate script to be output, determine the second feature vector corresponding to each candidate script;Based on the first feature vector and the second feature vector, determine the matching degree of each candidate script and the target user triggering to execute the target service;Based on the matching degree, determine the target script in the candidate script which matches the target user triggering to execute the target service, and output the target script.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The embodiments in this specification relate to the field of data processing technology, and in particular to a data processing method, apparatus, and device. Background Technology

[0002] With the rapid development of the internet industry, cyber risks have also increased. In risk control scenarios, application service providers can interact with users through customer service personnel before providing services. This allows them to determine whether there are any risks associated with the current business (such as transfers, top-ups, and withdrawals) based on user feedback. To reduce the cost of manual intervention, risk control can be implemented through human-computer interaction. For example, a computer can be trained using historical dialogue scripts to determine a model. Then, based on the trained dialogue script model, it can determine the corresponding dialogue script for the current scenario and interact with the user using this determined script to control the risks of the current business.

[0003] However, when fraudulent methods employed by the black market evolve, it leads to a large volume of risk control data and rapid updates. Consequently, the data processing pressure for model updates becomes significant, making it impossible to update the model in a timely manner based on the dialogue. This can result in the inability to determine dialogue that is highly compatible with the current scenario, thus compromising the effectiveness of risk control. Therefore, a solution is needed that can accurately and promptly determine dialogue that matches the current scenario in risk control situations for risk management purposes. Summary of the Invention

[0004] The purpose of the embodiments in this specification is to provide a data processing method, apparatus, and device to provide a solution for risk control in a risk control scenario that can promptly and accurately determine the wording that matches the current scenario.

[0005] To achieve the above technical solution, the embodiments in this specification are implemented as follows:

[0006] In a first aspect, embodiments of this specification provide a data processing method, comprising: upon detecting that a target user has triggered the execution of a target service, determining a first feature vector corresponding to the target information based on acquired target information, wherein the target information includes information required by the target user to trigger the execution of the target service, and / or interaction information of the target user in response to triggering the execution of the target service; determining a second feature vector corresponding to each candidate dialogue based on frequency information, matching information, and first information of candidate dialogues to be output, wherein the frequency information is determined based on report information within a preset detection period and report information corresponding to the candidate dialogue in the report information, the matching information is used to characterize the degree of matching between the candidate dialogue and the target user and a preset risk type, and the first information is determined based on intent type obtained by intent recognition processing of the target information; determining the degree of matching between each candidate dialogue and the target user triggering the execution of the target service based on the first feature vector and the second feature vector; determining the target dialogue among the candidate dialogues that matches the target user triggering the execution of the target service based on the degree of matching, and outputting the target dialogue.

[0007] Secondly, embodiments of this specification provide a data processing apparatus, comprising: a first acquisition module, configured to, upon detecting that a target user has triggered the execution of a target service, determine a first feature vector corresponding to the acquired target information based on acquired target information, wherein the target information includes information required by the target user to trigger the execution of the target service, and / or interaction information of the target user in response to triggering the execution of the target service; a first determination module, configured to, based on frequency information, matching information, and first information of candidate dialogues to be output, determine a second feature vector corresponding to each candidate dialogue, wherein the frequency information is determined based on report information within a preset detection period and report information corresponding to the candidate dialogue among the report information, the matching information is used to characterize the degree of matching between the candidate dialogue and the target user and a preset risk type, and the first information is determined based on intent type obtained by intent recognition processing of the target information; a second determination module, configured to, based on the first feature vector and the second feature vector, determine the degree of matching between each candidate dialogue and the target user triggering the execution of the target service; and a dialogue determination module, configured to, based on the degree of matching, determine the target dialogue among the candidate dialogues that matches the target user triggering the execution of the target service, and output the target dialogue.

[0008] Thirdly, embodiments of this specification provide a data processing device, the data processing device comprising: a processor; and a memory arranged to store computer-executable instructions, wherein, when executed, the executable instructions cause the processor to: upon detecting that a target user has triggered the execution of a target service, determine a first feature vector corresponding to the acquired target information based on acquired target information, the target information including information required by the target user to trigger the execution of the target service, and / or interaction information of the target user in response to triggering the execution of the target service; and determine, based on frequency information, matching information, and the first information, a corresponding feature vector for each candidate dialogue. The second feature vector, the frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue in the report information, the matching information is used to characterize the matching degree between the candidate dialogue and the target user and the preset risk type, the first information is determined based on the intent type obtained by performing intent recognition processing on the target information; based on the first feature vector and the second feature vector, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined; based on the matching degree, the target dialogue among the candidate dialogues that matches the target user triggering the execution of the target service is determined, and the target dialogue is output.

[0009] Fourthly, embodiments of this specification provide a storage medium for storing computer-executable instructions. When executed, these instructions implement the following process: upon detecting that a target user has triggered the execution of a target service, based on acquired target information, a first feature vector corresponding to the target information is determined. The target information includes information required by the target user to trigger the execution of the target service, and / or interaction information of the target user in response to triggering the execution of the target service. Based on the frequency information, matching information, and first information of candidate dialogues to be output, a second feature vector corresponding to each candidate dialogue is determined. The frequency information is determined based on report information within a preset detection period and report information corresponding to the candidate dialogue within the report information. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and a preset risk type. The first information is determined based on intent type obtained by performing intent recognition processing on the target information. Based on the first feature vector and the second feature vector, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined. Based on the matching degree, a target dialogue among the candidate dialogues that matches the target user triggering the execution of the target service is determined, and the target dialogue is output. Attached Figure Description

[0010] To more clearly illustrate the technical solutions in the embodiments or prior art of this specification, the drawings used in the description of the embodiments or prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this specification. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0011] Figure 1A This is a flowchart illustrating an embodiment of a data processing method described in this specification;

[0012] Figure 1B This is a schematic diagram illustrating the processing procedure of one embodiment of a data processing method described in this specification.

[0013] Figure 2 This is a schematic diagram illustrating the acquisition of target information as described in this specification;

[0014] Figure 3 This is a schematic diagram illustrating the processing procedure of another data processing method embodiment in this specification;

[0015] Figure 4 This is a schematic diagram illustrating the determination of matching degree in this specification;

[0016] Figure 5 This is a schematic diagram of the structure of an embodiment of a data processing device according to this specification;

[0017] Figure 6 This is a schematic diagram of the structure of a data processing device described in this specification. Detailed Implementation

[0018] This specification provides a data processing method, apparatus, and device through its embodiments.

[0019] To enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this specification, and not all embodiments. Based on the embodiments in this specification, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of this specification.

[0020] Example 1

[0021] like Figure 1A and Figure 1BAs shown in the embodiments of this specification, a data processing method is provided. The execution subject of this method can be a terminal device or a server. The terminal device can be a device such as a personal computer, or a mobile terminal device such as a mobile phone or a tablet computer. The server can be an independent server or a server cluster composed of multiple servers.

[0022] This method may specifically include the following steps:

[0023] In S102, when it is detected that the target user has triggered the execution of the target service, a first feature vector corresponding to the target information is determined based on the acquired target information.

[0024] The target business can be any business involving user privacy, property security, etc. For example, the target business can be a resource transfer business, a privacy information update business (such as changing login password, adding user information, etc.), etc. The target information can include the information required for the target user to trigger the execution of the target business, and / or the interaction information of the target user in response to triggering the execution of the target business. For example, assuming the target business is a resource transfer business, the target information can include the authentication information required for the target user to trigger the execution of the resource transfer business, and / or the interaction information of the target user in response to triggering the execution of the resource transfer business. The interaction information can specifically include the target user's feedback information in response to questions such as "Did you meet the resource transfer target online?"

[0025] In practice, with the rapid development of the internet industry, network risks have also increased. In risk control scenarios, application service providers can interact with users through customer service personnel before providing services. This allows them to determine whether there are risks associated with the current business (such as transfers, top-ups, and withdrawals) based on user feedback. To reduce the cost of manual intervention, risk control can be implemented through human-computer interaction. For example, a computer can be trained using historical dialogue scripts to determine a model. Then, based on the trained dialogue script model, it can determine the corresponding script for the current scenario and interact with the user using this determined script to control the risks of the current business.

[0026] However, as fraudulent methods employed by the black market evolve, the volume of risk control data becomes large and the update speed becomes rapid. Consequently, the data processing pressure on model updates becomes significant, making it impossible to update the dialogue confirmation model in a timely manner. This can lead to the inability to determine dialogue with a high degree of matching to the current scenario through the dialogue confirmation model, thus compromising the effectiveness of risk control. Therefore, a solution is needed that can promptly and accurately determine dialogue matching the current scenario for risk control. To this end, this specification provides a technical solution that can solve the above problems, as detailed below.

[0027] Taking a resource transfer service within a resource management application installed on an electronic device (i.e., a terminal device or server) as an example, the target user can trigger the launch of the resource management application, which in turn triggers the execution of the resource transfer service. The electronic device can obtain the information required for the target user to trigger the execution of the resource transfer service (such as the target user's authentication information) and use this information as the target information.

[0028] In addition, when the electronic device detects that the target user has triggered the execution of the target service, it can also output a preset prompt message and receive feedback information input by the target user in response to the preset prompt message. The electronic device can determine the preset prompt message and the feedback information input by the target user in response to the preset prompt message as the target information.

[0029] For example, such as Figure 2 As shown, when the electronic device detects that a target user has triggered a resource transfer service, it can display a prompt page with preset prompt information (i.e., prompt information Q1 and prompt information Q2), and can receive feedback information entered by the target user on the prompt page in response to the preset prompt information. The electronic device can identify prompt information Q1, prompt information Q2, feedback information A1, and feedback information A2 as target information.

[0030] The electronic device can determine a first feature vector corresponding to the acquired target information based on the acquired target information. There can be multiple methods for determining the first feature vector. For example, the target information can be processed by extracting features based on a pre-trained feature extraction model to obtain the first feature vector corresponding to the target information. The feature extraction module can be trained on a model constructed by a machine learning algorithm based on historical information. In addition, there can be multiple other methods for determining the first feature vector. Different methods can be selected according to different actual application scenarios. This specification does not specifically limit the methods in this way.

[0031] In S104, based on the frequency information, matching information, and first information of the candidate dialogue to be output, the second feature vector corresponding to each candidate dialogue is determined.

[0032] Among them, the candidate dialogue can be used to obtain feedback information from the target user regarding the target business during the interaction process. The feedback information can be any text information, voice information, etc. The frequency information can be determined based on the report information within the preset detection period and the report information corresponding to the candidate dialogue in the report information. The preset detection period can be the past 3 days, the past week, the past month, etc. The report information can be the information provided by the user when reporting the preset business with risks. For example, the report information can include the business information of the preset business entered by the user, the trigger information (such as the trigger time, etc.), and other related information. The matching information can be used to characterize the degree of matching between the candidate dialogue and the target user and the preset risk type. There can be multiple preset risk types, and different risk types can be set according to the actual application scenario. For example, risk types can include fraudulent transaction types, loan types, game types, etc. The embodiments of this specification do not specifically limit the risk types. The first information can be determined based on the intent type obtained by performing intent recognition processing on the target information.

[0033] In implementation, the frequency information of each candidate dialogue can be determined based on the reported information acquired within a preset detection period and the reported information corresponding to each candidate dialogue within that reported information. The frequency information can be used to characterize the frequency of occurrence of each candidate dialogue in the reported information within the preset detection period; that is, the higher the frequency of occurrence, the higher the output probability of the candidate dialogue. For example, assuming there are 5 reported messages within the preset detection period, of which 3 report messages correspond to candidate dialogue 1, 2 report messages correspond to candidate dialogue 2, and 1 report message corresponds to candidate dialogue 3, then based on the above reported information, the frequency information of candidate dialogue 1, candidate dialogue 2, and candidate dialogue 3 can be determined respectively. Obviously, the frequency of occurrence of candidate dialogue 1 is higher than that of candidate dialogue 2 and candidate dialogue 3; therefore, the output probability of candidate dialogue 1 is higher than that of candidate dialogue 2 and candidate dialogue 3.

[0034] Furthermore, the reported information can be associated with one or more candidate statements. For example, keyword extraction can be performed on the reported information, and the corresponding candidate statements can be determined based on the extracted keywords. For instance, if the reported information is "I met a malicious third party online, and he told me that he would give me a commission within two days after the transaction is completed," the keywords obtained from the keyword extraction of the reported information can include "met online" and "commission." Then, the keywords can be matched with candidate statements, and the candidate statements corresponding to the reported information can be determined based on the matching results. For instance, the candidate statements corresponding to the reported information can include candidate statement 1 and candidate statement 2. Candidate statement 1 can be "whether you met the other party online" and candidate statement 2 can be "whether the other party promised you a rebate or commission."

[0035] Electronic devices can determine the degree of matching between each candidate message and the target user and the preset risk type based on the acquired target information (i.e., information required by the target user to trigger the execution of the target service, and / or target information of the target user's interaction information in response to triggering the execution of the target service), the content of the candidate messages, etc., thus obtaining the matching information for each candidate message. For example, assuming there are 8 preset risk types, the electronic device can determine the degree of matching between each candidate message and the target user and each risk type based on a pre-trained matching degree determination model, based on the content of each candidate message, the acquired target information, and each risk type. The matching degree determination model can be trained on a model constructed by a machine learning algorithm based on historical candidate messages, historical user target information, and preset risk types, and this matching degree is then used to determine the matching information between each candidate message and the target user and each risk type.

[0036] Electronic devices can determine the first information of each candidate dialogue based on the intent type obtained by performing intent recognition processing on the target information. The first information can be used to characterize the degree of matching between each candidate dialogue and the intent type of the target information. That is, the higher the degree of matching between the candidate dialogue and the intent type of the target information, the higher the output probability of the candidate dialogue.

[0037] In practical application scenarios, there can be a variety of different methods for determining the frequency information, matching information, and first information of the aforementioned candidate statements. Different methods can be selected according to different practical application scenarios. This specification does not specifically limit this method in the embodiments.

[0038] The electronic device can determine a second feature vector corresponding to each candidate dialogue based on the frequency information, matching information, and first information of the candidate dialogue. There are multiple methods for determining the second feature vector. For example, the frequency information, matching information, and first information of the candidate dialogue can be processed by a pre-trained feature extraction model to obtain the second feature vector corresponding to the candidate dialogue. The feature extraction module can be trained on a model constructed by a machine learning algorithm based on the frequency information, matching information, and first information of historical candidate dialogues. In addition, there are multiple other methods for determining the second feature vector. Different methods can be selected according to different application scenarios. This specification does not specifically limit the methods in this way.

[0039] In S106, based on the first feature vector and the second feature vector, the matching degree between each candidate message and the target user triggering the execution of the target service is determined.

[0040] In implementation, the matching degree between each candidate dialogue and the target user triggering the execution of the target service can be determined based on the similarity between the first feature vector and the second feature vector. There are multiple methods for determining the similarity between the first feature vector and the second feature vector, and different methods can be selected according to different actual application scenarios. This specification does not specifically limit this method in the embodiments.

[0041] In S108, based on the matching degree, the target script that matches the target user's trigger execution of the target business is determined from the candidate scripts, and the target script is output.

[0042] In implementation, one or more candidate statements with a matching degree greater than a preset matching degree threshold can be identified as target statements. Alternatively, the candidate statement with the highest matching degree among the candidate statements can be identified as the target statement. After identifying the target statement, the target statement can be output to perform risk control based on the target statement. For example, the target statement can be used to prompt the target user that there may be risks in triggering the execution of the target business. Alternatively, the feedback information from the target user regarding the target statement can be used to determine whether there are risks in triggering the execution of the target business by the target user.

[0043] This specification provides a data processing method. When a target user triggers the execution of a target service, a first feature vector corresponding to the target information is determined based on the acquired target information. The target information may include information required for the target user to trigger the execution of the target service, and / or the target user's interaction information in response to triggering the execution of the target service. Based on the frequency information, matching information, and first information of the candidate dialogues to be output, a second feature vector corresponding to each candidate dialogue is determined. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and a preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information. Based on the first and second feature vectors, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined. Based on the matching degree, the target dialogue among the candidate dialogues that matches the target user triggering the execution of the target service is determined, and the target dialogue is output. In this way, the target dialogue that matches the target user's trigger execution of the target business can be determined by using the second feature vector determined by the frequency information, matching information, and first information of the candidate dialogue, and the first feature vector determined by the target information. This avoids the problem of not being able to determine the dialogue for risk control in a timely and accurate manner due to the inability to update the model in a timely manner when the risk control data volume is large and the update speed is fast. That is, the target dialogue that matches the target user's trigger execution of the target business can be determined in a timely and accurate manner by using the frequency information, matching information, and first information of the candidate dialogue, so that risk control can be carried out in a timely and accurate manner in the risk control scenario through the determined target dialogue.

[0044] Example 2

[0045] like Figure 3 As shown in the embodiments of this specification, a data processing method is provided. The execution subject of this method can be a terminal device or a server. The terminal device can be a personal computer or a mobile terminal device such as a mobile phone or tablet computer. The server can be a standalone server or a server cluster composed of multiple servers. Specifically, the method may include the following steps:

[0046] In S102, when it is detected that the target user has triggered the execution of the target service, a first feature vector corresponding to the target information is determined based on the acquired target information.

[0047] The target information may include the information required by the target user to trigger the execution of the target service, and / or the interaction information of the target user in triggering the execution of the target service.

[0048] In practice, a pre-trained vector determination model can be used to perform feature extraction on the target information to obtain the first feature vector corresponding to the target information. For example, a pre-trained BERT model can be used to perform feature extraction on the target information to obtain the first feature vector corresponding to the target information.

[0049] In S302, the first number of reported information within the preset detection period is obtained.

[0050] In S304, determine the second number of report messages that correspond to the candidate statements in the report information.

[0051] In practice, the method for determining the reporting information corresponding to the candidate script in the reporting information can be found in the relevant content of S104 of the above embodiment, and will not be repeated here.

[0052] In S306, frequency information for each candidate phrase is determined based on the first quantity and the second quantity.

[0053] In implementation, assuming there are 5 reports within the preset detection period, of which 3 reports correspond to candidate dialogue 1, 2 reports correspond to candidate dialogue 2, and 1 report corresponds to candidate dialogue 3, then the frequency information of candidate dialogue 1 can be 3 / 5, the frequency information of candidate dialogue 2 can be 2 / 5, and the frequency information of candidate dialogue 3 can be 1 / 5.

[0054] In S308, based on target information and a pre-trained probability determination model, the first probability corresponding to each preset risk type for the target user is determined.

[0055] Among them, the probability deterministic model can be obtained by training a model constructed by a preset machine learning algorithm based on historical information.

[0056] In practice, for example, assuming there are 3 preset risk types, the target information can be input into a pre-trained probability determination model to obtain the first probability of the target user corresponding to each preset risk type.

[0057] In S310, based on the reporting information corresponding to the candidate statements in the reporting information, the second probability corresponding to each preset risk type of the candidate statements is determined.

[0058] In implementation and practical applications, the processing method of S310 can vary. The following is one optional implementation method, which can be found in steps one to two below:

[0059] Step 1: Obtain the third number of reports corresponding to each preset risk type from the reports that correspond to the candidate statements.

[0060] In implementation, keyword extraction can be performed on the reported information corresponding to the candidate scripts. Based on the extracted keywords, the third number of reported information corresponding to each preset risk type can be determined. For example, suppose there are 5 reported information within a preset detection period, of which 3 reported information correspond to candidate script 1, namely reported information 1, reported information 2, and reported information 3. Assume the keyword for reported information 1 and reported information 2 is "commission," and the keyword for reported information 3 is "game." If the preset risk types include fraudulent transactions and games, then the reported information corresponding to fraudulent transactions can be reported information 1 and reported information 2 (i.e., the keyword "commission" corresponds to fraudulent transactions), and the reported information corresponding to games can be reported information 3 (i.e., the keyword "game" corresponds to games). That is, the third number of reported information corresponding to fraudulent transactions is 2, and the third number of reported information corresponding to games is 1.

[0061] Step two: Based on the second and third quantities, determine the second probability corresponding to each preset risk type for the candidate statements.

[0062] In implementation, the ratio of the third quantity to the second quantity can be determined as the second probability corresponding to each preset risk type for the candidate message. For example, if there are 5 reports within the preset detection period, and 3 of them correspond to candidate message 1, that is, the second quantity is 3. Then, the second probability corresponding to candidate message 1 for the fake transaction type can be 2 / 3, and the second probability corresponding to candidate message 1 for the game type can be 1 / 3.

[0063] In S312, matching information for each candidate phrase is determined based on the first probability and the second probability.

[0064] In implementation, the product of the first probability and the second probability can be used to determine the matching information for each candidate dialogue. That is, the matching information for each candidate dialogue can include the probability value corresponding to each candidate dialogue with each preset risk type and target user. For example, assuming there are two risk types, and the first probability includes probability 1 corresponding to the target user and risk type 1, and probability 2 corresponding to the target user and risk type 2, and the second probability includes probability 3 corresponding to candidate dialogue 1 and risk type 1, and probability 4 corresponding to candidate dialogue 1 and risk type 2, then the matching information for the candidate dialogue can include probability 5 and probability 6. Probability 5 can be the product of probability 1 and probability 3, and probability 6 can be the product of probability 2 and probability 4.

[0065] Alternatively, the sum of the products of the first probability and the second probability can be used to determine the matching information for each candidate phrase. In other words, the sum of the probabilities 5 and 6 can be used to determine the matching information for candidate phrase 1.

[0066] The method for determining the matching information of the candidate dialogue is an optional and feasible method. In actual application scenarios, there can be a variety of different methods. Different methods can be selected according to different actual application scenarios. This specification does not specifically limit this method in the embodiments.

[0067] In S314, based on a pre-trained intent recognition model, intent recognition processing is performed on the target information to obtain the intent type corresponding to the target information, and the degree of matching between each candidate speech and the intent type corresponding to the target information is determined.

[0068] The intent recognition model can be obtained by training a model built by machine learning algorithms based on historical information.

[0069] In practice, after obtaining the intent type of the target information, keywords can be extracted for each candidate message, and the degree of matching between each candidate message and the intent type of the target information can be determined based on the extracted keywords and the intent type of the target information.

[0070] For example, assuming the intent type of the target information is rebate, and the keyword obtained by extracting keywords from the candidate message is "commission", then based on the pre-trained keyword matching model, the similarity between "commission" and rebate can be determined, and this similarity can be determined as the degree of matching between the candidate message and the intent type corresponding to the target information.

[0071] In addition, after obtaining the intent type corresponding to the target information, it can also receive the matching degree of each candidate dialogue input by preset staff for that intent type with that intent type.

[0072] The method for determining the degree of matching between the candidate dialogue and the intent type corresponding to the target information is an optional and feasible method. In actual application scenarios, there can be a variety of different methods. Different methods can be selected according to different actual application scenarios. This specification does not specifically limit this method in the embodiments.

[0073] In S316, the degree of matching between each candidate dialogue and the intent type corresponding to the target information is determined as the first information of each candidate dialogue.

[0074] In implementation, the first information can be used to characterize the degree of matching between each candidate message and the intent type of the target information. For example, if the degree of matching between the candidate message and the intent type of the target information is higher than a preset threshold, the first information of the candidate message can be determined as "recommended". If the degree of matching between the candidate message and the intent type of the target information is not higher than the preset threshold, the first information of the candidate message can be determined as "not recommended".

[0075] In S318, based on the frequency information, matching information, first information, and pre-trained second vector extraction model of the candidate words to be output, the first sub-feature vector corresponding to each candidate word is determined.

[0076] In implementation, taking the second vector extraction model as a multilayer perceptron (MLP) as an example, the frequency information, matching information, and first information of the candidate words can be input into the pre-trained MLP to obtain the first sub-feature vector corresponding to the candidate words. MLP is a feedforward artificial neural network model that can map multiple input datasets to a single output dataset.

[0077] In S320, feature extraction processing is performed on the content of candidate statements to determine the second sub-feature vector corresponding to each candidate statement.

[0078] In S322, the second feature vector corresponding to each candidate speech is determined based on the first sub-feature vector and the second sub-feature vector.

[0079] In implementation, the first and second sub-feature vectors can be concatenated to obtain the second feature vector corresponding to each candidate speech. In this way, the second feature vector of each candidate speech can be determined by the first sub-feature vector determined by the frequency information, matching information, and first information of the candidate speech. The obtained second feature vector can retain the speech content of the candidate speech itself, and also take into account the external knowledge of the candidate speech (i.e., additional information other than the speech content of the candidate speech itself). When adding a new candidate speech, the output of the candidate speech can be adjusted by updating the external knowledge of the candidate speech, without the need for model training or other operations.

[0080] In S324, a similarity determination model, a first feature vector, and a second feature vector are pre-trained to determine the similarity between the first feature vector and the second feature vector. Based on the similarity, the matching degree between each candidate dialogue and the target user to trigger the execution of the target business is determined.

[0081] The similarity determination model can be obtained by training a model constructed by a machine learning algorithm based on the first historical feature vector and the second historical feature vector.

[0082] In S108, based on the matching degree, the target script that matches the target user's trigger execution of the target business is determined from the candidate scripts, and the target script is output.

[0083] In S326, obtain feedback information from the target user regarding the target message.

[0084] In S328, the risk score corresponding to the feedback information is determined based on the target dialogue, feedback information, and a pre-trained risk score determination model.

[0085] The risk score determination model can be obtained by training a model constructed by a preset machine learning algorithm based on historical dialogue and historical feedback information.

[0086] In S330, based on the risk scores corresponding to the target dialogue and feedback information, it is determined whether there is a risk in triggering the execution of the target business by the target user.

[0087] During implementation, if the risk score corresponding to the target message and feedback information indicates a risk to the target user triggering the target service, a preset prompt message can be output to alert the target user of this risk. Furthermore, if the risk score exceeds the preset risk score, the triggering of the target service can be stopped to reduce the probability of risks such as privacy leaks or user financial losses.

[0088] This specification provides a data processing method. When a target user triggers the execution of a target service, a first feature vector corresponding to the target information is determined based on the acquired target information. The target information may include information required for the target user to trigger the execution of the target service, and / or the target user's interaction information in response to triggering the execution of the target service. Based on the frequency information, matching information, and first information of the candidate dialogues to be output, a second feature vector corresponding to each candidate dialogue is determined. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and a preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information. Based on the first and second feature vectors, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined. Based on the matching degree, the target dialogue among the candidate dialogues that matches the target user triggering the execution of the target service is determined, and the target dialogue is output. In this way, the target dialogue that matches the target user's trigger execution of the target business can be determined by using the second feature vector determined by the frequency information, matching information, and first information of the candidate dialogue, and the first feature vector determined by the target information. This avoids the problem of not being able to determine the dialogue for risk control in a timely and accurate manner due to the inability to update the model in a timely manner when the risk control data volume is large and the update speed is fast. That is, the target dialogue that matches the target user's trigger execution of the target business can be determined in a timely and accurate manner by using the frequency information, matching information, and first information of the candidate dialogue, so that risk control can be carried out in a timely and accurate manner in the risk control scenario through the determined target dialogue.

[0089] Example 3

[0090] The above describes the data processing method provided in the embodiments of this specification. Based on the same idea, the embodiments of this specification also provide a data processing device, such as... Figure 5 As shown.

[0091] The data processing device includes: a first acquisition module 501, a first determination module 502, a second determination module 503, and a script determination module 504, wherein:

[0092] The first acquisition module 501 is used to determine a first feature vector corresponding to the target information based on the acquired target information when the target user triggers the execution of the target service. The target information includes information required by the target user to trigger the execution of the target service, and / or the interaction information of the target user in response to triggering the execution of the target service.

[0093] The first determining module 502 is used to determine a second feature vector corresponding to each candidate dialogue based on the frequency information, matching information and first information of the candidate dialogue to be output. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue in the report information. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and the preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information.

[0094] The second determining module 503 is used to determine the matching degree between each candidate dialogue and the target user triggering the execution of the target service based on the first feature vector and the second feature vector;

[0095] The script determination module 504 is used to determine, based on the matching degree, the target script among the candidate scripts that matches the target user's trigger execution of the target service, and output the target script.

[0096] In the embodiments described in this specification, the device further includes:

[0097] The second acquisition module is used to acquire feedback information from the target user regarding the target dialogue;

[0098] The score determination module is used to determine the risk score corresponding to the feedback information based on the target dialogue, the feedback information, and the pre-trained risk score determination model. The risk score determination model is obtained by training a model constructed by a preset machine learning algorithm based on historical dialogue and historical feedback information.

[0099] The risk determination module is used to determine whether there is a risk in the target user triggering the execution of the target service based on the risk score corresponding to the target dialogue and the feedback information.

[0100] In the embodiments of this specification, the first determining module 502 is used for:

[0101] Based on the frequency information of the candidate dialogue to be output, the matching information, the first information, and the pre-trained second vector extraction model, a first sub-feature vector corresponding to each candidate dialogue is determined;

[0102] The candidate dialogue content is subjected to feature extraction processing to determine the second sub-feature vector corresponding to each candidate dialogue.

[0103] Based on the first sub-feature vector and the second sub-feature vector, a second feature vector corresponding to each candidate speech is determined.

[0104] In the embodiments described in this specification, the device further includes:

[0105] The quantity acquisition module is used to acquire the first quantity of reported information within the preset detection period;

[0106] The quantity determination module is used to determine a second quantity of the reported information corresponding to the candidate dialogue in the reported information;

[0107] The frequency determination module is used to determine the frequency information of each candidate speech based on the first quantity and the second quantity.

[0108] In the embodiments described in this specification, the device further includes:

[0109] The third determining module is used to determine the first probability of the target user corresponding to each of the preset risk types based on the target information and a pre-trained probability determining model. The probability determining model is obtained by training a model constructed by a preset machine learning algorithm based on historical information.

[0110] The fourth determining module is used to determine the second probability of the candidate dialogue and each preset risk type based on the reporting information corresponding to the candidate dialogue in the reporting information;

[0111] The information determination module is used to determine the matching information for each candidate speech based on the first probability and the second probability.

[0112] In the embodiments of this specification, the fourth determining module is used for:

[0113] Obtain the third number of reports corresponding to each preset risk type from the reports that correspond to the candidate dialogue in the reported information;

[0114] Based on the second quantity and the third quantity, a second probability is determined for each candidate speech and each preset risk type.

[0115] In the embodiments described in this specification, the device further includes:

[0116] The type determination module is used to perform intent recognition processing on the target information based on a pre-trained intent recognition model, obtain the intent type corresponding to the target information, and determine the degree of matching between each candidate speech and the intent type corresponding to the target information.

[0117] The fifth determining module is used to determine the degree of matching between each candidate dialogue and the intent type corresponding to the target information as the first information of each candidate dialogue.

[0118] In the embodiments of this specification, the second determining module 503 is used for:

[0119] Based on a pre-trained similarity determination model, the first feature vector, and the second feature vector, the similarity between the first feature vector and the second feature vector is determined, and based on the similarity, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined.

[0120] This specification provides a data processing device that, upon detecting that a target user has triggered the execution of a target service, determines a first feature vector corresponding to the acquired target information. The target information may include information required for the target user to trigger the execution of the target service, and / or the target user's interaction information in response to triggering the execution of the target service. Based on the frequency information, matching information, and the first information of the candidate dialogues to be output, a second feature vector is determined for each candidate dialogue. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and a preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information. Based on the first and second feature vectors, the matching degree between each candidate dialogue and the target user's triggering of the target service is determined. Based on the matching degree, the target dialogue among the candidate dialogues that matches the target user's triggering of the target service is determined, and the target dialogue is output. In this way, the target dialogue that matches the target user's trigger execution of the target business can be determined by using the second feature vector determined by the frequency information, matching information, and first information of the candidate dialogue, and the first feature vector determined by the target information. This avoids the problem of not being able to determine the dialogue for risk control in a timely and accurate manner due to the inability to update the model in a timely manner when the risk control data volume is large and the update speed is fast. That is, the target dialogue that matches the target user's trigger execution of the target business can be determined in a timely and accurate manner by using the frequency information, matching information, and first information of the candidate dialogue, so that risk control can be carried out in a timely and accurate manner in the risk control scenario through the determined target dialogue.

[0121] Example 4

[0122] Following the same line of thought, embodiments of this specification also provide a data processing device, such as... Figure 6 As shown.

[0123] Data processing devices can vary considerably due to differences in configuration or performance. They may include one or more processors 601 and memory 602, with memory 602 storing one or more application programs or data. Memory 602 can be temporary or persistent storage. The application programs stored in memory 602 may include one or more modules (not shown), each module including a series of computer-executable instructions for the data processing device. Furthermore, processor 601 may be configured to communicate with memory 602 and execute the series of computer-executable instructions stored in memory 602 on the data processing device. The data processing device may also include one or more power supplies 603, one or more wired or wireless network interfaces 604, one or more input / output interfaces 605, and one or more keyboards 606.

[0124] Specifically, in this embodiment, the data processing device includes a memory and one or more programs, wherein one or more programs are stored in the memory, and one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the data processing device, and is configured to be executed by one or more processors. The one or more programs include computer-executable instructions for performing the following:

[0125] Upon detecting that a target user has triggered the execution of a target service, a first feature vector corresponding to the acquired target information is determined based on the acquired target information. The target information includes information required by the target user to trigger the execution of the target service, and / or the target user's interaction information in response to triggering the execution of the target service.

[0126] Based on the frequency information, matching information, and first information of the candidate dialogue to be output, a second feature vector is determined for each candidate dialogue. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue in the report information. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and the preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information.

[0127] Based on the first feature vector and the second feature vector, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined;

[0128] Based on the matching degree, the target script that matches the target user in triggering the execution of the target service is determined from the candidate scripts, and the target script is output.

[0129] Optionally, the method further includes:

[0130] Obtain feedback information from the target user regarding the target dialogue;

[0131] Based on the target dialogue, the feedback information, and the pre-trained risk score determination model, the risk score corresponding to the feedback information is determined. The risk score determination model is obtained by training a model constructed by a preset machine learning algorithm based on historical dialogue and historical feedback information.

[0132] Based on the risk scores corresponding to the target dialogue and the feedback information, it is determined whether there is a risk in the target user triggering the execution of the target service.

[0133] Optionally, determining the second feature vector corresponding to each candidate dialogue based on the frequency information, matching information, and first information of the candidate dialogues to be output includes:

[0134] Based on the frequency information of the candidate dialogue to be output, the matching information, the first information, and the pre-trained second vector extraction model, a first sub-feature vector corresponding to each candidate dialogue is determined;

[0135] The candidate dialogue content is subjected to feature extraction processing to determine the second sub-feature vector corresponding to each candidate dialogue.

[0136] Based on the first sub-feature vector and the second sub-feature vector, a second feature vector corresponding to each candidate speech is determined.

[0137] Optionally, before determining the second feature vector corresponding to each candidate speech based on the frequency information, matching information, and first information of the candidate speech to be output, the method further includes:

[0138] Obtain the first number of reported information within the preset detection period;

[0139] Determine a second number of the reported information that corresponds to the candidate dialogue;

[0140] Based on the first quantity and the second quantity, the frequency information of each candidate speech is determined.

[0141] Optionally, before determining the second feature vector corresponding to each candidate speech based on the frequency information, matching information, and first information of the candidate speech to be output, the method further includes:

[0142] Based on the target information and the pre-trained probability determination model, a first probability is determined for the target user corresponding to each of the preset risk types. The probability determination model is obtained by training a model constructed by a preset machine learning algorithm based on historical information.

[0143] Based on the reporting information corresponding to the candidate dialogue in the reporting information, determine the second probability of the candidate dialogue corresponding to each of the preset risk types;

[0144] Based on the first probability and the second probability, matching information for each candidate phrase is determined.

[0145] Optionally, determining the second probability of the candidate dialogue word corresponding to each preset risk type based on the report information corresponding to the candidate dialogue word in the report information includes:

[0146] Obtain the third number of reports corresponding to each preset risk type from the reports that correspond to the candidate dialogue in the reported information;

[0147] Based on the second quantity and the third quantity, a second probability is determined for each candidate speech and each preset risk type.

[0148] Optionally, before determining the second feature vector corresponding to each candidate speech based on the frequency information, matching information, and first information of the candidate speech to be output, the method further includes:

[0149] Based on a pre-trained intent recognition model, the target information is processed to obtain the intent type corresponding to the target information, and the degree of matching between each candidate speech and the intent type corresponding to the target information is determined.

[0150] The degree of matching between each candidate dialogue and the intent type corresponding to the target information is determined as the first information of each candidate dialogue.

[0151] Optionally, determining the matching degree between each candidate dialogue and the target user's triggering of the target service based on the first feature vector and the second feature vector includes:

[0152] Based on a pre-trained similarity determination model, the first feature vector, and the second feature vector, the similarity between the first feature vector and the second feature vector is determined, and based on the similarity, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined.

[0153] This specification provides a data processing device that, upon detecting that a target user has triggered the execution of a target service, determines a first feature vector corresponding to the acquired target information. The target information may include information required for the target user to trigger the execution of the target service, and / or the target user's interaction information in response to triggering the execution of the target service. Based on the frequency information, matching information, and the first information of the candidate dialogues to be output, a second feature vector is determined for each candidate dialogue. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and a preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information. Based on the first and second feature vectors, the matching degree between each candidate dialogue and the target user's triggering of the target service is determined. Based on the matching degree, the target dialogue among the candidate dialogues that matches the target user's triggering of the target service is determined, and the target dialogue is output. In this way, the target dialogue that matches the target user's trigger execution of the target business can be determined by using the second feature vector determined by the frequency information, matching information, and first information of the candidate dialogue, and the first feature vector determined by the target information. This avoids the problem of not being able to determine the dialogue for risk control in a timely and accurate manner due to the inability to update the model in a timely manner when the risk control data volume is large and the update speed is fast. That is, the target dialogue that matches the target user's trigger execution of the target business can be determined in a timely and accurate manner by using the frequency information, matching information, and first information of the candidate dialogue, so that risk control can be carried out in a timely and accurate manner in the risk control scenario through the determined target dialogue.

[0154] Example 5

[0155] This specification also provides a computer-readable storage medium storing a computer program. When executed by a processor, this computer program implements the various processes of the above-described data processing method embodiments and achieves the same technical effects. To avoid repetition, it will not be described again here. The computer-readable storage medium may include, for example, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.

[0156] This specification provides a computer-readable storage medium that, upon detecting that a target user has triggered the execution of a target service, determines a first feature vector corresponding to the acquired target information. The target information may include information required for the target user to trigger the execution of the target service, and / or the target user's interaction information in response to triggering the execution of the target service. Based on the frequency information, matching information, and first information of the candidate dialogues to be output, a second feature vector is determined for each candidate dialogue. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and a preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information. Based on the first and second feature vectors, the matching degree between each candidate dialogue and the target user's triggering of the target service is determined. Based on the matching degree, the target dialogue among the candidate dialogues that matches the target user's triggering of the target service is determined, and the target dialogue is output. In this way, the target dialogue that matches the target user's trigger execution of the target business can be determined by using the second feature vector determined by the frequency information, matching information, and first information of the candidate dialogue, and the first feature vector determined by the target information. This avoids the problem of not being able to determine the dialogue for risk control in a timely and accurate manner due to the inability to update the model in a timely manner when the risk control data volume is large and the update speed is fast. That is, the target dialogue that matches the target user's trigger execution of the target business can be determined in a timely and accurate manner by using the frequency information, matching information, and first information of the candidate dialogue, so that risk control can be carried out in a timely and accurate manner in the risk control scenario through the determined target dialogue.

[0157] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired result. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired result. In some embodiments, multitasking and parallel processing are possible or may be advantageous.

[0158] In the 1990s, improvements to a technology could be clearly distinguished as either hardware improvements (e.g., improvements to the circuit structure of diodes, transistors, switches, etc.) or software improvements (improvements to the methodology). However, with technological advancements, many methodological improvements today can be considered direct improvements to the hardware circuit structure. Designers almost always obtain the corresponding hardware circuit structure by programming the improved methodology into the hardware circuit. Therefore, it cannot be said that a methodological improvement cannot be implemented using a hardware physical module. For example, a Programmable Logic Device (PLD) (e.g., a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic function is determined by the user programming the device. Designers can program a digital system themselves to "integrate" it onto a PLD, without needing chip manufacturers to design and manufacture dedicated integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing integrated circuit chips, this programming is mostly implemented using "logic compiler" software. Similar to the software compiler used in program development, the original code before compilation must be written in a specific programming language, called a Hardware Description Language (HDL). There are many HDLs, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, and RHDL (Ruby Hardware Description Language). Currently, the most commonly used are VHDL (Very-High-Speed ​​Integrated Circuit Hardware Description Language) and Verilog. Those skilled in the art should understand that by simply performing some logic programming on the method flow using one of these hardware description languages ​​and programming it into an integrated circuit, the hardware circuit implementing the logical method flow can be easily obtained.

[0159] The controller can be implemented in any suitable manner. For example, it can take the form of a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro)processor, logic gates, switches, application-specific integrated circuits (ASICs), programmable logic controllers, and embedded microcontrollers. Examples of controllers include, but are not limited to, the following microcontrollers: ARC625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicon Labs C8051F320. A memory controller can also be implemented as part of the control logic of the memory. Those skilled in the art will also recognize that, in addition to implementing the controller in purely computer-readable program code form, the same functionality can be achieved by logically programming the method steps to make the controller take the form of logic gates, switches, ASICs, programmable logic controllers, and embedded microcontrollers. Therefore, such a controller can be considered a hardware component, and the means included therein for implementing various functions can also be considered as structures within the hardware component. Alternatively, the means for implementing various functions can be considered as both software modules implementing the method and structures within the hardware component.

[0160] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer. Specifically, a computer can be, for example, a personal computer, laptop computer, cellular phone, camera phone, smartphone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or any combination of these devices.

[0161] For ease of description, the above apparatus is described by dividing it into various functional units. Of course, when implementing one or more embodiments of this specification, the functions of each unit can be implemented in one or more software and / or hardware.

[0162] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, one or more embodiments of this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of this specification may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0163] The embodiments described herein are illustrated with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this specification. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in one or more flowchart illustrations and / or one or more block diagrams.

[0164] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means that implement the functions specified in one or more flowcharts and / or one or more block diagrams.

[0165] These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer-implemented process, such that the instructions, which execute on the computer or other programmable apparatus, provide steps for implementing the functions specified in one or more flowcharts and / or one or more block diagrams.

[0166] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0167] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0168] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.

[0169] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0170] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, one or more embodiments of this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of this specification may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0171] One or more embodiments of this specification can be described in the general context of computer-executable instructions, such as program modules, that are executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a particular task or implement a particular abstract data type. One or more embodiments of this specification can also be practiced in distributed computing environments where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.

[0172] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to interchangeably. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.

[0173] The above description is merely an embodiment of this specification and is not intended to limit this specification. Various modifications and variations can be made to this specification by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this specification should be included within the scope of the claims of this specification.

Claims

1. A data processing method, comprising: Upon detecting that a target user has triggered the execution of a target service, a first feature vector corresponding to the acquired target information is determined based on the acquired target information. The target information includes information required by the target user to trigger the execution of the target service, and / or the target user's interaction information in response to triggering the execution of the target service. Based on the frequency information, matching information, and first information of the candidate dialogue to be output, a second feature vector is determined for each candidate dialogue. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue in the report information. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and the preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information. Based on the first feature vector and the second feature vector, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined; Based on the matching degree, the target script that matches the target user in triggering the execution of the target service is determined from the candidate scripts, and the target script is output; The matching information for each candidate dialogue is determined based on a first probability and a second probability. The first probability is the probability of the target user corresponding to each preset risk type, determined based on the target information and a pre-trained probability determination model. The second probability is the probability of the candidate dialogue corresponding to each preset risk type, determined based on the report information in the report information corresponding to the candidate dialogue.

2. The method according to claim 1, further comprising: Obtain feedback information from the target user regarding the target dialogue; Based on the target dialogue, the feedback information, and the pre-trained risk score determination model, the risk score corresponding to the feedback information is determined. The risk score determination model is obtained by training a model constructed by a preset machine learning algorithm based on historical dialogue and historical feedback information. Based on the risk scores corresponding to the target dialogue and the feedback information, it is determined whether there is a risk in the target user triggering the execution of the target service.

3. The method according to claim 2, wherein determining the second feature vector corresponding to each candidate dialogue based on the frequency information, matching information, and first information of the candidate dialogue to be output includes: Based on the frequency information of the candidate dialogue to be output, the matching information, the first information, and the pre-trained second vector extraction model, a first sub-feature vector corresponding to each candidate dialogue is determined; The candidate dialogue content is subjected to feature extraction processing to determine the second sub-feature vector corresponding to each candidate dialogue. Based on the first sub-feature vector and the second sub-feature vector, a second feature vector corresponding to each candidate speech is determined.

4. The method according to claim 3, before determining the second feature vector corresponding to each candidate dialogue based on the frequency information, matching information, and first information of the candidate dialogue to be output, further comprising: Obtain the first number of reported information within the preset detection period; Determine a second number of the reported information that corresponds to the candidate dialogue; Based on the first quantity and the second quantity, the frequency information of each candidate speech is determined.

5. The method according to claim 4, before determining the second feature vector corresponding to each candidate dialogue based on the frequency information, matching information, and first information of the candidate dialogue to be output, further comprising: Based on the target information and the pre-trained probability determination model, a first probability is determined for the target user corresponding to each of the preset risk types. The probability determination model is obtained by training a model constructed by a preset machine learning algorithm based on historical information. Based on the reporting information corresponding to the candidate dialogue in the reporting information, determine the second probability of the candidate dialogue corresponding to each of the preset risk types; Based on the first probability and the second probability, matching information for each candidate phrase is determined.

6. The method according to claim 5, wherein determining the second probability of the candidate dialogue word corresponding to each preset risk type based on the report information corresponding to the candidate dialogue word in the report information includes: Obtain the third number of reports corresponding to each preset risk type from the reports that correspond to the candidate dialogue in the reported information; Based on the second quantity and the third quantity, a second probability is determined for each candidate speech and each preset risk type.

7. The method according to claim 6, before determining the second feature vector corresponding to each candidate dialogue based on the frequency information, matching information, and first information of the candidate dialogue to be output, further comprising: Based on a pre-trained intent recognition model, the target information is processed to obtain the intent type corresponding to the target information, and the degree of matching between each candidate speech and the intent type corresponding to the target information is determined. The degree of matching between each candidate dialogue and the intent type corresponding to the target information is determined as the first information of each candidate dialogue.

8. The method according to claim 7, wherein determining the matching degree between each candidate dialogue and the target user triggering the execution of the target service based on the first feature vector and the second feature vector comprises: Based on a pre-trained similarity determination model, the first feature vector, and the second feature vector, the similarity between the first feature vector and the second feature vector is determined, and based on the similarity, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined.

9. A data processing apparatus, comprising: The first acquisition module is used to determine a first feature vector corresponding to the target information based on the acquired target information when the target user triggers the execution of the target service. The target information includes information required by the target user to trigger the execution of the target service, and / or the interaction information of the target user in response to triggering the execution of the target service. The first determining module is used to determine a second feature vector corresponding to each candidate dialogue based on the frequency information, matching information, and first information of the candidate dialogue to be output. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue in the report information. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and the preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information. The second determining module is used to determine the matching degree between each candidate dialogue and the target user triggering the execution of the target service based on the first feature vector and the second feature vector; The script determination module is used to determine, based on the matching degree, the target script among the candidate scripts that matches the target user's trigger execution of the target service, and output the target script; The matching information for each candidate dialogue is determined based on a first probability and a second probability. The first probability is the probability of the target user corresponding to each preset risk type, determined based on the target information and a pre-trained probability determination model. The second probability is the probability of the candidate dialogue corresponding to each preset risk type, determined based on the report information in the report information corresponding to the candidate dialogue.

10. A data processing apparatus, the data processing apparatus comprising: processor; as well as A memory configured to store computer-executable instructions, which, when executed, cause the processor to: Upon detecting that a target user has triggered the execution of a target service, a first feature vector corresponding to the acquired target information is determined based on the acquired target information. The target information includes information required by the target user to trigger the execution of the target service, and / or the target user's interaction information in response to triggering the execution of the target service. Based on the frequency information, matching information, and first information of the candidate dialogue to be output, a second feature vector is determined for each candidate dialogue. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue in the report information. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and the preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information. Based on the first feature vector and the second feature vector, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined; Based on the matching degree, the target script that matches the target user in triggering the execution of the target service is determined from the candidate scripts, and the target script is output; The matching information for each candidate dialogue is determined based on a first probability and a second probability. The first probability is the probability of the target user corresponding to each preset risk type, determined based on the target information and a pre-trained probability determination model. The second probability is the probability of the candidate dialogue corresponding to each preset risk type, determined based on the report information in the report information corresponding to the candidate dialogue.

11. A storage medium for storing computer-executable instructions, which, when executed by a processor, perform the following process: Upon detecting that a target user has triggered the execution of a target service, a first feature vector corresponding to the acquired target information is determined based on the acquired target information. The target information includes information required by the target user to trigger the execution of the target service, and / or the target user's interaction information in response to triggering the execution of the target service. Based on the frequency information, matching information, and first information of the candidate dialogue to be output, a second feature vector is determined for each candidate dialogue. The frequency information is determined based on the report information within a preset detection period and the report information corresponding to the candidate dialogue in the report information. The matching information is used to characterize the degree of matching between the candidate dialogue and the target user and the preset risk type. The first information is determined based on the intent type obtained by performing intent recognition processing on the target information. Based on the first feature vector and the second feature vector, the matching degree between each candidate dialogue and the target user triggering the execution of the target service is determined; Based on the matching degree, the target script that matches the target user in triggering the execution of the target service is determined from the candidate scripts, and the target script is output; The matching information for each candidate dialogue is determined based on a first probability and a second probability. The first probability is the probability of the target user corresponding to each preset risk type, determined based on the target information and a pre-trained probability determination model. The second probability is the probability of the candidate dialogue corresponding to each preset risk type, determined based on the report information in the report information corresponding to the candidate dialogue.