Intelligent matching model construction method, system and device using deep reinforcement learning
The intelligent matching model construction method based on deep reinforcement learning solves the problem of the inability to share candidate information matching models, and realizes efficient and accurate model updates and optimizations across multiple devices.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ZHONGSHEN BUSINESS TECH (SHENZHEN) CO LTD
- Filing Date
- 2025-02-08
- Publication Date
- 2026-06-16
Smart Images

Figure CN120067704B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of deep learning technology and information processing technology, and in particular to a method, system and device for constructing an intelligent matching model using deep reinforcement learning. Background Technology
[0002] In the existing resume screening or candidate information screening process, matching models and matching algorithms are often used to achieve a high degree of matching of candidate information, thereby ensuring the efficiency of the candidate screening process.
[0003] However, in practical applications, the matching model or matching algorithm is often differentiated as the user continues to use it, corresponding to the user or the configured device. That is, the model often reflects the personalized differences of the user or the user on the configured device in the process of matching candidate information. This means that for the same model, different users and different configured devices will learn and eventually obtain different models as they are used.
[0004] However, as is well known, in the process of matching candidate information, there are often more than one user or device. Since the existing model cannot be shared, for the same batch of candidate information or the same position, in order to achieve fast matching, other devices often need to retrain or transfer the model to achieve the efficiency that the model can achieve after long-term use, which leads to the low efficiency of the matching model. Summary of the Invention
[0005] To address the problems of existing technologies, embodiments of the present invention provide a method, system, and apparatus for constructing an intelligent matching model using deep reinforcement learning, comprising:
[0006] On the one hand, a method for constructing an intelligent matching model using deep reinforcement learning is provided. This method is applied to a distributed candidate information intelligent matching system, which includes multiple matching devices, each configured with an intelligent matching model. The method includes:
[0007] After any matching device retrieves and matches candidate information through the intelligent matching model, it obtains the matching information, sample information, and operation information corresponding to the matching device.
[0008] Based on the matching information, generate matching transmission parameters;
[0009] Based on the sample information, generate sample transfer parameters;
[0010] Based on the operation information, generate operation transmission parameters;
[0011] The matching transmission parameters, the sample transmission parameters, and the operation transmission parameters are transmitted to the central device respectively;
[0012] Upon receiving the matching transmission parameters, the sample transmission parameters, and the operation transmission parameters, the central device performs filtering through preset deep reinforcement learning.
[0013] When other matching devices are idle, they train the configured intelligent matching model based on the filtered matching transmission parameters, filtered sample transmission parameters, and filtered operation transmission parameters to obtain an updated intelligent matching model.
[0014] Optionally, the method further includes:
[0015] The intelligent matching model is constructed using a wide neural network;
[0016] The intelligent matching model is trained based on preset candidate information and corresponding matching information.
[0017] Optionally, after any matching device retrieves and matches candidate information through the intelligent matching model, the matching information, sample information, and operation information corresponding to the matching device are obtained, including:
[0018] After the user matches candidate information using keywords and description text through the intelligent matching model, the operations performed on the candidate information are recorded. The operations include at least marking, deleting, and adding to the candidate list.
[0019] The sample information is generated based on the keywords, the descriptive text, and the matched candidate information.
[0020] The matching information is generated based on the first parameter matrix of the intelligent matching model;
[0021] Based on the marking operation, the deletion operation, and the addition to the candidate list operation, operation information corresponding to the keyword, the description text, and the matching candidate information is generated respectively.
[0022] Optionally, generating the matching transmission parameters based on the matching information includes:
[0023] Based on the first parameter matrix, the keywords, and the description text, generate corresponding matching transmission parameters;
[0024] The matching transmission parameters are transmitted to the central device.
[0025] Optionally, generating sample transfer parameters based on the sample information includes:
[0026] After the deletion operation, the candidate information is subjected to keyword identification and extraction, and a first sample transmission parameter is generated based on the first extraction result, the keywords, and the description text.
[0027] The candidate information for which the deletion operation was performed in the matched candidate information is subjected to keyword identification and extraction, and a second sample transmission parameter is generated based on the second extraction result, the keywords and the description text;
[0028] The first sample transmission parameter and the second sample transmission parameter are combined into the sample transmission parameter and transmitted to the central device.
[0029] Optionally, generating operation transmission parameters based on the operation information includes:
[0030] Obtain the final candidate information after the operation is performed;
[0031] The intelligent matching model is trained based on the final candidate information, the keywords, and the description text.
[0032] Based on the second parameter matrix corresponding to the trained intelligent matching model, as well as the keywords and the description text, the operation transmission parameters are generated.
[0033] The operation transmission parameters are transmitted to the central device.
[0034] Optionally, upon receiving the matching transmission parameters, the sample transmission parameters, and the operation transmission parameters, the central device performs screening through preset deep reinforcement learning, including:
[0035] Set the first parameter matrix and the second parameter matrix as the parameter matrices of the initial intelligent matching model, respectively;
[0036] The sample transmission parameters and the matching transmission parameters are used as input values for the training samples, and the operation transmission parameters, the keywords, and the descriptive text are used as the expected output.
[0037] The input values are output to the initial intelligent matching model and the deep reinforcement learning algorithm, respectively.
[0038] Based on the actual output value and the expected output value, the matching transfer parameter, the sample transfer parameter, and the operation transfer parameter are evaluated, and the matching transfer parameter, sample transfer parameter, and operation transfer parameter whose difference between the actual output value and the expected output value is greater than a preset value are deleted.
[0039] Optionally, the step of training the configured intelligent matching model based on the filtered matching transfer parameters, the filtered sample transfer parameters, and the filtered operation transfer parameters to obtain the updated intelligent matching model includes:
[0040] Set the first parameter matrix as the first influence coefficient;
[0041] Set the second parameter matrix as the second influence coefficient;
[0042] The parameter matrix of the intelligent matching model is adjusted based on the first influence coefficient and the second influence coefficient.
[0043] The sample transmission parameters and the matching transmission parameters are used as input values for training samples, and the operation transmission parameters, the keywords, and the descriptive text are used as expected outputs to train the intelligent matching model, thereby obtaining the updated intelligent matching model.
[0044] On the other hand, a system for constructing intelligent matching models using deep reinforcement learning is provided. The system includes multiple matching devices, each configured with an intelligent matching model. The method includes:
[0045] The matching device is used for:
[0046] After retrieving and matching candidate information through the intelligent matching model, the matching information, sample information, and operation information corresponding to the matching device are obtained.
[0047] Based on the matching information, generate matching transmission parameters;
[0048] Based on the sample information, generate sample transfer parameters;
[0049] Based on the operation information, generate operation transmission parameters;
[0050] The matching transmission parameters, the sample transmission parameters, and the operation transmission parameters are transmitted to the central device respectively;
[0051] The central device is used to perform filtering through preset deep reinforcement learning upon receiving the matching transmission parameters, the sample transmission parameters, and the operation transmission parameters.
[0052] Other matching devices are used, when idle, to train the configured intelligent matching model based on the filtered matching transmission parameters, the filtered sample transmission parameters, and the filtered operation transmission parameters, to obtain an updated intelligent matching model.
[0053] On the other hand, an intelligent matching model construction device applying deep reinforcement learning is provided, the device comprising:
[0054] The acquisition module is used to retrieve and match candidate information through the intelligent matching model, and then acquire matching information, sample information and operation information corresponding to the matching device.
[0055] The processing module is used to generate matching transmission parameters based on the matching information;
[0056] The processing module is also used to generate sample transfer parameters based on the sample information;
[0057] The processing module is also used to generate operation transmission parameters based on the operation information;
[0058] The transmission module is used to transmit the matching transmission parameters, the sample transmission parameters, and the operation transmission parameters to the central device, respectively.
[0059] When in an idle state, the configured intelligent matching model is trained based on the filtered matching transmission parameters, filtered sample transmission parameters, and filtered operation transmission parameters to obtain an updated intelligent matching model.
[0060] The present invention has at least the following beneficial effects:
[0061] By generating matching transfer parameters based on matching information, sample transfer parameters based on sample information, and operation transfer parameters based on operation information, the central device performs screening through preset deep reinforcement learning. When other matching devices are idle, they train the configured intelligent matching model based on the screened matching transfer parameters, screened sample transfer parameters, and screened operation transfer parameters to obtain an updated intelligent matching model. This allows each device's intelligent matching model to share its operations, samples, and parameters with other matching devices after completing the candidate information matching process, avoiding the need for other devices to retrain or transfer their models. By sharing operations, samples, and parameters, the model is optimized, which not only improves efficiency but also further ensures the accuracy of the candidate information matching process. Attached Figure Description
[0062] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0063] Figure 1 A schematic diagram of a candidate information matching system provided in an embodiment of the present invention;
[0064] Figure 2 A schematic diagram of a method for constructing an intelligent matching model using deep reinforcement learning, provided in an embodiment of the present invention;
[0065] Figure 3 A schematic diagram of a method for constructing an intelligent matching model using deep reinforcement learning, provided in an embodiment of the present invention;
[0066] Figure 4 A schematic diagram of a method for constructing an intelligent matching model using deep reinforcement learning, provided in an embodiment of the present invention;
[0067] Figure 5 This is a schematic diagram of an intelligent matching model construction system using deep reinforcement learning, provided in an embodiment of the present invention. Detailed Implementation
[0068] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of the embodiments of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this invention, and not all embodiments. Based on the embodiments of this invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this invention.
[0069] The candidate information mentioned in the method described in this embodiment of the invention, in practical applications, is often information such as a resume describing the candidate's skills, experience, and education. Of course, other methods are also included to achieve the above-mentioned candidate information. This embodiment of the invention does not limit the specific form of candidate information.
[0070] Additionally, it should be noted that, referring to Figure 1 As shown, when the method described in this embodiment of the invention is mainly applied to a candidate information matching system, the system configuration must at least meet the following conditions:
[0071] Candidate information is configured in the candidate information server. The central device is configured with at least a large language model. The aforementioned candidate information server and central device are deployed in the cloud to avoid the need for local computing and storage resources. The matching device is deployed locally and deploys an intelligent matching model. This matching device can be a user's personal device, such as a computer.
[0072] In this process, users initiate operations through matching devices, such as searching using keywords;
[0073] The intelligent matching model retrieves information on multiple matching candidates from the candidate information server using keywords input by the user.
[0074] The candidate information server will return the information of multiple candidates to the matching device;
[0075] On the matching device, users can continue to perform operations such as tagging, deleting, and adding to the candidate list for multiple candidate information.
[0076] The matching device can also be deployed in different locations. That is, when an entity (such as a company) has matching devices deployed in multiple locations at the same time, the central device and candidate server can be shared through cloud services.
[0077] Reference Figure 2 As shown, a method for constructing an intelligent matching model using deep reinforcement learning is provided. This method is applied to a distributed candidate information intelligent matching system, which includes multiple matching devices, each configured with an intelligent matching model. The method includes:
[0078] 101. After retrieving and matching candidate information through the intelligent matching model on any matching device, obtain the matching information, sample information and operation information corresponding to the matching device;
[0079] 102. Generate matching transmission parameters based on the matching information;
[0080] 103. Generate sample transfer parameters based on sample information;
[0081] 104. Generate operation transmission parameters based on the operation information;
[0082] 105. Transmit the matching transmission parameters, sample transmission parameters, and operation transmission parameters to the central device respectively;
[0083] 106. Upon receiving the matching transmission parameters, sample transmission parameters, and operation transmission parameters, the central equipment performs screening through preset deep reinforcement learning.
[0084] 107. When other matching devices are idle, they train the configured intelligent matching model based on the filtered matching transmission parameters, filtered sample transmission parameters, and filtered operation transmission parameters to obtain an updated intelligent matching model.
[0085] Optionally, the method also includes:
[0086] A smart matching model is constructed using a wide neural network;
[0087] The intelligent matching model is trained based on the preset candidate information and the corresponding matching information.
[0088] The above construction process can be specifically described as follows:
[0089] Through feature extraction, the training text and input text of the intelligent matching model are constructed. In the feature extraction of text data, the bag-of-words model, TF-IDF and word embedding can be used.
[0090] After the model is built, the constructed BLS is structurally flat, where the original input is converted into random features in the "feature nodes" and then expanded in width in the "enhancement nodes".
[0091] In BLS, the input data is first transformed into random features through some feature mappings, and then connected to the enhancement nodes through a non-linear activation function to train the model. The feature mapping is the result output by the large language model of the central device. That is, the user input keywords, descriptive text, and candidate information are input into the large language model. The large language model obtains the features corresponding to the user input keywords, descriptive text, and candidate information through recognition and feature extraction, and maps the features to the input layer of the intelligent matching model.
[0092] Random features (nodes) are connected to the output layer along with the output of the enhancement layer. The weights of the output layer are determined by a fast pseudo-inverse of the system equations or an iterative gradient descent training algorithm.
[0093] Set up the training dataset and n feature maps φ i If i = 1, 2, ..., n, then the i-th mapping feature is:
[0094] Z i =φ i (XW ei +β ei ), i = 1, 2, ..., n
[0095] Wherein, weight W ei and bias term β ei It is a randomly generated matrix.
[0096] Z n = [Z1, Z2, ..., Z n Z represents a set of n feature nodes; n Connect to the enhanced node layer.
[0097] Preferably, it can also be done through:
[0098] H j =ξ j (Z n W hj +β hj ), j = 1, 2, ..., m represent the output of the j-th augmentation node, where ξ jIt is a non-linear activation function. Furthermore, the model uses H... m = [H1, H1, ..., H m [] indicates the output of the enhancement layer.
[0099] Finally, the output Y of BLS takes the following form:
[0100] Y = [Z1, Z2, ..., Z] n H1, H2, ..., H m W m
[0101] Among them, W m These are the weights connecting the feature node layer and the augmentation node layer to the output layer, which can be expressed by the pseudo-inverse [Z]. n H m ] + Calculated.
[0102] After training, regularization techniques are employed to enhance the model's generalization ability and prevent overfitting. Regularization limits the complexity of model parameters by adding additional terms (such as L1 or L2 penalty terms) to the loss function, thereby reducing the risk of the model overfitting the training data.
[0103] Optional, refer to Figure 3 As shown, after any matching device retrieves and matches candidate information through the intelligent matching model, it obtains the matching information, sample information, and operation information corresponding to the matching device, including:
[0104] 201. After the user matches candidate information using keywords and description text through an intelligent matching model, record the operations performed on the candidate information. The operations include at least the marking operation, the deletion operation, and the addition of the candidate to the candidate list.
[0105] After recording the operations performed on candidate information, the multiple candidate information referenced by the operations are returned to the central device.
[0106] If a marking operation and a candidate list addition operation are performed on candidate information A, the text marked by the user, the flag for adding to the candidate list, and the identifier of candidate information A are returned to the central device.
[0107] 202. Generate sample information based on keywords, description text, and matched candidate information;
[0108] Specifically, the process can be:
[0109] Add keywords, description text, and identifiers of matched candidate information to the data packet, which becomes the sample information;
[0110] 203. Generate matching information based on the first parameter matrix of the intelligent matching model;
[0111] Specifically, after a user performs a search using keywords and description text, they receive information on multiple candidates.
[0112] Users can indicate search results by marking, deleting, and adding to the candidate list.
[0113] That is, when a user performs a delete operation on a candidate's information, the system will indicate that the candidate's information is not the expected result of the user's input keywords and description text;
[0114] When a user performs a tagging operation or adds a candidate to the candidate list for a certain candidate information, it indicates that the candidate information is the expected result of the user's input keywords and description text;
[0115] The intelligent matching model is trained by taking user keywords and description text as the expected output and multiple candidate information when the user finally performs the operation and adds the candidate list as the input, and the first parameter matrix of the intelligent matching model is obtained. The first parameter matrix, keywords, description text, and the identifiers of multiple candidate information are used as matching information.
[0116] 204. Based on the marking operation, deletion operation, and adding to the candidate list operation, generate operation information corresponding to the keywords, description text, and matching candidate information respectively.
[0117] Optionally, based on the matching information, the parameters to be generated for passing the match include:
[0118] Based on the first parameter matrix, keywords, and description text, generate the corresponding matching parameters;
[0119] The matching parameters are transmitted to the central device.
[0120] Optionally, based on the sample information, the generated sample transfer parameters include:
[0121] After the deletion operation, the candidate information is subjected to keyword identification and extraction, and the first sample transmission parameters are generated based on the first extraction results, keywords and descriptive text.
[0122] For candidates whose information was deleted from the matched candidate information, keyword identification and extraction were performed, and second sample transmission parameters were generated based on the second extraction results, keywords and descriptive text.
[0123] The first sample transfer parameters and the second sample transfer parameters are combined into a single sample transfer parameter and transmitted to the central device.
[0124] Optionally, based on the operation information, the generated operation transmission parameters include:
[0125] Obtain the final candidate information after the operation is performed;
[0126] The intelligent matching model is trained based on the final candidate information, keywords, and description text.
[0127] Based on the second parameter matrix corresponding to the trained intelligent matching model, as well as the keywords and descriptive text, the operation transmission parameters are generated.
[0128] The operation parameters are transmitted to the central device.
[0129] Optionally, upon receiving the matching transmission parameters, sample transmission parameters, and operation transmission parameters, the central device performs screening through pre-defined deep reinforcement learning, including:
[0130] Set the first parameter matrix and the second parameter matrix as the parameter matrices of the initial intelligent matching model, respectively;
[0131] The sample pass parameters and matching pass parameters are used as input values for the training samples, and the operation pass parameters, keywords, and descriptive text are used as the expected output.
[0132] The input values are output to the initial intelligent matching model and the deep reinforcement learning algorithm, respectively.
[0133] Based on the actual output value and the expected output value, the matching transfer parameter, sample transfer parameter and operation transfer parameter are evaluated, and the matching transfer parameter, sample transfer parameter and operation transfer parameter whose difference between the actual output value and the expected output value is greater than the preset value are deleted.
[0134] Optionally, the configured intelligent matching model is trained based on the filtered matching transmission parameters, the filtered sample transmission parameters, and the filtered operation transmission parameters to obtain an updated intelligent matching model, including:
[0135] Set the first parameter matrix to the first influence coefficient;
[0136] Set the second parameter matrix as the second influence coefficient;
[0137] Adjust the parameter matrix of the intelligent matching model based on the first and second influence coefficients;
[0138] The sample transfer parameters and matching transfer parameters are used as input values for training samples, and the operation transfer parameters, keywords, and descriptive text are used as expected outputs to train the intelligent matching model, thus obtaining the updated intelligent matching model.
[0139] Reference Figure 4As shown, an intelligent matching model construction system applying deep reinforcement learning is provided. The system includes multiple matching devices, each configured with an intelligent matching model. The method includes:
[0140] The matching device is used for:
[0141] After retrieving and matching candidate information through the intelligent matching model, the matching information, sample information and operation information corresponding to the matching device are obtained.
[0142] Generate matching parameters based on the matching information;
[0143] Generate sample transfer parameters based on sample information;
[0144] Generate operation transmission parameters based on the operation information;
[0145] The matching transmission parameters, sample transmission parameters, and operation transmission parameters are transmitted to the central device respectively;
[0146] The central device is used to filter the received matching transmission parameters, sample transmission parameters, and operation transmission parameters through preset deep reinforcement learning.
[0147] Other matching devices are used, when idle, to train the configured intelligent matching model based on the filtered matching transmission parameters, the filtered sample transmission parameters, and the filtered operation transmission parameters, to obtain an updated intelligent matching model.
[0148] Optionally, the matching device is used for:
[0149] A smart matching model is constructed using a wide neural network;
[0150] The intelligent matching model is trained based on the preset candidate information and the corresponding matching information.
[0151] Optionally, the matching device is used for:
[0152] After the user matches candidate information using keywords and description text through an intelligent matching model, the operations performed on the candidate information are recorded. These operations include at least marking, deleting, and adding to the candidate list.
[0153] Generate sample information based on keywords, description text, and matched candidate information;
[0154] Based on the first parameter matrix of the intelligent matching model, matching information is generated;
[0155] Based on the tagging, deletion, and adding to the candidate list operations, generate operation information corresponding to the keywords, description text, and matching candidate information, respectively.
[0156] Optionally, the matching device is used for:
[0157] Based on the first parameter matrix, keywords, and description text, generate the corresponding matching parameters;
[0158] The matching parameters are transmitted to the central device.
[0159] Optionally, the matching device is used for:
[0160] After the deletion operation, the candidate information is subjected to keyword identification and extraction, and the first sample transmission parameters are generated based on the first extraction results, keywords and descriptive text.
[0161] For candidates whose information was deleted from the matched candidate information, keyword identification and extraction were performed, and second sample transmission parameters were generated based on the second extraction results, keywords and descriptive text.
[0162] The first sample transfer parameters and the second sample transfer parameters are combined into a single sample transfer parameter and transmitted to the central device.
[0163] Optionally, the matching device is used for:
[0164] Obtain the final candidate information after the operation is performed;
[0165] The intelligent matching model is trained based on the final candidate information, keywords, and description text.
[0166] Based on the second parameter matrix corresponding to the trained intelligent matching model, as well as the keywords and descriptive text, the operation transmission parameters are generated.
[0167] The operation parameters are transmitted to the central device.
[0168] Optionally, the central device is used for:
[0169] Set the first parameter matrix and the second parameter matrix as the parameter matrices of the initial intelligent matching model, respectively;
[0170] The sample pass parameters and matching pass parameters are used as input values for the training samples, and the operation pass parameters, keywords, and descriptive text are used as the expected output.
[0171] The input values are output to the initial intelligent matching model and the deep reinforcement learning algorithm, respectively.
[0172] Based on the actual output value and the expected output value, the matching transfer parameter, sample transfer parameter and operation transfer parameter are evaluated, and the matching transfer parameter, sample transfer parameter and operation transfer parameter whose difference between the actual output value and the expected output value is greater than the preset value are deleted.
[0173] Optionally, the matching device is used for:
[0174] Set the first parameter matrix to the first influence coefficient;
[0175] Set the second parameter matrix as the second influence coefficient;
[0176] Adjust the parameter matrix of the intelligent matching model based on the first and second influence coefficients;
[0177] The sample transfer parameters and matching transfer parameters are used as input values for training samples, and the operation transfer parameters, keywords, and descriptive text are used as expected outputs to train the intelligent matching model, thus obtaining the updated intelligent matching model.
[0178] Reference Figure 5 As shown, an intelligent matching model construction device applying deep reinforcement learning is provided. The device includes:
[0179] The acquisition module is used to retrieve and match candidate information through an intelligent matching model, and then obtain matching information, sample information, and operation information corresponding to the matching device.
[0180] The processing module is used to generate matching transmission parameters based on the matching information;
[0181] The processing module is also used to generate sample transfer parameters based on the sample information;
[0182] The processing module is also used to generate operation transmission parameters based on the operation information;
[0183] The transmission module is used to transmit the matching transmission parameters, sample transmission parameters, and operation transmission parameters to the central device, respectively.
[0184] The processing module is also used to train the configured intelligent matching model when it is in an idle state, based on the filtered matching transmission parameters, the filtered sample transmission parameters, and the filtered operation transmission parameters, to obtain an updated intelligent matching model.
[0185] Optionally, the processing module is used for:
[0186] A smart matching model is constructed using a wide neural network;
[0187] The intelligent matching model is trained based on the preset candidate information and the corresponding matching information.
[0188] Optionally, after any matching device retrieves and matches candidate information through the intelligent matching model, it obtains the matching information, sample information, and operation information corresponding to the matching device, including:
[0189] After the user matches candidate information using keywords and description text through an intelligent matching model, the operations performed on the candidate information are recorded. These operations include at least marking, deleting, and adding to the candidate list.
[0190] Generate sample information based on keywords, description text, and matched candidate information;
[0191] Based on the first parameter matrix of the intelligent matching model, matching information is generated;
[0192] Based on the tagging, deletion, and adding to the candidate list operations, generate operation information corresponding to the keywords, description text, and matching candidate information, respectively.
[0193] Optionally, the processing module is used for:
[0194] Based on the first parameter matrix, keywords, and description text, generate the corresponding matching parameters;
[0195] The matching parameters are transmitted to the central device.
[0196] Optionally, the processing module is used for:
[0197] After the deletion operation, the candidate information is subjected to keyword identification and extraction, and the first sample transmission parameters are generated based on the first extraction results, keywords and descriptive text.
[0198] For candidates whose information was deleted from the matched candidate information, keyword identification and extraction were performed, and second sample transmission parameters were generated based on the second extraction results, keywords and descriptive text.
[0199] The first sample transfer parameters and the second sample transfer parameters are combined into a single sample transfer parameter and transmitted to the central device.
[0200] Optionally, the processing module is used for:
[0201] Obtain the final candidate information after the operation is performed;
[0202] The intelligent matching model is trained based on the final candidate information, keywords, and description text.
[0203] Based on the second parameter matrix corresponding to the trained intelligent matching model, as well as the keywords and descriptive text, the operation transmission parameters are generated.
[0204] The operation parameters are transmitted to the central device.
[0205] Optionally, the processing module is used for:
[0206] Set the first parameter matrix to the first influence coefficient;
[0207] Set the second parameter matrix as the second influence coefficient;
[0208] Adjust the parameter matrix of the intelligent matching model based on the first and second influence coefficients;
[0209] The sample transfer parameters and matching transfer parameters are used as input values for training samples, and the operation transfer parameters, keywords, and descriptive text are used as expected outputs to train the intelligent matching model, thus obtaining the updated intelligent matching model.
[0210] The above specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
[0211] The technical features of the above embodiments can be combined in any way (as long as there is no contradiction in the combination of these technical features). For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described; these embodiments not explicitly written should also be considered to be within the scope of this specification.
[0212] The present invention has been described in detail above through general description and specific embodiments. It should be noted that, without departing from the concept of the present invention, various modifications and improvements can be made to these specific embodiments, all of which fall within the scope of protection of this application. Therefore, the scope of protection of this patent application should be determined by the appended claims.
[0213] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A method for constructing an intelligent matching model using deep reinforcement learning, characterized in that, The method is applied to a distributed candidate information intelligent matching system, the system including multiple matching devices, each of which is configured with an intelligent matching model, and the method includes: After any matching device retrieves and matches candidate information through the intelligent matching model, it obtains the matching information, sample information, and operation information corresponding to the matching device. Based on the matching information, generate matching transmission parameters; Based on the sample information, generate sample transfer parameters; Based on the operation information, generate operation transmission parameters; The matching transmission parameters, the sample transmission parameters, and the operation transmission parameters are transmitted to the central device respectively; Upon receiving the matching transmission parameters, the sample transmission parameters, and the operation transmission parameters, the central device performs filtering through preset deep reinforcement learning. When other matching devices are idle, they train the configured intelligent matching model based on the filtered matching transmission parameters, filtered sample transmission parameters, and filtered operation transmission parameters to obtain an updated intelligent matching model. After any matching device retrieves and matches candidate information through the intelligent matching model, it obtains matching information, sample information, and operation information corresponding to the matching device, including: After the user matches candidate information using keywords and description text through the intelligent matching model, the operations performed on the candidate information are recorded. The operations include at least marking, deleting, and adding to the candidate list. The sample information is generated based on the keywords, the descriptive text, and the matched candidate information. The matching information is generated based on the first parameter matrix of the intelligent matching model; Based on the marking operation, the deletion operation, and the addition to the candidate list operation, operation information corresponding to the keyword, the description text, and the matching candidate information is generated respectively.
2. The method according to claim 1, characterized in that, The method further includes: The intelligent matching model is constructed using a wide neural network; The intelligent matching model is trained based on preset candidate information and corresponding matching information.
3. The method according to claim 2, characterized in that, The step of generating matching transmission parameters based on the matching information includes: Based on the first parameter matrix, the keywords, and the description text, generate corresponding matching transmission parameters; The matching transmission parameters are transmitted to the central device.
4. The method according to claim 1, characterized in that, The step of generating sample transfer parameters based on the sample information includes: After the deletion operation, the candidate information is subjected to keyword identification and extraction, and a first sample transmission parameter is generated based on the first extraction result, the keywords, and the description text. The candidate information for which the deletion operation was performed in the matched candidate information is subjected to keyword identification and extraction, and a second sample transmission parameter is generated based on the second extraction result, the keywords and the description text; The first sample transmission parameter and the second sample transmission parameter are combined into the sample transmission parameter and transmitted to the central device.
5. The method according to claim 1, characterized in that, The step of generating operation transmission parameters based on the operation information includes: Obtain the final candidate information after the operation is performed; The intelligent matching model is trained based on the final candidate information, the keywords, and the description text. Based on the second parameter matrix corresponding to the trained intelligent matching model, as well as the keywords and the description text, the operation transmission parameters are generated. The operation transmission parameters are transmitted to the central device.
6. The method according to claim 5, characterized in that, Upon receiving the matching transmission parameters, the sample transmission parameters, and the operation transmission parameters, the central device performs screening through preset deep reinforcement learning, including: Set the first parameter matrix and the second parameter matrix as the parameter matrices of the initial intelligent matching model, respectively; The sample transmission parameters and the matching transmission parameters are used as input values for the training samples, and the operation transmission parameters, the keywords, and the descriptive text are used as the expected output. The input values are output to the initial intelligent matching model and the deep reinforcement learning algorithm, respectively. Based on the actual output value and the expected output value, the matching transfer parameter, the sample transfer parameter, and the operation transfer parameter are evaluated, and the matching transfer parameter, sample transfer parameter, and operation transfer parameter whose difference between the actual output value and the expected output value is greater than a preset value are deleted.
7. The method according to claim 6, characterized in that, The step of training the configured intelligent matching model based on the filtered matching transfer parameters, filtered sample transfer parameters, and filtered operation transfer parameters to obtain the updated intelligent matching model includes: Set the first parameter matrix as the first influence coefficient; Set the second parameter matrix as the second influence coefficient; The parameter matrix of the intelligent matching model is adjusted based on the first influence coefficient and the second influence coefficient. The sample transmission parameters and the matching transmission parameters are used as input values for training samples, and the operation transmission parameters, the keywords, and the descriptive text are used as expected outputs to train the intelligent matching model, thereby obtaining the updated intelligent matching model.
8. A system for constructing an intelligent matching model using deep reinforcement learning, characterized in that, The system includes multiple matching devices, each configured with an intelligent matching model. The system includes: The matching device is used for: After retrieving and matching candidate information through the intelligent matching model, the matching information, sample information, and operation information corresponding to the matching device are obtained. Based on the matching information, generate matching transmission parameters; Based on the sample information, generate sample transfer parameters; Based on the operation information, generate operation transmission parameters; The matching transmission parameters, the sample transmission parameters, and the operation transmission parameters are transmitted to the central device respectively; The central device is used to perform filtering through preset deep reinforcement learning upon receiving the matching transmission parameters, the sample transmission parameters, and the operation transmission parameters. Other matching devices are used to train the configured intelligent matching model when the device is idle, based on the filtered matching transmission parameters, the filtered sample transmission parameters, and the filtered operation transmission parameters, to obtain an updated intelligent matching model. The matching device is specifically used for: After any matching device retrieves and matches candidate information through the intelligent matching model, it obtains matching information, sample information, and operation information corresponding to the matching device, including: After the user matches candidate information using keywords and description text through the intelligent matching model, the operations performed on the candidate information are recorded. The operations include at least marking, deleting, and adding to the candidate list. The sample information is generated based on the keywords, the descriptive text, and the matched candidate information. The matching information is generated based on the first parameter matrix of the intelligent matching model; Based on the marking operation, the deletion operation, and the addition to the candidate list operation, operation information corresponding to the keyword, the description text, and the matching candidate information is generated respectively.
9. A device for constructing an intelligent matching model using deep reinforcement learning, characterized in that, The device includes: The acquisition module is used to retrieve and match candidate information through an intelligent matching model, and then obtain matching information, sample information, and operation information corresponding to the matching device. The processing module is used to generate matching transmission parameters based on the matching information; The processing module is also used to generate sample transfer parameters based on the sample information; The processing module is also used to generate operation transmission parameters based on the operation information; The transmission module is used to transmit the matching transmission parameters, the sample transmission parameters, and the operation transmission parameters to the central device, respectively. When in an idle state, the configured intelligent matching model is trained based on the filtered matching transmission parameters, the filtered sample transmission parameters, and the filtered operation transmission parameters to obtain an updated intelligent matching model. The acquisition module is specifically used for: After any matching device retrieves and matches candidate information through the intelligent matching model, it obtains matching information, sample information, and operation information corresponding to the matching device, including: After the user matches candidate information using keywords and description text through the intelligent matching model, the operations performed on the candidate information are recorded. The operations include at least marking, deleting, and adding to the candidate list. The sample information is generated based on the keywords, the descriptive text, and the matched candidate information. The matching information is generated based on the first parameter matrix of the intelligent matching model; Based on the marking operation, the deletion operation, and the addition to the candidate list operation, operation information corresponding to the keyword, the description text, and the matching candidate information is generated respectively.