A semantic-based job information classification and identification method
By constructing a closed-loop feedback training mechanism for semantic and pressure drop models, job title codes are automatically generated, solving the problems of misunderstanding bias and low efficiency in manual review, achieving accuracy and consistency of job title codes, and reducing credit risk.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANGHAI PUDONG DEVELOPMENT BANK
- Filing Date
- 2022-10-26
- Publication Date
- 2026-06-19
Smart Images

Figure CN115640812B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of natural language processing technology, and in particular to a semantic-based method for classifying and recognizing job information. Background Technology
[0002] In current financial business scenarios, it is often necessary to predict and assess customer risk. Customer industry and occupation information are crucial indicators for this assessment. By appropriately categorizing customers by industry and occupation, the accuracy of risk assessment can be effectively improved. For example, in banking, different industries and occupations affect customer credit limits; therefore, accurate job codes can enhance the accuracy of credit granting by banks.
[0003] Currently, the common industry practice is to categorize similar industries and job titles according to credit policies, and then manually review the occupational information filled in on the customer application form and assign a code to it. For example, if the applicant's job title is a regular employee in administration or human resources, their job code is 551; if the applicant's job title is the general manager of administration or human resources, their job code is 553. Generally, customers with higher positions will have higher credit limits. Therefore, efficiently and accurately determining a customer's job code is crucial in the customer information review process.
[0004] However, the existing manual review methods have the following main drawbacks:
[0005] Manual review relies on human experience, which can lead to misunderstandings and inconsistent judgment standards. Different reviewers may reach different conclusions when dealing with the same customer information.
[0006] Manual review is prone to confusion because the differences between certain professions are not obvious, making it easy to overlook them. For example, the general manager of a private enterprise and the general manager of a listed company may have the same job title, but their occupational codes are different. Manual review can easily lead to misjudgment if not careful.
[0007] Manual review is inefficient. Internal reviewers need to review a large number of customer applications every day. If they rely solely on manual retrieval of customer industry and occupation information and manual assessment, it will inevitably lead to a long processing time for each application. Summary of the Invention
[0008] The purpose of this invention is to overcome the shortcomings of the existing technology by providing a semantic-based job information classification and recognition method that can ensure review consistency, reduce the risk of misjudgment, and improve review efficiency.
[0009] The objective of this invention can be achieved through the following technical solution: a semantic-based job information classification and recognition method, comprising the following steps:
[0010] S1. Obtain information from the user's historical data table;
[0011] S2. Based on the information filled in by the user's historical table, combined with the corresponding historical job code, a semantic model is constructed through closed-loop feedback training.
[0012] S3. Based on the information filled in the user's historical data, further construct the pressure drop model;
[0013] S4. Input the current user information into the semantic model and output the initial job code;
[0014] The initial job codes are corrected using the pressure drop model, and the final job codes are output.
[0015] Furthermore, the information filled in the user history table includes the company name, enterprise type, registered capital, industry type, department name, and job title.
[0016] Furthermore, the semantic model constructed in step S2 takes the department name and job title as input and outputs the initial job title code as output.
[0017] Furthermore, step S2 specifically includes the following steps:
[0018] S21. Extract the department name and job title data from the user history table to use as input data;
[0019] S22. Select tags from the corresponding historical job codes, combine them with the input data, complete the closed-loop feedback training of the model, and construct the semantic model.
[0020] Furthermore, step S22 specifically includes the following steps:
[0021] S221. According to the set iteration update cycle, obtain the historical job code data corresponding to the input data;
[0022] S222. Filter out tags from the acquired historical job code data;
[0023] S223. Using the input data and corresponding labels, train the model to obtain a semantic model.
[0024] Furthermore, in step S222, the code with the highest proportion is selected from the historical job code data to be used as a label, while the remaining codes are retained.
[0025] Furthermore, the semantic model specifically adopts the BERT model architecture.
[0026] Furthermore, step S3 specifically includes the following steps:
[0027] S31. Based on the remaining codes, extract the corresponding unit name, enterprise type, registered capital and industry type data from the information filled in the user history table.
[0028] S32. Design a pressure drop model and perform statistical analysis on the data extracted in step S31 to determine the expert experience and data required for the pressure drop model.
[0029] Furthermore, step S31 specifically extracts the keywords, enterprise nature, registered capital, and industry nature data from the corresponding unit name in the user's historical information.
[0030] Furthermore, keywords in the organization's name include, but are not limited to, branch, sub-branch, and subsidiary.
[0031] Compared with existing technologies, this invention constructs a semantic model and a pressure reduction model based on historical form-filled information and corresponding historical job code data. The semantic model is used to output an initial job code, and the pressure reduction model is used to correct the initial job code to obtain the final job code. This realizes an intelligent automatic job code generation solution, which can effectively save labor costs, improve approval efficiency, and avoid consistency and misjudgment problems caused by manual review. In addition, the pressure reduction model can prevent the problem of excessive credit due to inaccurate job codes, reduce approval risks, and improve the accuracy of generated job codes.
[0032] When constructing the semantic model, this invention uses the commonly used department names and job titles from historical data as inputs, and uses the code with the highest proportion in the corresponding historical job code data as the label, which can ensure the accuracy of the semantic model. When constructing the pressure reduction model, it uses quantitative indicators such as the nature of the unit, registered capital, and industry nature corresponding to the other codes in the historical job code data, as well as keywords in the unit name, to determine the expert experience and data required for the pressure reduction model through statistical analysis, thereby ensuring the accuracy of the pressure reduction model.
[0033] According to a set iterative update cycle, the present invention updates the historical job code data corresponding to the input data, and then constructs the semantic model and the pressure drop model in sequence, thereby forming a closed-loop feedback training mechanism to continuously improve the accuracy of the constructed model. Attached Figure Description
[0034] Figure 1 This is a schematic diagram of the method flow of the present invention;
[0035] Figure 2 This is a schematic diagram of the application process in an embodiment. Detailed Implementation
[0036] The present invention will now be described in detail with reference to the accompanying drawings and specific embodiments.
[0037] Example
[0038] like Figure 1 As shown, a semantic-based job information classification and recognition method includes the following steps:
[0039] S1. Obtain user history table information, which includes company name, enterprise type, registered capital, industry type, department name, and job title;
[0040] S2. Based on the information entered into the user's historical data table, combined with the corresponding historical job codes, a semantic model is constructed through closed-loop feedback training. Specifically:
[0041] First, extract the department name and job title data from the user history table as input data;
[0042] Then, tags are selected from the corresponding historical job codes, and combined with the input data to complete the closed-loop feedback training of the model and construct a semantic model. The input of the semantic model includes the department name and job name, and the output is the initial job code.
[0043] In practical applications, to ensure the accuracy of the model construction, the historical job code data corresponding to the input data can be updated according to the set iteration update cycle; labels are selected from the acquired historical job code data (specifically, the code with the highest proportion is selected from the historical job code data as the label, and the remaining codes are retained); the input data and the corresponding labels are used to train the model to obtain the semantic model.
[0044] In this embodiment, the semantic model adopts the BERT model architecture;
[0045] S3. Based on the information filled in the user's historical data table, further construct the pressure drop model, specifically:
[0046] First, based on the remaining code, extract the corresponding keywords (including but not limited to branch, sub-branch, branch company), enterprise type, registered capital and industry type data from the information filled in the user history table.
[0047] Next, a pressure drop model is designed, and the data extracted in the previous step is statistically analyzed to determine the expert experience and data required for the pressure drop model.
[0048] S4. Input the current user information into the semantic model and output the initial job code;
[0049] The initial job codes are corrected using the pressure drop model, and the final job codes are output.
[0050] This embodiment applies the above technical solution to process credit card approval business, such as... Figure 2 As shown, it mainly includes the following:
[0051] 1) Training data is generated based on historical data from the credit card center, mainly the information filled in by users when applying for credit cards, including company name, type of enterprise, registered capital, industry, department, position, etc.
[0052] 2) The model is trained using the department and job title filled in by the customer as inputs, and the approved job title codes as labels. A semantic model based on BERT is selected. This model has general semantic understanding capabilities, and after fine-tuning, it can better understand the data in the credit card approval scenario. When selecting labels, since the semantic model only takes two inputs, department and job title, and does not consider data such as company name, company type, registered capital, and industry type, there may be multiple job title codes. Here, the code with the highest proportion in the historical data is selected as the label, and the remaining codes are processed according to the subsequent pressure reduction model.
[0053] 3) After obtaining the initial job code in step 2), the initial code is subsequently modified using a pressure reduction model based on data such as company name, company type, registered capital, and industry. This technical solution designs the pressure reduction model for two purposes:
[0054] A) Reduce users' credit limits to prevent excessive credit. For example, the job codes for department managers at a bank's head office, branches, and sub-branches should decrease progressively. The same applies to department managers at a company's headquarters, city-level branches, and district / county-level branches. Alternatively, the codes for the same job titles may differ between companies with higher registered capital and companies with lower registered capital, requiring different mapping methods.
[0055] B) Handle the situation in step 2) where the same department and position correspond to multiple job codes. For example, office employees in government departments have special codes, which are different from the codes corresponding to office employees in ordinary companies.
[0056] By statistically analyzing the keywords (such as branch, sub-branch, branch company, etc.), unit nature, registered capital, industry nature, etc. in the unit names corresponding to each code in historical data, the expert experience and data required for the pressure reduction model can be obtained.
[0057] 4) Based on the semantic model in step 2) and the pressure drop model in step 3), the final job title code is obtained. When approving credit cards, the reviewers refer to the model results to determine the final code. If the manual judgment differs from the model's result, the data is included in the training set for weekly model iteration updates. This forms a closed-loop feedback training mechanism, allowing the model's accuracy to continuously improve.
[0058] In summary, this embodiment designs a semantic model based on the commonly used department and job information in the historical review data of the credit card center, which can intelligently generate initial job codes, greatly saving manpower and improving accuracy.
[0059] On the other hand, based on quantitative indicators such as the nature of the organization, registered capital, and industry, as well as keywords in the organization's name, a pressure reduction model was designed after the semantic model. This model can prevent excessive credit limits caused by inaccurate job codes, reduce credit card approval risks, and achieve accurate pressure reduction of job codes.
[0060] Furthermore, the closed-loop feedback training mechanism can form a virtuous cycle. As the model is continuously applied and more and more data is accumulated, the model accuracy will continue to improve.
[0061] This technical solution can reduce the risk of human error in the approval process and improve approval efficiency; while model-based judgment and recognition can ensure the consistency of judgment standards and reduce deviations caused by human factors.
Claims
1. A semantic-based method for classifying and recognizing job information, characterized in that, Includes the following steps: S1. Obtain user historical data, which includes company name, enterprise type, registered capital, industry type, department name, and job title. S2. Based on the information filled in by the user's historical table, combined with the corresponding historical job code, a semantic model is constructed through closed-loop feedback training. The semantic model takes department name and job title as input and outputs initial job title code as output. Step S2 specifically includes the following steps: S21. Extract the department name and job title data from the user history table to use as input data; S22. Select tags from the corresponding historical job codes, combine them with the input data, complete the closed-loop feedback training of the model, and construct the semantic model. Step S22 specifically includes the following steps: S221. According to the set iteration update cycle, obtain the historical job code data corresponding to the input data; S222. Select tags from the acquired historical job code data. Specifically, select the code with the highest proportion from the historical job code data as tags, and retain the remaining codes. S223. Using the input data and corresponding labels, train the model to obtain a semantic model; S3. Based on the information filled in the user's historical data, further construct the pressure drop model; Step S3 specifically includes the following steps: S31. Based on the remaining codes, extract the corresponding unit name, enterprise type, registered capital and industry type data from the information filled in the user history table. S32. Design a pressure drop model and perform statistical analysis on the data extracted in step S31 to determine the expert experience and data required for the pressure drop model. S4. Input the current user information into the semantic model and output the initial job code; The initial job codes are corrected using the pressure drop model, and the final job codes are output.
2. The semantic-based job information classification and recognition method according to any one of claims 1, characterized in that, The semantic model specifically adopts the BERT model architecture.
3. The semantic-based job information classification and recognition method according to claim 1, characterized in that, Specifically, step S31 involves extracting keywords, enterprise type, registered capital, and industry type data from the user's historical data entry.
4. The semantic-based job information classification and recognition method according to claim 3, characterized in that, Keywords in the company name include, but are not limited to, branch, sub-branch, and branch company.