Data processing methods and related devices

By comparing and classifying credit data, and utilizing Kafka+Flink technology, the pressure on the server when processing large amounts of credit data was resolved, thus improving the accuracy and efficiency of data processing.

CN115170376BActive Publication Date: 2026-06-30SHENZHEN WEIZHONG TAXATION INFORMATION SERVICE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHENZHEN WEIZHONG TAXATION INFORMATION SERVICE CO LTD
Filing Date
2022-07-26
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In existing technologies, the amount of credit data collected by servers in real time is large and complex, resulting in high processing requirements. How to effectively classify and process this data has become a challenge.

Method used

By comparing the content information of the credit data to be processed with historical data, the data that was successfully or unsuccessfully matched is determined, and the required values ​​are determined according to the data source and type. Kafka+Flink is used for real-time processing to establish a standardized credit data processing workflow.

Benefits of technology

It enables the classification and processing of credit data, reduces the real-time data processing pressure on the server, and improves the accuracy and efficiency of data processing.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115170376B_ABST
    Figure CN115170376B_ABST
Patent Text Reader

Abstract

This application discloses a data processing method and related apparatus. The method includes: acquiring credit information data to be processed; comparing the content information of the credit information data to be processed with the content information of a first set of historical credit information data to be processed; if a match is successful, determining the successfully matched credit information data to be processed as the first set of credit information data to be processed; if a match fails, determining the target accuracy requirement value and the target timeliness requirement value based on the data source and data type of the unmatched credit information data to be processed; comparing the target accuracy requirement value and the target timeliness requirement value to determine the second set of credit information data to be processed from the unmatched credit information data to be processed; processing the first set of credit information data and the second set of credit information data to be processed in real time to obtain the target credit information data; and sending the target credit information data to a user terminal. This application can identify data that needs to be processed in real time and can improve the accuracy of processing new content credit information data to be processed.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data processing technology, specifically to a data processing method and related apparatus. Background Technology

[0002] Credit data includes corporate credit information reflecting the creditworthiness of enterprises and personal credit information reflecting the creditworthiness of individuals. Servers collect credit data in real time and can process and analyze it. However, the volume and complexity of the credit data collected in real time are substantial. Processing every single piece of data in real time places high demands on the server's processing power. Therefore, how to classify and process the collected credit data remains a problem to be solved. Summary of the Invention

[0003] This application provides a data processing method and related apparatus to differentiate credit data collected by a processing server and improve the accuracy of the differentiation results.

[0004] In a first aspect, embodiments of this application provide a data processing method applied to a server, the method comprising:

[0005] Obtain credit information data to be processed, wherein the credit information data to be processed includes data source, data type, and content information;

[0006] By comparing the content information of the credit data to be processed with the content information of the first historical credit data to be processed, a first comparison result is obtained;

[0007] If the first comparison result is a successful match, then the credit data to be processed that is successfully matched is determined to be the first credit data to be processed;

[0008] If the first comparison result is a match failure, then based on the data source and data type of the credit information to be processed that failed to match, the target accuracy requirement value and the target timeliness requirement value of the credit information to be processed that failed to match are determined; and the target accuracy requirement value and the target timeliness requirement value are compared to obtain a second comparison result; and based on the second comparison result, the second credit information to be processed in the credit information to be processed that failed to match is determined.

[0009] The first and second unprocessed credit data are processed in real time to obtain the target credit data;

[0010] Send the target credit data to the database.

[0011] Secondly, embodiments of this application provide a data processing apparatus applied to a server, the apparatus comprising:

[0012] An acquisition unit is used to acquire credit data to be processed, wherein the credit data to be processed includes data source, data type, and content information;

[0013] The comparison unit is used to compare the content information of the credit information to be processed with the content information of the first historical credit information to be processed to obtain a first comparison result; it is also used to compare the target accuracy requirement value with the target timeliness requirement value to obtain a second comparison result.

[0014] The determining unit is configured to, when the first comparison result is a successful match, determine the successfully matched credit information data to be processed as the first credit information data to be processed; and, when the first comparison result is a failed match, determine the target accuracy requirement value and target timeliness requirement value of the failed-matched credit information data to be processed based on the data source and data type of the credit information data to be processed; and further configured to, when the first comparison result is a failed match, determine the second credit information data to be processed from the failed-matched credit information data to be processed based on the second comparison result.

[0015] The processing unit is used to process the first credit data to be processed and the second credit data to be processed in real time to obtain the target credit data;

[0016] The sending unit is used to send the target credit data to the database.

[0017] Thirdly, embodiments of this application provide a server including a processor, a memory, a communication interface, and one or more programs, the one or more programs being stored in the memory and configured to be executed by the processor, the programs including instructions for performing the steps in the first aspect of embodiments of this application.

[0018] Fourthly, embodiments of this application provide a computer storage medium storing a computer program for electronic data interchange, wherein the computer program causes a computer to perform some or all of the steps described in the first aspect of this embodiment.

[0019] As can be seen, in this embodiment, the process involves acquiring credit data to be processed; comparing the content information of the credit data to be processed with the content information of a first historical credit data to be processed to obtain a first comparison result; if the first comparison result indicates a successful match, the successfully matched credit data to be processed is determined as the first credit data to be processed; if the first comparison result indicates a failed match, the target accuracy requirement value and target timeliness requirement value of the failed-matched credit data to be processed are determined based on the data source and data type of the failed-matched credit data to be processed; and the target accuracy requirement value and the target timeliness requirement value are compared to obtain a second comparison result; and the second credit data to be processed is determined based on the second comparison result; the first credit data to be processed and the second credit data to be processed are processed in real time to obtain target credit data; and the target credit data is sent to the database. Thus, a standardized credit data processing flow established by this application can be used to classify and process the collected credit data to be processed, reducing the server's pressure on real-time data processing. Furthermore, the proposed solution can improve the accuracy of the credit data collected for processing and the accuracy of the judgment on the processing method of the new data format credit data. Attached Figure Description

[0020] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0021] Figure 1 This is a schematic diagram of a network architecture provided in an embodiment of this application;

[0022] Figure 2 This is a schematic diagram illustrating the composition of a server provided in an embodiment of this application;

[0023] Figure 3 This is a flowchart illustrating a data processing method provided in an embodiment of this application;

[0024] Figure 4 This is a functional unit block diagram of a data processing device provided in an embodiment of this application;

[0025] Figure 5 This is a block diagram of the functional units of another data processing device provided in the embodiments of this application. Detailed Implementation

[0026] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present application, and not all embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present application.

[0027] The terms "first," "second," etc., in the specification, claims, and accompanying drawings of this application are used to distinguish different objects, not to describe a specific order. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or apparatus that includes a series of steps or units is not limited to the listed steps or units, but may optionally include steps or units not listed, or may optionally include other steps or units inherent to these processes, methods, products, or apparatuses.

[0028] In this document, the term "embodiment" means that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a separate or alternative embodiment mutually exclusive with other embodiments. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments.

[0029] This application provides a data processing method and related apparatus. The embodiments of this application are described below with reference to the accompanying drawings.

[0030] Please see Figure 1 This is a schematic diagram of a network architecture provided in an embodiment of this application. Figure 1 As shown, the network architecture may include server 100 and a user terminal cluster. The user terminal cluster may include one or more user terminals; the number of user terminals is not limited here. Figure 1 As shown, the multiple user terminals may specifically include user terminal 200a, user terminal 200b, user terminal 200c, ..., user terminal 200n; as Figure 1 As shown, user terminals 200a, 200b, 200c, ..., 200n can each connect to server 100 via a network, so that each user terminal can interact with server 100 through the network connection.

[0031] like Figure 1The server 100 shown can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.

[0032] The server 100 in this application can be structured as follows: Figure 2 As shown, Figure 2 This is an example diagram illustrating the composition of a server provided in an embodiment of this application. The server 100 may include a processor 110, a memory 120, a communication interface 130, and one or more programs 121, wherein the one or more programs 121 are stored in the memory 120 and configured to be executed by the processor 110, and the one or more programs 121 include instructions for performing any step in the following method embodiments.

[0033] The communication interface 130 is used to support communication between the server 100 and other devices. The processor 110 may be, for example, a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It can implement or execute various exemplary logic blocks, units, and circuits described in conjunction with the embodiments of this application. The processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, etc.

[0034] The memory 120 can be volatile memory or non-volatile memory, or may include both. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of random access memory (RAM) are available, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate synchronous DRAM (DDR SDRAM), enhanced synchronous DRAM (ESDRAM), synchronous linked DRAM (SLDRAM), and direct rambus RAM (DR RAM).

[0035] In a specific implementation, the processor 110 is used to execute any step performed by the server in the following method embodiments, and when performing data transmission operations such as sending target credit data, it may choose to call the communication interface 130 to complete the corresponding operation.

[0036] It should be noted that the above schematic diagram of the server structure is only an example, and the actual number of components included may be more or less, and no single limitation is made here.

[0037] Please see Figure 3 , Figure 3 This is a flowchart illustrating a data processing method provided in an embodiment of this application. This method can be executed by a server, and can be applied to applications such as... Figure 1 or Figure 2 Server 100 shown, as Figure 3 As shown, the data processing method includes:

[0038] S110, Obtain credit data to be processed.

[0039] The credit data to be processed can be all the data collected by the server in real time within a unit of time.

[0040] The credit information to be processed includes data source, data type, and content information.

[0041] In practice, the server can interface with third-party data source systems to collect and store credit data. Different third-party data source systems represent different data sources. Regarding data source system access, it can connect to systems such as invoice data, tax data, business registration data, judicial data, and intellectual property data to collect raw credit data from these systems. This means that data sources can include invoice data, tax data, business registration data, judicial data, and intellectual property data. When collecting credit data, the server can break down the raw credit data into processing data such as business relationships, judicial relationships, and invoice purchase and sale relationships for enterprises or individuals, and store this data in the database. The data types of the credit data to be processed stored in the database include, but are not limited to, user identity information, loan information, and credit card information, etc., and can be configured according to needs without further restrictions.

[0042] Furthermore, in practical applications, because servers typically collect credit data frequently in real time, the changes in individual credit data collected from different data sources may be minimal or nonexistent during a single collection by the server. Therefore, to determine whether all collected credit data constitutes the information to be processed, the server can collect only the credit data that has changed from each data source during collection, thereby reducing the processing load on the server. Alternatively, the server can collect all credit data at once and then perform deduplication on all collected data to avoid duplication with data stored in the database. The server's collection of credit data from different data sources can be configured according to actual needs, and no further restrictions are imposed here.

[0043] S120, compare the content information of the credit data to be processed with the content information of the first historical credit data to be processed to obtain the first comparison result.

[0044] Among them, the first historical credit data to be processed refers to the data in the database that has been processed in real time from the historical credit data collected.

[0045] In specific implementation, comparing the content information of the to-be-processed credit investigation data with the content information of the first historical to-be-processed credit investigation data includes: performing word segmentation on the content information of the to-be-processed credit investigation data to obtain the word segmentation result and each feature corresponding to the word segmentation result; obtaining the word segmentation result of the first historical to-be-processed credit investigation data and each feature corresponding to the word segmentation result; comparing each feature corresponding to the word segmentation result of the to-be-processed credit investigation data with each feature corresponding to the word segmentation result of the first historical to-be-processed credit investigation data. Among them, each word segment corresponding to the word segmentation result can be respectively marked as a feature, so as to compare each feature of the to-be-processed credit investigation data with each feature of each stored first historical to-be-processed credit investigation data, and it can be judged whether there is a first historical to-be-processed credit investigation data in the database that matches the data form of the to-be-processed credit investigation data. Exemplarily, if the to-be-processed credit investigation data is "Zhang San's tax to be paid in July 2022 is 2,000 yuan", the data form expressed by each feature corresponding to its word segmentation result can be "name" + "time" + "tax to be paid" + "amount".

[0046] Furthermore, for the convenience of comparison and to reduce the storage space, before performing word segmentation on the to-be-processed credit investigation data, each collected to-be-processed credit investigation data can also be processed to remove stop words. The stop word removal process is used to remove unimportant words (such as: of, its, for, etc.) and punctuation marks in each to-be-processed credit investigation data. For example: if the to-be-processed credit investigation data is "According to Zhang San's income in July 2022, the tax to be paid in that month is 2,000 yuan", the result after stop word removal may be: "Zhang San's tax to be paid in July 2022 is 2,000 yuan". It should be noted that the above stop word removal operation is based on a stop word dictionary. This stop word dictionary can be an existing stop word dictionary, or this stop word dictionary can also be a stop word dictionary trained by the server according to a large amount of data stored in the database and applicable to the current application scenario.

[0047] S130, if the first comparison result is a successful match, then determine the to-be-processed credit investigation data with a successful match as the first to-be-processed credit investigation data.

[0048] Among them, the first to-be-processed credit investigation data is the data that needs to be processed in real time. In this step, the first comparison result is a successful match, indicating that there is a data form in the database that matches the data form of the current to-be-processed credit investigation data. At this time, the current to-be-processed credit investigation data can be processed in the same processing manner as the first historical to-be-processed data that matches it, that is, processed in real time.

[0049] For example, if the credit information to be processed is "Zhang San's tax payment of 2000 yuan in July 2022", its various features are represented in the following format: name + time + tax payment + amount. The database stores the data "Li Si's tax payment of 1000 yuan in June 2022". Therefore, the data format of this first historical credit information to be processed is the same as that of the credit information to be processed. Thus, it can be concluded that the credit information to be processed and the first historical credit information to be processed are successfully matched, and this credit information to be processed is the first credit information to be processed that requires real-time processing.

[0050] S140, if the first comparison result is a matching failure, then based on the data source and data type of the credit information to be processed that failed to match, determine the target accuracy requirement value and the target timeliness requirement value of the credit information to be processed that failed to match.

[0051] When the first comparison result is a match failure, it indicates that the first historical credit information data to be processed stored in the database does not contain content with the same data format as the current credit information data to be processed. In this case, the current credit information data to be processed has two possibilities: the current credit information data that failed to match is data that needs to be processed offline; or, the current credit information data that failed to match is data in a new data format that does not exist in the database and needs to be processed in real time. In order to further accurately determine the processing method of the credit information data that failed to match, it is necessary to perform this step S104, as well as the following steps S105 and S106, on the credit information data that failed to match.

[0052] Among them, the target accuracy requirement value is a concrete value representing the possibility that the credit data to be processed needs to be processed offline, and the target timeliness requirement value is a concrete value representing the possibility that the credit data to be processed needs to be processed in real time.

[0053] S150, compare the target accuracy requirement value and the target timeliness requirement value to obtain a second comparison result.

[0054] Specifically, the target accuracy requirement value and the target timeliness requirement value can be compared to obtain a second comparison result. If the target accuracy requirement value is less than or equal to the target timeliness requirement value, then step S160 is executed to determine the credit data to be processed as the second credit data to be processed; if the target accuracy requirement value is greater than the target timeliness requirement value, then the credit data to be processed is determined as the third credit data to be processed, and the third credit data to be processed is processed offline to obtain offline credit data, which is then sent to the database.

[0055] Among them, the second type of credit data to be processed represents the credit data that needs to be processed in real time. The third type of credit data to be processed represents the credit data that needs to be processed offline.

[0056] S160, based on the second comparison result, determine the second unprocessed credit data among the unprocessed credit data that failed to match.

[0057] The number of second unprocessed credit data in the unprocessed credit data that failed to match can be two; or, the second unprocessed credit data in the unprocessed credit data that failed to match includes at least one.

[0058] S170, the first credit data to be processed and the second credit data to be processed are processed in real time to obtain the target credit data.

[0059] Specifically, real-time processing can be performed jointly by Kafka and Flink. Kafka retrieves the first and second pieces of credit information to be processed from the database, and then Flink performs common data engineering, global aggregation, and other real-time computational tasks to complete the real-time processing of the first and second pieces of credit information to obtain the target credit information.

[0060] S180, send the target credit data to the database.

[0061] It is understandable that the raw credit data collected by the server, the offline credit data obtained through offline processing, and the target credit data obtained through real-time processing are stored in different databases.

[0062] As can be seen, in this embodiment, the process involves acquiring credit data to be processed; comparing the content information of the credit data to be processed with the content information of a first historical credit data to be processed to obtain a first comparison result; if the first comparison result indicates a successful match, the successfully matched credit data to be processed is determined as the first credit data to be processed; if the first comparison result indicates a failed match, the target accuracy requirement value and target timeliness requirement value of the failed-matched credit data to be processed are determined based on the data source and data type of the failed-matched credit data to be processed; and the target accuracy requirement value and the target timeliness requirement value are compared to obtain a second comparison result; and the second credit data to be processed is determined based on the second comparison result; the first credit data to be processed and the second credit data to be processed are processed in real time to obtain target credit data; and the target credit data is sent to the database. Thus, a standardized credit data processing flow can be established through this application, thereby enabling the classified processing of collected credit data to be processed and reducing the server's pressure on real-time data processing. Furthermore, the proposed solution can improve the accuracy of classified and collected credit data awaiting processing, and also improve the accuracy of judging the processing methods for new data formats of credit data awaiting processing.

[0063] In one possible example, determining the target accuracy requirement value and target timeliness requirement value of the unmatched credit information data to be processed based on the data source and data type of the unmatched credit information data to be processed includes: matching a first accuracy requirement value and a first timeliness requirement value corresponding to the unmatched credit information data to be processed based on the data source of the unmatched credit information data to be processed; matching a second accuracy requirement value and a second timeliness requirement value corresponding to the unmatched credit information data to be processed based on the data type of the unmatched credit information data to be processed; determining the target accuracy requirement value corresponding to the unmatched credit information data to be processed based on the first accuracy requirement value and the second accuracy requirement value; and determining the target timeliness requirement value corresponding to the unmatched credit information data to be processed based on the first timeliness requirement value and the second timeliness requirement value.

[0064] Among them, the first accurate demand value and the first timeliness demand value are concrete values ​​stored in the database corresponding to each data source, and the second accurate demand value and the second timeliness demand value are concrete values ​​stored in the database corresponding to each data type.

[0065] Specifically, taking loan information from the data source corresponding to tax data as an example, the first accurate demand value and the first timeliness demand value are the values ​​corresponding to the tax data. For example, the first accurate demand value is 20 and the first timeliness demand value is 80. The second accurate demand value and the second timeliness demand value are the values ​​corresponding to the loan information in the tax data. For example, the second accurate demand value is 40 and the second timeliness demand value is 60.

[0066] It is evident that by calculating the target timeliness requirement value and the target accuracy requirement value using the first accurate requirement value, the first timeliness requirement value, the second accurate requirement value, and the second timeliness requirement value of credit data, the processing method corresponding to the unmatched credit data to be processed can be determined more accurately, thereby improving the accuracy of the judgment results.

[0067] In one possible example, the method further includes obtaining a first accurate demand value and a first timeliness demand value corresponding to each of the data sources; obtaining the first accurate demand value and the first timeliness demand value corresponding to each of the data sources includes: obtaining the data sources of historical credit data to be processed, wherein the historical credit data to be processed includes the first historical credit data to be processed and the second historical credit data to be processed; determining the quantity of the first historical credit data to be processed and the second historical credit data to be processed in each of the data sources of the historical credit data to be processed, based on the data sources of the historical credit data to be processed; and determining the first timeliness demand value and the first accurate demand value corresponding to each of the data sources of the historical credit data to be processed, based on the ratio of the quantity of the first historical credit data to the quantity of the second historical credit data to be processed in each of the data sources of the historical credit data to be processed.

[0068] The first set of pending historical credit data, as mentioned above, refers to the historically collected credit data stored in the database that has undergone real-time processing. The second set of pending historical credit data refers to the historically collected credit data stored in the database that has undergone offline processing.

[0069] Specifically, the first accurate requirement value and the first timeliness requirement value are determined by the processing methods of all historical pending data from various data sources stored in the database. For example, taking tax data as an example, the database stores a total of 1000 tax records, including 800 historical pending credit data records and 200 historical pending credit data records. It can be seen that the ratio of the first historical pending credit data records to the second historical pending credit data records is 4:1. Therefore, the first timeliness requirement value and the second timeliness requirement value corresponding to the tax data can be assigned values. For example, the first timeliness requirement value corresponding to each credit data record collected from the tax data repository is 80, and the first accurate requirement value is 20.

[0070] As can be seen, in this embodiment, by calculating the ratio of the first historical credit data to be processed in real time and the second historical credit data to be processed offline from the same data source, the probability of the credit data in the data source being processed in real time and offline can be estimated, thereby improving the accuracy of the processing method for the credit data to be processed that fails to match.

[0071] In one possible example, the method further includes obtaining a second accurate demand value and a second timeliness demand value corresponding to each data type in each of the data sources; obtaining the second accurate demand value and the second timeliness demand value corresponding to each data type in each of the data sources includes: obtaining the data type of historical credit information data to be processed in each of the data sources, wherein the historical credit information data to be processed includes first historical credit information data to be processed and second historical credit information data to be processed; determining the number of first historical credit information data to be processed and the number of second historical credit information data to be processed in each of the data sources based on the data type of the historical credit information data to be processed in each of the data sources; and determining the second timeliness demand value and the second accurate demand value corresponding to each of the data types of the historical credit information data to be processed based on the ratio of the number of first historical credit information data to the number of second historical credit information data to the number of data types of the historical credit information data to be processed in each of the data sources.

[0072] The first and second historical credit data pending processing are as described above and will not be repeated here.

[0073] Specifically, determining the number of first and second historical credit data to be processed in each of the data sources based on the data types of the historical credit data to be processed means determining the number of first and second historical credit data to be processed included in all historical credit data to be processed in each of the data sources based on the data types in each data source.

[0074] In this example, the second accuracy requirement value and the second timeliness requirement value are determined by the number of processing methods for historical pending data in various data types stored in the database. For instance, taking loan information as an example, the database stores a total of 600 loan information entries, including 240 first-historical pending credit data entries and 360 second-historical pending credit data entries. It can be seen that the ratio of the first-historical pending credit data entries to the second-historical pending credit data entries is 2:3. Therefore, the second timeliness requirement value for loan information whose data source is tax data is 60, and the second accuracy requirement value is 40.

[0075] As can be seen, in this embodiment, by calculating the ratio of the first historical credit data to be processed in real time and the second historical credit data to be processed offline, the probability of real-time and offline processing of the data in this data type can be estimated, thereby improving the accuracy of the processing method for the credit data to be processed that fails to match.

[0076] In one possible example, determining the target accurate demand value corresponding to the unmatched credit data to be processed based on the first accurate demand value and the second accurate demand value; and determining the target timeliness demand value corresponding to the unmatched credit data to be processed based on the first timeliness demand value and the second timeliness demand value, includes: obtaining a first proportion corresponding to the data source of the unmatched data to be processed;

[0077] Obtain the second proportion corresponding to the data type of the data to be processed that failed to match; perform weighted processing on the first accurate demand value and the second accurate demand value according to the first proportion and the second proportion to obtain the target accurate demand value; perform weighted processing on the first timeliness demand value and the second timeliness demand value according to the first proportion and the second proportion to obtain the target timeliness demand value.

[0078] In practical applications, servers may collect credit data of the same data type from different data sources. However, the uses of credit data from different data sources may differ. Therefore, determining the processing method for unmatched credit data by combining the data source and data type is more reliable.

[0079] The first and second proportions can be set based on experience in data application, and no further restrictions are imposed here. For example, in a data source, if the ratio of the first historical unprocessed credit report data to the second historical unprocessed credit report data for each data type in that data source is relatively small, then the main factor affecting the processing method of each processed credit report in that data source can be considered to be the data source itself. In this case, the rule for setting the first and second proportions is that the first proportion is greater than the second proportion. Alternatively, in a data source, if the ratio of the first historical unprocessed credit report data to the second historical unprocessed credit report data for each data type in that source is relatively large, then the main factor affecting the processing method of each processed credit report in that data source can be considered to be the data type. In this case, the rule for setting the first and second proportions is that the first proportion is less than the second proportion.

[0080] For example, if the source of the credit information data to be processed is tax data and the data type is loan information, then the corresponding first timeliness requirement value is 80, the first accuracy requirement value is 20, the second timeliness requirement value is 60, and the second accuracy requirement value is 40. Given that the first percentage is 60% and the second percentage is 40%, the target timeliness requirement value for the credit information data to be processed is (80+60)*60% = 84, and the target accuracy requirement value is (20+40)*40% = 24. It is evident that the target timeliness requirement value for the credit information data to be processed is greater than the target accuracy requirement value; therefore, the credit information data to be processed needs to undergo real-time processing.

[0081] As can be seen, in this embodiment, the first timeliness requirement value and the second timeliness requirement value, as well as the first accuracy requirement value and the second accuracy requirement value, can be weighted respectively, thereby further improving the accuracy and reliability of the classification and judgment of the credit data to be processed, so that the credit data to be processed can be processed in a timely and accurate manner using appropriate processing methods.

[0082] In one possible example, the first and second unprocessed credit data include the credit data of at least one user; the real-time processing of the first and second unprocessed credit data to obtain target credit data includes: obtaining a user ID corresponding to each of the first and second unprocessed credit data; transmitting each of the first and second unprocessed credit data to a corresponding target message queue according to the user ID, wherein one target message queue corresponds to one user ID; obtaining all the first and second unprocessed credit data in the target message queue, and performing the real-time processing to obtain the target credit data.

[0083] Each piece of credit data collected by the server corresponds to a user ID, which can be the name of an individual user or the name of a corporate user. Therefore, each piece of credit data to be processed in real time, both the first and second pieces, also corresponds to a user ID.

[0084] In practical implementation, the database can establish separate datasets for each user ID to facilitate data storage and retrieval. Once the server categorizes the first and second unprocessed credit data from all real-time collected credit data to be processed, it can send all first and second unprocessed credit data for the same user ID to the target message queue corresponding to that user ID. When the target message queue receives all the first and second unprocessed credit data corresponding to that user ID, it can perform batch real-time processing operations to obtain the target credit data. These real-time processing operations include, but are not limited to, data cleaning and data format conversion.

[0085] As can be seen, in this embodiment, the server can process all first and second unprocessed credit data corresponding to the same user ID in the same batch of collected credit data in real time. Based on this, it is possible to process credit data in batches, and even process data in time-sharing batches, thereby reducing the server's data processing pressure, improving the server's data processing speed, and ensuring the rational utilization of server resources.

[0086] In one possible example, sending the target credit data to the database includes: performing user analysis based on the target credit data and / or historical credit data to obtain a user relationship graph corresponding to the user; performing user analysis based on the target credit data and / or historical credit data to obtain service configuration rules corresponding to the user; and sending at least one of the user relationship graph and the service configuration rules corresponding to the user to the user terminal.

[0087] Historical credit data refers to target credit data obtained after real-time processing. Target credit data and historical credit data include at least one user's basic information, at least one user's credit information, and at least one user's public information. Basic user information includes, but is not limited to: user's identity information, occupation information, and residence information; user's credit information includes, but is not limited to: user's credit card, mortgage, and other loan repayment information; user's public information includes, but is not limited to: social security and housing provident fund information, court information, tax arrears information, and administrative enforcement information.

[0088] User relationship graphs can be used to represent users' investment, financial, or business relationships. For example, a user relationship graph can analyze target credit data and historical credit data from business registration data to obtain investment relationships. These investment relationships can be individual user investment relationships, or corporate investments and equity information in those companies. The target credit data and historical credit data can be data related to "shareholder and capital contribution information" registered with the business registration department.

[0089] Service configuration rules can be used to represent the service rules that banks configure for users. For example, the database can store preset service plans corresponding to different user scores. After analyzing a user based on their target credit data and historical credit data, a user score corresponding to that user can be obtained. The server can match the user with the corresponding preset service plan based on the user score. The preset service plan includes the user's appropriate credit limit, repayment interest rate, and the service personnel level providing services to the user, etc.

[0090] In practical implementation, taking user A as an example, if the target credit data includes all of user A's credit data, then user A can be analyzed based on the target credit data to obtain user A's user relationship graph and service configuration rules. If the target credit data and historical credit data each include different credit data for user A, then user A can be analyzed based on both the target credit data and historical credit data to obtain user A's user relationship graph and service configuration rules. If there is no target credit data for user A, then user A can be analyzed based on historical credit data to obtain user A's user relationship graph and service configuration rules. If the target credit data is updated from historical credit data, then the target credit data takes precedence, thus ensuring the real-time nature and reliability of the credit data.

[0091] As can be seen, in this embodiment, users can be analyzed by processing target credit data and historical credit data to obtain information such as user relationship graphs and service configuration rules related to the user, which can then be promptly fed back to the customer. This enables the rational use of target credit data and helps customers gain a more timely and comprehensive understanding of the user's credit situation, thereby allowing for more accurate service provision.

[0092] This application can divide the server into functional units based on the above method examples. For example, each function can be divided into its own functional unit, or two or more functions can be integrated into one processing unit. The integrated unit can be implemented in hardware or as a software functional unit. It should be noted that the unit division in this application embodiment is illustrative and only represents one logical functional division; other division methods may be used in actual implementation.

[0093] Figure 4 This is a functional unit block diagram of a data processing apparatus provided in an embodiment of this application. The data processing apparatus 300 can be applied to, for example... Figure 1 or Figure 2 On the server 100 shown, the data processing device 300 includes:

[0094] The acquisition unit 310 is used to acquire credit information data to be processed, wherein the credit information data to be processed includes data source, data type and content information;

[0095] The comparison unit 320 is used to compare the content information of the credit information to be processed with the content information of the first historical credit information to be processed to obtain a first comparison result; it is also used to compare the target accuracy requirement value and the target timeliness requirement value to obtain a second comparison result.

[0096] The determining unit 330 is configured to: when the first comparison result is a successful match, determine the successfully matched credit information data to be processed as the first credit information data to be processed; when the first comparison result is a failed match, determine the target accuracy requirement value and target timeliness requirement value of the failed-matched credit information data to be processed based on the data source and data type of the credit information data to be processed; and further configured to: when the first comparison result is a failed match, determine the second credit information data to be processed from the failed-matched credit information data to be processed based on the second comparison result.

[0097] Processing unit 340 is used to process the first credit data to be processed and the second credit data to be processed in real time to obtain target credit data;

[0098] The sending unit 350 is used to send the target credit data to the database.

[0099] In one possible example, regarding the determination of the target accuracy requirement value and target timeliness requirement value of the unmatched credit information to be processed based on the data source and data type of the unmatched credit information to be processed, the determining unit 330 is further configured to: match a first accuracy requirement value and a first timeliness requirement value corresponding to the unmatched credit information to be processed based on the data source of the unmatched credit information to be processed; match a second accuracy requirement value and a second timeliness requirement value corresponding to the unmatched credit information to be processed based on the data type of the unmatched credit information to be processed; determine the target accuracy requirement value corresponding to the unmatched credit information to be processed based on the first accuracy requirement value and the second accuracy requirement value; and determine the target timeliness requirement value corresponding to the unmatched credit information to be processed based on the first timeliness requirement value and the second timeliness requirement value.

[0100] In one possible example, the acquisition unit 310 is further configured to: acquire the data source of historical credit data to be processed, wherein the historical credit data to be processed includes the first historical credit data to be processed and the second historical credit data to be processed; the determination unit 330 is further configured to: determine the quantity of the first historical credit data to be processed and the second historical credit data to be processed in each of the data sources of the historical credit data to be processed, based on the data source of the historical credit data to be processed; and determine the first timeliness requirement value and the first accuracy requirement value corresponding to each of the data sources of the historical credit data to be processed, based on the ratio of the quantity of the first historical credit data to the quantity of the second historical credit data to be processed in each of the data sources of the historical credit data to be processed.

[0101] In one possible example, the acquisition unit 310 is further configured to: acquire the data type of historical credit information data to be processed, wherein the historical credit information data to be processed includes the first historical credit information data to be processed and the second historical credit information data to be processed; the determination unit 330 is further configured to: determine the number of the first historical credit information data to be processed and the number of the second historical credit information data to be processed in each of the data types of the historical credit information data to be processed, based on the data type of the historical credit information data to be processed; and determine the second timeliness requirement value and the second accuracy requirement value corresponding to the data type of each of the historical credit information data to be processed, based on the ratio of the number of the first historical credit information data to the number of the second historical credit information data to the number of the data types of the historical credit information data to be processed.

[0102] In one possible example, the acquisition unit 310 is further configured to: acquire a first proportion corresponding to the data source of the data to be processed that failed to match; acquire a second proportion corresponding to the data type of the data to be processed that failed to match; the determination unit 330 is further configured to: perform weighted processing on the first accurate demand value and the second accurate demand value according to the first proportion and the second proportion to obtain the target accurate demand value; and perform weighted processing on the first timeliness demand value and the second timeliness demand value according to the first proportion and the second proportion to obtain the target timeliness demand value.

[0103] In one possible example, the first credit data to be processed and the second credit data to be processed include the credit data of at least one user; the acquisition unit 310 is further configured to: acquire the user ID corresponding to each of the first credit data to be processed and the second credit data to be processed; the processing unit 340 is configured to: transmit each of the first credit data to be processed and the second credit data to be processed to a corresponding target message queue according to the user ID corresponding to each of the first credit data to be processed and the second credit data to be processed, wherein one target message queue corresponds to one user ID; acquire all the first credit data to be processed and the second credit data to be processed in the target message queue, and perform the real-time processing to obtain the target credit data.

[0104] In one possible example, the processing unit 340 is configured to: perform user analysis based on the target credit data and / or historical credit data to obtain a user relationship graph corresponding to the user; perform user analysis based on the target credit data and / or historical credit data to obtain service configuration rules corresponding to the user; the sending unit 350 is further configured to: send at least one of the user relationship graph and the service configuration rules corresponding to the user to the user terminal.

[0105] In the case of using integrated units, the functional unit composition block diagram of the data processing apparatus 300 provided in the embodiments of this application is as follows: Figure 5 As shown. In Figure 5 In this document, the data processing device 300 includes a communication module 360 ​​and a processing module 370. The processing module 370 controls and manages the operations of the data processing device 300, including, for example, the steps performed by the acquisition unit 310, comparison unit 320, determination unit 330, processing unit 340, and transmission unit 350, and / or other processes for performing the techniques described herein. The communication module 360 ​​supports interaction between the data processing device 300 and other devices. Figure 5 As shown, the data processing device 300 may further include a storage module 380, which is used to store the program code and data of the data processing device 300.

[0106] The processing module 370 can be a processor or controller, such as a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), an ASIC, an FPGA, or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It can implement or execute various exemplary logic blocks, modules, and circuits described in conjunction with the embodiments of this application. The processor can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, etc. The communication module 360 ​​can be a transceiver, RF circuitry, or a communication interface, etc. The storage module 380 can be a memory.

[0107] All relevant content in each scenario involved in the above method embodiments can be referenced from the functional descriptions of the corresponding functional modules, and will not be repeated here. Figure 3 The steps performed by the server in the data processing method shown.

[0108] This application also provides a computer storage medium storing a computer program for electronic data interchange, which causes a computer to perform some or all of the steps of any of the methods described in the above method embodiments, wherein the computer includes a server.

[0109] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that this application is not limited to the described order of actions, as some steps may be performed in other orders or simultaneously according to this application. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are preferred embodiments, and the actions and modules involved are not necessarily essential to this application.

[0110] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions in other embodiments.

[0111] In the several embodiments provided in this application, it should be understood that the disclosed apparatus can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of the units described above is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical or other forms.

[0112] The units described above as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0113] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0114] If the integrated units described above are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage device (CMD). Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a memory and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned memory includes various media capable of storing program code, such as USB flash drives, read-only memory (ROM), random access memory (RAM), portable hard drives, magnetic disks, or optical disks.

[0115] Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be implemented by a program instructing related hardware. The program can be stored in a computer-readable storage medium, which may include: flash drive, read-only memory (ROM), random access memory (RAM), disk or optical disk, etc.

[0116] The embodiments of this application have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The description of the above embodiments is only for the purpose of helping to understand the method and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. A data processing method, characterized in that, Applied to a server, the method includes: Obtain credit information data to be processed, wherein the credit information data to be processed includes data source, data type, and content information; By comparing the content information of the credit data to be processed with the content information of the first historical credit data to be processed, a first comparison result is obtained; If the first comparison result is a successful match, then the credit data to be processed that is successfully matched is determined to be the first credit data to be processed; If the first comparison result is a matching failure, then based on the data source and data type of the unmatched credit information data to be processed, the target accuracy requirement value and target timeliness requirement value of the unmatched credit information data to be processed are determined, including: matching a first accuracy requirement value and a first timeliness requirement value corresponding to the unmatched credit information data to be processed based on the data source of the unmatched credit information data to be processed; matching a second accuracy requirement value and a second timeliness requirement value corresponding to the unmatched credit information data to be processed based on the data type of the unmatched credit information data to be processed; obtaining a first proportion corresponding to the data source of the unmatched credit information data to be processed; obtaining a second proportion corresponding to the data type of the unmatched credit information data to be processed; weighting the first accuracy requirement value and the second accuracy requirement value according to the first proportion and the second proportion to obtain the target accuracy requirement value; weighting the first timeliness requirement value and the second timeliness requirement value according to the first proportion and the second proportion to obtain the target timeliness requirement value. Furthermore, by comparing the target accurate demand value and the target timeliness demand value, a second comparison result is obtained, wherein the target accurate demand value is less than or equal to the target timeliness demand value; and based on the second comparison result, the second unprocessed credit data in the unprocessed credit data that failed to match is determined. The first and second unprocessed credit data are processed in real time to obtain the target credit data; Send the target credit data to the database.

2. The method as described in claim 1, characterized in that, The method further includes obtaining a first accurate demand value and a first timeliness demand value corresponding to each of the data sources; The step of obtaining the first accurate demand value and the first timeliness demand value corresponding to each of the data sources includes: The data source for obtaining historical credit data awaiting processing includes the first historical credit data awaiting processing and the second historical credit data awaiting processing. Based on the data source of the historical credit information to be processed, determine the quantity of the first historical credit information to be processed and the second historical credit information to be processed in each of the data sources of the historical credit information to be processed; Based on the ratio of the number of the first historical credit data to the number of the second historical credit data in each of the data sources of the historical credit data to be processed, a first timeliness requirement value and a first accuracy requirement value corresponding to each of the data sources of the historical credit data to be processed are determined.

3. The method as described in claim 1, characterized in that, The method further includes obtaining the second accurate demand value and the second timeliness demand value corresponding to each data type in each of the data sources; The step of obtaining the second accurate demand value and the second timeliness demand value corresponding to each data type in each of the data sources includes: Obtain the data type of historical credit data to be processed from each of the data sources, wherein the historical credit data to be processed includes the first historical credit data to be processed and the second historical credit data to be processed; Based on the data type of the historical credit data to be processed in each of the data sources, determine the number of the first historical credit data to be processed and the number of the second historical credit data to be processed in each of the data types of the historical credit data to be processed; Based on the ratio of the number of the first historical credit data to the number of the second historical credit data in each of the data types to be processed, a second timeliness requirement value and a second accuracy requirement value corresponding to each of the data types to be processed are determined.

4. The method as described in claim 1, characterized in that, The first and second credit data to be processed include the credit data of at least one user; The real-time processing of the first and second unprocessed credit data to obtain the target credit data includes: Obtain the user ID corresponding to each of the first and second unprocessed credit data; Based on the user ID corresponding to each of the first and second unprocessed credit data, each of the first and second unprocessed credit data is transmitted to the corresponding target message queue, where each target message queue corresponds to a user ID. All the first and second unprocessed credit data in the target message queue are obtained, and the real-time processing is performed to obtain the target credit data.

5. The method as described in claim 1, characterized in that, After sending the target credit data to the database, the process further includes: User analysis is performed based on the target credit data and / or historical credit data to obtain a user relationship graph corresponding to the user. User analysis is performed based on the target credit data and / or historical credit data to obtain service configuration rules corresponding to the user; Send at least one of the user relationship graph corresponding to the user and the service configuration rules to the user terminal.

6. A data processing apparatus, characterized in that, Applied to a server, the device includes: An acquisition unit is used to acquire credit data to be processed, wherein the credit data to be processed includes data source, data type, and content information; The comparison unit is used to compare the content information of the credit data to be processed with the content information of the first historical credit data to be processed, and obtain the first comparison result; The determining unit is used to determine the successfully matched credit data to be processed as the first credit data to be processed when the first comparison result is a successful match; The determining unit is further configured to, when the first comparison result is a matching failure, determine the target accuracy requirement value and target timeliness requirement value of the credit information to be processed that failed to match, based on the data source and data type of the credit information to be processed, including: matching a first accuracy requirement value and a first timeliness requirement value corresponding to the credit information to be processed that failed to match, based on the data source of the credit information to be processed that failed to match; matching a second accuracy requirement value and a second timeliness requirement value corresponding to the credit information to be processed that failed to match, based on the data type of the credit information to be processed that failed to match; obtaining a first proportion corresponding to the data source of the credit information to be processed that failed to match; obtaining a second proportion corresponding to the data type of the credit information to be processed that failed to match; weighting the first accuracy requirement value and the second accuracy requirement value according to the first proportion and the second proportion to obtain the target accuracy requirement value; and weighting the first timeliness requirement value and the second timeliness requirement value according to the first proportion and the second proportion to obtain the target timeliness requirement value. The comparison unit is further configured to compare the target accurate requirement value and the target timeliness requirement value when the first comparison result is a matching failure, and obtain a second comparison result, wherein the second comparison result is that the target accurate requirement value is less than or equal to the target timeliness requirement value. The determining unit is further configured to determine, based on the second comparison result, the second unprocessed credit data among the unmatched credit data to be processed; The processing unit is used to process the first credit data to be processed and the second credit data to be processed in real time to obtain the target credit data; The sending unit is used to send the target credit data to the database.

7. A server, characterized in that, The method includes a processor, a memory, a communication interface, and one or more programs, said one or more programs being stored in the memory and configured to be executed by the processor, said programs including instructions for performing the steps of the method as described in any one of claims 1-5.

8. A computer-readable storage medium, characterized in that, A computer program for storing electronic data interchange is provided, wherein the computer program causes a computer to perform the steps of the method as described in any one of claims 1-5.