A method for establishing user information view and relationship model based on big data

By constructing user information views and relationship models, and using classification learning algorithms to update association index weights, data silos are eliminated, data interconnection between different systems is achieved, the problem of incomplete reflection of user characteristics is solved, and precision marketing is supported.

CN116127136BActive Publication Date: 2026-06-19CHINA TELECOM CLOUD TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA TELECOM CLOUD TECH CO LTD
Filing Date
2022-11-14
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies struggle to effectively utilize mobile and fixed-line data to establish comprehensive user profiles, resulting in data silos and an inability to accurately reflect user characteristics.

Method used

By constructing a user information view, updating the association index weights using classification learning algorithms, and combining ID-Mapping and user-specific identifiers, data silos are eliminated, a user relationship model is established, and data interconnection and interoperability between systems in different industries is achieved.

Benefits of technology

It provides a complete view of user information, establishes an accurate user relationship model, supports the defection of users from other networks, and achieves precise outreach.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116127136B_ABST
    Figure CN116127136B_ABST
Patent Text Reader

Abstract

This invention relates to the field of big data analytics, specifically to a method for establishing user information views and relationship models based on big data. By combining ID-Mapping with user-specific identifiers, fragmented data is linked together, eliminating data silos and incorporating all of an individual's information into the user information view, providing a complete information view of the user, and establishing a user relationship model based on this view. This invention can combine user relationship models with user personal information views to provide a comprehensive description of the user from both macro and micro perspectives.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of big data analytics, and in particular to a method for establishing user information views and relationship models based on big data. Background Technology

[0002] The home market is one of the key competitive markets in the telecommunications industry. With the development of full-service and bundled packages, the home market is becoming increasingly important. Defining a user's home network is a crucial support measure for business development and attracting customers from other networks, and it has significant practical implications. Furthermore, the diverse behavioral attributes of users reflected in operator data are of great importance in determining a user's home network.

[0003] Meanwhile, outside of the operator's system, users generate massive amounts of data daily on the internet. However, this data is often isolated, not a readily accessible whole, but rather exists as fragmented islands. It is extremely difficult for different industries and systems to share each other's data, and the value created by individual data silos is very limited. Existing technologies provide mapping methods between mobile and fixed-line data, which can link mobile and fixed-line data and establish connections between terminal and broadband account information. However, these methods still have certain drawbacks, such as the limited volume of mobile and fixed-line data, meaning that information connections based on this limited data cannot comprehensively reflect user characteristics. Summary of the Invention

[0004] To address the aforementioned shortcomings in existing technologies, this invention proposes a method for establishing user information views and relationship models based on big data. This invention constructs a user circle model based on user information views, utilizes a classification learning algorithm to update association index weights to improve reliability, and simultaneously performs a exclusion operation for special groups, making the user relationship model more closely resemble reality.

[0005] Furthermore, this invention combines ID-Mapping with user-specific identifiers to connect all fragmented data, eliminate data silos, and incorporate all personal information into the user information view, providing a complete information view for each user. This enables data interconnection and interoperability between different industries, systems, and fields, becoming the core of data interoperability.

[0006] Furthermore, by comprehensively utilizing user information, business information, and online behavior information, almost all-round identification information of each individual user can be integrated. Moreover, through some of these identifications, user relationships can be identified, providing a reference for establishing a user relationship model.

[0007] The objective of this invention can be achieved through the following technical solutions:

[0008] I. Establishing a User Information View

[0009] 1) During the process of establishing the user information view, the different attribute information of users is first classified, and a special user identification database is established for each type of information;

[0010] 2) In the specialized identifier database, match associated items for each identifier. That is, establish a mapping network within the specialized identifier database;

[0011] 3) Using various identifiers as nodes in a complex network, and operating system, IP address, time-series data, and many other user access characteristics as node attributes, the relationships between identifiers are represented by edges. If there is a relationship between identifier nodes, a directed edge is established. By comparing the node attributes of pairwise identifier nodes, an algorithm is used to analyze and match, inferring the probability that the two identifiers belong to the same user. Once an acceptable threshold is reached, the two identifiers are identified as belonging to the same user.

[0012] II. Establishment of User Relationship Model

[0013] The user relationship model is built upon user associations. It extracts features from these associations, calculates various association indices to form a user association index matrix, and then establishes the user relationship model based on this matrix. The specific steps are as follows:

[0014] 1) Classify user association methods, and for each type of association method, extract or define association features;

[0015] 2) Normalize each feature and assign it an initial weight. The sum of the weights of all features equals 1.

[0016] 3) Define a correlation index formula for each type of association, and calculate the correlation index based on the characteristic quantities and weights of that type of association;

[0017] 4) Use a classification algorithm to update the weight of each feature in the association index, and recalculate the association index to form a closed loop. Continuously update the weights until the amount of weight updates is less than a certain threshold.

[0018] 5) Using each user as a node in a complex network, the individual user information view as a node attribute (i.e., each node is a sub-network), and the user association index as a directed edge between nodes, a total user relationship view is established.

[0019] 6) In the overall user relationship view, select the necessary parts or remove irrelevant parts based on actual needs or rules derived from exploration, which will result in the final targeted user relationship model.

[0020] This invention relies on operator data, fully utilizes user behavior trajectory data to complete user profile attributes, integrates various user identifier-related IDs, and constructs a complex network to establish a mapping relationship between user network identifiers and user identifiers and service identifiers in telecommunications data. Compared with existing technologies, it can autonomously discover basic correlations, support flexible merging of various correlations, split user identifier data, and support horizontal expansion. In establishing the user relationship model, it comprehensively considers user status, location, communication, and other behavioral attributes, defines a correlation index, and uses decision tree C&R trees, decision tree CHAID, and logistic regression algorithms for modeling, continuously optimizing calculation methods and parameters to make the user relationship model more accurate. Combining the user relationship model with user personal information views, it provides a comprehensive description of users from both macro and micro perspectives, enhancing comprehensiveness and ease of use compared to existing technologies.

[0021] Compared with the prior art, the present invention has the following beneficial effects:

[0022] 1. Eliminate data silos and provide a complete information view for each user;

[0023] 2. Establish a user relationship model, define user family circles, effectively support the defection of users from other networks, and support precise outreach. Attached Figure Description

[0024] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of this invention or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0025] Figure 1 shows the overall user relationship view;

[0026] Figure 2 shows the user's final social circle relationship model;

[0027] Figure 3 shows the framework for constructing a user relationship network within a family circle.

[0028] Figure 4 shows the user feature vectors of the family circle user relationship network;

[0029] Figures 5A, 5B, 6A, 6B, 7A, 7B, 8A, 8B, 9A, and 9B are schematic diagrams illustrating the process of generating the identifier relationship diagram. Detailed Implementation

[0030] The present invention will now be described in further detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and not intended to limit it.

[0031] Example 1

[0032] A method for establishing user information views and relationship models based on big data, characterized in that,

[0033] Step 100: Extract the user's basic telecommunications identification information from all user-related information tables in the data warehouse, and construct the user telecommunications identification basic information database, the user equipment identification basic information database, and the user network identification information database.

[0034] Step 200: Construct a user information view, which contains the mapping relationship between user network identifier and user identifier and service identifier in telecommunications data;

[0035] Step 300: Using each user in the user information view as a user node, establish a user interaction circle relationship network model.

[0036] Optionally, in some embodiments, constructing the user information view in step 200 includes:

[0037] Step 201: Obtain user attribute information and classify it, and establish a special user identifier database for each type of information;

[0038] Step 202: Establish a mapping network within the special identifier database to match associated items for each identifier in the special identifier database;

[0039] Step 203: Using various types of identifiers as identifier nodes in a complex network and user access feature data as identifier node attributes, establish the association between identifier nodes.

[0040] Optionally, in some embodiments, establishing the association between the identifier nodes in step 203 further includes: if the identifier nodes have an association, then establishing a directed edge between the identifier nodes; optionally, in some embodiments, by comparing the node attributes of pairwise identifier nodes, using an algorithm to analyze the matching, calculating the similarity value of the two identifier nodes, and if the similarity value exceeds a set threshold, then the two identifier nodes are identified as the same user.

[0041] Optionally, in some embodiments, the user access feature data includes at least one of the following: the user's operating system, IP address, time-series data, and other user access data.

[0042] Optionally, in some embodiments, establishing a user social circle relationship network model in step 300 includes:

[0043] Step 301: Classify user association methods, and for each type of association method, extract or define association features;

[0044] Step 302: Extract one or more features based on user association relationships, normalize the features, assign an initial weight to each feature, and the sum of all weights of the features equals 1;

[0045] Step 303: Define a correlation index formula for each type of association, and calculate the correlation index based on the feature quantity and weight of that type of association;

[0046] Step 304: Use a classification algorithm to update the weight of each feature in the correlation index and recalculate the correlation index; repeat step 304 if the weight update amount is not less than the weight update amount threshold, otherwise execute step 305.

[0047] Step 305: Using each user as a user node in a complex network, the attributes of the user nodes in the complex network are the individual information view of each user, and the user association index is used as the directed edge between the user nodes to establish a total user relationship view.

[0048] Step 306 yields the user relationship model.

[0049] Optionally, in some embodiments, after step 305, the method further includes: step 306, simplifying the total user view by removing noisy data from the total user relationship view based on specific rules.

[0050] Optionally, in some embodiments, step 300 further includes: the social network is a network of relationships consisting of all individuals who have bidirectional contact with the user within a certain period of time.

[0051] Optionally, in some embodiments, the classification algorithm may be one or more of the following: decision tree C&R tree algorithm, decision tree CHAID algorithm, and logistic regression algorithm.

[0052] The following example, using data from a single user, illustrates the process of generating an identifier relationship graph:

[0053]

[0054] The following is a specific method for establishing a user relationship model in this embodiment, taking the establishment of a user relationship model based on operator data as an example:

[0055] From a social perspective, a typical user relationship network is called a social circle. A user's social circle is defined as "all individuals with whom the user has bidirectional contact within a certain period of time." Depending on the nature of different social groups, social circles can be further divided into family circles, work circles, close friend circles, etc. Based on operator data, the identifiable connections between users are mainly: communication connections, geographical connections, and identity connections.

[0056] Communication association refers to the frequency of communication, geographical association refers to the proximity of geographical locations, and identity association refers to the primary and secondary relationship between user numbers.

[0057] Taking the definition of the communication correlation index as an example, this paper analyzes billing detail call data, conducts positive and negative sample training, and summarizes and extracts features based on the training results. Combining three features—call duration, call frequency, and number of calls—the communication correlation index is defined.

[0058] For example, it is now necessary to establish a user social circle relationship model. Based on the main user associations in the operator's data, various association indices are calculated and a total user relationship view is formed, as shown in Figure 1.

[0059] Since the definition of a social circle includes "two-way communication," noisy data with only one-way communication and worthless offline users are removed.

[0060] For the social circle, service personnel such as delivery drivers and intermediaries represent noise data, affecting the identification of the user group, and therefore need to be excluded. These individuals typically have a large number of contacts and a low average call duration; therefore, they can be excluded based on their call characteristics.

[0061] Users with a correlation index of 0 should also be removed. The final user social circle relationship model is shown in Figure 2.

[0062] The following is a typical application scenario of this embodiment:

[0063] To support cross-network defection efforts within the group, a user information view and a user family network relationship model are established to identify high-potential defection targets for targeted marketing. The user information view is built based on carrier data, starting with a user identifier database. A complex network is then constructed based on this database to create a comprehensive user information view.

[0064] The system extracts users' basic telecommunications identification information from all user-related information tables in the data warehouse, constructing user telecommunications identification basic information databases, user equipment identification basic information databases, and user network identification information databases. By establishing a mapping relationship between user network identifiers and user identifiers and service identifiers in telecommunications data, a user information view is constructed.

[0065] After the user information view is established, each user's personal information view is treated as a node, and a user family circle relationship network is built according to actual business needs. The model framework is shown in Figure 3, and includes five parts: business understanding, model input, model algorithm, model output, application, and evaluation.

[0066] When establishing a user family circle user relationship network, the feature quantities used are shown in Figure 4, including fixed network DPI, billing details, CRM, behavioral tags, terminal tags, and external data.

[0067] Based on these characteristics, a user relationship network modeling method is used to build the model, and clustering and decision tree algorithms are employed for analysis to output the final group of users who can be recruited for targeted marketing.

[0068] Example 2

[0069] A computer device. A computer device is manifested in the form of a general-purpose computing device. The components of a computer device may include, but are not limited to: one or more processors or processing units, system memory, and buses connecting different system components.

[0070] Computer devices typically include a variety of computer system-readable media. These media can be any available media that can be accessed by a computer device, including volatile and non-volatile media, and removable and non-removable media.

[0071] The system memory may include a computer system readable medium in the form of volatile memory, and the memory may include at least one program product having a set (e.g., at least one) of program modules configured to perform the functions of the embodiments of the present invention.

[0072] The processing unit executes various functional applications and data processing by running programs stored in the system memory, such as implementing a method for establishing a user information view and relationship model based on big data provided in other embodiments of the present invention.

[0073] Example 3: This embodiment of the invention also provides a storage medium containing computer-executable instructions, on which a computer program is stored. When the program is executed by a processor, it implements a method for establishing a user information view and relationship model based on big data, as provided in other embodiments of the invention.

[0074] Note that the above description is merely a preferred embodiment of the present invention and the technical principles employed. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and various obvious changes, readjustments, and substitutions can be made without departing from the scope of protection of the present invention. Therefore, although the present invention has been described in detail through the above embodiments, the present invention is not limited to the above embodiments, and may include many other equivalent embodiments without departing from the concept of the present invention, the scope of which is determined by the scope of the appended claims.

Claims

1. A method for establishing user information views and relationship models based on big data, characterized in that, Step 100: Extract the user's basic telecommunications identification information from all user-related information tables in the data warehouse, and construct the user telecommunications identification basic information database, the user equipment identification basic information database, and the user network identification information database. Step 200: Construct a user information view, which contains the mapping relationship between user network identifier and user identifier and service identifier in telecommunications data; Step 300: Using each user in the user information view as a user node, establish a user interaction circle relationship network model; The step 200 of constructing the user information view includes: Step 201: Obtain user attribute information and classify it, and establish a special user identifier database for each type of information; Step 202: Establish a mapping network within the special identifier database to match associated items for each identifier in the special identifier database; Step 203: Using various identifiers as identifier nodes in a complex network and user access feature data as identifier node attributes, establish the association between identifier nodes. The step 300 of establishing a user social circle relationship network model includes: Step 301: Classify user association methods, and for each type of association method, extract or define association features; Step 302: Extract one or more features based on user association relationships, normalize the features, assign an initial weight to each feature, and the sum of all weights of the features equals 1; Step 303: Define a correlation index formula for each type of association, and calculate the correlation index based on the feature quantity and weight of that type of association; Step 304: Use a classification algorithm to update the weight of each feature in the correlation index and recalculate the correlation index; repeat step 304 if the weight update amount is not less than the weight update amount threshold, otherwise execute step 305. Step 305: Using each user as a user node in a complex network, the attributes of the user nodes in the complex network are the individual information view of each user, and the user association index is used as the directed edge between the user nodes to establish a total user relationship view. Step 306 yields the user relationship model.

2. The big data based user information view and relationship model building method of claim 1, wherein, The step 203 of establishing the association between the identifier nodes also includes: If there is an association between the identified nodes, then a directed edge is established between the identified nodes; By comparing the node attributes of each pair of identified nodes, an algorithm is used to analyze the matching and calculate the similarity value between the two identified nodes. If the similarity value exceeds a set threshold, the two identified nodes are identified as the same user.

3. The big data based user information view and relationship model building method of claim 1, wherein, The user access feature data includes at least one of the following: User access network operating system, IP, time series data, and other user access data.

4. The big data based user information view and relationship model building method of claim 1, wherein, After step 305, the method further includes: Step 306: Simplify the total user relationship view by removing noisy data from the total user relationship view based on specific rules.

5. The big data based user information view and relationship model building method of claim 1, wherein, Step 300 further includes: The aforementioned social network is a network of relationships consisting of all individuals who have bidirectional contact with the user within a certain period of time.

6. The big data based user information view and relationship model building method of claim 1, wherein, Step 304 further includes: The classification algorithm is one or more of the following: decision tree C&R tree algorithm, decision tree CHAID algorithm, and logistic regression algorithm.

7. A computing device comprising: The processor, memory, communication interface, and communication bus are provided, wherein the processor, memory, and communication interface communicate with each other via the communication bus. The memory is used to store at least one executable program, which causes the processor to perform the operations corresponding to the user information view and relation model establishment method based on big data as described in any one of claims 1-6.

8. A computer storage medium storing at least one executable program, the executable program causing a processor to perform operations corresponding to the user information view and relation model establishment method based on big data as described in any one of claims 1-6.