A professional label determination method, device and electronic equipment

By obtaining the user's application access list and dividing it into two categories, namely, those with identifiable occupations and those with inidentifiable occupations, the user's occupational label is determined based on these two categories of applications. This solves the problem of low accuracy of user occupational information in existing technologies and achieves higher accuracy and reliability in user occupational identification.

CN117421473BActive Publication Date: 2026-06-23CHINA TELECOM CORP LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA TELECOM CORP LTD
Filing Date
2023-09-20
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing methods for determining users' occupational information have limitations and low accuracy, making it impossible to achieve differentiated and precise online marketing.

Method used

By obtaining the user's application access list, the applications are divided into two categories: those with identifiable occupations and those with inidentifiable occupations. The first occupation tag is determined based on the applications with identifiable occupations, and the second occupation tag is determined based on the applications with inidentifiable occupations. The user's occupation tag is determined by combining the two.

Benefits of technology

It improves the accuracy and reliability of user occupation identification, expands the scope of occupation identification, and achieves higher accuracy and reliability in determining user occupation tags.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117421473B_ABST
    Figure CN117421473B_ABST
Patent Text Reader

Abstract

The embodiment of the application provides a kind of professional label determination method, device and electronic equipment, the method comprises: obtaining the application access list of at least one user in target time period;Determine the professional information and the category of each application, according to the first professional information corresponding to the first application, determine the first professional label of first user, and the first application includes professional determinable application in application access list;According to the second professional information corresponding to the second application, determine the second professional label of second user, and the second application includes professional undeterminable application in application access list;According to at least one of first professional label and second professional label, determine the professional label of at least one user.Therefore, the embodiment of the application can solve the problem that the method for determining the professional information of user in the prior art has limitation and low accuracy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of communication technology, and in particular to a method, apparatus, and electronic device for determining occupational labels. Background Technology

[0002] Different occupational groups have varying communication service needs in terms of call volume, data usage, network speed, and price. Therefore, a user's occupational attributes play a crucial role in enabling operators to develop differentiated marketing strategies. However, users do not need to provide their occupational information when applying for communication services or phone cards from operators. Operators need to determine a user's occupation by combining industry clusters at offline promotional locations or through corporate group buying, which cannot achieve large-scale, differentiated, and precise online marketing. Furthermore, determining a user's occupational information through industry clusters and corporate group buying has limitations and low accuracy.

[0003] Therefore, it can be seen that the existing methods for determining a user's occupational information have limitations and low accuracy. Summary of the Invention

[0004] This application provides a method, apparatus, and electronic device for determining occupational labels, in order to address the limitations and low accuracy of existing methods for determining users' occupational information.

[0005] In a first aspect, embodiments of this application provide a method for determining occupational labels, the method comprising:

[0006] Obtain a list of applications accessed by at least one user within a target time period;

[0007] Determine the occupational information corresponding to each application in the application access list;

[0008] Determine the category to which each application in the application access list belongs, wherein the category includes occupationally identifiable applications and occupationally indeterminate applications;

[0009] In the presence of a first application, the first occupational tag of the first user is determined based on the first occupational information corresponding to the first application, wherein the first application includes occupationally identifiable applications in the application access list, and the first user includes the user who accessed the first application among the at least one user.

[0010] In the presence of a second application, the second occupation tag of the second user is determined based on the second occupation information corresponding to the second application, wherein the second application includes applications with undetermined occupations in the application access list, and the second user includes the user who accessed the second application among the at least one user;

[0011] The occupational label of the at least one user is determined based on at least one of the first occupational label and the second occupational label.

[0012] Secondly, embodiments of this application provide an occupational label determination device, the device comprising:

[0013] The list acquisition module is used to acquire at least one user's application access list within a target time period;

[0014] The occupational information determination module is used to determine the occupational information corresponding to each application in the application access list;

[0015] The category determination module is used to determine the category to which each application in the application access list belongs, wherein the category includes applications with identifiable occupations and applications with indeterminate occupations;

[0016] The first determining module is used to determine the first occupation tag of the first user based on the first occupation information corresponding to the first application when the first application exists, wherein the first application includes occupation-determinable applications in the application access list, and the first user includes the user who accessed the first application among the at least one user.

[0017] The second determining module is used to determine the second occupation tag of the second user based on the second occupation information corresponding to the second application when a second application exists, wherein the second application includes applications with undetermined occupations in the application access list, and the second user includes the user who accessed the second application among the at least one user.

[0018] The third determining module is used to determine the occupational label of the at least one user based on at least one of the first occupational label and the second occupational label.

[0019] Thirdly, embodiments of this application provide an electronic device, including a memory, a transceiver, and a processor:

[0020] A memory for storing computer programs; a transceiver for sending and receiving data under the control of the processor; and a processor for reading the computer programs from the memory and executing the occupational label determination method described in the first aspect above.

[0021] Fourthly, embodiments of this application provide a processor-readable storage medium storing a computer program for causing the processor to execute the occupational label determination method described in the first aspect.

[0022] In this embodiment, it is possible to obtain an application access list of at least one user within a target time period and determine the occupational information corresponding to each application in the application access list, thereby determining the category to which each application in the application access list belongs. Then, if a first application exists, a first occupational tag for the first user is determined based on the first occupational information corresponding to the first application. If a second application exists, a second occupational tag for the second user is determined based on the second occupational information corresponding to the second application. Then, an occupational tag for each user is determined based on at least one of the first and second occupational tags. The first application includes applications with identifiable occupations in the application access list, and the first user includes users among the at least one user who accessed the first application. The second application includes applications with indeterminate occupations in the application access list, and the second user includes users among the at least one user who accessed the second application.

[0023] Therefore, in this embodiment of the application, the applications in the user's application access list can be divided into two categories: those with identifiable occupations and those with indeterminate occupations. Thus, an occupational label (i.e., the first occupational label mentioned above) is determined based on the applications with identifiable occupations, and an occupational label (i.e., the second occupational label mentioned above) is determined based on the applications with indeterminate occupations. Then, the user's occupational label is determined based on at least one of the first occupational label and the second occupational label.

[0024] In this case, since there is a significant discrepancy between the accuracy and reliability of occupation identification results obtained by applications with identifiable and unidentifiable occupations, the use of these two types of applications is avoided when the first occupation tag is determined solely by the application with identifiable occupations and the second occupation tag is determined solely by the application with unidentifiable occupations. This ensures that the first occupation tag obtained by the application with identifiable occupations has higher accuracy and reliability. Furthermore, by combining the second occupation tag obtained by the application with unidentifiable occupations, the user's occupation can be represented by both the first and second occupation tags. This not only improves the accuracy of determining the user's occupation but also expands the scope of occupation identification for the user. Attached Figure Description

[0025] To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0026] Figure 1 A flowchart illustrating the occupational label determination method provided in this application embodiment;

[0027] Figure 2 A flowchart illustrating an example of a first score for determining the corresponding first occupational information in the implementation of this application;

[0028] Figure 3 A flowchart illustrating an example of a second scoring method for second occupational information corresponding to an application where the occupation cannot be determined in the implementation of this application;

[0029] Figure 4 A structural block diagram of the occupational label determination device provided in the embodiments of this application;

[0030] Figure 5 A structural block diagram of an electronic device provided in an embodiment of this application. Detailed Implementation

[0031] In the embodiments of this application, the term "and / or" describes the relationship between associated objects, indicating that three relationships can exist. For example, A and / or B can represent three cases: A alone, A and B simultaneously, and B alone. The character " / " generally indicates that the preceding and following associated objects have an "or" relationship.

[0032] In the embodiments of this application, the term "multiple" refers to two or more, and other quantifiers are similar.

[0033] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of the embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.

[0034] Firstly, embodiments of this application provide a method for determining occupational labels, such as... Figure 1 As shown, the method includes the following steps 101 to 106:

[0035] Step 101: Obtain a list of applications accessed by at least one user within the target time period.

[0036] In step 101, one user may correspond to one application access list, or multiple users may correspond to one application access list. It should be noted that in this embodiment, the user's application access list is obtained with the user's authorization.

[0037] Optionally, step 101, "obtaining a list of applications accessed by at least one user within a target time period," includes:

[0038] Obtain the Uniform Resource Locator (URL) data of the at least one user during the target time period, and parse the application access list of the at least one user based on the URL data.

[0039] A URL is a concise representation of the location and access method of a resource available on the Internet; it is the address of a standard resource on the Internet. Every file on the Internet has a unique URL, which contains information that indicates the file's location and how the browser should handle it.

[0040] A basic URL includes a pattern (or protocol), server name (or IP address), path, and filename, such as "protocol: / / authorization / path?query"; the complete syntax of a regular Uniform Resource Identifier (URI) with the authorization part looks like this: protocol: / / username:password@subdomain.domain.top-level domain:port / directory / filename.file extension?parameter=value#flag; therefore, by resolving the domain information in the URL, it is possible to determine which specific applications (i.e., apps) the user accessed, thus obtaining a list of applications accessed by each user.

[0041] In addition, the target time period can be the current month or the most recent three months.

[0042] Step 102: Determine the occupational information corresponding to each application in the application access list.

[0043] Optionally, before determining the occupation information corresponding to each application in the application access list, the method further includes:

[0044] Obtain description information of at least one application, and determine a first correspondence between the at least one application and occupational information based on the description information;

[0045] Determining the occupational information corresponding to each application in the application access list includes:

[0046] Based on the first correspondence, determine the occupational information corresponding to each application in the application access list.

[0047] It should be noted that when determining the occupational information corresponding to each application in the application access list based on the first correspondence, if the occupational information corresponding to a certain application in the application access list exists in the first correspondence, then that occupational information is the occupational information corresponding to that application; however, if the occupational information corresponding to a certain application in the application access list does not exist in the first correspondence, the description information of the application can be crawled from the Internet, and the occupational information corresponding to the application can be determined based on the description information, and then the correspondence between the application and the occupational information can be stored.

[0048] The first correspondence between applications and occupational information can be stored in a database.

[0049] In addition, the description information of the aforementioned application may include at least one of the description information of the application in the app store and the description information of the application on the official website of the application.

[0050] Furthermore, when determining the initial correspondence between an application and occupational information based on descriptive information, keywords can be extracted from the application's description to determine the corresponding occupational information. Here, keyword extraction can be done in many ways; for example, keywords may include job titles, professional actions, etc.

[0051] For example, the description of the "Medical World Doctor Station" app in an app store is as follows: "Medical World Doctor Station is a leading medical new media and doctor service platform in China, focusing on the professional learning, development, and growth of medical professionals. It provides high-quality content services to doctors by aggregating professional and authoritative medical information, high-quality and cutting-edge expert live broadcasts, practical clinical courses, and targeted professional examination question banks. Medical World Doctor Station also provides doctors with free clinical decision-making tools such as guidelines, medications, and formulas, and is committed to promoting the development of China's medical industry. To date, it has over 2.5 million registered users!"

[0052] Alternatively, for example, the description of the "Medical World Doctor Station" application on the Medical World official website is as follows: "Medical World Doctor Station is a learning and service platform launched by Medical World, focusing on the professional growth of medical professionals. It mainly provides medical information, industry conferences, expert live broadcasts, clinical courses, exam question banks, medical tools, etc., to help doctors to continuously advance conveniently and at low cost, accompanying the most ambitious doctors of this era!"

[0053] Therefore, the keywords in the description information of the "Medical World Doctors Station" application can include "doctor" and "medical practitioner", so the "Medical World Doctors Station" application can correspond to the profession of "doctor".

[0054] Alternatively, for example, the keywords in the description information of "HaoDf.com" may include "doctor" and "patient". Therefore, the "HaoDf.com" application may correspond to the profession of "doctor" or "patient". However, "patient" is not a profession. Therefore, the "HaoDf.com" application also corresponds to the profession of "doctor". However, since it has more than two user groups, it is an application with uncertain profession.

[0055] As can be seen from the above, in this embodiment of the application, based on at least one of the application description information in the application store and the application description information on the application's official website, the occupational information corresponding to the application can be determined more accurately.

[0056] Step 103: Determine the category to which each application in the application access list belongs.

[0057] The categories include applications with identifiable occupations and applications with unidentifiable occupations.

[0058] Optionally, determining the category to which each application in the access list belongs includes:

[0059] Based on a pre-determined second correspondence between applications and categories, the category to which each application in the access list belongs is determined.

[0060] Here, the second correspondence can also be stored in a database. In one embodiment, the first and second correspondences mentioned above can be recorded together in a database. For example, the database records the occupational information corresponding to each application and whether it belongs to an application with a definite occupation or an application with an indefinite occupation, as shown in Table 1.

[0061] Table 1. Examples of Correspondence between Applications, Occupational Information, and Categories

[0062]

[0063]

[0064]

[0065] In addition, the correspondence between an application and its category can also be determined based on the application's description information. That is, keywords can be extracted from the application's description information. If the extracted keywords include only one keyword related to user information, the application belongs to the category of applications with a identifiable occupation; if the extracted keywords include multiple keywords related to user information, the application belongs to the category of applications with an unidentifiable occupation. For example, the keywords in the description information of the "Medical World Doctor Station" application may include "doctor" and "medical practitioner," therefore, the "Medical World Doctor Station" application corresponds to the occupation of "doctor" and belongs to the category of applications with a identifiable occupation.

[0066] Alternatively, for example, the keywords in the description information of "HaoDf. Online" may include "doctor" and "patient". Therefore, the "HaoDf. Online" application may correspond to the profession of "doctor" or "patient". However, "patient" is not a profession. Therefore, the "HaoDf. Online" application also corresponds to the profession of "doctor", but it is an application with an uncertain profession.

[0067] Step 104: If a first application exists, determine the first occupation tag of the first user based on the first occupation information corresponding to the first application.

[0068] The first application includes occupationally identifiable applications in the application access list, and the first user includes the user who accesses the first application among the at least one users.

[0069] Step 105: If a second application exists, determine the second user's second occupation tag based on the second occupation information corresponding to the second application.

[0070] The second application includes applications with undetermined occupations from the application access list, and the second user includes the user who accessed the second application from among the at least one user.

[0071] Step 106: Determine the occupational label of the at least one user based on at least one of the first occupational label and the second occupational label.

[0072] As can be seen from steps 104 and 105 above, if the application access list contains applications with identifiable occupations, then the first occupation tag of the first user accessing these applications can be determined based on the occupation information corresponding to those applications; that is, an occupation tag is determined based on the identifiable occupation applications. Conversely, if the application access list contains applications with unidentifiable occupations, then the second occupation tag of the second user accessing these applications can be determined based on the occupation information corresponding to those applications; that is, an occupation tag is determined based on the unidentifiable occupation applications. Thus, the same user may have both a first and a second occupation tag, or only the first occupation tag, or only the second occupation tag.

[0073] Therefore, optionally, step 106, "determining the occupational label of the at least one user based on at least one of the first occupational label and the second occupational label," includes:

[0074] If the h-th user has both the first occupation tag and the second occupation tag, and there are scores corresponding to the first occupation tag and scores corresponding to the second occupation tag, then the user with the larger score between the first occupation tag and the second occupation tag is selected as the occupation tag of the h-th user, or the user with the first occupation tag is selected first, or the user with the larger score between the first occupation tag and the second occupation tag is selected as the primary occupation tag of the h-th user and the user with the smaller score is selected as the secondary occupation tag of the h-th user.

[0075] If only the first occupational tag exists for the h-th user, then the first occupational tag is determined to be the occupational tag of the h-th user;

[0076] If only the second occupational label exists for the h-th user, then the second occupational label is determined to be the occupational label for the h-th user;

[0077] Where h is an integer from 1 to H, and H represents the number of the at least one user.

[0078] Therefore, if a user has a first occupation tag and a second occupation tag, and both have corresponding ratings, then the occupation tag with the higher rating can be selected as the user's occupation tag. If a user only has a first occupation tag, then that first occupation tag is the user's occupation tag. If a user only has a second occupation tag, then that second occupation tag is the user's occupation tag.

[0079] As can be seen from steps 101 to 106 above, in this embodiment of the application, the applications in the user's application access list can be divided into two categories: those with identifiable occupations and those with indeterminate occupations. Thus, an occupational label (i.e., the first occupational label mentioned above) is determined based on the applications with identifiable occupations, and an occupational label (i.e., the second occupational label mentioned above) is determined based on the applications with indeterminate occupations. Then, the user's occupational label is determined based on at least one of the first occupational label and the second occupational label.

[0080] In this case, since there is a significant discrepancy between the accuracy and reliability of occupation identification results obtained by applications with identifiable and unidentifiable occupations, the use of these two types of applications is avoided when the first occupation tag is determined solely by the application with identifiable occupations and the second occupation tag is determined solely by the application with unidentifiable occupations. This ensures that the first occupation tag obtained by the application with identifiable occupations has higher accuracy and reliability. Furthermore, by combining the second occupation tag obtained by the application with unidentifiable occupations, the user's occupation can be represented by both the first and second occupation tags. This not only improves the accuracy of determining the user's occupation but also expands the scope of occupation identification for the user.

[0081] The following sections will describe the specific processes for determining the first occupation label based on the occupational determination application in step 104, and the specific processes for determining the second occupation label based on the occupational indeterminate application in step 105.

[0082] Step 104: The specific process of determining the first occupational label based on occupational determinability:

[0083] Optionally, step 104 above, "determining the first occupation tag of the first user based on the first occupation information corresponding to the first application," includes the following steps A-1 to A-2:

[0084] Step A-1: ​​Determine the first rating of the first occupation information corresponding to each of the first applications accessed by each of the first users;

[0085] Step A-2: Determine the first occupational tag for each of the first users from the first occupational information based on the first score.

[0086] Therefore, if there are occupationally identifiable applications in the application access list, the first score of the first occupation information corresponding to each occupationally identifiable application can be determined. Based on the first score, information that can represent the first occupation tag can be determined from the first job information.

[0087] Understandably, it's also possible to count the number of identical or similar occupational information entries in the first occupational information of the first application corresponding to the same first user, and then select the occupational information with the largest number as the first occupational tag of the first user.

[0088] Optionally, step A-1 above, "determining the first score of the first occupation information corresponding to each of the first applications accessed by each first user," includes the following steps A-1.1 to A-1.4:

[0089] Step A-1.1: Obtain the values ​​of J access metrics for each of the first applications accessed by the v1th first user, where v1 is an integer from 1 to V1, V1 represents the number of the first users, and J is an integer greater than zero;

[0090] Step A-1.2: Normalize the values ​​of the J access metrics of each of the first applications accessed by the v1th first user to obtain the standard values ​​of the J access metrics of each of the first applications accessed by the v1th first user.

[0091] Step A-1.3: Determine the weights of the J access metrics corresponding to the v1st first user based on the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user.

[0092] Step A-1.4: Based on the weights of the J access metrics corresponding to the v1st first user, the standard values ​​of the J access metrics of the qth first application accessed by the v1st first user are weighted and summed to obtain the first score of the first occupational information corresponding to the qth first application accessed by the v1st first user, where q is an integer from 1 to Q, and Q represents the number of the first applications accessed by the v1st first user in the application access list.

[0093] In step A-1.1 above, the user's URL data on the internet can be obtained. This data, combined with the file memory usage of each URL data packet, allows for the calculation of user browsing behavior metrics across various applications, which are then stored in a database. As one implementation, the J access metrics mentioned above may include, for example, the number of clicks, traffic, and number of days accessed. It is understood that this is merely an example of J access metrics, and metrics can be added or removed according to actual needs.

[0094] For example, if the J access metrics mentioned above include metric 1, metric 2, and metric 3, the first user includes user A and user B, and there are 3 identifiable applications in the application access list (e.g., APP1, APP2, and APP3), then steps A-1.1 to A-1.4 above can be specifically as follows: Figure 2 As shown, the details are as follows:

[0095] Obtain the following first and second data:

[0096] The first set of data includes: when user A accesses APP1, the statistical values ​​of indicator 1 (A11), indicator 2 (A12), and indicator 3 (A13); when user A accesses APP2, the statistical values ​​of indicator 1 (A21), indicator 2 (A22), and indicator 3 (A23); and when user A accesses APP3, the statistical values ​​of indicator 1 (A31), indicator 2 (A32), and indicator 3 (A33).

[0097] The second set of data includes the statistical values ​​of indicator 1 (B11), indicator 2 (B12), and indicator 3 (B13) when user B accesses APP1; the statistical values ​​of indicator 1 (B21), indicator 2 (B22), and indicator 3 (B23) when user B accesses APP2; and the statistical values ​​of indicator 1 (B31), indicator 2 (B32), and indicator 3 (B33) when user B accesses APP3.

[0098] Then, the first and second data points are normalized to obtain the following data:

[0099] The third set of data includes the standard values ​​of indicator 1 (AZ11), indicator 2 (AZ12), and indicator 3 (AZ13) when user A accesses APP1; the standard values ​​of indicator 1 (AZ21), indicator 2 (AZ22), and indicator 3 (AZ23) when user A accesses APP2; and the standard values ​​of indicator 1 (AZ31), indicator 2 (AZ32), and indicator 3 (AZ33) when user A accesses APP3.

[0100] The fourth set of data includes the standard values ​​of indicator 1 (BZ11), indicator 2 (BZ12), and indicator 3 (BZ13) when user B accesses APP1; the standard values ​​of indicator 1 (BZ21), indicator 2 (BZ22), and indicator 3 (BZ23) when user B accesses APP2; and the standard values ​​of indicator 1 (BZ31), indicator 2 (BZ32), and indicator 3 (BZ33) when user B accesses APP3.

[0101] Secondly, based on the third data, determine the weights of indicator 1 (WA1), indicator 2 (WA2), and indicator 3 (WA3) for user A; based on the fourth data, determine the weights of indicator 1 (WB1), indicator 2 (WB2), and indicator 3 (WB3) for user B.

[0102] Furthermore, based on the weights WA1 of indicator 1, WA2 of indicator 2, and WA3 of indicator 3 corresponding to user A, the standard values ​​AZ11 of indicator 1, AZ12 of indicator 2, and AZ13 of indicator 3 when user A accesses APP1 are weighted and summed to obtain the first score SA1 for occupation 1 corresponding to user A accessing APP1 = AZ11*WA1 + AZ12*WA2 + AZ13*WA3; similarly, the first score SA2 for occupation 2 corresponding to user A accessing APP2 can be obtained as AZ21*WA1 + AZ22*WA2 + AZ23*WA3, and the first score SA3 for occupation 3 corresponding to user A accessing APP3 is obtained as AZ31*WA1 + AZ32*WA2 + AZ33*WA3.

[0103] Similarly, we can obtain the first rating of user B for occupation 1 corresponding to APP1: SB1 = BZ11*WB1 + BZ12*WB2 + BZ13*WB3, the first rating of user B for occupation 2 corresponding to APP2: SB2 = BZ21*WB1 + BZ22*WB2 + BZ23*WB3, and the first rating of user B for occupation 3 corresponding to APP3: SB3 = BZ31*WB1 + BZ32*WB2 + BZ33*WB3.

[0104] In step A-1.2 above, when performing normalization, either a positive index processing method or a negative index processing method can be used for normalization.

[0105] Specifically, based on the following formula (1), a positive index processing method can be used to normalize the values ​​of the J access indicators of each first application accessed by the v1th first user:

[0106]

[0107] Here, X dj M represents the standard value of the j-th access metric of the d-th first application under the j-th access metric corresponding to the v1-th first user (i.e., the standard value of the j-th metric for the v1-th user accessing the d-th first application), M dj This represents the value of the j-th metric when the v1-th user accesses the d-th first application (i.e., the value before normalization); j is an integer from 1 to J, d is an integer from 1 to D, and D represents the number of first applications accessed by the v1-th first user.

[0108] Alternatively, based on the following formula (2), a negative index processing method can be used to normalize the values ​​of the J access indicators of each first application accessed by the v1th first user:

[0109]

[0110] Optionally, step A-1.3 above, "determining the weights of the J access metrics corresponding to the v1st first user based on the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user," includes:

[0111] The entropy weight method is used to determine the weights of the J access metrics corresponding to the first user v1 based on the standard values ​​of the J access metrics of each of the first applications accessed by the first user v1.

[0112] Specifically, the method of using entropy weight to determine the weights of the J access metrics corresponding to the v1st first user based on the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user includes:

[0113] According to the first formula Calculate the percentage P of the i-th first application under the j-th access metric corresponding to the v1-th first user, and the percentage of all first applications under the j-th access metric corresponding to the v1-th first user. ij X ij This represents the standard value of the j-th access metric of the i-th first application under the j-th access metric corresponding to the v1-th first user, where j is an integer from 1 to J, i is an integer from 1 to n, and n represents the number of the first applications under the j-th access metric corresponding to the v1-th first user.

[0114] According to the second formula Calculate the entropy value E of the j-th access metric corresponding to the v1-th first user. j Where k = 1 / ln(n), in P ij P = 0 ij ln(P ij ) = 0; when n = 1, k = 0;

[0115] According to the third formula D j =1-E j Calculate the information entropy redundancy D of the j-th access indicator corresponding to the v1-th first user. j ;

[0116] According to the fourth formula Calculate the weight W of the j-th access metric corresponding to the v1-th first user. j .

[0117] It should be noted that if Then W j = 1 / J. For example, if J = 3, if Then W j =1 / 3.

[0118] For example, in the above example, when determining the weight WA1 of indicator 1 corresponding to user A based on the third data, the specific process is as follows:

[0119] First, calculate the proportion of APP1 among APP1 to APP3 under indicator 1 for user A: The proportion of APP2 in APP2 to APP3 under indicator 1 for user A: The proportion of APP3 among APP2 to APP3 for User A under indicator 1:

[0120] Similarly, we can obtain the proportions of APP1, APP2, and APP3 among APP1 to APP3 under indicator 2 for user A: P 12 P 22 P 32 The proportion of APP1, APP2, and APP3 among APP1 to APP3 for user A under indicator 3: P 13 P 23 P 33 ;

[0121] Secondly, calculate the entropy value E1 = -k(P) for index 1 corresponding to user A. 11 *ln(P 11 )+P 21 *ln(P 21 )+P 31 *ln(P 31 The entropy value of index 2 corresponding to user A is E2 = -k(P). 12 *ln(P 12 )+P 22 *ln(P 22 )+P 32 *ln(P 32 The entropy value of index 3 corresponding to user A is E3 = -k(P). 13 *ln(P 13 )+P 23 *ln(P 23 )+P 33 *ln(P 33 );

[0122] Next, calculate the information entropy redundancy of indicator 1 corresponding to user A: D1 = 1 - E1, the information entropy redundancy of indicator 2 corresponding to user A: D2 = 1 - E2, and the information entropy redundancy of indicator 3 corresponding to user A: D3 = 1 - E3.

[0123] Finally, calculate the weight of indicator 1 corresponding to user A. The weight of indicator 2 corresponding to user A The weight of indicator 3 corresponding to user A

[0124] Optionally, determining the first occupational tag for each of the first users from the first occupational information based on the first rating includes:

[0125] Select the maximum value in the first rating corresponding to the v1st first user, and determine the first occupation information corresponding to the maximum value as the first occupation tag of the v1st first user;

[0126] Where v1 is an integer from 1 to V1, and V1 represents the number of the first users.

[0127] For example, if the v1th first user has a first rating, then the occupational information to which the first rating belongs is the user's occupational tag; or, if the v1th first user has multiple first ratings, then the occupational information to which the maximum value among these first ratings belongs is the user's occupational tag.

[0128] II. Step 105: The specific process of determining the first occupational label based on occupational identifiability:

[0129] Optionally, step 105 above, "determining the second user's second occupation tag based on the second occupation information corresponding to the second application," includes the following steps B-1 to B-2:

[0130] Step B-1: Determine the second score of the second occupational information corresponding to each of the second applications accessed by each of the second users;

[0131] Step B-2: Determine the second occupational tag of the second user from the second occupational information based on the second score.

[0132] Therefore, if there are applications with uncertain occupations in the application access list, a second score can be determined for the second occupation information corresponding to each application with uncertain occupations. Based on the second score, information that can represent the second occupation tag can be determined from the second job information.

[0133] Understandably, it's also possible to count the number of identical or similar occupational information entries in the second occupational information of the second application corresponding to the same second user, and then select the occupational information with the largest number as the second occupational tag for the second user.

[0134] Optionally, in step B-1 above, determining the second score of the second occupational information corresponding to each of the second applications accessed by each second user includes the following steps B-1.1 to B-1.3:

[0135] Step B-1.1: Obtain the values ​​of J access metrics for each second user accessing each second application, where J is a positive integer;

[0136] Step B-1.2: Based on the values ​​of the J access metrics of the second user corresponding to the m-th second application, perform clustering processing on the second user corresponding to the m-th second application to obtain at least two clusters, where m is an integer from 1 to M, and M represents the number of the second applications;

[0137] Step B-1.3 Determine the profile coefficient of the cluster corresponding to the m-th second application, and use the profile coefficient as the second score of the second occupational information corresponding to the m-th second application.

[0138] In step B-1.1 above, the user's URL data on the internet can be obtained. This data, combined with the file memory usage of each URL data packet, allows for the calculation of user browsing behavior metrics across various applications, which are then stored in a database. As one implementation, the J access metrics mentioned above may include, for example, the number of clicks, traffic, and number of days accessed. It is understood that this is merely an example of J access metrics, and metrics can be added or removed according to actual needs.

[0139] For example, if the J access metrics mentioned above include metric 1, metric 2, and metric 3, the second users include user A, user B, and user C, the application access list contains two applications with uncertain occupations (e.g., APP4 and APP5), and the clustering process uses a binary clustering algorithm, then steps B-1.1 to B-1.3 above can be specifically as follows: Figure 3 As shown, the details are as follows:

[0140] Obtain the following fifth, sixth, and seventh data points:

[0141] The fifth set of data includes: when user A accesses APP4, the statistical values ​​of indicator 1 (A41), indicator 2 (A42), and indicator 3 (A43); when user A accesses APP5, the statistical values ​​of indicator 1 (A51), indicator 2 (A52), and indicator 3 (A53).

[0142] The sixth data includes the statistical values ​​of indicator 1 (B41), indicator 2 (B42), and indicator 3 (B43) when user B accesses APP4, and the statistical values ​​of indicator 1 (B51), indicator 2 (B52), and indicator 3 (B53) when user B accesses APP5.

[0143] The seventh set of data includes the statistical values ​​of indicator 1 (C41), indicator 2 (C42), and indicator 3 (C43) when user C accesses APP4, and the statistical values ​​of indicator 1 (C51), indicator 2 (C52), and indicator 3 (C53) when user C accesses APP5.

[0144] Then, the fifth, sixth, and seventh data points are grouped according to the APP, that is, the statistical values ​​of the access indicators of users corresponding to the same APP are grouped together, so as to perform clustering processing on users A, B, and C corresponding to APP4 and obtain two clusters.

[0145] Similarly, we can obtain the two clusters corresponding to APP5;

[0146] Secondly, determine the silhouette coefficient of the cluster result corresponding to APP4 (i.e., the two clusters), and use the silhouette coefficient as the second score of occupation 4 corresponding to APP4; and determine the silhouette coefficient of the cluster result corresponding to APP2, and use the silhouette coefficient as the second score of occupation 5 corresponding to APP5.

[0147] In step B-1.2 above, a binary clustering algorithm can be used for clustering. Here, the algorithm uses the KMeans dynamic clustering algorithm (i.e., k-means clustering algorithm). The KMeans dynamic clustering algorithm is a type of unsupervised learning algorithm. The number of clusters needs to be set in advance. Here, it is binary clustering, so the number of clusters is set to 2.

[0148] It should be noted that after clustering, if an application corresponds to two clusters, these two clusters can be divided into a high-frequency cluster and a low-frequency cluster based on the average value of the target metrics of users accessing the application within the cluster. For example, the cluster with the larger average value belongs to the high-frequency cluster, and the other belongs to the low-frequency cluster.

[0149] Alternatively, after clustering, if an application corresponds to more than two clusters, these clusters can be divided into high-frequency and low-frequency clusters based on the average value of the target metric for user access to the application within each cluster. For example, the cluster with the highest average value belongs to the high-frequency cluster, and the others belong to the low-frequency cluster; or, the cluster with the lowest average value belongs to the low-frequency cluster, and the others belong to the high-frequency cluster. Here, the target metric can be an indicator representing the frequency of user use of the app, such as one of the following: number of clicks, data usage, or number of days of access.

[0150] In addition, the KMeans dynamic clustering algorithm calculates the distance between user samples in a multidimensional index space. The distance can be calculated in various ways, such as using Euclidean distance. Finally, based on the pre-set number of clusters, the users are clustered into two clusters. Users within a cluster have high similarity, i.e., they are close to each other, while users between clusters have high dissimilarity, i.e., they are far apart.

[0151] Ultimately, users of each occupationally uncertain application will be categorized into one of the application's high-frequency or low-frequency clusters. Conversely, each user can correspond to multiple high-frequency or low-frequency clusters of occupationally uncertain applications. For example, such as... Figure 3 As shown, user A corresponds to the high-frequency cluster of APP4 and also to the high-frequency cluster of APP5; user B corresponds to the high-frequency cluster of APP4 and also to the low-frequency cluster of APP5; and user C corresponds to the low-frequency cluster of APP4 and also to the high-frequency cluster of APP5.

[0152] In step B-1.3 above, the silhouette coefficient is an indicator for evaluating the clustering effect, with a value range of [-1, 1]. The closer it is to 1, the better the clustering effect. The better the clustering effect of the cluster corresponding to the application with uncertain occupation, the better the distinction between professional users and non-professional users of the APP. In other words, the higher the occupational certainty of the APP, the better. Therefore, the clustering silhouette coefficient of the APP can be used as the second score of the second occupational information corresponding to each application with uncertain occupation.

[0153] The silhouette coefficient of a cluster corresponding to a certain second application can be calculated through the following processes (1) to (4):

[0154] (1) Taking a user within a cluster corresponding to a second application as a sample point, then all users within all clusters corresponding to that application constitute E sample points. For each sample point e, based on the values ​​of J access metrics of the sample point accessing the application, calculate the average distance (e.g., Euclidean distance) between sample point e and all other sample points in the same cluster, denoted as A. e , used to quantify the cohesion within a cluster;

[0155] (2) Select a cluster other than the cluster to which sample point e belongs, calculate the average distance between sample point e and all sample points in that cluster, traverse all other clusters, find the minimum average distance outside the cluster, and denot it as B. e This is used to quantify the separation between clusters;

[0156] (3) For sample point e, the contour coefficient is calculated using the following formula:

[0157]

[0158] (4) Calculate the silhouette coefficients of all E sample points, and the average value is the overall silhouette coefficient of the current cluster, which measures the tightness of the data clustering. The calculation formula is as follows:

[0159]

[0160] Based on the above steps, after completing the clustering, the contour coefficient corresponding to each second application can be calculated, which serves as the second score for the second occupational information corresponding to that second application.

[0161] Optionally, determining the second user's second occupational tag from the second occupational information based on the second rating includes:

[0162] The second application corresponding to each cluster to which the second user belongs is identified;

[0163] In the second application corresponding to the cluster to which the v2th second user belongs, the second application corresponding to the target cluster is removed to obtain the candidate application corresponding to the v2th second user. Among them, the target parameter of the target cluster is the smallest in the cluster corresponding to the same second application. The target parameter is the value of the target indicator corresponding to the users included in a cluster. The target indicator is one of the J access indicators. Where v2 is an integer from 1 to V2, and V2 represents the number of the second users.

[0164] Select the maximum value in the second rating corresponding to the candidate application, and determine the second occupation information corresponding to the candidate application corresponding to the maximum value as the second occupation tag of the v2th second user.

[0165] For example, if the target cluster mentioned above is a low-frequency cluster, then as follows Figure 3 As shown, after clustering, for APP4, users A and B belong to the high-frequency cluster, and user B belongs to the low-frequency cluster; for APP5, users A and C belong to the high-frequency cluster, and user B belongs to the low-frequency cluster. Based on this, the applications corresponding to the clusters to which each user belongs are statistically analyzed, i.e.:

[0166] User A belongs to two high-frequency clusters, corresponding to APP4 and APP5 respectively;

[0167] User B belongs to a high-frequency cluster corresponding to APP4, and also belongs to a low-frequency cluster corresponding to APP5;

[0168] User C belongs to a low-frequency cluster corresponding to APP4, and also belongs to a high-frequency cluster corresponding to APP5;

[0169] Then, among the applications corresponding to each user, the low-frequency applications are removed. Thus, user A corresponds to APP4 and APP5; user B corresponds to APP4; and user C corresponds to APP5.

[0170] Therefore, the occupation corresponding to the maximum value of the second rating of APP4 and the second rating of APP5 is selected as the second occupation tag for user A; the occupation corresponding to APP4 is selected as the second occupation tag for user B, and the occupation corresponding to APP5 is selected as the second occupation tag for user C.

[0171] In summary, the specific implementation method for determining occupational labels in this application can be described in the following steps S1 to S5:

[0172] Step S1: Collect the user's mobile terminal's Internet access URL data, parse the user's APP access list based on the Internet access URL data, and count the access behavior data of each user's APP in the list.

[0173] Step S2: Confirm the type of each app in the app access list from the database. The database categorizes apps into two main types: those with identifiable occupations and those with unidentifiable occupations, and each has a corresponding occupation. If an app in the app access list does not have a type saved in the database, the app's detailed description can be crawled from the internet (e.g., descriptions in app stores or on the app's official website). By using keywords in the app's detailed description, the app's corresponding occupation and type can be determined, i.e., whether it is a identifiable or unidentifiable occupation category. The corresponding occupation and type of the app are then added to the database.

[0174] Different methods for determining user occupation tags can be implemented for different types of apps. For apps where occupation can be determined, steps S3 and S4 can be executed to determine the first occupation tag. An example is as follows: Figure 2 As shown, for apps with uncertain occupations, steps S5 and S6 can be performed to determine a second occupation tag. An example is shown below. Figure 3 As shown.

[0175] Step S3: Based on the access behavior data of each user to the occupation-determinable APP obtained in Step S1, determine the values ​​of the following three indicators for each user's access to each occupation-determinable APP: "Number of access clicks", "Access traffic", and "Number of access days". Then, based on the values ​​of these indicators, use the entropy weight method to calculate the weight of the three indicators for each user.

[0176] The specific calculation method for using the entropy weight method to calculate the weights can be found in the previous text, and will not be repeated here.

[0177] Step S4: For the same user accessing the same job, the values ​​of the three indicators of the APP can be determined. According to the weight of the three indicators corresponding to the user, the weighted sum is performed to obtain the first score of the job corresponding to the different jobs of the APP for different users. Then, the job with the highest first score corresponding to the same user is selected as the first job tag of the user.

[0178] Step S5: Identify multiple users corresponding to each undetermined APP for each occupation. Based on the values ​​of the three indicators corresponding to the multiple users of each APP, use binary clustering to divide the multiple users of each APP into the high-frequency cluster or low-frequency cluster of each APP, and obtain the silhouette coefficient of each APP in the clustering, which is used as the second score of the occupation corresponding to each APP.

[0179] The specific process for calculating the profile coefficient can be found in the previous text and will not be repeated here.

[0180] Step S6: Filter out the low-frequency cluster apps corresponding to each user (i.e., the apps corresponding to the low-frequency cluster to which the user belongs), and determine the high-frequency cluster apps corresponding to each user (i.e., the apps corresponding to the high-frequency cluster to which the user belongs). If a user corresponds to only one high-frequency cluster app, then the occupation corresponding to that high-frequency cluster app is used as the user's second occupation tag. If a user corresponds to multiple high-frequency cluster apps, then compare the second ratings corresponding to each high-frequency cluster app, and select the occupation corresponding to the high-frequency cluster app with the highest second rating as the user's second occupation tag.

[0181] Step S7: For the same user, select one of the first occupation tag and the second occupation tag as the user's occupation tag. Alternatively, the first occupation tag can be selected as the user's occupation tag. Or, based on the ratings corresponding to the two tags, one can be determined as the user's primary occupation tag and the other as the user's secondary occupation tag.

[0182] Currently, in most cases, operators cannot directly obtain users' occupational attribute information from the business acceptance process. Even if users' occupational attribute information is required to be registered during the business acceptance process, their occupation may change over time, causing the user's occupational tag information to lose its value. However, users' occupational characteristics are of great guiding significance for formulating differentiated marketing strategies.

[0183] The embodiments of this application address the current problem of the inability to accurately obtain user occupational information by proposing the aforementioned occupational tag determination method, which can achieve the following technical effects:

[0184] 1. By categorizing apps into two main types—those with identifiable and those with indeterminate occupations—different occupational scoring methods are applied to each type of app. A first occupational tag is determined based on apps with identifiable occupations, and a second occupational tag is determined based on apps with indeterminate occupations. The user's occupational tag is then determined by combining the first and second occupational tags. Since there is a significant discrepancy in the accuracy and reliability of occupational identification results from apps with and without identifiable occupations, this approach avoids mixing the two types of apps when the first occupational tag is determined solely by apps with identifiable occupations and the second occupational tag solely by apps with indeterminate occupations. This ensures that the first occupational tag identified by apps with identifiable occupations has higher accuracy and reliability. The second occupational tag identified by apps with indeterminate occupations can be calculated using different methods to improve the accuracy and reliability of the identification results. By using both the first and second occupational tags to represent the user's occupation, not only is the accuracy of occupational identification improved, but the range of occupational identification for the user is also broadened.

[0185] 2. For occupation identification apps, the entropy weight method is used to determine the weight of each access behavior indicator. This method can objectively reflect the differences and importance between different access behavior indicators and eliminate the influence of subjective factors, thereby improving the accuracy of the weights. Based on the weight of each access indicator, the first score of the occupation corresponding to each occupation identification app is determined, thereby further improving the accuracy and credibility of the first occupation label.

[0186] 3. For apps with uncertain occupations, instead of directly calculating the occupation rating for each user across multiple apps with uncertain occupations, we first use each app with uncertain occupations as a unit. Based on each user's access behavior metrics for each app, we use a binary clustering algorithm to divide the multiple users corresponding to each app into high-frequency clusters and low-frequency clusters. Conversely, each app with uncertain occupations for a user can be mapped to either a high-frequency cluster or a low-frequency cluster. Then, we filter out the low-frequency cluster apps used by the user and only consider the high-frequency cluster apps used by the user to determine the second occupation label. We compare the silhouette coefficients obtained from clustering each high-frequency cluster app to determine the second occupation label, thereby improving the accuracy and reliability of the second occupation label.

[0187] 4. By using the profile coefficient as the second score for the occupation corresponding to each APP with an uncertain occupation, the separation degree between occupational users and non-occupational users of APPs with uncertain occupations can be accurately quantified, and the most likely second occupation tag can be accurately located from the occupations corresponding to APPs with uncertain occupations.

[0188] The above describes the occupational label determination method provided in the embodiments of this application. The occupational label determination device provided in the embodiments of this application will be described below with reference to the accompanying drawings.

[0189] Secondly, embodiments of this application provide an occupational label determination device, such as... Figure 4 As shown, the device includes the following modules:

[0190] The list acquisition module 401 is used to acquire at least one user's application access list within a target time period.

[0191] Occupational information determination module 402 is used to determine the occupational information corresponding to each application in the application access list;

[0192] The category determination module 403 is used to determine the category to which each application in the application access list belongs, wherein the category includes applications with identifiable occupations and applications with indeterminate occupations;

[0193] The first determining module 404 is used to determine the first occupation tag of the first user based on the first occupation information corresponding to the first application when the first application exists, wherein the first application includes occupation-determinable applications in the application access list, and the first user includes the user who accesses the first application among the at least one user.

[0194] The second determining module 405 is used to determine the second occupation tag of the second user based on the second occupation information corresponding to the second application when a second application exists, wherein the second application includes applications with undetermined occupations in the application access list, and the second user includes the user who accessed the second application among the at least one user.

[0195] The third determining module 406 is used to determine the occupational label of the at least one user based on at least one of the first occupational label and the second occupational label.

[0196] Optionally, the first determining module 404 includes:

[0197] The first determining submodule is used to determine the first score of the first occupation information corresponding to each of the first applications accessed by each of the first users;

[0198] The second determining submodule is used to determine the first occupational tag of each of the first users from the first occupational information based on the first score.

[0199] Optionally, the first determining submodule is specifically used for:

[0200] Get the values ​​of J access metrics for each of the first applications accessed by the v1th first user, where v1 is an integer from 1 to V1, V1 represents the number of the first users, and J is an integer greater than zero.

[0201] The values ​​of the J access metrics of each of the first applications accessed by the v1st first user are normalized to obtain the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user.

[0202] The weights of the J access metrics corresponding to the v1st first user are determined based on the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user.

[0203] Based on the weights of the J access metrics corresponding to the v1st first user, the standard values ​​of the J access metrics of the qth first application accessed by the v1st first user are weighted and summed to obtain the first score of the first occupational information corresponding to the qth first application accessed by the v1st first user, where q is an integer from 1 to Q, and Q represents the number of the first applications accessed by the v1st first user in the application access list.

[0204] Optionally, the first determining submodule determines the weights of the J access metrics corresponding to the v1st first user based on the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user, specifically for:

[0205] The entropy weight method is used to determine the weights of the J access metrics corresponding to the first user v1 based on the standard values ​​of the J access metrics of each of the first applications accessed by the first user v1.

[0206] Optionally, the second determining submodule is specifically used for:

[0207] Select the maximum value in the first rating corresponding to the v1st first user, and determine the first occupation information corresponding to the maximum value as the first occupation tag of the v1st first user;

[0208] Where v1 is an integer from 1 to V1, and V1 represents the number of the first users.

[0209] Optionally, the second determining module 405 includes:

[0210] The third determining submodule is used to determine the second score of the second occupational information corresponding to each of the second applications accessed by each of the second users;

[0211] The fourth determination submodule is used to determine the second occupational tag of the second user from the second occupational information based on the second score.

[0212] Optionally, the third determining submodule is specifically used for:

[0213] Obtain the values ​​of J access metrics for each second user accessing each second application, where J is a positive integer;

[0214] Based on the values ​​of J access metrics of the second user corresponding to the m-th second application, clustering is performed on the second user corresponding to the m-th second application to obtain at least two clusters, where m is an integer from 1 to M, and M represents the number of the second applications;

[0215] Determine the profile coefficient corresponding to the m-th second application, and use the profile coefficient as the second score of the second occupational information corresponding to the m-th second application.

[0216] Optionally, the fourth determining submodule is specifically used for:

[0217] The second application corresponding to each cluster to which the second user belongs is identified;

[0218] In the second application corresponding to the cluster to which the v2th second user belongs, the second application corresponding to the target cluster is removed to obtain the candidate application corresponding to the v2th second user. Among them, the target parameter of the target cluster is the smallest in the cluster corresponding to the same second application. The target parameter is the value of the target indicator corresponding to the users included in a cluster. The target indicator is one of the J access indicators. Where v2 is an integer from 1 to V2, and V2 represents the number of the second users.

[0219] Select the maximum value in the second rating corresponding to the candidate application, and determine the second occupation information corresponding to the candidate application corresponding to the maximum value as the second occupation tag of the v2th second user.

[0220] Optionally, the third determining module 406 is specifically used for:

[0221] If the h-th user has both the first occupation tag and the second occupation tag, and there are scores corresponding to the first occupation tag and scores corresponding to the second occupation tag, then the user with the larger score between the first occupation tag and the second occupation tag is selected as the occupation tag of the h-th user, or the user with the first occupation tag is selected first, or the user with the larger score between the first occupation tag and the second occupation tag is selected as the primary occupation tag of the h-th user and the user with the smaller score is selected as the secondary occupation tag of the h-th user.

[0222] If only the first occupational tag exists for the h-th user, then the first occupational tag is determined to be the occupational tag of the h-th user;

[0223] If only the second occupational label exists for the h-th user, then the second occupational label is determined to be the occupational label for the h-th user;

[0224] Where h is an integer from 1 to H, and H represents the number of the at least one user.

[0225] Optionally, the device further includes:

[0226] The relationship determination module is used to obtain description information of at least one application and determine a first correspondence between the at least one application and occupational information based on the description information.

[0227] The occupational information determination module 402 is specifically used for:

[0228] Based on the first correspondence, determine the occupational information corresponding to each application in the application access list.

[0229] Optionally, the category determination module 403 is specifically used for:

[0230] Based on a pre-determined second correspondence between applications and categories, the category to which each application in the access list belongs is determined.

[0231] The method and apparatus are based on the same concept of the application. Since the methods and apparatus solve problems in similar ways, the implementation of the apparatus and methods can refer to each other, and the repeated parts will not be described again.

[0232] It should be noted that the division of units in the embodiments of this application is illustrative and only represents one logical functional division. In actual implementation, other division methods may be used. Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated units described above can be implemented in hardware or as software functional units.

[0233] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a processor-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0234] It should be noted that the apparatus provided in this application embodiment can implement all the method steps implemented in the above method embodiment and can achieve the same technical effect. Here, the parts that are the same as those in the method embodiment and the beneficial effects will not be described in detail.

[0235] Embodiments of this application also provide an electronic device, such as... Figure 5 As shown, the electronic device includes a memory 520, a transceiver 510, and a processor 500;

[0236] Memory 520 is used to store computer programs;

[0237] Transceiver 510 is used to receive and send data under the control of processor 500;

[0238] The processor 500 is used to read the computer program in the memory 520 and execute the occupational label determination method described in the first aspect above.

[0239] Among them, Figure 5In this context, the bus architecture can include any number of interconnected buses and bridges, specifically linking various circuits together, represented by one or more processors (processor 500) and memory (memory 520). The bus architecture can also link various other circuits such as peripheral devices, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein. The bus interface provides an interface. The transceiver 510 can be multiple elements, including transmitters and receivers, providing a unit for communicating with various other devices over transmission media, including wireless channels, wired channels, optical fibers, etc. The processor 500 is responsible for managing the bus architecture and general processing, and the memory 520 can store data used by the processor 500 during operation.

[0240] The processor 500 can be a central processing unit (CPU), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a complex programmable logic device (CPLD). The processor 500 can also adopt a multi-core architecture.

[0241] It should be noted that the apparatus provided in this application embodiment can implement all the method steps implemented in the above method embodiment and can achieve the same technical effect. Here, the parts that are the same as those in the method embodiment and the beneficial effects will not be described in detail.

[0242] Embodiments of this application also provide a processor-readable storage medium storing a computer program for causing the processor to execute the occupational label determination method described above.

[0243] The processor-readable storage medium can be any available medium or data storage device that the processor can access, including but not limited to magnetic memory (e.g., floppy disk, hard disk, magnetic tape, magneto-optical disk (MO)), optical memory (e.g., CD, DVD, BD, HVD), and semiconductor memory (e.g., ROM, EPROM, EEPROM, non-volatile memory (NAND FLASH), solid-state drive (SSD)).

[0244] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage and optical storage) containing computer-usable program code.

[0245] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-executable instructions. These computer-executable instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0246] These processor-executable instructions may also be stored in a processor-readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the processor-readable memory produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0247] These processors can execute instructions that can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable device for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0248] Obviously, those skilled in the art can make various modifications and variations to this application without departing from the spirit and scope of this application. Therefore, if such modifications and variations fall within the scope of the claims of this application and their equivalents, this application also intends to include such modifications and variations.

Claims

1. A method for determining occupational labels, characterized in that, The method includes: Obtain a list of applications accessed by at least one user within a target time period; Determine the occupational information corresponding to each application in the application access list; Determine the category to which each application in the application access list belongs, wherein the category includes occupationally identifiable applications and occupationally indeterminate applications; In the presence of a first application, the first occupational tag of the first user is determined based on the first occupational information corresponding to the first application, wherein the first application includes occupationally identifiable applications in the application access list, and the first user includes the user who accessed the first application among the at least one user. In the presence of a second application, the second occupation tag of the second user is determined based on the second occupation information corresponding to the second application, wherein the second application includes applications with undetermined occupations in the application access list, and the second user includes the user who accessed the second application among the at least one user; The occupational label of the at least one user is determined based on at least one of the first occupational label and the second occupational label; The step of determining the first occupation tag of the first user based on the first occupation information corresponding to the first application includes: Determine the first score of the first occupation information corresponding to each of the first applications accessed by each of the first users; The first occupational tag for each of the first users is determined from the first occupational information based on the first score; The first score for determining the first occupation information corresponding to each of the first applications accessed by each of the first users includes: Get the values ​​of J access metrics for each of the first applications accessed by the v1th first user, where v1 is an integer from 1 to V1, V1 represents the number of the first users, and J is an integer greater than zero. The values ​​of the J access metrics of each of the first applications accessed by the v1st first user are normalized to obtain the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user. The weights of the J access metrics corresponding to the v1st first user are determined based on the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user. Based on the weights of the J access metrics corresponding to the v1st first user, the standard values ​​of the J access metrics of the qth first application accessed by the v1st first user are weighted and summed to obtain the first score of the first occupational information corresponding to the qth first application accessed by the v1st first user, where q is an integer from 1 to Q, and Q represents the number of the first applications accessed by the v1st first user in the application access list. The step of determining the second user's second occupation tag based on the second occupation information corresponding to the second application includes: Determine the second score of the second occupational information corresponding to each of the second applications accessed by each of the second users; The second occupational tag of the second user is determined from the second occupational information based on the second score; The second rating for determining the second occupational information corresponding to each of the second applications accessed by each second user includes: Obtain the values ​​of J access metrics for each second user accessing each second application, where J is a positive integer; Based on the values ​​of J access metrics of the second user corresponding to the m-th second application, clustering is performed on the second user corresponding to the m-th second application to obtain at least two clusters, where m is an integer from 1 to M, and M represents the number of the second applications; Determine the profile coefficient of the cluster corresponding to the m-th second application, and use the profile coefficient as the second score of the second occupational information corresponding to the m-th second application.

2. The method according to claim 1, characterized in that, The step of determining the weights of the J access metrics corresponding to the v1st first user based on the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user includes: The entropy weight method is used to determine the weights of the J access metrics corresponding to the first user v1 based on the standard values ​​of the J access metrics of each of the first applications accessed by the first user v1.

3. The method according to claim 1, characterized in that, The step of determining the first occupational tag for each of the first users from the first occupational information based on the first score includes: Select the maximum value in the first rating corresponding to the v1st first user, and determine the first occupation information corresponding to the maximum value as the first occupation tag of the v1st first user; Where v1 is an integer from 1 to V1, and V1 represents the number of the first users.

4. The method according to claim 1, characterized in that, The step of determining the second user's second occupational tag from the second occupational information based on the second score includes: The second application corresponding to each cluster to which the second user belongs is identified; In the second application corresponding to the cluster to which the v2th second user belongs, the second application corresponding to the target cluster is removed to obtain the candidate application corresponding to the v2th second user. Among them, the target parameter of the target cluster is the smallest in the cluster corresponding to the same second application. The target parameter is the value of the target indicator corresponding to the users included in a cluster. The target indicator is one of the J access indicators. Where v2 is an integer from 1 to V2, and V2 represents the number of the second users. Select the maximum value in the second rating corresponding to the candidate application, and determine the second occupation information corresponding to the candidate application corresponding to the maximum value as the second occupation tag of the v2th second user.

5. The method according to claim 1, characterized in that, Determining the occupational label of the at least one user based on at least one of the first occupational label and the second occupational label includes: If the h-th user has both the first occupation tag and the second occupation tag, and there are scores corresponding to the first occupation tag and scores corresponding to the second occupation tag, then the user with the larger score between the first occupation tag and the second occupation tag is selected as the occupation tag of the h-th user, or the user with the first occupation tag is selected first, or the user with the larger score between the first occupation tag and the second occupation tag is selected as the primary occupation tag of the h-th user and the user with the smaller score is selected as the secondary occupation tag of the h-th user. If only the first occupational tag exists for the h-th user, then the first occupational tag is determined to be the occupational tag of the h-th user; If only the second occupational label exists for the h-th user, then the second occupational label is determined to be the occupational label for the h-th user; Where h is an integer from 1 to H, and H represents the number of the at least one user.

6. The method according to claim 1, characterized in that, Before determining the occupation information corresponding to each application in the application access list, the method further includes: Obtain description information of at least one application, and determine a first correspondence between the at least one application and occupational information based on the description information; Determining the occupational information corresponding to each application in the application access list includes: Based on the first correspondence, determine the occupational information corresponding to each application in the application access list.

7. The method according to claim 1, characterized in that, Determining the category to which each application in the access list belongs includes: Based on a pre-determined second correspondence between applications and categories, the category to which each application in the access list belongs is determined.

8. A device for determining occupational labels, characterized in that, The device includes: The list acquisition module is used to acquire at least one user's application access list within a target time period; The occupational information determination module is used to determine the occupational information corresponding to each application in the application access list; The category determination module is used to determine the category to which each application in the application access list belongs, wherein the category includes applications with identifiable occupations and applications with indeterminate occupations; The first determining module is used to determine the first occupation tag of the first user based on the first occupation information corresponding to the first application when the first application exists, wherein the first application includes occupation-determinable applications in the application access list, and the first user includes the user who accessed the first application among the at least one user. The second determining module is used to determine the second occupation tag of the second user based on the second occupation information corresponding to the second application when a second application exists, wherein the second application includes applications with undetermined occupations in the application access list, and the second user includes the user who accessed the second application among the at least one user. The third determining module is used to determine the occupational label of the at least one user based on at least one of the first occupational label and the second occupational label; The first determining module includes: The first determining submodule is used to determine the first score of the first occupation information corresponding to each of the first applications accessed by each of the first users; The second determining submodule is used to determine the first occupational tag of each of the first users from the first occupational information based on the first score; The first determining submodule is specifically used for: Get the values ​​of J access metrics for each of the first applications accessed by the v1th first user, where v1 is an integer from 1 to V1, V1 represents the number of the first users, and J is an integer greater than zero. The values ​​of the J access metrics of each of the first applications accessed by the v1st first user are normalized to obtain the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user. The weights of the J access metrics corresponding to the v1st first user are determined based on the standard values ​​of the J access metrics of each of the first applications accessed by the v1st first user. Based on the weights of the J access metrics corresponding to the v1st first user, the standard values ​​of the J access metrics of the qth first application accessed by the v1st first user are weighted and summed to obtain the first score of the first occupational information corresponding to the qth first application accessed by the v1st first user, where q is an integer from 1 to Q, and Q represents the number of the first applications accessed by the v1st first user in the application access list. The second determining module includes: The third determining submodule is used to determine the second score of the second occupational information corresponding to each of the second applications accessed by each of the second users; The fourth determining submodule is used to determine the second user's second occupational tag from the second occupational information based on the second score; The third determining submodule is specifically used for: Obtain the values ​​of J access metrics for each second user accessing each second application, where J is a positive integer; Based on the values ​​of J access metrics of the second user corresponding to the m-th second application, clustering is performed on the second user corresponding to the m-th second application to obtain at least two clusters, where m is an integer from 1 to M, and M represents the number of the second applications; Determine the profile coefficient corresponding to the m-th second application, and use the profile coefficient as the second score of the second occupational information corresponding to the m-th second application.

9. An electronic device, characterized in that, Includes memory, transceiver, and processor: Memory, used to store computer programs; A transceiver for sending and receiving data under the control of the processor; a processor for reading a computer program in the memory and executing the occupational label determination method according to any one of claims 1 to 7.

10. A processor-readable storage medium, characterized in that, The processor-readable storage medium stores a computer program for causing the processor to perform the occupational label determination method according to any one of claims 1 to 7.