Ranking method, system, device and storage medium for popular diseases
By calculating the information richness value and total pageviews of a disease, and combining them with dynamic weights, the problem that existing technologies' popularity search is not suitable for ranking popular diseases has been solved, thus achieving dynamic ranking and accurate reflection of popular diseases.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 广州赛业百沐生物科技有限公司
- Filing Date
- 2022-04-20
- Publication Date
- 2026-06-23
AI Technical Summary
In existing technologies, the popularity search method is not suitable for sorting popular diseases and cannot effectively reflect the information richness of diseases and the dynamic changes in page views.
By calculating the information richness value and total pageviews of a disease, and combining them with dynamic weights, a popularity score for each disease is determined and the diseases are ranked.
It enables dynamic ranking of popular diseases, reflecting the level of attention and research on these diseases, and providing more accurate ranking results.
Smart Images

Figure CN114822860B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of Internet technology, and in particular to a method, system, device, and storage medium for sorting popular diseases. Background Technology
[0002] Big data is increasingly being applied in daily life, and it can reveal macro-level patterns. For example, on some medical websites, users search for information about diseases to learn about, study, or conduct in-depth research. These users may include medical researchers, doctors, patients, or students. How to rank popular diseases and provide this information to users through the website is a problem that needs to be solved. Current technologies, such as trending searches, focus more on the time of the search, and this ranking method is not suitable for ranking popular diseases. Summary of the Invention
[0003] In view of this, the purpose of this invention is to provide a method, system, device and storage medium for sorting popular diseases, which can sort popular diseases according to the information richness value, number of views and dynamic weight.
[0004] In a first aspect, embodiments of the present invention provide a method for ranking popular diseases, comprising the following steps:
[0005] Determine the information richness value of each disease based on the content information of each disease in the database;
[0006] The total number of views for each disease was calculated through several methods;
[0007] Based on the information richness of each disease and the total number of views for each disease, a popularity score for each disease is determined according to a preset dynamic weight. Then, each disease is sorted by popularity score and displayed.
[0008] Optionally, the disease content information includes basic disease information, disease ID, disease alternatives, disease tags, disease phenotype, incidence rate information, disease-related genes, disease target drugs, and disease clinical trials. The step of determining the information richness value of each disease based on its content information in the database specifically includes:
[0009] The score of each item in the content information of each disease is determined according to a preset ratio;
[0010] The score for each item in the content information of each disease is determined based on the score amount, the preset scoring rules, and the content information of each disease.
[0011] The information richness value of each disease is determined by summing the scores of each item in the content information of each disease.
[0012] Optionally, the score for each item in the content information of each disease is determined by the following method:
[0013] Several score tiers are established, and the scoring conditions for each tier are determined based on the score levels.
[0014] The score for each item in the content information of each disease is determined by comparing the content information of each disease with the scoring conditions.
[0015] Optionally, the several methods include search web pages for popular diseases, and the statistics are completed through several methods to determine the total number of page views for each disease; specifically including:
[0016] Statistics show the number of first page views for each disease obtained through search pages for popular diseases.
[0017] The second pageviews for each disease are obtained through means other than search pages for popular diseases, and the second equivalent pageviews are calculated based on the second pageviews and a preset second conversion factor, wherein the second conversion factor is greater than 1.
[0018] The total number of views for each disease is calculated based on the first number of views and the second equivalent number of views.
[0019] Optionally, the method further includes:
[0020] When the details page corresponding to a disease is marked, the corresponding number of views is determined according to a preset first conversion factor; wherein, the first conversion factor is greater than 1.
[0021] Optionally, the preset dynamic weights include a dynamically changing first weight and a dynamically changing second weight, the sum of the first weight and the second weight being 1. The step of determining the popularity score of each disease according to its information richness and total pageviews based on the preset dynamic weights, and then sorting and displaying the diseases based on their popularity scores, specifically includes:
[0022] The converted pageviews for each disease are determined based on the total pageviews for each disease and a preset conversion factor; the conversion factor is greater than 1.
[0023] The first score is determined based on the converted page views for each disease and the preset first weight;
[0024] The second score is determined based on the information richness value of each disease and a preset second weight;
[0025] A popularity score is determined based on the first score and the second score. The diseases are then ranked and displayed according to their popularity scores.
[0026] Optionally, the first weight and the second weight are set by the following method:
[0027] The first weight increases with the increase of the total number of views, and the second weight decreases with the increase of the total number of views.
[0028] Secondly, embodiments of the present invention provide a ranking system for popular diseases, including:
[0029] The first module is used to determine the information richness value of each disease based on the content information of each disease in the database.
[0030] The second module is used to count the total number of views for each disease through several different methods.
[0031] The third module is used to determine the popularity score of each disease according to the information richness of each disease and the total number of views of each disease according to a preset dynamic weight, and to sort and display each disease according to its popularity score.
[0032] Thirdly, embodiments of the present invention provide a sorting device for popular diseases, comprising:
[0033] At least one processor;
[0034] At least one memory for storing at least one program;
[0035] When the at least one program is executed by the at least one processor, the at least one processor performs the method described above.
[0036] Fourthly, embodiments of the present invention provide a storage medium storing a processor-executable program, which, when executed by a processor, is used to perform the methods described above.
[0037] Implementing the embodiments of the present invention has the following beneficial effects: This embodiment first determines the information richness value and total number of views of each disease by using the content information of each disease in the database, and then determines the popularity score and sorts them according to the information richness value, total number of views and preset dynamic weight of each disease, thereby realizing the dynamic sorting of popular diseases and presenting them to users. Attached Figure Description
[0038] Figure 1 This is a flowchart illustrating the steps of a method for ranking popular diseases provided in an embodiment of the present invention;
[0039] Figure 2 This is a schematic diagram of a process for determining the information richness value of each disease based on the content information of each disease, provided by an embodiment of the present invention;
[0040] Figure 3This is a schematic diagram of the steps for determining the score of each item in the content information of each disease, provided by an embodiment of the present invention;
[0041] Figure 4 This is a flowchart illustrating the steps involved in calculating the total pageviews for various diseases through several methods, as provided in an embodiment of the present invention.
[0042] Figure 5 This is a flowchart illustrating the steps of sorting diseases by their information richness and total views, as provided in an embodiment of the present invention.
[0043] Figure 6 This is a structural block diagram of a ranking system for popular diseases provided in an embodiment of the present invention;
[0044] Figure 7 This is a structural block diagram of a sorting device for popular diseases provided in an embodiment of the present invention. Detailed Implementation
[0045] The present invention will now be described in further detail with reference to the accompanying drawings and specific embodiments. The step numbers in the following embodiments are only for ease of explanation and do not limit the order of the steps. The execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
[0046] The disease search webpage connects to several disease-related databases. Each database contains disease-related information recorded in different journals, academic journals, conference articles, etc. Users can enter disease names, disease-related genes, or disease IDs on the search webpage to search for, view, or download disease-related information.
[0047] like Figure 1 As shown, this embodiment of the invention provides a method for ranking popular diseases, which includes steps S100 to S300.
[0048] S100. Determine the information richness value of each disease based on the content information of each disease in the database.
[0049] Specifically, the literature database contains a lot of disease-related content, and it is necessary to extract information that is useful for calculating the information richness value of the disease. The specific content to be extracted is determined according to the actual application, and this embodiment does not impose any specific restrictions.
[0050] It should be noted that the information richness value of each disease can be used to characterize the depth and breadth of research on each disease. The higher the information richness value of a disease, the more times the disease has been studied or the wider the scope of research, which indirectly reflects the higher level of attention the disease receives.
[0051] In one specific embodiment, the disease content information includes basic disease information, disease ID, disease alternatives, disease tags, disease phenotype, incidence rate information, disease-related genes, disease target drugs, and disease clinical trials. The step of determining the information richness value of each disease based on its content information in the database is as follows: Figure 2 As shown, it specifically includes steps S110 to S130.
[0052] S110. Determine the score of each item in the content information of each disease according to a preset ratio;
[0053] S120. Determine the score for each item in the content information of each disease based on the score amount, the preset scoring rules, and the content information of each disease;
[0054] S130. Determine the information richness value of each disease based on the sum of the scores of each item in the content information of each disease.
[0055] It should be noted that the score and preset scoring rules for each item in the content information of each disease are determined according to the actual application, and this embodiment does not impose specific restrictions.
[0056] Specifically, if the maximum value of the information richness of each disease is set to 100 points, the score of each item in the content information of each disease is determined according to the proportions of 5%, 10% or 20%, respectively. If an item meets the set scoring conditions, it can get the full score; if only some of the scoring conditions are met, the score is determined according to the number of conditions met; finally, the information richness value of each disease is determined according to the sum of the scores of each item in the content information of each disease.
[0057] In a specific embodiment, such as Figure 3 As shown, the score for each item in the content information of each disease is determined by the following method:
[0058] S121. Establish several score tiers, and determine the scoring conditions for each tier based on the score levels.
[0059] S122. Compare the content information of each disease with the scoring conditions to determine the score of each item in the content information of each disease.
[0060] It should be noted that by establishing score tiers and scoring conditions for each tier, and by determining the score of each item in the content information of each disease by matching the content information of each disease with the scoring conditions, calculation is facilitated and computational resources are saved.
[0061] Specifically, referring to Table 1, the scores for each disease's content information, including basic disease information (10 points), disease ID (5 points), disease alternatives (5 points), disease tags (5 points), disease phenotype (10 points), incidence rate information (10 points), disease-related genes (30 points), disease target drugs (15 points), and disease clinical trials (10 points), should be set according to requirements. Scoring criteria should be set for each item. For example, if the disease search webpage connects to four databases (Malacards / Orphanet / GARD / OMIM), 5 points are awarded if one of these databases contains the basic information for the searched disease; 7 points are awarded if two or more databases contain the basic information for the searched disease; and 10 points are awarded if three or more databases contain the basic information for the searched disease. The scores for other items in the content information for each disease should follow the descriptions in Table 1.
[0062] Table 1
[0063]
[0064] S200: Calculate the total number of views for each disease through several methods. This specifically includes steps S201 to S230.
[0065] It should be noted that users can click on the link and browse disease details through different channels, such as directly through the disease search page, or through other social media platforms (WeChat, Douyin, or Weibo), external links, etc. The weight of different click channels varies, and the weight of different actions on the search page also varies.
[0066] In one specific embodiment, the ranking method for popular diseases further includes:
[0067] S201. When the details page corresponding to the disease is marked, the corresponding page views are determined according to a preset first conversion factor; wherein, the first conversion factor is greater than 1.
[0068] It should be noted that being marked includes being saved, shared, or bookmarked. When a webpage detailing a disease is marked, it indicates a higher level of user interest in that disease. This level of interest can be calibrated using a first conversion factor. Different marking methods correspond to different sizes of the first conversion factor; a larger first conversion factor indicates a higher level of interest. The size of the first conversion factor is determined based on the actual application, and this embodiment does not impose specific limitations.
[0069] In one specific embodiment, the plurality of methods includes search web pages for popular diseases, and the statistics are obtained by calculating the total pageviews for each disease through several methods; such as Figure 4 As shown, it specifically includes steps S210 to S230.
[0070] S210. Statistical analysis of the first page views of each disease completed through search pages for popular diseases;
[0071] S220. Complete the second pageviews for each disease through means other than the search pages for popular diseases, and calculate the second equivalent pageviews based on the second pageviews and the preset second conversion factor, wherein the second conversion factor is greater than 1;
[0072] S230. Calculate the total number of views for each disease based on the first number of views and the second equivalent number of views.
[0073] It should be noted that the pageviews of search pages for popular diseases and links from other channels are counted separately. Links from other channels are used to calculate the second equivalent traffic using a second conversion factor. This second conversion factor varies depending on the channel; for example, the second conversion factor for WeChat or Douyin is 3, while the second conversion factor for Baidu or other search pages is 2. The second conversion factor is determined based on the specific application, and this embodiment does not impose any specific limitations. The total pageviews for each disease are determined by the sum of the first pageviews and the second equivalent pageviews. For example, if the first pageviews for a disease through search pages for popular diseases are 1000, and the first pageviews through WeChat are 200, then the total pageviews for that disease are: 1000 + 200 * 3 = 1600.
[0074] S300: Determine the popularity score of each disease according to the information richness and total views of each disease using a preset dynamic weight, and sort and display the diseases based on their popularity scores. This specifically includes steps S301 to S340.
[0075] Optionally, the preset dynamic weights include a dynamically changing first weight and a dynamically changing second weight, the sum of the first weight and the second weight is 1, and the first weight and the second weight are set by the following method:
[0076] S301, the first weight increases with the increase of the total number of views, and the second weight decreases with the increase of the total number of views.
[0077] It should be noted that in the initial stage, disease research may primarily focus on a small group of scientists. As research progresses, the number of users searching will increase, leading to greater attention and popularity. Therefore, in the initial stage, the richness of information about the disease will play a larger role in the ranking of disease popularity, while in the later stages, the proportion of page views will be greater. The specific values of the first and second weights will be determined based on the actual application, and this embodiment does not impose specific limitations.
[0078] Specifically, referring to Table 2, when the total number of views for a disease is 0-1000, the weight of the number of views is 0, and the weight of the information richness of the disease is 100%; when the total number of views for a disease is 1001-5000, the weight of the number of views is 10%, and the weight of the information richness of the disease is 90%; when the total number of views for a disease is 5001-10000, the weight of the number of views is 30%, and the weight of the information richness of the disease is 70%; when the total number of views for a disease is 10001-30000, the weight of the number of views is 50%, and the weight of the information richness of the disease is 50%; when the total number of views for a disease is 30001-100000, the weight of the number of views is 70%, and the weight of the information richness of the disease is 30%; when the total number of views for a disease is 100001-∞, the weight of the number of views is 90%, and the weight of the information richness of the disease is 10%.
[0079] Table 2
[0080] Total pageviews for disease details Product detail page view count weighting percentage (%) Weighting percentage of disease information richness (%) 0~1000 0 100 1001~5000 10 90 5001~10000 30 70 10001~30000 50 50 30001~100000 70 30 100001~∞ 90 10
[0081] Optionally, the preset dynamic weights include a dynamically changing first weight and a dynamically changing second weight, the sum of which is 1. The process involves determining the popularity score of each disease based on its information richness and total pageviews according to the preset dynamic weights, and then sorting and displaying the diseases based on their popularity scores. Figure 5 As shown, it specifically includes steps S310 to S340.
[0082] S310. Determine the converted pageviews for each disease based on the total pageviews for each disease and a preset conversion factor; the conversion factor is greater than 1.
[0083] S320. Determine the first score based on the converted page views of each disease and the preset first weight;
[0084] S330. Determine the second score based on the information richness value of each disease and the preset second weight;
[0085] S340. Determine the popularity score based on the first score and the second score, and sort and display the popularity of each disease according to the popularity score.
[0086] Specifically, the popularity score = (total pageviews / conversion factor) * first weight + information richness value * second weight. For example, if the total pageviews for a certain disease are 40,000, the conversion factor is 1,000, the first weight corresponding to the total pageviews of 40,000 is 70%, and the information richness value is calculated to be 60, then the popularity score = (40,000 / 1,000) * 70% + 60 * 30% = 40 * 0.7 + 60 * 0.3 = 46.
[0087] Implementing the embodiments of the present invention has the following beneficial effects: This embodiment first determines the information richness value and total number of views of each disease by using the content information of each disease in the database, and then determines the popularity score and sorts them according to the information richness value, total number of views and preset dynamic weight of each disease, thereby realizing the dynamic sorting of popular diseases and presenting them to users.
[0088] like Figure 6 As shown, this embodiment of the invention provides a ranking system for popular diseases, including:
[0089] The first module is used to determine the information richness value of each disease based on the content information of each disease in the database.
[0090] The second module is used to count the total number of views for each disease through several different methods.
[0091] The third module is used to determine the popularity score of each disease according to the information richness of each disease and the total number of views of each disease according to a preset dynamic weight, and to sort and display each disease according to its popularity score.
[0092] It is evident that the content of the above method embodiments is applicable to this system embodiment. The specific functions implemented in this system embodiment are the same as those in the above method embodiments, and the beneficial effects achieved are also the same as those achieved in the above method embodiments.
[0093] like Figure 7 As shown, an embodiment of the present invention provides a sorting device for popular diseases, comprising:
[0094] At least one processor;
[0095] At least one memory for storing at least one program;
[0096] When the at least one program is executed by the at least one processor, the at least one processor performs the method described above.
[0097] It is evident that the content of the above method embodiments is applicable to this device embodiment. The specific functions implemented in this device embodiment are the same as those in the above method embodiments, and the beneficial effects achieved are also the same as those achieved in the above method embodiments.
[0098] Furthermore, this application also discloses a computer program product or computer program stored in a computer-readable storage medium. A processor of a computer device can read the computer program from the computer-readable storage medium, and the processor executes the computer program, causing the computer device to perform the described method. Similarly, the content of the above method embodiments is applicable to this storage medium embodiment. The specific functions implemented in this storage medium embodiment are the same as those in the above method embodiments, and the beneficial effects achieved are also the same as those achieved in the above method embodiments.
[0099] The above is a detailed description of the preferred embodiments of the present invention. However, the present invention is not limited to the embodiments described. Those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention. All such equivalent modifications or substitutions are included within the scope defined by the claims of this application.
Claims
1. A method for ranking popular diseases, characterized in that, include: Determine the information richness value of each disease based on the content information of each disease in the database; Specifically, this includes: determining the score of each item in the content information of each disease according to a preset ratio; determining the score of each item in the content information of each disease based on the score, preset scoring rules, and the content information of each disease; and determining the information richness value of each disease based on the sum of the scores of each item in the content information of each disease. The content information of the disease includes basic disease information, disease ID, disease alternatives, disease tags, disease phenotype, incidence rate information, disease-related genes, disease target drugs, and disease clinical trials. The total number of views for each disease is calculated through several methods; specifically, this includes: calculating the first number of views for each disease through search pages for popular diseases; calculating the second number of views for each disease through other methods besides search pages for popular diseases, and calculating the second equivalent number of views based on the second number of views and a preset second conversion factor, wherein the second conversion factor is greater than 1; and calculating the total number of views for each disease based on the first number of views and the second equivalent number of views. Based on the information richness of each disease and the total number of views for each disease, a popularity score is determined according to a preset dynamic weight. Then, the diseases are ranked and displayed according to their popularity scores. Specifically, this includes: determining the converted view count for each disease based on its total view count and a preset conversion coefficient (where the conversion coefficient is greater than 1); determining a first score based on the converted view count and a preset first weight; determining a second score based on the information richness value of each disease and a preset second weight; determining a popularity score based on the first and second scores; and ranking and displaying the diseases according to their popularity scores.
2. The method according to claim 1, characterized in that, The score for each item in the content information of each disease is determined by the following method: Several score tiers are established, and the scoring conditions for each tier are determined based on the score levels. The score for each item in the content information of each disease is determined by comparing the content information of each disease with the scoring conditions.
3. The method according to claim 1, characterized in that, The method further includes: When the details page corresponding to a disease is marked, the corresponding number of views is determined according to a preset first conversion factor; wherein, the first conversion factor is greater than 1.
4. The method according to claim 1, characterized in that, The first weight and the second weight are set by the following method: The first weight increases with the increase of the total number of views, and the second weight decreases with the increase of the total number of views.
5. A ranking system for popular diseases, characterized in that, include: The first module is used to determine the information richness value of each disease based on the content information of each disease in the database. Specifically, this includes: determining the score of each item in the content information of each disease according to a preset ratio; determining the score of each item in the content information of each disease based on the score, preset scoring rules, and the content information of each disease; and determining the information richness value of each disease based on the sum of the scores of each item in the content information of each disease. The content information of the disease includes basic disease information, disease ID, disease alternatives, disease tags, disease phenotype, incidence rate information, disease-related genes, disease target drugs, and disease clinical trials. The second module is used to count the total number of views for each disease through several channels; specifically, it includes: counting the first number of views for each disease through search pages for popular diseases; counting the second number of views for each disease through channels other than search pages for popular diseases, and calculating the second equivalent number of views based on the second number of views and a preset second conversion factor, wherein the second conversion factor is greater than 1; and calculating the total number of views for each disease based on the first number of views and the second equivalent number of views. The third module is used to determine the popularity score of each disease according to the information richness of each disease and the total number of views of each disease, based on a preset dynamic weight, and to sort and display each disease according to its popularity score. Specifically, it includes: determining the converted view count of each disease based on the total number of views of each disease and a preset conversion coefficient; the conversion coefficient is greater than 1; determining a first score based on the converted view count of each disease and a preset first weight; determining a second score based on the information richness value of each disease and a preset second weight; determining a popularity score based on the first score and the second score; and sorting and displaying each disease according to its popularity score.
6. A sorting device for popular diseases, characterized in that, include: At least one processor; At least one memory for storing at least one program; When the at least one program is executed by the at least one processor, the at least one processor performs the method as described in any one of claims 1-4.
7. A storage medium storing a processor-executable program, characterized in that, The processor-executable program, when executed by the processor, is used to perform the method as described in any one of claims 1-4.