Index updating method and device, computer device, and storage medium
By obtaining and calculating the index bloat ratio, and utilizing simulated annealing algorithms or bloat level update tasks, the database index bloat problem was solved, query performance and storage efficiency were optimized, and operating costs were reduced.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- PING AN TECH (SHENZHEN) CO LTD
- Filing Date
- 2024-09-06
- Publication Date
- 2026-06-30
AI Technical Summary
Database indexes are prone to bloat during frequent insert, delete, and update operations, leading to decreased query performance and inefficient use of storage space, which increases the operating costs of business systems.
By acquiring data index information, calculating the index bloat ratio, and calling corresponding update tasks based on the simulated annealing algorithm or bloat level to rebuild and update the index information, including index rebuilding, cleaning, compression, and index optimization strategies.
It enables fast and efficient updates to data indexes, prevents system query performance degradation caused by index bloat, optimizes storage space utilization, and reduces operating costs.
Smart Images

Figure CN119201938B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of data processing technology, specifically to the field of digital healthcare, and particularly to an index update method, apparatus, computer device, and storage medium. Background Technology
[0002] In modern business systems, databases play a crucial role as the core information storage and retrieval engine. They not only hold massive amounts of critical data but also support managers' real-time access to information and decision support through efficient query mechanisms. However, with increasingly frequent and complex business activities, the data update speed in databases has accelerated dramatically, bringing considerable challenges to database performance management.
[0003] Indexes are a key technology for optimizing database query performance, and their effectiveness directly determines the efficiency of data retrieval. However, in business systems, due to dynamic changes in information (such as the entry of new data, adjustment of data tables, and updates of data content), data tables frequently face insert, delete, and update operations. If these operations are not properly managed, they can easily lead to index fragmentation, which in turn causes index bloat. Index bloat not only increases the maintenance cost of the database, but more importantly, it significantly reduces query performance, prolongs response time, and also results in inefficient use of storage space, increasing the operating costs of the business system. Summary of the Invention
[0004] The purpose of this application is to provide an index updating method, apparatus, computer device, and storage medium to solve the problem that medical index information cannot be updated effectively and quickly.
[0005] To address the aforementioned technical problems, this application provides an index update method, employing the following technical solution:
[0006] Retrieve data index information;
[0007] Calculate the index bloat ratio based on the data index information;
[0008] Determine whether the index expansion ratio is greater than or equal to a preset expansion threshold;
[0009] If the index expansion ratio is greater than or equal to the preset expansion threshold, the data index information is concurrently reconstructed according to the simulated annealing algorithm, and the index status is updated according to the reconstructed data index information to obtain the first real-time index information.
[0010] If the index expansion ratio is less than the preset expansion threshold, the expansion level corresponding to the index expansion ratio is calculated, and the corresponding update task is called according to the expansion level to update the index status of the data index information to obtain the second real-time index information.
[0011] Furthermore, prior to the step of obtaining the data index information, the following steps are also included:
[0012] Acquire multi-source data;
[0013] The multi-source data is cleaned and normalized to obtain standard multi-source data.
[0014] Key elements are extracted from the standard multi-source data to obtain key data information;
[0015] A data index library is constructed based on the key information in the data;
[0016] Obtain historical query information, filter the data index database based on the historical query information, and obtain the data index information.
[0017] Furthermore, the step of obtaining data index information specifically includes:
[0018] Retrieve index information identifier;
[0019] Based on the index information identifier, extract the original data index information from the preset database;
[0020] The original data index information is preprocessed to obtain the data index information.
[0021] Furthermore, the step of calculating the index bloat ratio based on the data index information specifically includes:
[0022] Obtain the index data size and index data structure of the data index information;
[0023] Calculate the index data space based on the index data size and the index data structure;
[0024] Obtain the preset index space occupied, and calculate the index expansion ratio based on the index data space and the preset index space occupied.
[0025] Furthermore, the step of concurrently reconstructing the data index information according to the simulated annealing algorithm, and updating the index status according to the reconstructed data index information to obtain the first real-time index information specifically includes:
[0026] The data index information is divided into datasets to obtain a subset of the data index;
[0027] Obtain the data index structure of the data index subset, and select an initial data index structure from the data index structure;
[0028] Obtain the preset simulated annealing parameters;
[0029] The data index subset is distributed to multiple processing threads, and simulated annealing calculations are performed according to the preset simulated annealing parameters and the initial data index structure. The calculated optimal solution is used as the optimal data index structure.
[0030] Based on the optimal data index structure, the index is reconstructed to obtain valid data index information;
[0031] The data index information of the current record is updated based on the valid data index information to obtain the first real-time index information.
[0032] Furthermore, the step of reconstructing the index based on the optimal data index structure to obtain effective data index information specifically includes:
[0033] Delete the data index information;
[0034] Obtain the index structure information of the optimal data index structure, wherein the index structure information includes index key, index type, and index partition;
[0035] An index is created based on the index key, the index type, and the index partition to obtain optimized data index information;
[0036] The optimized data index information is validated by indicators, and the validated optimized data index information is taken as the valid data index information.
[0037] Furthermore, the step of calculating the expansion level corresponding to the index expansion ratio, and calling the corresponding update task to update the index status of the data index information according to the expansion level to obtain the second real-time index information specifically includes:
[0038] The expansion difference is obtained by calculating the difference between the index expansion ratio and the preset expansion threshold;
[0039] Obtain a preset expansion level mapping table, and identify the corresponding expansion level in the expansion level mapping table according to the expansion difference;
[0040] Obtain the expansion level information identifier corresponding to the expansion level, and extract the update task from the database based on the expansion level information identifier;
[0041] The data index information is adjusted according to the update task, and the data index information of the current record is updated according to the adjusted data index information to obtain the second real-time index information.
[0042] To address the aforementioned technical problems, this application also provides an index update device, which employs the following technical solution:
[0043] The information acquisition module is used to acquire data index information;
[0044] A ratio calculation module is used to calculate the index expansion ratio based on the data index information;
[0045] The data judgment module is used to determine whether the index expansion ratio is greater than or equal to a preset expansion threshold;
[0046] The first processing module is configured to, if the index expansion ratio is greater than or equal to the preset expansion threshold, perform concurrent index reconstruction on the data index information according to the simulated annealing algorithm, and update the index status according to the reconstructed data index information to obtain the first real-time index information;
[0047] The second processing module is used to calculate the expansion level corresponding to the index expansion ratio if the index expansion ratio is less than the preset expansion threshold, and call the corresponding update task according to the expansion level to update the index status of the data index information to obtain the second real-time index information.
[0048] To address the aforementioned technical problems, this application also provides a computer device that employs the following technical solution:
[0049] A computer device includes a memory and a processor, the memory storing computer-readable instructions, the processor executing the computer-readable instructions to implement the steps of the index update method as described in any of the preceding claims.
[0050] To address the aforementioned technical problems, this application also provides a computer-readable storage medium, employing the technical solution described below:
[0051] A computer-readable storage medium storing computer-readable instructions that, when executed by a processor, implement the steps of the index update method as described in any of the preceding claims.
[0052] Compared with existing technologies, the embodiments of this application have the following main advantages: This embodiment obtains data index information; calculates the index expansion ratio based on the data index information; determines whether the index expansion ratio is greater than or equal to a preset expansion threshold; if the index expansion ratio is greater than or equal to the preset expansion threshold, then concurrently rebuilds the data index information according to the simulated annealing algorithm, and updates the index status based on the rebuilt data index information to obtain first real-time index information; if the index expansion ratio is less than the preset expansion threshold, then the expansion level corresponding to the index expansion ratio is calculated, and the corresponding update task is called according to the expansion level to update the index status of the data index information to obtain second real-time index information. This achieves fast and effective updating of the data index, ensuring that the data index does not cause a decrease in system query performance due to expansion. Attached Figure Description
[0053] To more clearly illustrate the solutions in this application, the accompanying drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the accompanying drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0054] Figure 1 This is an exemplary system architecture diagram to which this application can be applied;
[0055] Figure 2 A flowchart of an embodiment of the index update method according to this application;
[0056] Figure 3 yes Figure 2 A flowchart of a specific implementation of step S10;
[0057] Figure 4 yes Figure 2 A flowchart of a specific implementation of step S20;
[0058] Figure 5 yes Figure 2 A flowchart of a specific implementation of step S40;
[0059] Figure 6 yes Figure 5 A flowchart of a specific implementation of step S405;
[0060] Figure 7 yes Figure 2 A flowchart of a specific implementation of step S50;
[0061] Figure 8This is a schematic diagram of the structure of an embodiment of the index update apparatus according to this application;
[0062] Figure 9 This is a schematic diagram of the structure of one embodiment of the computer device according to this application. Detailed Implementation
[0063] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains; the terminology used herein in the specification of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having," and any variations thereof, in the specification, claims, and foregoing drawings of this application, are intended to cover non-exclusive inclusion. The terms "first," "second," etc., in the specification, claims, or foregoing drawings of this application are used to distinguish different objects, not to describe a particular order.
[0064] In this document, the term "embodiment" means that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a non-related or alternative embodiment that is mutually exclusive with other embodiments. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments.
[0065] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
[0066] like Figure 1 As shown, system architecture 100 may include terminal device 101, network 102, and server 103. Terminal device 101 may be a laptop 1011, tablet 1012, or mobile phone 1013. Network 102 is used as a medium to provide a communication link between terminal device 101 and server 103. Network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, etc.
[0067] Users can use terminal device 101 to interact with server 103 via network 102 to receive or send messages, etc. Various communication client applications can be installed on terminal device 101, such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, social media platform software, etc.
[0068] Terminal device 101 can be various electronic devices with a display screen and support web browsing. In addition to laptops 1011, tablets 1012, or mobile phones 1013, terminal device 101 can also be an e-book reader, an MP3 player (Moving Picture Experts Group Audio Layer III), an MP4 player (Moving Picture Experts Group Audio Layer IV), a laptop computer, and a desktop computer, etc.
[0069] Server 103 can be a server that provides various services, such as a backend server that provides support for the pages displayed on terminal device 101.
[0070] It should be noted that the index update method provided in this application embodiment is generally executed by the server, and correspondingly, the index update device is generally set in the server.
[0071] It should be understood that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.
[0072] Continue to refer to Figure 2 A flowchart of an embodiment of the image recognition method according to this application is shown. The index update method includes the following steps:
[0073] Step S10: Obtain data index information;
[0074] In this embodiment, the data index information can be medical data index information, which includes: a patient basic information index: including the patient's name, ID number, contact information, home address, etc., used to quickly locate the patient's relevant medical records; a medical history index: recording the time, location, diagnosis results, treatment plans, etc. of each patient's medical visits, used to provide doctors with comprehensive patient medical history information; a doctor's personal information index: including the doctor's name, title, professional field, hospital, etc., used to allow patients to choose a suitable doctor for treatment; a doctor's medical history index: recording information on patients treated by the doctor, diagnosis results, treatment plans, etc., used to provide a basis for doctors' individual performance evaluation and medical quality assessment; and a hospital basic information index: including the hospital name, location, etc. The hospital's information includes: address, contact information, size, grade, and departmental structure, providing patients with an overview of the hospital; medical resource index, including information on medical equipment, number of medical staff, and number of beds, reflecting the hospital's medical service capabilities; electronic medical record index, detailing patients' medical history, diagnoses, treatment plans, and medication records; medical imaging index, including indexes of X-rays, CT scans, MRI scans, and other medical images, allowing doctors to quickly find and access patients' imaging data; health habit index, recording patients' health habits and lifestyle information, providing a basis for developing personalized health management and preventative measures; and family medical history index, recording patients' family medical history information, used to assess patients' genetic risks and develop targeted treatment plans.
[0075] Step S20: Calculate the index expansion ratio based on the data index information;
[0076] In this embodiment, the index expansion ratio refers to the ratio of the size of the index data space corresponding to the current data index information to the preset index space size allocated by the system. This ratio can be a percentage or a decimal value. The index data space refers to the space occupied by the total data of the data index information in the system memory, and its size is affected by the index data size and structure. The preset index space refers to the space that the index data can occupy by default in the system, and its size can be adjusted according to actual conditions.
[0077] Step S30: Determine whether the index expansion ratio is greater than or equal to a preset expansion threshold;
[0078] In this embodiment, the index expansion ratio and the preset expansion threshold are in the same numerical unit standard. By comparing the index expansion ratio and the preset expansion threshold, the relationship between the two can be determined, so as to facilitate further processing based on this relationship.
[0079] Step S40: If the index expansion ratio is greater than or equal to the preset expansion threshold, then the data index information is concurrently reconstructed according to the simulated annealing algorithm, and the index status is updated according to the reconstructed data index information to obtain the first real-time index information.
[0080] In this embodiment, Simulated Annealing (SA) is a general probabilistic optimization algorithm used to find an approximate global optimal solution to a problem within a large search space. This embodiment uses SA to optimize the data index structure of the data index information, thereby obtaining the optimal data retrieval structure. Then, the index is reconstructed based on the optimal data retrieval structure, and the index status is updated using the effective data index information after index reconstruction, thereby obtaining the first real-time index information.
[0081] Step S50: If the index expansion ratio is less than the preset expansion threshold, calculate the expansion level corresponding to the index expansion ratio, and call the corresponding update task according to the expansion level to update the index status of the data index information to obtain the second real-time index information.
[0082] In this embodiment, different index expansion ratios correspond to different expansion levels, and different expansion levels correspond to different update tasks. In this embodiment, the initial expansion levels are set to level one, level two, and level three. Level one expansion level corresponds to the first update task, level two expansion level corresponds to the second update task, and level three expansion level corresponds to the third update task. The first update task performs basic index cleanup and compression operations, including removing invalid and expired index entries, and performing simple compression on the index structure to optimize storage space and query efficiency. The second update task performs index optimization operations to eliminate index fragmentation and improve index query functionality. The third update task adjusts the index strategy (such as the granularity, number, and type of the index) to optimize index performance.
[0083] In this embodiment, the above method can be applied to a medical service system. By optimizing and updating the system index information within the medical service system, the real-time validity of accessed medical data is ensured. Specifically, in this embodiment, the medical service system can be one or more of a medical insurance system and a disease insurance system. The data index information is the medical data index information corresponding to medical-related data in the medical service system. The medical data index information is stored in the medical insurance system and the disease insurance system and retrieved from the databases of these systems. The first real-time index information and the second real-time index information are stored in these systems for easy viewing of subsequent update records.
[0084] This embodiment acquires data index information; calculates the index bloat ratio based on the data index information; determines whether the index bloat ratio is greater than or equal to a preset bloat threshold; if the index bloat ratio is greater than or equal to the preset bloat threshold, then concurrently rebuilds the data index information using a simulated annealing algorithm, and updates the index status based on the rebuilt data index information to obtain first real-time index information; if the index bloat ratio is less than the preset bloat threshold, then the bloat level corresponding to the index bloat ratio is calculated, and the corresponding update task is called based on the bloat level to update the index status of the data index information to obtain second real-time index information. This achieves fast and effective updates to the data index, ensuring that the data index does not cause a decrease in system query performance due to bloat.
[0085] In some optional implementations of this embodiment, the following steps are included before step S10:
[0086] Acquire multi-source data;
[0087] In this embodiment, multi-source data refers to system data from different data sources. In this embodiment, the multi-source data sources can be multi-source medical data, including Hospital Information System (HIS), Electronic Medical Record System (EMR), Laboratory Information System (LIS), Picture Archiving and Communication System (PACS), telemedicine platform, health monitoring equipment, etc. Multi-source medical data includes: Hospital Information System data: comprehensive medical data including patient basic information, medical records, diagnostic information, medical orders, medication use, and cost settlement; Electronic Medical Record System data: medical records (such as the medical record cover page, progress notes, surgical records), examination and test results, nursing records, etc.; Laboratory Information System data: reports and data analysis of various laboratory tests such as hematology, biochemistry, immunology, and microbiology; Picture Archiving and Communication System data: medical imaging data, such as X-rays, CT scans, MRI scans, and related diagnostic reports; and Telemedicine Platform data: remote consultation records and remote monitoring data generated by platforms with functions such as remote consultation, remote monitoring, and remote teaching. Health monitoring devices: Personal health data collected by wearable devices, home health monitoring instruments, etc., such as heart rate, blood pressure, blood sugar, steps, sleep quality, etc. Medical insurance system: Includes patient medical insurance information, reimbursement records, payment status, and other data.
[0088] The multi-source data is cleaned and normalized to obtain standard multi-source data.
[0089] In this embodiment, data cleaning includes: identifying and processing missing values (such as deleting and filling missing values), processing outliers (such as deleting, correcting, or retaining and specially marking them), correcting data errors (such as correcting data types, formats, and logic), and removing duplicate values (identifying and deleting identical data records); format normalization includes: unifying data types (converting data to a unified format), standardizing encoding (unifying multiple encoding formats), data standardization (eliminating differences between numerical and text data), data integration, and data validation.
[0090] Key elements are extracted from the standard multi-source data to obtain key data information;
[0091] In this embodiment, key data information can be medical data information, including: basic patient information such as name, gender, age, ID number, and contact information; diagnostic information such as disease name, ICD code, diagnosis date, and diagnosing doctor; treatment information such as surgery name, surgery date, surgeon, medication name, dosage, and administration time; examination results such as laboratory test results, imaging examination results, examination date, and examining doctor; and cost information such as medical expenses, payment method, and reimbursement status. Key elements can be extracted using natural language processing, regular expression matching, and other methods.
[0092] A data index library is constructed based on the key information in the data;
[0093] In this embodiment, the construction of the data index library includes the following steps: Designing the index structure, determining the index fields (i.e., key information elements), index type (such as full-text index, hash index, B-tree index, etc.), index hierarchy and relationships, etc. Selecting an indexing tool, choosing a suitable tool based on factors such as data scale, query requirements, and system environment, such as the indexing function in a relational database management system (such as MySQL, Oracle), or a dedicated search engine (such as Elasticsearch, Solr). Creating the index, using the selected indexing tool to index the key information of the data according to the designed index structure. Optimizing the index, optimizing the index based on query performance feedback, such as adjusting index parameters, adding caching, and optimizing query statements.
[0094] Obtain historical query information, filter the data index database based on the historical query information, and obtain the data index information.
[0095] In this embodiment, the steps for filtering information in the data index include: analyzing historical queries: statistically analyzing query frequency, query type, query keywords, etc., to identify high-frequency queries and hot data; adjusting the indexing strategy: adjusting the indexing strategy based on the historical query analysis results, such as increasing the index weight of high-frequency query keywords and optimizing the query path; and filtering information: when executing a new query, using the adjusted indexing strategy to filter information in the index to quickly locate the data most relevant to the query request, thereby obtaining data index information.
[0096] This embodiment obtains multi-source data; performs data cleaning and format normalization on the multi-source data to obtain standard multi-source data; extracts key elements from the standard multi-source data to obtain key data information; constructs a data index library based on the key data information; obtains historical query information, and filters the data index library based on the historical query information, thereby effectively obtaining data index information collected and processed from multiple different sources to provide reliable data support for subsequent operations.
[0097] Continue to refer to Figure 3 In some optional implementations of this embodiment, step S10 includes the following steps:
[0098] Step S101: Obtain the index information identifier;
[0099] In this embodiment, the index information identifier is an information identifier generated based on the search terms or query terms input by the user interface. The information identifier contains information content corresponding to the search terms or query terms, such as a specific patient ID, disease name, date range, etc.
[0100] Step S102: Extract the original data index information from the preset database according to the index information identifier;
[0101] In this embodiment, the index information identifier is a unique information identifier pointing to the original data index information. The index information identifier is used to perform matching queries in the preset database to extract the original data index information.
[0102] Step S103: Preprocess the original data index information to obtain the data index information.
[0103] In this embodiment, preprocessing includes data cleaning: deleting or correcting erroneous data, handling missing values, and removing duplicate records. Data transformation: converting data into a unified format or unit, such as converting date strings to date objects, and unifying different encoding systems. Data standardization: standardizing system data to ensure that data from different sources use the same terminology when describing the same thing. By preprocessing the raw data index information, it is ensured that the acquired data index information conforms to the system processing standards.
[0104] This embodiment obtains index information identifiers; extracts raw data index information from a preset database based on the index information identifiers; and preprocesses the raw data index information to obtain reliable and effective data index information, which facilitates subsequent calculation of the index expansion ratio.
[0105] Continue to refer to Figure 4 In some optional implementations of this embodiment, step S20 includes the following steps:
[0106] Step S201: Obtain the index data size and index data structure of the data index information;
[0107] In this embodiment, the index data size refers to the amount of storage space occupied by the index itself, usually measured in bytes, kilobytes (KB), megabytes (MB), or larger units. This size depends on the amount of data contained in the index, the data type (such as integers, floating-point numbers, strings, etc.), and the complexity of the index (such as single-column indexes, composite indexes, etc.). The index data structure determines how the index data is stored and accessed. Common index data structures include B-trees (such as B+ trees), hash tables, bitmap indexes, etc. Different data structures have different impacts on index performance (such as query speed, insertion / deletion performance) and storage space.
[0108] Step S202: Calculate the index data space based on the index data size and the index data structure;
[0109] In this embodiment, the steps for calculating the index data space include: obtaining the data type of the index columns, such as integers, floating-point numbers, strings, etc., and their storage requirements; estimating the space required for each index entry (i.e., an entry in the index) based on the data type and the characteristics of the index structure. For example, in a B+ tree index, an index entry may include a key value, a pointer (or offset), and possible additional information (such as transaction ID, deletion flag, etc.); determining the index data volume: obtaining the amount of data that the index will contain, i.e., the number of tables or rows in the table that the index will cover; and calculating the total storage space required for the index based on the size of the index entries, the overhead of the index structure, and the amount of index data, including estimating the hierarchical structure of the index structure, the number of nodes, and the number of index entries per node, thereby obtaining the index data space occupied by the index.
[0110] Step S203: Obtain the preset index space occupied, and calculate the index expansion ratio based on the index data space and the preset index space occupied.
[0111] In this embodiment, the index expansion ratio is calculated according to the following formula: Index expansion ratio = Index data space / Preset index occupied space, wherein the calculated value of the index expansion ratio can be a percentage value or a decimal value.
[0112] This embodiment obtains the index data size and index data structure of the data index information; calculates the index data space based on the index data size and index data structure; obtains the preset index occupied space; and calculates the index expansion ratio based on the index data space and the preset index occupied space. This effectively realizes the calculation of the data index expansion ratio based on the data space corresponding to the data index information, facilitating subsequent comparison and judgment operations.
[0113] Continue to refer to Figure 5 In some optional implementations of this embodiment, step S40 includes the following steps:
[0114] Step S401: Divide the data index information into datasets to obtain a subset of data indexes;
[0115] In this embodiment, the data index information can be medical data index information, and the data index subset can be a medical data index subset. For example, it can be initially divided according to the category of medical data index information, such as information release index information (such as the establishment and scale of health institutions, distribution of health personnel resources, and collection and allocation of health funds), and business construction index information (such as medical services, public health, drug supply, medical security, and health management). Then, it can be further divided according to the attributes of the medical data index information subset (such as time, region, disease, discipline, institution, etc.) to obtain the medical data index subset.
[0116] Step S402: Obtain the data index structure of the data index subset, and select an initial data index structure from the data index structure;
[0117] In this embodiment, the data index structure is extracted from each data index subset, and one or more of them are selected as the initial data index structure. The method for selecting the initial data index structure can be random, and the data index structure will be used as the starting point of the simulated annealing algorithm.
[0118] Step S403: Obtain preset simulated annealing parameters;
[0119] In this embodiment, the preset simulated annealing parameters refer to the parameters set by the simulated annealing algorithm, such as initial temperature, cooling rate, number of iterations, etc. These parameters can be adjusted and set according to the actual situation.
[0120] Step S404: Distribute the data index subset to multiple processing threads, and perform simulated annealing calculations according to the preset simulated annealing parameters and the initial data index structure, and use the calculated optimal solution as the optimal data index structure;
[0121] In this embodiment, a subset of the data index is distributed across multiple processing threads to execute the simulated annealing algorithm in parallel. Each thread performs iterative calculations based on the current initial data index structure and preset simulated annealing parameters. In each iteration, the algorithm attempts to make minor adjustments to the index structure (such as changing the order of the indexes, adjusting the depth or breadth of the indexes, etc.) and calculates the adjusted cost (or energy) function value. If the new index structure has a lower cost (i.e., better index performance), this new structure is accepted as the current solution; otherwise, a decision is made based on a certain probability (usually related to the current temperature and cost difference) to accept the worse solution, in order to avoid getting trapped in local optima. As the number of iterations increases and the temperature gradually decreases, the algorithm gradually converges to a global or near-global optimal solution, i.e., the optimal data index structure.
[0122] Step S405: Reconstruct the index according to the optimal data index structure to obtain valid data index information;
[0123] In this embodiment, index reconstruction refers to rebuilding the current data index information based on the optimal data index structure. After index reconstruction based on the optimal data index structure, the resulting effective data index structure will have higher retrieval efficiency and lower query cost.
[0124] Step S406: Update the data index information of the current record according to the valid data index information to obtain the first real-time index information.
[0125] In this embodiment, the data index information of the current record is the information data recorded in the index information update record. It can be updated by overwriting the data index information of the current record with the valid data index information. The data index information of the current record is the same as the data index information processed above.
[0126] This embodiment divides the data index information into a dataset to obtain a data index subset; obtains the data index structure of the data index subset, and selects an initial data index structure from the data index structure; obtains preset simulated annealing parameters; distributes the data index subset to multiple processing threads, and performs simulated annealing calculations according to the preset simulated annealing parameters and the initial data index structure, taking the calculated optimal solution as the optimal data index structure; reconstructs the index according to the optimal data index structure to obtain valid data index information; and updates the data index information of the current record according to the valid data index information, thereby effectively and quickly updating the index status according to the data index information to obtain the first real-time index information.
[0127] Continue to refer to Figure 6 In some optional implementations of this embodiment, step S405 includes the following steps:
[0128] Step S4051: Delete the data index information;
[0129] In this embodiment, the optimal data index structure is used to determine which data index information needs to be deleted. The deletion command can be executed using commands or tools provided by the database system. After the deletion operation is completed, it can be confirmed that the data index information has been correctly deleted, and it can be checked whether the database performance and data access have been affected.
[0130] Step S4052: Obtain the index structure information of the optimal data index structure, wherein the index structure information includes index key, index type, and index partition;
[0131] In this embodiment, the index structure information includes the index key (the column or combination of columns used for indexing), the index type (such as B-tree, hash, bitmap, etc.), and the index partition (if applicable, that is, the index data is distributed to different physical regions to improve performance).
[0132] Step S4053: Create an index based on the index key, the index type, and the index partition to obtain optimized data index information;
[0133] In this embodiment, the index creation steps include: writing SQL statements to create a new index based on the extracted index structure information, which can be achieved using the CREATE INDEX statement, specifying the index name, index key, index type, and any partitioning options; and executing the SQL statements in the database management system to create the new data index. During the index creation process, database performance and resource usage can be monitored to ensure that the operation does not negatively impact business operations.
[0134] Step S4054: Perform indicator verification on the optimized data index information, and take the verified optimized data index information as the valid data index information.
[0135] In this embodiment, the validity of the data index information obtained after the index is created is determined by verifying the optimized data index information with indicators. When the optimized data index information meets all the verification indicators, the optimized data index information is output as valid data index information.
[0136] This embodiment involves deleting the data index information; obtaining the index structure information of the optimal data index structure, wherein the index structure information includes index key, index type, and index partition; creating an index based on the index key, index type, and index partition to obtain optimized data index information; verifying the optimized data index information using metrics, and using the verified optimized data index information as the valid data index information. This effectively rebuilds the index based on the index structure information of the data index information to obtain valid data index information, facilitating subsequent updates to the data index information.
[0137] Continue to refer to Figure 7 In some optional implementations of this embodiment, step 50 includes the following steps:
[0138] Step 501: Calculate the difference between the index expansion ratio and the preset expansion threshold to obtain the expansion difference;
[0139] In this embodiment, the preset expansion threshold is a value set according to system performance requirements and storage efficiency, used to determine whether the index needs to be adjusted. The preset expansion threshold can be set and adjusted according to the actual situation.
[0140] Step 502: Obtain a preset expansion level mapping table, and identify the corresponding expansion level in the expansion level mapping table according to the expansion difference;
[0141] In this embodiment, the inflation level mapping table defines inflation levels corresponding to different inflation difference ranges. Each level corresponds to a different update task. Based on the calculated inflation difference, a match is performed in the inflation level mapping table to identify the inflation level corresponding to the inflation difference. In this embodiment, the inflation levels are set to Level 1, Level 2, and Level 3. The inflation difference for Level 1 is 0-0.3, for Level 2 it is 0.3-0.6, and for Level 3 it is 0.6-0.9. The inflation differences corresponding to each of the above inflation levels can be set and adjusted according to actual conditions.
[0142] Step 503: Obtain the expansion level information identifier corresponding to the expansion level, and extract the update task from the database according to the expansion level information identifier;
[0143] In this embodiment, the expansion level information identifier is a unique information identifier for the corresponding update task. The update task is effectively extracted by matching and querying the database based on the expansion level information identifier.
[0144] Step 504: Adjust the data index information according to the update task, and update the data index information of the current record according to the adjusted data index information to obtain the second real-time index information.
[0145] In this embodiment, the update tasks include a first update task, a second update task, and a third update task. The first update task performs basic index cleanup and compression operations, including removing invalid and expired index entries and performing simple compression on the index structure to optimize storage space and query efficiency. The second update task performs index optimization operations to eliminate index fragmentation and improve index query functionality. The third update task adjusts the index strategy (such as the granularity, number, and type of the index) to optimize index performance. After the data index information is adjusted according to the corresponding update task, the adjusted data index information is used as the second real-time index information.
[0146] This embodiment calculates the inflation difference based on the difference between the index inflation ratio and the preset inflation threshold; obtains a preset inflation level mapping table; identifies the corresponding inflation level in the inflation level mapping table based on the inflation difference; obtains the inflation level information identifier corresponding to the inflation level; extracts the update task from the database based on the inflation level information identifier; adjusts the data index information according to the update task; and updates the data index information of the current record based on the adjusted data index information, thereby effectively updating the index status according to the inflation level corresponding to the index inflation ratio to obtain the second real-time index information.
[0147] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing related hardware with computer-readable instructions. These computer-readable instructions can be stored in a computer-readable storage medium. When executed, the program can include the processes of the embodiments of the above methods. The aforementioned storage medium can be a non-volatile storage medium such as a magnetic disk, optical disk, or read-only memory (ROM), or random access memory (RAM).
[0148] It should be understood that although the steps in the flowcharts of the accompanying figures are shown sequentially as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the accompanying figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.
[0149] Further reference Figure 8 As a response to the above Figure 1 To implement the method shown, this application provides an embodiment of an index update apparatus, which is similar to... Figure 1 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.
[0150] like Figure 8 As shown, the index update device 600 described in this embodiment includes: an information acquisition module 601, a ratio calculation module 602, a data judgment module 603, a first processing module 604, and a second processing module 605. Wherein:
[0151] Information acquisition module 601 is used to acquire data index information;
[0152] The ratio calculation module 602 is used to calculate the index expansion ratio based on the data index information.
[0153] Data judgment module 603 is used to determine whether the index expansion ratio is greater than or equal to a preset expansion threshold;
[0154] The first processing module 604 is used to perform concurrent index reconstruction on the data index information according to the simulated annealing algorithm if the index expansion ratio is greater than or equal to the preset expansion threshold, and to update the index status according to the reconstructed data index information to obtain the first real-time index information.
[0155] The second processing module 605 is used to calculate the expansion level corresponding to the index expansion ratio if the index expansion ratio is less than the preset expansion threshold, and call the corresponding update task according to the expansion level to update the index status of the data index information to obtain the second real-time index information.
[0156] This embodiment, by employing the aforementioned device, can acquire data index information; calculate the index bloat ratio based on the data index information; determine whether the index bloat ratio is greater than or equal to a preset bloat threshold; if the index bloat ratio is greater than or equal to the preset bloat threshold, then concurrently rebuild the data index information using a simulated annealing algorithm, and update the index status based on the rebuilt data index information to obtain first real-time index information; if the index bloat ratio is less than the preset bloat threshold, then calculate the bloat level corresponding to the index bloat ratio, and call the corresponding update task based on the bloat level to update the index status of the data index information to obtain second real-time index information. This achieves fast and effective updates to the data index, ensuring that the data index does not cause a decrease in system query performance due to bloat.
[0157] To address the aforementioned technical problems, embodiments of this application also provide a computer device. Please refer to [link / reference needed]. Figure 9 , Figure 9 This is a basic structural block diagram of the computer device in this embodiment.
[0158] The computer device 7 includes a memory 71, a processor 72, and a network interface 73 that are interconnected via a system bus. It should be noted that only the computer device 7 with components 71-73 is shown in the figure; however, it should be understood that it is not required to implement all the shown components, and more or fewer components can be implemented alternatively. Those skilled in the art will understand that the computer device described here is a device capable of automatically performing numerical calculations and / or information processing according to pre-set or stored instructions, and its hardware includes, but is not limited to, microprocessors, application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), digital signal processors (DSPs), embedded devices, etc.
[0159] The computer device can be a desktop computer, laptop, handheld computer, or cloud server, etc. The computer device can interact with the user via a keyboard, mouse, remote control, touchpad, or voice control.
[0160] The memory 71 includes at least one type of readable storage medium, including flash memory, hard disk, multimedia card, card-type memory (e.g., SD or DX memory), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 71 may be an internal storage unit of the computer device 7, such as the hard disk or memory of the computer device 7. In other embodiments, the memory 71 may also be an external storage device of the computer device 7, such as a plug-in hard disk, smart media card (SMC), secure digital (SD) card, flash card, etc., equipped on the computer device 7. Of course, the memory 71 may include both the internal storage unit and its external storage device of the computer device 7. In this embodiment, the memory 71 is typically used to store the operating system and various application software installed on the computer device 7, such as computer-readable instructions for index update methods. In addition, the memory 71 can also be used to temporarily store various types of data that have been output or will be output.
[0161] In some embodiments, the processor 72 may be a central processing unit (CPU), a controller, a microcontroller, a microprocessor, or other data processing chip. The processor 72 is typically used to control the overall operation of the computer device 7. In this embodiment, the processor 72 is used to execute computer-readable instructions stored in the memory 71 or to process data, such as executing computer-readable instructions for the index update method.
[0162] The network interface 73 may include a wireless network interface or a wired network interface, which is typically used to establish communication connections between the computer device 7 and other electronic devices.
[0163] This embodiment, by employing the aforementioned computer equipment, can acquire data index information; calculate the index bloat ratio based on the data index information; determine whether the index bloat ratio is greater than or equal to a preset bloat threshold; if the index bloat ratio is greater than or equal to the preset bloat threshold, then concurrently rebuild the data index information using a simulated annealing algorithm, and update the index status based on the rebuilt data index information to obtain first real-time index information; if the index bloat ratio is less than the preset bloat threshold, then calculate the bloat level corresponding to the index bloat ratio, and call the corresponding update task based on the bloat level to update the index status of the data index information to obtain second real-time index information. This achieves fast and effective updates to the data index, ensuring that the data index does not cause a decrease in system query performance due to bloat.
[0164] This application also provides another embodiment, namely, providing a computer-readable storage medium storing computer-readable instructions that can be executed by at least one processor to cause the at least one processor to perform the steps of the index update method as described above.
[0165] This embodiment, by employing the aforementioned computer-readable storage medium, can acquire data index information; calculate the index bloat ratio based on the data index information; determine whether the index bloat ratio is greater than or equal to a preset bloat threshold; if the index bloat ratio is greater than or equal to the preset bloat threshold, then concurrently rebuild the data index information using a simulated annealing algorithm, and update the index status based on the rebuilt data index information to obtain first real-time index information; if the index bloat ratio is less than the preset bloat threshold, then calculate the bloat level corresponding to the index bloat ratio, and call the corresponding update task based on the bloat level to update the index status of the data index information to obtain second real-time index information. This achieves fast and effective updates to the data index, ensuring that the data index does not cause a decrease in system query performance due to bloat.
[0166] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk), and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.
[0167] Obviously, the embodiments described above are only some embodiments of this application, not all embodiments. The accompanying drawings show preferred embodiments of this application, but do not limit the patent scope of this application. This application can be implemented in many different forms; rather, the purpose of providing these embodiments is to provide a more thorough and comprehensive understanding of the disclosure of this application. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing specific embodiments, or make equivalent substitutions for some of the technical features. Any equivalent structures made using the content of this application's specification and drawings, directly or indirectly applied to other related technical fields, are similarly within the scope of patent protection of this application.
Claims
1. An index update method, characterized in that, Includes the following steps: Retrieve data index information; Calculate the index bloat ratio based on the data index information; Determine whether the index expansion ratio is greater than or equal to a preset expansion threshold; If the index inflation ratio is greater than or equal to the preset inflation threshold, then the data index information is concurrently reconstructed according to the simulated annealing algorithm, and the index status is updated according to the reconstructed data index information to obtain the first real-time index information. Specifically, the data index information is divided into datasets to obtain a data index subset; the data index structure of the data index subset is obtained, and an initial data index structure is selected from the data index structure; preset simulated annealing parameters are obtained; the data index subset is allocated to multiple processing threads, and simulated annealing calculation is performed according to the preset simulated annealing parameters and the initial data index structure, with the calculated optimal solution used as the optimal data index structure; index reconstruction is performed according to the optimal data index structure to obtain valid data index information; and the data index information of the current record is updated according to the valid data index information to obtain the first real-time index information. If the index expansion ratio is less than the preset expansion threshold, the expansion level corresponding to the index expansion ratio is calculated, and the corresponding update task is called according to the expansion level to update the index status of the data index information, thereby obtaining the second real-time index information.
2. The index update method according to claim 1, characterized in that, Before the step of obtaining data index information, the following steps are also included: Acquire multi-source data; The multi-source data is cleaned and normalized to obtain standard multi-source data. Key elements are extracted from the standard multi-source data to obtain key data information; A data index library is constructed based on the key information in the data; Obtain historical query information, filter the data index database based on the historical query information, and obtain the data index information.
3. The index update method according to claim 1, characterized in that, The step of obtaining data index information specifically includes: Retrieve index information identifier; Based on the index information identifier, extract the original data index information from the preset database; The original data index information is preprocessed to obtain the data index information.
4. The index update method according to claim 1, characterized in that, The step of calculating the index inflation ratio based on the data index information specifically includes: Obtain the index data size and index data structure of the data index information; Calculate the index data space based on the index data size and the index data structure; Obtain the preset index space occupied, and calculate the index expansion ratio based on the index data space and the preset index space occupied.
5. The index update method according to claim 1, characterized in that, The step of reconstructing the index based on the optimal data index structure to obtain effective data index information specifically includes: Delete the data index information; Obtain the index structure information of the optimal data index structure, wherein the index structure information includes index key, index type, and index partition; An index is created based on the index key, the index type, and the index partition to obtain optimized data index information; The optimized data index information is validated by indicators, and the validated optimized data index information is taken as the valid data index information.
6. The index update method according to claim 1, characterized in that, The steps of calculating the expansion level corresponding to the index expansion ratio, and calling the corresponding update task to update the index status of the data index information according to the expansion level to obtain the second real-time index information, specifically include: The expansion difference is obtained by calculating the difference between the index expansion ratio and the preset expansion threshold; Obtain a preset expansion level mapping table, and identify the corresponding expansion level in the expansion level mapping table according to the expansion difference; Obtain the expansion level information identifier corresponding to the expansion level, and extract the update task from the database based on the expansion level information identifier; The data index information is adjusted according to the update task, and the data index information of the current record is updated according to the adjusted data index information to obtain the second real-time index information.
7. An index update device, characterized in that, The index updating device implements the steps of the index updating method as described in any one of claims 1 to 6, and the index updating device includes: The information acquisition module is used to acquire data index information; A ratio calculation module is used to calculate the index expansion ratio based on the data index information; The data judgment module is used to determine whether the index expansion ratio is greater than or equal to a preset expansion threshold; The first processing module is configured to, if the index expansion ratio is greater than or equal to the preset expansion threshold, perform concurrent index reconstruction on the data index information according to the simulated annealing algorithm, and update the index status according to the reconstructed data index information to obtain the first real-time index information; The second processing module is used to calculate the expansion level corresponding to the index expansion ratio if the index expansion ratio is less than the preset expansion threshold, and call the corresponding update task according to the expansion level to update the index status of the data index information to obtain the second real-time index information.
8. A computer device, characterized in that, The system includes a memory and a processor, wherein the memory stores computer-readable instructions, and the processor executes the computer-readable instructions to implement the steps of the index update method as described in any one of claims 1 to 6.
9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-readable instructions, which, when executed by a processor, implement the steps of the index update method as described in any one of claims 1 to 6.