Optimized storage method and system for dental chair disinfection log
By dynamically updating the dictionary in the dental chair disinfection log and evaluating pattern changes using distribution entropy and Herringer distance, the compression efficiency of the Zstd algorithm is optimized, solving the problem of low compression efficiency in existing technologies and achieving more efficient disinfection log storage.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- FOSHAN SAFETY MEDICAL EQUIP CO LTD
- Filing Date
- 2026-03-24
- Publication Date
- 2026-06-19
AI Technical Summary
The existing Zstd compression algorithm cannot effectively capture dynamically changing, highly localized data patterns when compressing dental chair disinfection logs, resulting in low compression efficiency.
By calculating the distribution entropy and Herringer distance of the probability vector within the sliding window, the necessity of dictionary updates and compression is evaluated. The dictionary is dynamically updated to adapt to the pattern changes of the disinfection log, and the compression process is optimized by combining a string matching algorithm.
The Zstd algorithm has improved matching efficiency when compressing dental chair disinfection logs, thus enhancing the compression efficiency and real-time processing capabilities of disinfection logs.
Smart Images

Figure CN122240028A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of optimized storage technology. More specifically, this application relates to an optimized storage method and system for dental chair disinfection logs. Background Technology
[0002] Dental chair disinfection logs are crucial documents used by medical institutions to record the disinfection process of dental chairs. They detail the disinfection time, operator, disinfectant used, disinfection mode, and results for each chair, serving as vital evidence for ensuring compliance with medical procedures, infection control traceability, and maintaining healthy doctor-patient relationships. With the increasing informatization of clinics and increasingly stringent regulatory requirements, dental chair disinfection logs have transitioned from paper-based records to electronic storage. In the digital age, a medium to large-sized dental clinic generates a large number of disinfection logs daily, forming a massive dataset. Therefore, optimizing the storage of dental chair disinfection logs is of significant practical importance.
[0003] In existing technologies, the Zstandard (Zstd) compression algorithm, through long-distance matching and entropy coding, can effectively identify and compress repetitive sequences in data, demonstrating good adaptability to dental chair disinfection logs containing large amounts of text and structured information. However, dental chair disinfection logs are highly domain-specific and dynamically evolving; for example, clinics may introduce new operators, equipment models, or disinfectant brands over time. When directly using the traditional Zstd compression algorithm, the general-purpose or statically pre-trained dictionary used cannot capture these dynamically changing, highly localized data patterns, resulting in low compression efficiency and consequently affecting the real-time processing efficiency of the disinfection logs. Summary of the Invention
[0004] This application provides an optimized storage method for dental chair disinfection logs, aiming to solve the problem of low compression efficiency when using the Zstd algorithm to compress disinfection logs.
[0005] In a first aspect, this application provides an optimized storage method for dental chair disinfection logs. The optimized storage method includes: calculating the distribution entropy of disinfection logs within any sliding window; statistically analyzing the probability vectors within the sliding window and the previous adjacent window, and calculating the Herringer distance between the probability vectors, where the probability vectors include the frequency of occurrence of each disinfection log; using the product of the exponential function value of the difference between the distribution entropy of the sliding window and the adjacent window and the Herringer distance as the dictionary update necessity of the sliding window; calculating the compression necessity of the sliding window based on the string length of the disinfection logs; when the compression necessity is greater than a preset value, updating the dictionary and then compressing and storing the disinfection logs within the sliding window in response to the dictionary update necessity being greater than an update threshold; when the dictionary update necessity is less than or equal to the update threshold, directly compressing and storing the disinfection logs within the sliding window; and directly storing the disinfection logs when the compression necessity is less than or equal to a preset value. The dictionary update includes: statistically analyzing the frequency of each key-value pair in the dictionary and deleting key-value pairs with frequencies lower than a preset number.
[0006] By constructing distribution entropy, the concentration of disinfection logs within the sliding window is evaluated. The necessity of dictionary updates is constructed by multiplying the difference in distribution entropy between the sliding window and adjacent windows with the probability difference, thus assessing the value of updating the dictionary. The necessity of compression is constructed by combining the difference in average matching length between the sliding window and adjacent windows with the necessity of dictionary updates, thus assessing the intrinsic correlation of disinfection logs within the sliding window. Based on the necessity of compression and the necessity of dictionary updates, it is determined whether to update the dictionary. Using the updated dictionary to compress and store the disinfection logs improves the matching efficiency of the Zstd algorithm when compressing disinfection logs, thereby improving the compression efficiency of the disinfection logs and achieving optimized storage of the disinfection logs.
[0007] Furthermore, the distribution entropy of the arbitrary disinfection log within the sliding window includes: statistically analyzing the occurrence probability of different disinfection logs within the sliding window to obtain a probability vector; calculating the information entropy of the disinfection logs within the sliding window based on the probability vector; and normalizing the information entropy to obtain a normalized information entropy. The distribution entropy of the disinfection log within the sliding window is negatively correlated with the normalized information entropy.
[0008] By analyzing the occurrence probability of different disinfection logs within a sliding window to calculate the information entropy, and then normalizing the information entropy to construct the distribution entropy, the concentration of disinfection logs within the sliding window can be assessed.
[0009] Furthermore, the sliding window includes multiple sliding windows that divide all disinfection logs into preset time lengths.
[0010] Furthermore, the necessity of compression includes: obtaining the occurrence count and length of each string in the sliding window and adjacent windows respectively through a string matching algorithm, and performing a weighted summation of the string lengths to obtain the average matching length of the strings in the sliding window and adjacent windows respectively; the necessity of compression is positively correlated with the average matching length of the sliding window and negatively correlated with the average matching length of the adjacent windows.
[0011] By calculating the necessity of compression, we can identify the dominant repeating structure of strings within the sliding window, which makes it easier to determine whether the dictionary needs to be updated.
[0012] Furthermore, the string matching algorithm is either a rolling hash algorithm or a suffix array algorithm.
[0013] Furthermore, the weighted summation of the string lengths includes: counting the number of occurrences of strings of different lengths, using the ratio of the number of strings of the same length to the total number of strings as the weight, and using the sum of the products of the weights and lengths of all strings of different lengths as the result of the weighted summation.
[0014] Furthermore, the update threshold includes: dividing the historical disinfection log into multiple sliding windows, calculating the dictionary update necessity of each sliding window, using the dictionary update necessity as input to the Otsu threshold segmentation method, and outputting the update threshold.
[0015] By using the Otsu threshold, which determines the necessity of updating the dictionary of historical disinfection logs, as the update threshold, the system can automatically adapt to the baseline fluctuations in disinfection logs under different environments. This maintains high sensitivity to key changes while enhancing the versatility of the optimized storage method.
[0016] Furthermore, the preset quantity is 10% of the number of key-value pairs already existing in the dictionary.
[0017] Furthermore, the optimized storage method also includes: acquiring the disinfection logs of dental chairs and preprocessing them; wherein the preprocessing includes: filling the disinfection logs with nearest neighbor interpolation; uniformly parsing and converting heterogeneous JSON data from different devices into standardized logs with fixed fields; and mapping non-numerical categorical data to unique integer identifiers to obtain standardized disinfection logs.
[0018] By preprocessing the disinfection logs, standardized disinfection logs can be obtained, which facilitates efficient quantitative analysis and pattern recognition in the future.
[0019] In a second aspect, this application also provides an optimized storage system for dental chair disinfection logs, including a processor and a memory, wherein the memory stores computer program instructions that, when executed by the processor, implement the optimized storage method for dental chair disinfection logs according to the first aspect of this application.
[0020] This application has the following technical effects: By constructing distribution entropy, the concentration of disinfection logs within the sliding window is assessed. The degree of difference in the occurrence probability of disinfection logs between the sliding window and adjacent windows is evaluated. The difference in distribution entropy between the sliding window and adjacent windows is used to assess the degree of pattern change in the disinfection logs. The dictionary update necessity constructed based on the degree of difference and pattern change can assess the value of updating the dictionary. Combining the difference in average matching length between the sliding window and adjacent windows with the dictionary update necessity constructs compression necessity, which can assess the intrinsic correlation of disinfection logs within the sliding window. A comprehensive judgment based on compression necessity and dictionary update necessity determines whether to update the dictionary. Using the updated dictionary to compress and store the disinfection logs improves the matching efficiency of the Zstd algorithm when compressing disinfection logs, thereby improving the compression efficiency of disinfection logs and achieving optimized storage of disinfection logs. Attached Figure Description
[0021] Figure 1 This is a flowchart of an optimized storage method for dental chair disinfection logs according to an embodiment of this application.
[0022] Figure 2 This is a structural block diagram of an optimized storage system for dental chair disinfection logs according to an embodiment of this application.
[0023] Figure 3 This is a graph showing the distribution entropy versus dictionary update necessity according to an embodiment of this application.
[0024] Figure 4 This is a comparison chart of cumulative storage space according to embodiments of this application. Detailed Implementation
[0025] Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0026] The first aspect of this application provides an optimized method for storing dental chair disinfection logs. Figure 1 This is a flowchart of an optimized storage method for dental chair disinfection logs according to an embodiment of this application. The specific implementation process of this method will be described in detail below.
[0027] S101: Obtain disinfection logs.
[0028] In this embodiment, raw disinfection logs are acquired through a data acquisition system deployed in the dental treatment environment. Specifically, each dental chair is equipped with an embedded control and recording module. This module automatically collects data such as disinfection mode, start and end times, type and amount of disinfectant used, through built-in sensors. All collected information is integrated by the microcontroller within the module and transmitted in real time to the central log server in a structured data format via a local area network.
[0029] It should be noted that after receiving the raw disinfection logs, relevant preprocessing steps are required to prepare for subsequent compression and optimization analysis. These steps include: filling the disinfection logs with nearest neighbor interpolation; uniformly parsing and converting heterogeneous JSON data from different devices into standardized logs with fixed fields; and mapping non-numerical categorical data to unique integer identifiers to facilitate efficient quantitative analysis and pattern recognition, thereby obtaining standardized disinfection logs.
[0030] S102: Calculate the distribution entropy of the disinfection logs within any sliding window.
[0031] In this embodiment, the disinfection logs exhibit strong local pattern clustering, meaning that disinfection operations show a high degree of repetition and regularity within a specific time period. For example, when an operator is on duty, they may have unique operating habits, leading to situations where the same disinfection mode is used to perform consecutive similar disinfection operations on multiple dental chairs. This high degree of local repetition can cause a few similar logs to dominate the disinfection logs within a short period. Therefore, to effectively quantify this dominance phenomenon and facilitate the effectiveness of the dictionary in the Zstd compression algorithm in subsequent steps, this step constructs the distribution entropy of arbitrary disinfection logs to reflect the degree of concentration of the disinfection logs within the sliding window.
[0032] Specifically, all disinfection logs are divided into multiple sliding windows with a preset time length of N. For any sliding window, the probability of different disinfection logs appearing within the sliding window is calculated to obtain a probability vector. Based on the probability vector, the information entropy of the disinfection logs within the sliding window is calculated. In this embodiment, the time length... For one hour.
[0033] Based on the above characteristics, the distribution entropy of the disinfection logs within the sliding window is constructed, and the calculation method is as follows: In the formula This represents the distribution entropy of the disinfection logs within the sliding window. This represents the information entropy of the disinfection log within the sliding window. This represents a logarithmic function with the natural constant as its base. This indicates the number of disinfection logs contained within the sliding window. This represents the theoretical maximum value of the entropy of the disinfection log information within the sliding window, used for normalizing the information entropy.
[0034] Specifically, when the number of disinfection logs in the sliding window is 0 or 1, that is... or Then let the distribution entropy of the disinfection log within the sliding window be 1.
[0035] Within the sliding window, if the distribution pattern of the disinfection logs is relatively concentrated, such as when one operator performs disinfection work, there are many identical disinfection logs, then the information entropy of the disinfection logs calculated within the sliding window is small, and after normalization, it approaches 0, making the calculated distribution entropy close to 1. If the distribution pattern of the disinfection logs is relatively dispersed, such as when different operators perform disinfection work during shift handover, then there are many different disinfection logs within the sliding window, and the information entropy of the corresponding disinfection logs is large, and after normalization, it approaches 1, making the calculated distribution entropy close to 0.
[0036] This step also provides a method for calculating the distribution entropy of the disinfection log within the sliding window, including: The frequency of different disinfection logs within the sliding window is counted, and their mean is calculated. The obtained mean is used as the distribution entropy of the disinfection logs within the sliding window.
[0037] If the distribution of disinfection logs within the sliding window is more concentrated, there will be more identical disinfection logs, fewer types of disinfection logs, and a larger number of disinfection logs with the same content, resulting in a larger calculated mean. If the distribution of disinfection logs within the sliding window is more dispersed, there will be multiple types of disinfection logs, and fewer disinfection logs with the same content, resulting in a smaller calculated mean.
[0038] S103: Calculate the probability vectors within the sliding window and the previous adjacent window respectively, and calculate the Herringer distance of the probability vectors. The probability vectors include the occurrence frequency of each disinfection log. Use the exponential function value of the difference between the distribution entropy of the sliding window and the adjacent window and the Herringer distance as the dictionary update necessity of the sliding window.
[0039] In this embodiment, the log data also exhibits mode switching characteristics, and different mode switching may correspond to different update values. For example, when different operators hand over shifts, the operators' behavioral habits change, causing the distribution pattern of the log data to change. In this case, updating the dictionary has high update value, that is, it can improve the compression efficiency of subsequent log data. However, during training, multiple interns may use different functions of the equipment, resulting in a more chaotic log data distribution pattern. In this case, updating the dictionary has low update value, that is, the compression efficiency before and after the dictionary update will not change significantly.
[0040] Therefore, this step establishes the necessity of updating the dictionary in the sliding window, reflecting the value of updating the dictionary.
[0041] Specifically, the sliding windows preceding and adjacent to the sliding window corresponding to the disinfection log are recorded as adjacent windows. The occurrence probability of different disinfection logs within the two sliding windows is calculated to obtain a probability vector. The Heringer distance is then calculated based on the probability vector. The calculation process of the Heringer distance is a well-known technique and will not be elaborated here.
[0042] The necessity of updating the disinfection log dictionary is constructed based on the above features, and the calculation method is as follows: In the formula This indicates the necessity of updating the dictionary in the sliding window. This represents a function for calculating the Herringer distance. This represents the probability vector of the sliding window. This represents the probability vector of the adjacent windows. This represents an exponential function with the natural constant as its base. The distribution entropy of the sliding window, This represents the distribution entropy of the adjacent windows.
[0043] When the changes after switching disinfection log modes are minor, such as during shift changes between different operators, the dictionary should be updated promptly to adapt to the new operator's operating habits. This improves the compression efficiency of the algorithm, making dictionary updates highly valuable and the calculated necessity for updating the dictionary significant. However, when the changes after switching disinfection log modes are still significant, such as during training equipment use or testing equipment functions, the changes in the disinfection logs are irregular. In this case, updating the dictionary will not improve the compression efficiency of the algorithm, making dictionary updates less valuable and the calculated necessity for updating the dictionary less significant.
[0044] S104: Calculate the necessity of compressing the sliding window based on the string length of the disinfection log.
[0045] In this embodiment, when compressing disinfection logs using the Zstd algorithm, since the Zstd algorithm compresses based on the occurrence of repeated data within a matching window, it is necessary to further analyze the inherent correlation characteristics of the new distribution pattern within the matching window after considering different distribution patterns of the disinfection logs. For example, two different distribution patterns may both generate stable disinfection logs, but the first distribution pattern may contain more complex and repetitive long content, such as detailed device parameter reports, while the second distribution pattern may contain more simple log content. In the Zstd algorithm, the first distribution pattern will produce a higher compression ratio.
[0046] Therefore, the necessity of compressing the sliding window in this step reflects the inherent correlation of the disinfection logs within the sliding window.
[0047] Specifically, the number and length of strings appearing in the sliding window and adjacent windows are obtained by a string matching algorithm. The ratio of the number of occurrences of a string to the total number of strings is used as a weight, and the lengths of the strings are weighted and summed to obtain the average matching length of the strings in the sliding window and adjacent windows.
[0048] The necessity of compressing the sliding window is determined based on the above characteristics, and the calculation method is as follows: In the formula This indicates the necessity of compressing the sliding window. This represents a logarithmic function with base to the natural constant, used to smooth the data within the parentheses and prevent extremely large values. and These represent the average matching length of the strings within the sliding window and the adjacent window, respectively.
[0049] If updating the disinfection log dictionary is highly necessary, it indicates that the dictionary has high update value. Simultaneously, if the sliding window contains many complex and recurring long strings, the sliding window obtained by the string matching algorithm will contain a large number of long strings, resulting in a longer calculated average matching length. If the value is large, and there are many simple log entries in adjacent windows, then the string matching algorithm will find a large number of short strings in the adjacent windows, resulting in a shorter calculated average matching length. Smaller, making The larger value indicates that the disinfection log has entered a new and stable pattern, and the Zstd algorithm has a better compression effect on the new pattern. This means that the repeated structure of the longer strings in the disinfection log is more obvious, that is, the internal correlation is stronger, so the calculated compression necessity is greater.
[0050] S105: When the necessity of compression is greater than the preset value, in response to the necessity of dictionary update being greater than the update threshold, the dictionary is updated and then the disinfection logs in the sliding window are compressed and stored. In response to the necessity of dictionary update being less than or equal to the update threshold, the disinfection logs in the sliding window are directly compressed and stored. When the necessity of compression is less than or equal to the preset value, the disinfection logs are directly stored.
[0051] In this embodiment, the compression necessity of each sliding window obtained through the above steps reflects the compression effect of the Zstd algorithm on the new mode of disinfection logs. Therefore, this application guides the updating of the dictionary in the Zstd algorithm based on the compression necessity, and then realizes the compressed storage of disinfection logs based on the updated dictionary.
[0052] Specifically, when the necessity of compression is greater than a preset value, in response to the necessity of dictionary update being greater than the update threshold, it indicates that the disinfection log has significant update and compression value, and the dictionary should be updated. The updated dictionary is then used to compress and store the disinfection log within the sliding window. In response to the necessity of dictionary update being less than or equal to the update threshold, it indicates that the disinfection log has low update value, and the dictionary does not need to be updated. The disinfection log within the sliding window is then directly compressed and stored.
[0053] When the compression necessity is less than or equal to the preset value, it indicates that the disinfection log has low compression value, and the disinfection log should be stored directly.
[0054] In this embodiment, the method for obtaining the update threshold is as follows: after dividing the historical disinfection log into multiple sliding windows, the dictionary update necessity of each sliding window is calculated, and the dictionary update necessity is used as the input of Otsu's threshold segmentation method to output the update threshold.
[0055] The process of updating the dictionary includes: counting the frequency of each key-value pair in the dictionary, deleting key-value pairs with frequencies below a preset number, and adding key-value pairs generated during the compression process of the disinfection log in the sliding window to the dictionary. In this embodiment, the preset number is 10.
[0056] According to a second aspect of this application, an optimized storage system for dental chair disinfection logs is also provided. Figure 2 This is a structural block diagram of an optimized storage system for dental chair disinfection logs according to an embodiment of this application. Figure 2 As shown, the system 50 includes a processor and a memory. The memory stores computer program instructions, which, when executed by the processor, implement the optimized storage method for dental chair disinfection logs according to the first aspect of this application. The system also includes other components well-known to those skilled in the art, such as a communication bus and a communication interface; their configurations and functions are known in the art and will not be described further here.
Claims
1. An optimized storage method for dental chair disinfection logs, characterized in that, The optimized storage method includes: calculating the distribution entropy of disinfection logs within any sliding window; The probability vectors within the sliding window and the previous adjacent window are statistically analyzed, and the Herringer distance of the probability vectors is calculated. The probability vectors include the occurrence frequency of each disinfection log. The product of the exponential function value of the difference between the distribution entropy of the sliding window and the adjacent window and the Herringer distance is used as the necessity of dictionary update for the sliding window. Calculate the necessity of compressing the sliding window based on the string length of the disinfection log; When the necessity of compression is greater than the preset value, in response to the necessity of dictionary update being greater than the update threshold, the dictionary is updated and then the disinfection logs in the sliding window are compressed and stored. In response to the necessity of dictionary update being less than or equal to the update threshold, the disinfection logs in the sliding window are directly compressed and stored. When the compression necessity is less than or equal to the preset value, the disinfection log is stored directly; The dictionary update includes: counting the frequency of each key-value pair in the dictionary and deleting key-value pairs whose frequency is lower than a preset number.
2. The optimized storage method for dental chair disinfection logs according to claim 1, characterized in that, The distribution entropy of any disinfection log within the sliding window includes: statistically analyzing the probability of different disinfection logs appearing within the sliding window to obtain a probability vector; calculating the information entropy of the disinfection logs within the sliding window based on the probability vector; and normalizing the information entropy to obtain a normalized information entropy. The distribution entropy of the disinfection logs within the sliding window is negatively correlated with the normalized information entropy.
3. The optimized storage method for dental chair disinfection logs according to claim 1, characterized in that, The sliding window includes multiple sliding windows that divide all disinfection logs into preset time lengths.
4. The optimized storage method for dental chair disinfection logs according to claim 1, characterized in that, The necessity of compression includes: obtaining the occurrence count and length of each string in the sliding window and adjacent windows respectively through a string matching algorithm, and performing a weighted summation of the string lengths to obtain the average matching length of the strings in the sliding window and adjacent windows respectively; the necessity of compression is positively correlated with the average matching length of the sliding window and negatively correlated with the average matching length of the adjacent windows.
5. The optimized storage method for dental chair disinfection logs according to claim 4, characterized in that, The string matching algorithm can be either a rolling hash algorithm or a suffix array algorithm.
6. The optimized storage method for dental chair disinfection logs according to claim 4, characterized in that, The weighted summation of string lengths includes: counting the number of occurrences of strings of different lengths, using the ratio of the number of strings of the same length to the total number of strings as the weight, and using the sum of the products of the weights and lengths of all strings of different lengths as the weighted summation result.
7. The optimized storage method for dental chair disinfection logs according to claim 1, characterized in that, The update threshold includes: dividing the historical disinfection log into multiple sliding windows, calculating the dictionary update necessity of each sliding window, using the dictionary update necessity as input to the Otsu threshold segmentation method, and outputting the update threshold.
8. The optimized storage method for dental chair disinfection logs according to claim 1, characterized in that, The preset quantity is 10% of the number of key-value pairs already existing in the dictionary.
9. The optimized storage method for dental chair disinfection logs according to claim 1, characterized in that, The compressed storage of the disinfection logs within the sliding window includes: acquiring the disinfection logs of the dental chair and preprocessing them; wherein the preprocessing includes: filling the disinfection logs with neighbor interpolation; uniformly parsing and converting heterogeneous JSON data from different devices into standardized logs with fixed fields; and mapping non-numerical categorical data to unique integer identifiers to obtain standardized disinfection logs.
10. An optimized storage system for dental chair disinfection logs, characterized in that, include: The device includes a processor, a memory, and a communication interface. The memory stores a computer program that, when executed by the processor, implements the optimized storage method for dental chair disinfection logs as described in any one of claims 1 to 9.