A multi-level data storage method and system

By adopting a two-factor data hierarchy model based on time and criticality in the storage of data for new energy vehicle driving, combined with compression algorithms and partitioning strategies, the problems of wasted storage resources and low data access efficiency in existing storage solutions are solved, achieving efficient data storage and fast access.

CN122018820BActive Publication Date: 2026-06-19CHONGQING JINKANG NEW ENERGY VEHICLE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHONGQING JINKANG NEW ENERGY VEHICLE CO LTD
Filing Date
2026-04-13
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing data storage solutions for new energy vehicles, relying solely on time or criticality-based hierarchical storage methods leads to the migration of highly critical historical data to low-speed storage media, affecting the efficiency of fault tracing. Low-critical real-time data occupies high-speed storage resources, and existing storage solutions are difficult to adapt to the business characteristics of write-intensive and graded filtering reads, resulting in high data write latency, slow query response, and serious waste of storage space.

Method used

A time- and criticality-based two-factor data grading model is adopted. The storage level of vehicle driving data is determined by time grading sub-model, criticality grading sub-model and two-factor mapping sub-model. Different compression algorithms and partitioning strategies are combined to store the data in the corresponding storage media to achieve dynamic migration and storage optimization.

Benefits of technology

It improves data storage efficiency, reduces storage resource waste, increases storage space utilization by 40%-60%, reduces hardware costs by 25%-30%, reduces query latency and IO overhead, and achieves efficient data access and fault tracing.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122018820B_ABST
    Figure CN122018820B_ABST
Patent Text Reader

Abstract

This application provides a multi-level data storage method and system. The method includes: reading vehicle driving data from onboard sensors and a telematics processor; determining the data storage level corresponding to the vehicle driving data using a time- and key-based two-factor data hierarchy model; and storing the vehicle driving data according to the hierarchical storage configuration rules corresponding to the data storage level. This application uses a two-factor data hierarchy model to determine the storage level of vehicle driving data, improving data storage efficiency and avoiding waste of storage resources.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data processing technology, and in particular to a method and system for multi-level data storage. Background Technology

[0002] Currently, the amount of driving data (including battery SOC, motor speed, vehicle speed, braking frequency, ambient temperature, etc.) for new energy vehicles is enormous, with a single vehicle generating up to 10GB of data per day. 50GB. Existing data storage solutions for new energy vehicle driving mostly adopt single-factor hierarchical storage, which is mainly divided into two categories: one is time-based hierarchical storage, which divides data into real-time data, recent data, and historical data according to the time of data generation, and stores them in different media such as memory, SSD, and HDD respectively; the other is key-based hierarchical storage, which divides data into different key levels according to the importance of data to vehicle safety and fault diagnosis and stores them in storage media with corresponding performance.

[0003] The aforementioned hierarchical storage methods relying solely on time or criticality have significant drawbacks: Firstly, when hierarchically categorized by time, highly critical historical data is easily migrated to low-speed storage media, reducing the efficiency of fault tracing and data analysis; secondly, when hierarchically categorized by criticality, low-critical real-time data occupies high-speed storage resources, resulting in wasted storage space. Thirdly, existing storage solutions mostly employ relational databases or conventional distributed file systems, which are ill-suited to the write-intensive and tiered reading characteristics of new energy vehicle driving data. These solutions generally suffer from high data write latency and slow query response times, failing to meet the demands for efficient storage and rapid access to large-scale vehicle data. Summary of the Invention

[0004] In view of this, the purpose of this application is to provide at least one method and apparatus for multi-level data storage, which uses a two-factor data hierarchical model to determine the storage level of vehicle driving data, thereby improving data storage efficiency and avoiding waste of storage resources.

[0005] This application mainly includes the following aspects:

[0006] In a first aspect, embodiments of this application provide a multi-level data storage method, the method comprising: reading vehicle driving data from vehicle-mounted sensors and a telematics processor; determining the data storage level corresponding to the vehicle driving data using a two-factor data grading model based on time and criticality; and storing the vehicle driving data according to a grading storage configuration rule corresponding to the data storage level of the vehicle driving data.

[0007] In one possible implementation, the two-factor data grading model includes a time-based grading sub-model, a key-based grading sub-model, and a two-factor mapping sub-model. The step of determining the data storage level corresponding to vehicle driving data using the time- and key-based two-factor data grading model includes: inputting the data collection time corresponding to the vehicle driving data into the time-based grading sub-model to determine the query level corresponding to the vehicle driving data; inputting the data types related to the vehicle driving data into the key-based grading sub-model to determine the key level corresponding to the vehicle driving data; and inputting the query level and key level corresponding to the vehicle driving data into the two-factor mapping sub-model to determine the data storage level corresponding to the vehicle driving data. The two-factor mapping sub-model describes the mapping relationship between the query level, key level, and data storage level corresponding to the vehicle driving data.

[0008] In one possible implementation, the query level includes real-time, recent, and historical levels. The time-level sub-model determines the query level corresponding to vehicle driving data as follows: calculate the time difference between the data collection time and the current time; if the time difference is less than or equal to a first time period, the query level of the vehicle driving data is determined to be real-time; if the time difference is greater than the first time period and less than or equal to a second time period, the query level of the vehicle driving data is determined to be recent; if the time difference is greater than the second time period, the query level of the vehicle driving data is determined to be historical, wherein the second time period is greater than the first time period.

[0009] In one possible implementation, vehicle driving data includes multiple driving parameter sampling data at the same sampling time. The driving parameter sampling data includes the data type corresponding to the driving parameter and the sampling data value. The criticality grading sub-model determines the criticality level corresponding to the vehicle driving data in the following ways: Based on the data type corresponding to the driving parameter sampling data, it determines the basic criticality score and criticality indicator weight corresponding to the driving parameter sampling data; based on the dynamic threshold library corresponding to the vehicle and the sampling data value corresponding to the driving parameter, it determines the state coefficient corresponding to the driving parameter sampling data; based on the basic criticality score, criticality indicator weight, and state coefficient corresponding to each driving parameter sampling data, it determines the criticality judgment score corresponding to the vehicle driving data; it searches a preset criticality level judgment table, and based on the criticality judgment score corresponding to the vehicle driving data, determines the criticality level corresponding to the vehicle driving data. The preset criticality level judgment table describes the mapping relationship between the criticality level corresponding to the vehicle driving data and the criticality judgment score.

[0010] In one possible implementation, the step of determining the criticality score corresponding to vehicle driving data based on the criticality base score, criticality index weight, and state coefficient corresponding to each driving parameter sampling data includes: calculating a first product between the criticality base score and the state coefficient corresponding to the driving parameter sampling data, and determining the first product as the first criticality index score corresponding to the driving parameter sampling data; calculating a second product between the first criticality index score and the criticality index weight corresponding to each driving parameter sampling data, and determining the second product as the second criticality index score corresponding to the driving parameter sampling data; and determining the sum of the second criticality index scores corresponding to all driving parameter sampling data as the criticality score of the vehicle driving data.

[0011] In one possible implementation, the hierarchical storage configuration rules record the mapping relationship between data storage level, data storage table, compression algorithm, partitioning strategy, storage location, and data retention period. The step of storing vehicle driving data according to the hierarchical storage configuration rules corresponding to the data storage level of the vehicle driving data includes: writing the vehicle driving data into the corresponding data storage table according to the compression algorithm and partitioning strategy corresponding to the data storage level of the vehicle driving data; and refreshing the storage medium where the data storage table is located.

[0012] In one possible implementation, the data storage levels include a first storage level, a second storage level, and a third storage level from high to low. Vehicle driving data is written to the corresponding data storage table in the following manner: if the vehicle driving data is at the first storage level, the ZSTD compression algorithm and a minute-based partitioning strategy are used to write the vehicle driving data to the corresponding data storage table; if the vehicle driving data is at the second storage level, the LZ4 compression algorithm and an hour-based partitioning strategy are used to write the vehicle driving data to the corresponding data storage table; if the vehicle driving data is at the third storage level, the SNAPPY compression algorithm and a day-based partitioning strategy are used to write the vehicle driving data to the corresponding data storage table.

[0013] In one possible implementation, the method further includes: determining the vehicle driving data to be migrated in the data storage table that meets the migration trigger conditions based on the data retention period corresponding to the data storage table and the data storage level of the vehicle driving data in the data storage table; migrating the vehicle driving data to be migrated from its original data storage table to the target data storage table; after the migration is completed, deleting the vehicle driving data to be migrated from its corresponding original data storage table, updating the storage medium where the original data storage table is located and the storage medium where the target data storage table is located; and generating a data migration log corresponding to the migration.

[0014] In one possible implementation, before determining the data storage level corresponding to the vehicle driving data using a two-factor data grading model based on time and keyness, the method further includes: converting the driving parameter sampling data into a preset acquisition format; and performing data verification on the driving parameter sampling data after format conversion to obtain multiple driving parameter sampling data that have passed the data verification.

[0015] Secondly, embodiments of this application also provide a multi-level data storage system, comprising: a data acquisition module for reading vehicle driving data from onboard sensors and a telematics processor; a two-factor grading module for determining the data storage level corresponding to the vehicle driving data using a time- and key-based two-factor data grading model; and a storage module for storing the vehicle driving data according to a grading storage configuration rule corresponding to the data storage level of the vehicle driving data.

[0016] This application provides a multi-level data storage method and system. The method includes: reading vehicle driving data from onboard sensors and a telematics processor; determining the data storage level corresponding to the vehicle driving data using a time- and key-based two-factor data grading model; and storing the vehicle driving data according to the grading storage configuration rules corresponding to the data storage level. This application uses a two-factor data grading model to determine the storage level of vehicle driving data, improving data storage efficiency and avoiding waste of storage resources.

[0017] To make the above-mentioned objectives, features and advantages of this application more apparent and understandable, preferred embodiments are described below in detail with reference to the accompanying drawings. Attached Figure Description

[0018] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0019] Figure 1 A flowchart of a multi-level data storage method provided in an embodiment of this application is shown;

[0020] Figure 2 This application provides a flowchart illustrating how to determine the data storage level corresponding to vehicle driving data according to an embodiment of the present application.

[0021] Figure 3 A functional block diagram of a multi-level data storage system provided in an embodiment of this application is shown. Detailed Implementation

[0022] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. It should be understood that the drawings in this application are for illustrative and descriptive purposes only and are not intended to limit the scope of protection of this application. Furthermore, it should be understood that the schematic drawings are not drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of this application. It should be understood that the operations in the flowcharts may not be implemented in sequence, and steps without logical contextual relationships may be reversed or implemented simultaneously. In addition, those skilled in the art, guided by the content of this application, may add one or more other operations to the flowcharts, or remove one or more operations from the flowcharts.

[0023] Furthermore, the described embodiments are merely some, not all, of the embodiments of this application. The components of the embodiments of this application described and illustrated herein can typically be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of this application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely to illustrate selected embodiments of the application. All other embodiments obtained by those skilled in the art based on the embodiments of this application without inventive effort are within the scope of protection of this application.

[0024] The current amount of driving data generated by new energy vehicles (such as battery state of charge, motor speed, vehicle speed, braking frequency, ambient temperature, etc.) is enormous, with the average daily data volume per vehicle reaching 10GB-50GB. Existing storage solutions for new energy vehicle driving data mainly employ single-factor hierarchical classification, specifically including the following two categories:

[0025] 1) Time-based hierarchical storage: Data is divided into "real-time data (within the last hour for example)," "recent data (within the last hour to 7 days for example)," and "historical data (within the last 7 days for example)" according to the time of data generation, and stored in memory, SSD (Solid State Drive), and HDD (Hard Disk Drive for example) respectively.

[0026] 2) Criterion-based hierarchical storage: Data is stored on storage media with different performance levels according to its importance to vehicle safety / fault diagnosis (e.g., battery overvoltage data is "high criticality", ambient temperature data is "low criticality").

[0027] The hierarchical storage of data based on a single factor (time or keyness) has at least the following drawbacks:

[0028] 1. When storing data in tiers based solely on time, high-criticality historical data (such as battery failure precursor data from one month ago) may be migrated to low-speed storage areas, affecting the efficiency of fault tracing. When storing data in tiers based solely on criticality, low-criticality real-time data (such as real-time ambient humidity) may occupy high-speed storage resources, resulting in wasted space.

[0029] 2. Existing storage strategies mostly use relational databases (such as MySQL databases) or ordinary distributed file systems, which cannot adapt to the characteristics of vehicle driving data that are "write-intensive and read-by-level filtering", resulting in high data write latency (>500ms) and slow query response (complex queries >10s).

[0030] 3. In the existing solution, there is no dynamic linkage mechanism when data is migrated between data storage factors (e.g., time and criticality). For example, "data with high criticality but older than 30 days" needs to be manually judged to determine whether to migrate, which is prone to data loss or excessive storage occupation.

[0031] 4. Existing storage solutions do not optimize storage structures for different levels of data. For example, low-criticality data is still configured with a high compression ratio, resulting in wasted storage space (waste rate > 25%).

[0032] Based on this, embodiments of this application provide a multi-level data storage method and system, which uses a two-factor data hierarchical model to determine the storage level of vehicle driving data, thereby improving data storage efficiency and avoiding waste of storage resources, as detailed below:

[0033] Please see Figure 1 , Figure 1 A flowchart illustrating a multi-level data storage method provided in an embodiment of this application is shown. Figure 1 As shown, the method provided in this application embodiment includes the following steps:

[0034] S100 reads vehicle driving data from onboard sensors and telematics processors.

[0035] S200. Using a two-factor data grading model based on time and criticality, determine the data storage level corresponding to vehicle driving data.

[0036] S300: Store vehicle driving data according to the hierarchical storage configuration rules corresponding to the data storage level of vehicle driving data.

[0037] In specific implementation, in step S100, the vehicle-mounted sensors include, but are not limited to, at least one of the following: battery sensor and motor controller. The vehicle driving data involves multiple driving parameters, which are divided into two categories: core data and auxiliary data. The core data is further divided into battery data, motor data, and brake signal. The battery data includes, but is not limited to, at least one of the following: battery SOC (State of Charge), battery voltage, and battery temperature. The motor data includes, but is not limited to, at least one of the following: motor speed and motor torque. The auxiliary data includes, but is not limited to, at least one of the following: ambient temperature, ambient humidity, GPS positioning, and air conditioning status.

[0038] Furthermore, the vehicle-mounted sensors and the remote information processor T-BOX sample data according to the preset sampling frequency corresponding to each driving parameter to obtain the driving parameter sampling data corresponding to each driving parameter. For example, the brake signal is sampled at a frequency of 1 time / second, and the air conditioning status is sampled at a frequency of 1 time / type.

[0039] In a preferred embodiment, step S100 further includes:

[0040] The driving parameter sampling data is converted into a preset acquisition format (such as JSON format), and the driving parameter sampling data after format conversion is validated to obtain multiple driving parameter sampling data that have passed the data validation.

[0041] Preferably, the preset data collection format includes at least the following fields: data identity identifier, vehicle identification code, data collection timestamp, data type (including battery data, motor data, brake signal and auxiliary data), sampled data value and data source.

[0042] The data verification process includes, but is not limited to, at least one of the following: integrity verification (e.g., discarding sampled data with missing fields) and outlier filtering (e.g., if battery voltage > 500V is considered an anomaly, discard the sampled data for battery voltage).

[0043] In a preferred embodiment, in step S200, the two-factor data grading model includes a time-based grading sub-model, a keyness grading sub-model, and a two-factor mapping sub-model. (See also...) Figure 2 , Figure 2 A flowchart illustrating a method for determining the data storage level corresponding to vehicle driving data, as provided in an embodiment of this application, is shown. Figure 2 As shown, step S200 further includes:

[0044] S2001. Input the data collection time corresponding to the vehicle driving data into the time-level sub-model to determine the query level corresponding to the vehicle driving data.

[0045] S2002. Input the data types related to vehicle driving data into the keyness classification sub-model to determine the keyness level corresponding to the vehicle driving data.

[0046] S2003. Input the query level and keyness level corresponding to the vehicle driving data into the two-factor mapping sub-model to determine the data storage level corresponding to the vehicle driving data.

[0047] Preferably, the two-factor mapping sub-model describes the mapping relationship between the query level, keyness level, and data storage level of vehicle driving data.

[0048] In one example, in step S2001, the query levels include real-time level T1, recent level T2, and historical level T3. Real-time level T1 needs to support millisecond-level queries, recent level T2 needs to support second-level queries, and historical level T3 needs to support minute-level queries. This application does not impose specific restrictions on the way query levels are divided, and can be set according to the actual needs of the user. The above embodiment is only one specific implementation method provided by this application.

[0049] Specifically, the time-level sub-model determines the query level corresponding to vehicle driving data in the following ways:

[0050] Calculate the time difference between the data collection time corresponding to the vehicle driving data and the current time. If the time difference is less than or equal to the first time period, the query level of the vehicle driving data is determined to be real-time. If the time difference is greater than the first time period and less than or equal to the second time period, the query level of the vehicle driving data is determined to be recent. If the time difference is greater than the second time period, the query level of the vehicle driving data is determined to be historical. The second time period is greater than the first time period.

[0051] In one specific embodiment, for example, the first time period can be 1 hour, and the second time period can be 7 days. If the time difference between the vehicle driving data and the current time is ≤1 hour, the query level of the vehicle driving data is real-time level T1. If 1 hour < time difference ≤7 days, the query level of the vehicle driving data is recent level T2. If the time difference is greater than 7 days, the query level of the vehicle driving data is historical level T3.

[0052] Step S2002 further includes:

[0053] S2002A. Based on the data type corresponding to the driving parameter sampling data, determine the criticality base score and criticality indicator weight corresponding to the driving parameter sampling data.

[0054] S2002B: Based on the dynamic threshold library corresponding to the vehicle and the sampled data values ​​corresponding to the driving parameters, determine the state coefficients corresponding to the sampled driving parameter data.

[0055] S2002C: Determine the criticality judgment score corresponding to the vehicle driving data based on the criticality base score, criticality index weight, and state coefficient corresponding to the driving parameter sampling data.

[0056] S2002D: Locate the preset criticality level judgment table and determine the criticality level of the vehicle driving data based on the criticality judgment score corresponding to the vehicle driving data.

[0057] Preferably, the preset criticality level judgment table describes the mapping relationship between the criticality level corresponding to the vehicle driving data and the criticality judgment score.

[0058] In step S2002A, a score weight mapping table is first obtained, which describes the mapping relationship between the data type, the basic score, and the weight of the key indicators corresponding to the driving parameter sampling data. Table 1 is a score weight mapping table.

[0059] Table 1

[0060]

[0061] As shown in Table 1, taking battery voltage as an example of driving parameter sampling data, it belongs to battery data. Therefore, the criticality base score corresponding to battery voltage is 80 and the criticality index weight is 0.4. Specifically, the criticality base score and criticality index weight corresponding to each data type are set according to actual needs, which will not be elaborated on here.

[0062] In step S2002B, a dynamic threshold library corresponding to different vehicle models is pre-established. The dynamic threshold library records the parameter thresholds corresponding to each driving parameter. The dynamic threshold library corresponding to the current vehicle model is retrieved, and the parameter thresholds corresponding to the driving parameter sampling data are retrieved from the corresponding dynamic threshold library. Using the parameter thresholds corresponding to the driving parameters and the sampling data values, the state coefficients corresponding to the driving parameter sampling data are determined by applying the state coefficient determination rules corresponding to the driving parameters.

[0063] Preferably, the rule for determining the state coefficient can be as follows: calculate the relative difference between the sampled data value and the parameter threshold, read the state mapping table corresponding to the driving parameter, and determine the current parameter state of the driving parameter sampled data based on the state mapping table and the relative difference. The state mapping table records the mapping relationship between the relative difference between the sampled data value of the driving parameter and the parameter threshold and the parameter state. The parameter state includes, but is not limited to, at least one of the following: abnormal, critical, and normal. Based on the current parameter state of the driving parameter sampled data and the mapping relationship between the parameter state and the state coefficient, determine the state coefficient corresponding to the driving parameter sampled data.

[0064] For example, when the parameter state is abnormal, its corresponding state coefficient is 1.2; when the parameter state is critical, its corresponding state coefficient is 0.8; and when the parameter state is normal, its corresponding state coefficient is 0.5.

[0065] In a preferred embodiment, step S2002C further includes:

[0066] Calculate the first product between the criticality base score and the state coefficient corresponding to each driving parameter sampling data, and determine the first product as the first critical indicator score corresponding to the driving parameter sampling data. Calculate the second product between the first critical indicator score and the criticality indicator weight corresponding to the driving parameter sampling data, and determine the second critical indicator score corresponding to the driving parameter sampling data. The sum of the second critical indicator scores corresponding to all driving parameter sampling data is determined as the criticality judgment score of the vehicle driving data.

[0067] In one specific embodiment, the score of the first key indicator corresponding to the driving parameter sampling data is determined by the following formula:

[0068]

[0069] In this formula, This represents the score of the first key indicator corresponding to the i-th driving parameter sample data in the vehicle driving data. This represents the basic score of keyness corresponding to the i-th driving parameter sampling data. This represents the index weight corresponding to the i-th driving parameter sampling data. For example, assuming that the i-th driving parameter sampling data belongs to battery data and its state coefficient is determined to be 0.5, then according to Table 1, its criticality base score is determined to be 80. Substituting into the above formula, the first critical index score corresponding to this driving parameter sampling data is: 80 × 0.5 = 40.

[0070] In another specific embodiment, the criticality score corresponding to the vehicle driving data is determined by the following formula:

[0071]

[0072] In this formula, This indicates the criticality score corresponding to the vehicle driving data. This indicates the number of driving parameter sampling data in the vehicle driving data. This represents the index weight corresponding to the i-th driving parameter sampling data. This represents the score of the second key indicator corresponding to the i-th driving parameter sampling data.

[0073] In step S2002D, Table 2 is a criticality level determination table.

[0074] Table 2

[0075]

[0076] As shown in Table 2, three criticality levels are set, including K1, K2, and K3. The criticality score corresponding to the vehicle driving data is... At that time, the criticality level of the vehicle driving data was determined to be K1. At that time, the criticality level of vehicle driving data is K2. At that time, the criticality level of vehicle driving data was K3. As the criticality score decreased, the criticality of vehicle driving data decreased.

[0077] In this application, the criticality level may be increased or decreased according to actual needs, which will not be elaborated here.

[0078] return Figure 2 In step S2003, Table 3 shows a two-factor mapping model.

[0079] Table 3

[0080]

[0081] As shown in a specific example in Table 3, the data storage levels of this application include the first storage level L1 (highest level), the second storage level L2 (second highest level), and the third storage level L3 (lowest level). In the two-factor mapping model shown in Table 3, when the query level corresponding to vehicle driving data is T1 and the key level is K1, the corresponding data storage level is the first storage level L1. When the query level corresponding to vehicle driving data is T2 and the key level is K2, the corresponding data storage level is the second storage level L2. When the query level corresponding to vehicle driving data is T2 and the key level is K2, the corresponding data storage level is the second storage level L3. Other cases are shown in Table 3 and will not be elaborated further here.

[0082] In this application, the mapping relationship between the query level, keyness level and data storage level of vehicle driving data is set according to actual needs, and will not be elaborated on here.

[0083] return Figure 1 In step S300, according to the designed data storage level, the table structure, compression algorithm, partitioning strategy, table storage location and data retention period of the data storage table corresponding to the data storage level are configured. Specifically, the table structure design of the data storage table includes the table name structure design and the core field design within the table. For example, the data storage table can be a StarRocks storage table.

[0084] Preferably, the table name structure of the data storage table can be "a unified prefix (e.g., "ev_driving_data_") + data storage level (L1 / L2 / L3).

[0085] The core fields in the data storage table include: data_id (primary key), vin (input source), collect_time (data collection time), data_type (data type), data_value (data value), key_score (key score), and time_level (query level).

[0086] In one specific embodiment, Table 4 shows a hierarchical storage configuration rule.

[0087] Table 4

[0088]

[0089] As shown in Table 4, the hierarchical storage configuration rules record the mapping relationship between data storage level, data storage table, compression algorithm, partitioning strategy, storage location and data retention period. For example, taking data storage level L1 as an example, when vehicle driving data is the first storage level L1, its corresponding data storage table is "ev_driving_data_L1", the data is stored using the ZSTD compression algorithm, the data partitioning strategy in the table is partitioned by minute, the storage medium is a solid-state drive (SSD), and the data retention period is 24 hours.

[0090] In a preferred embodiment, step S300 further includes:

[0091] According to the compression algorithm and partitioning strategy corresponding to the data storage level of vehicle driving data, the vehicle driving data is written to the corresponding data storage table, and the storage medium where the data storage table is located is refreshed.

[0092] As shown in Table 4, in one specific embodiment of this application, the storage medium includes a hard disk drive (HDD) and a solid-state drive (SSD).

[0093] Preferably, as shown in Table 4, this application sets three storage levels L1, L2, and L3, which respectively indicate the storage locations where the corresponding vehicle driving data should be stored. Specifically, the first storage level L1 corresponds to high-performance SSD storage media, indicating that highly critical data should be stored on high-performance SSD storage media. The second storage level L2 corresponds to a combination of HDD and SSD storage media, indicating that vehicle driving data belonging to the second storage level L2 is stored as the second most critical data in HDD and SSD, reducing the occupation of high-performance SSD storage and freeing up SSD resources. The third storage level L3 corresponds to HDD storage media, indicating that vehicle driving data belonging to the third storage level L2 is stored as low-critical data in HDD, avoiding non-critical data occupying the high-critical data storage space of SSD.

[0094] In one specific embodiment, vehicle driving data is written to the corresponding data storage table in the following manner:

[0095] If the vehicle driving data is in the first storage level, the ZSTD compression algorithm and the partitioning strategy of partitioning by minute are used to write the vehicle driving data into the corresponding data storage table. If the vehicle driving data is in the second storage level, the LZ4 compression algorithm and the partitioning strategy of partitioning by hour are used to write the vehicle driving data into the corresponding data storage table. If the vehicle driving data is in the third storage level, the SNAPPY compression algorithm and the partitioning strategy of partitioning by day are used to write the vehicle driving data into the corresponding data storage table.

[0096] For example, this application involves different compression algorithms, partitioning strategies, and combinations of storage media for different data storage levels of vehicle driving data to complete the storage of vehicle driving data, thereby reducing application costs, reducing invalid data writing, and reducing data input / output interface overhead.

[0097] In another specific embodiment, the data storage table corresponding to the first storage level L1 enables a real-time data synchronization mechanism, that is, the vehicle driving data is refreshed immediately after being written to the data storage table, ensuring that the data query latency is less than 100ms. The data storage tables corresponding to the second storage level L2 and the third storage level L3 enable a batch writing mechanism, that is, each time a preset number (1000 records for example) of vehicle driving data is obtained, a writing batch is formed, and the batch of data is written to the corresponding data storage table, reducing the overhead of data input / output interfaces.

[0098] In a preferred embodiment, the method provided in this application further includes:

[0099] Based on the data retention period corresponding to the data storage table and the data storage level of the vehicle driving data in the data storage table, determine the vehicle driving data to be migrated that meets the migration trigger conditions in the data storage table, migrate the vehicle driving data to be migrated from its original data storage table to the target data storage table, delete the vehicle driving data to be migrated from its corresponding original data storage table after the migration is completed, update the storage media where the original data storage table is located and the storage media where the target data storage table is located, and generate the data migration log corresponding to the migration.

[0100] For example, a data migration log includes at least the migration batch, the source table, the target table, the amount of data migrated, and the migration time.

[0101] In one specific embodiment, taking the above-mentioned data storage level division as an example, for the data storage table corresponding to the first storage level, vehicle driving data with a data retention period of more than 24 hours or a criticality judgment score that drops from K1 / K2 to K3 is defined as vehicle driving data to be migrated. For the data storage table corresponding to the second storage level, vehicle driving data with a data retention period of more than 7 days or a query level that drops from T1 / T2 to T3 is defined as vehicle driving data to be migrated.

[0102] For example, if the data storage table is StarRocks, then based on the StarRocks INSERT SELECT statement, the driving data of the vehicle to be migrated is saved from its original StarRocks storage table to the target StarRocks storage table corresponding to its updated data storage level.

[0103] After the data migration is completed, the driving data of the vehicle to be migrated will be deleted from its original StarRocks storage table. Specifically, the StarRocks storage table corresponding to the first storage level will be deleted immediately after the migration is completed, and the storage media where the original data storage table is located and the storage media where the target data storage table is located will be updated.

[0104] After the StarRocks storage tables corresponding to the second storage level are migrated, they should be retained for 12 hours for backup before being deleted.

[0105] In this application, reverse data migration (such as L3→L2) is prohibited to avoid wasting resources.

[0106] The advantages of this application are:

[0107] I. This application determines the data storage level of vehicle driving data based on a two-factor data hierarchical avoidance model, avoiding the problems of "low-speed storage of high-critical historical data" and "occupation of high-speed resources by low-critical real-time data", thereby increasing the storage space utilization rate by 40%-60% and reducing HDD resource consumption by 35% compared with the existing single-factor scheme.

[0108] Second, by combining StarRocks' columnar storage features with a hierarchical partitioning strategy to store vehicle driving data, the query latency for vehicle driving data belonging to the first storage level (L1) is less than 100ms (meeting the requirements for real-time fault alarms), and the query response time for vehicle driving data belonging to the third storage level (L3) (such as "battery data trend of a certain vehicle within 1 month") is less than 5s, which is 3-5 times better than the MySQL solution.

[0109] Third, by combining storage media (SSD+HDD) with compression algorithms to store vehicle driving data, hardware costs are reduced by 25%-30%, while invalid data writing is reduced (abnormal data filtering), and IO overhead is reduced by 20%.

[0110] Fourth, it enables dynamic data migration without manual intervention, with a migration success rate of >99.9%, and retains complete migration logs for easy problem tracing.

[0111] Based on the same application concept, this application also provides a StarRocks-based multi-level data storage system corresponding to the StarRocks-based multi-level data storage method provided in the above embodiments. Since the principle of the system in this application is similar to the StarRocks-based multi-level data storage method in the above embodiments of this application, the implementation of the system can refer to the implementation of the method, and the repeated parts will not be described again.

[0112] Please see Figure 3 , Figure 3 This diagram illustrates a functional block diagram of a multi-level data storage system provided in an embodiment of this application. For example... Figure 3 As shown, a multi-level data storage system includes:

[0113] The data acquisition module 400 is used to read vehicle driving data from onboard sensors and telematics processors.

[0114] The two-factor classification module 410 is used to determine the data storage level corresponding to the vehicle driving data using a two-factor data classification model based on time and keyness.

[0115] Storage module 420 is used to store vehicle driving data according to hierarchical storage configuration rules corresponding to the data storage level of vehicle driving data.

[0116] Preferred, such as Figure 3 As shown, the multi-level data storage system also includes:

[0117] The dynamic migration module 430 is used to determine the vehicle driving data in the data storage table that meets the migration trigger conditions based on the data retention period corresponding to the data storage table and the data storage level of the vehicle driving data in the data storage table. It then migrates the vehicle driving data to be migrated from its original data storage table to the target data storage table. After the migration is completed, it deletes the vehicle driving data to be migrated from its corresponding original data storage table, updates the storage media where the original data storage table is located and the storage media where the target data storage table is located, and generates the data migration log corresponding to the migration.

[0118] In one specific embodiment, the multi-level data storage system provided in this application allows each module to interact with the StreamLoad interface via JSON format.

[0119] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems and devices described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here. In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of units is only a logical functional division; in actual implementation, there may be other division methods. Furthermore, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Another point is that the displayed or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces; the indirect coupling or communication connection of devices or units may be electrical, mechanical, or other forms.

[0120] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0121] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0122] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a processor-executable, non-volatile, computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0123] The above are merely specific embodiments of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A method for multi-level data storage, characterized in that, The method includes: Read vehicle driving data from onboard sensors and telematics processors; The data storage level corresponding to the vehicle driving data is determined by using a two-factor data grading model based on time and criticality. The two-factor data grading model includes a time grading sub-model, a criticality grading sub-model, and a two-factor mapping sub-model. The vehicle driving data is stored according to the hierarchical storage configuration rules corresponding to the data storage level of the vehicle driving data; The step of determining the data storage level corresponding to the vehicle driving data using a two-factor data grading model based on time and keyness includes: Input the data collection time corresponding to the vehicle driving data into the time-level sub-model to determine the query level corresponding to the vehicle driving data; Input the data type related to vehicle driving data into the key. A degree-level sub-model is used to determine the criticality level corresponding to vehicle driving data. Input the query level and the keyness level corresponding to the vehicle driving data into the two-factor mapping sub-model to determine the data storage level corresponding to the vehicle driving data; The vehicle driving data includes multiple driving parameter sampling data at the same sampling time. The driving parameter sampling data includes the data type and sampling data value corresponding to the driving parameter. The keyness classification sub-model determines the keyness level corresponding to the vehicle driving data in the following way: Based on the data type corresponding to the driving parameter sampling data, determine the criticality base score and criticality indicator weight corresponding to the driving parameter sampling data; Based on the dynamic threshold library corresponding to the vehicle and the sampled data values ​​corresponding to the driving parameters, the state coefficients corresponding to the driving parameter sampled data are determined. Based on the criticality base score, criticality index weight, and state coefficient corresponding to each of the driving parameter sampling data, the criticality judgment score corresponding to the vehicle driving data is determined. The key level of the vehicle driving data is determined based on the mapping relationship between the key level of the vehicle driving data and the key score.

2. The method according to claim 1, characterized in that, The two-factor mapping sub-model describes the mapping relationship between the query level, key level, and data storage level of vehicle driving data.

3. The method according to claim 2, characterized in that, The query levels include real-time, recent, and historical levels. The time-level sub-model determines the query level corresponding to vehicle driving data in the following way: Calculate the time difference between the data collection time corresponding to the vehicle driving data and the current time; If the time difference is less than or equal to the first time period, then the query level of the vehicle driving data is determined to be real-time. If the time difference is greater than the first time period and the time difference is less than or equal to the second time period, then the query level of the vehicle driving data is determined to be the recent level. If the time difference is greater than the second time period, then the query level of the vehicle driving data is determined to be historical, wherein the second time period is greater than the first time period.

4. The method according to claim 1, characterized in that, The step of determining the criticality score corresponding to the vehicle driving data based on the criticality base score, criticality index weight, and state coefficient corresponding to each of the driving parameter sampling data includes: Calculate the first product between the criticality base score corresponding to the driving parameter sampling data and the state coefficient, and determine the first product as the first critical indicator score corresponding to the driving parameter sampling data; Calculate the second product between the first key indicator score and the keyness indicator weight corresponding to each driving parameter sampling data, and determine the second product as the second key indicator score corresponding to the driving parameter sampling data; The sum of the scores of the second key indicators corresponding to all the driving parameter sampling data is determined as the keyness score of the vehicle driving data.

5. The method according to claim 1, characterized in that, The hierarchical storage configuration rules record the mapping relationship between data storage level, data storage table, compression algorithm, partitioning strategy, storage location, and data retention period. The step of storing the vehicle driving data according to the hierarchical storage configuration rules corresponding to the data storage level of the vehicle driving data includes: The vehicle driving data is written into the corresponding data storage table according to the compression algorithm and partitioning strategy corresponding to the data storage level of the vehicle driving data. Refresh the storage medium where the data storage table is located.

6. The method according to claim 5, characterized in that, The data storage levels include a first storage level, a second storage level, and a third storage level, from high to low. The vehicle driving data is written to the corresponding data storage table in the following way: If the vehicle driving data is at the first storage level, then the ZSTD compression algorithm and the partitioning strategy of partitioning by minute are used to write the vehicle driving data into the corresponding data storage table. If the vehicle driving data is at the second storage level, then the LZ4 compression algorithm and the hourly partitioning strategy are used to write the vehicle driving data into the corresponding data storage table. If the vehicle driving data is at the third storage level, then the SNAPPY compression algorithm and a daily partitioning strategy are used to write the vehicle driving data into the corresponding data storage table.

7. The method according to claim 5, characterized in that, The method further includes: Based on the data retention period corresponding to the data storage table and the data storage level of the vehicle driving data in the data storage table, determine the vehicle driving data in the data storage table that meets the migration trigger conditions and needs to be migrated; Migrate the vehicle driving data to be migrated from its original data storage table to the target data storage table; After the migration is completed, the driving data of the vehicle to be migrated is deleted from its corresponding original data storage table, and the storage medium where the original data storage table is located and the storage medium where the target data storage table is located are updated. Generate the data migration logs corresponding to the migration.

8. The method according to claim 1, characterized in that, Before determining the data storage level corresponding to the vehicle driving data using a time- and key-based two-factor data grading model, the method further includes: Convert the driving parameter sampling data into a preset acquisition format; The driving parameter sampling data after format conversion is validated to obtain multiple driving parameter sampling data that pass the validation.

9. A multi-level data storage system, characterized in that, The system includes: The data acquisition module is used to read vehicle driving data from onboard sensors and telematics processors; The two-factor classification module is used to determine the data storage level corresponding to the vehicle driving data using a two-factor data classification model based on time and criticality. The two-factor data classification model includes a time classification sub-model, a criticality classification sub-model, and a two-factor mapping sub-model. The storage module is used to store the vehicle driving data according to the hierarchical storage configuration rules corresponding to the data storage level of the vehicle driving data; The two-factor classification module is also used for: Input the data collection time corresponding to the vehicle driving data into the time-level sub-model to determine the query level corresponding to the vehicle driving data; Input the relevant data types of vehicle driving data into the keyness classification sub-model to determine the keyness level corresponding to the vehicle driving data; Input the query level and the keyness level corresponding to the vehicle driving data into the two-factor mapping sub-model to determine the data storage level corresponding to the vehicle driving data; The vehicle driving data includes multiple driving parameter sampling data at the same sampling time. The driving parameter sampling data includes the data type and sampling data value corresponding to the driving parameter. The keyness classification sub-model determines the keyness level corresponding to the vehicle driving data in the following way: Based on the data type corresponding to the driving parameter sampling data, determine the criticality base score and criticality indicator weight corresponding to the driving parameter sampling data; Based on the dynamic threshold library corresponding to the vehicle and the sampled data values ​​corresponding to the driving parameters, the state coefficients corresponding to the driving parameter sampled data are determined. Based on the criticality base score, criticality index weight, and state coefficient corresponding to each of the driving parameter sampling data, the criticality judgment score corresponding to the vehicle driving data is determined. The key level of the vehicle driving data is determined based on the mapping relationship between the key level of the vehicle driving data and the key score.