Index method and system for data spatio-temporal classification

By constructing a pre-defined hierarchical classification system to generate classification codes, dividing the data into hierarchical grid units and mapping them to one-dimensional spatial codes, generating time codes according to a pre-defined time granularity, and merging them into a unified spatiotemporal classification code, the problem of unified coding for multi-source heterogeneous data is solved, and efficient spatiotemporal indexing and data retrieval are achieved.

CN122240884APending Publication Date: 2026-06-19CHINA UNIV OF MINING & TECH (BEIJING)

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHINA UNIV OF MINING & TECH (BEIJING)
Filing Date
2026-03-16
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing spatiotemporal indexing methods struggle to achieve unified encoding of multi-source heterogeneous data in the complex data environment of coal mines, fail to balance the retrieval efficiency of multi-scale geological information, weaken the time dimension, and incur high costs for updating and maintaining the index structure, making it difficult to adapt to the dynamic writing and real-time updating requirements of high-frequency real-time monitoring data.

Method used

By constructing a pre-defined hierarchical classification system to generate classification codes, dividing the hierarchical grid units and mapping them to one-dimensional spatial codes, generating time codes according to a pre-defined time granularity, and merging the classification, spatial, and time codes into a unified spatiotemporal classification code, which is used as the primary key or partition key of the database table and directly written into the storage engine to establish a compact physical index structure.

Benefits of technology

It achieves standardized semantic identification and association of multi-source heterogeneous data, high-precision spatial positioning across scales, supports efficient access to multi-granularity time data, reduces index maintenance overhead, and enables real-time writing and efficient retrieval of massive monitoring data.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240884A_ABST
    Figure CN122240884A_ABST
Patent Text Reader

Abstract

This application discloses a data spatiotemporal classification indexing method and system, relating to the field of geotechnical engineering safety monitoring and geological disaster early warning technology, which facilitates improved data indexing efficiency. The method includes: encoding the target data to be indexed based on a preset hierarchical classification system to generate classification codes; determining the corresponding divided grid units according to the target data and mapping the index of the grid units to spatial codes; dividing the time information of the target data according to a preset time granularity, mapping each time sub-unit in the time information to binary codes to obtain time codes; fusing the classification codes, spatial codes, and time codes in a predetermined order to generate a unified spatiotemporal classification code; and writing the unified spatiotemporal classification code as the primary key, partition key, or first column of a composite index in a database table into the database storage engine to establish a physical index structure based on the unified spatiotemporal classification code. This application is applicable to coal mine geoscience data management scenarios.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of organization, management and spatiotemporal indexing technology of multi-source heterogeneous data, and in particular to an indexing method and system for spatiotemporal classification of data. Background Technology

[0002] In current coal mine geoscientific data management, with the development of multi-source three-dimensional detection systems, data exhibits characteristics of multi-source heterogeneity, spatiotemporal coupling, and high-frequency updates. Existing spatiotemporal indexing methods mainly operate by constructing tree-like or coded structures through spatial partitioning or spatial filling curves to achieve spatial range queries and proximity retrieval.

[0003] However, existing spatiotemporal indexing methods have significant shortcomings in the complex data environment of coal mines. Specifically, firstly, there is a lack of a unified coding system for multi-source heterogeneous data, making it difficult to achieve collaborative organization and association of cross-modal data such as numerical data, images, and point clouds; secondly, the spatial partitioning mechanism is not adaptable enough to multi-scale geological information, making it difficult to balance retrieval efficiency for large-scale mining areas and local high-precision structures; thirdly, the time dimension is generally weakened to an auxiliary attribute, lacking a multi-granular, continuous time coding scheme, which cannot effectively support dynamic querying and analysis of time-series data; and finally, the index structure update and maintenance costs are high, making it difficult to adapt to the dynamic writing and real-time update requirements of high-frequency real-time monitoring data. In summary, the limitations of existing technologies lead to low data retrieval efficiency, serious storage redundancy, and reduced overall data management level and utilization efficiency. Summary of the Invention

[0004] In view of this, embodiments of this application provide an indexing method and system for spatiotemporal data classification, which facilitates the improvement of data indexing efficiency.

[0005] In a first aspect, embodiments of this application provide an indexing method for spatiotemporal classification of data, comprising: encoding target data to be indexed based on a preset hierarchical classification system to generate a classification code; dividing the spatial region to which the target data belongs into hierarchical grid units, determining the corresponding divided grid units according to the spatial coordinates of the target data, and mapping the index of the grid units to a one-dimensional spatial code; dividing the time information of the target data according to a preset time granularity, mapping each time sub-unit in the time information to a binary code and combining them in sequence to obtain a time code; fusing the classification code, spatial code and time code in a predetermined order to generate a unified spatiotemporal classification code, and writing the unified spatiotemporal classification code as the first column of the primary key, partition key or composite index of the database table into the database storage engine to establish a physical index structure based on the unified spatiotemporal classification code.

[0006] According to a specific implementation of an embodiment of this application, the step of encoding the target data to be indexed based on a preset hierarchical classification system to generate a classification code includes: obtaining the attribution identifier of the target data in the subject domain, business object, data entity, and attribute sequence based on the hierarchical classification system defined according to the coal mine data fusion and sharing specification; mapping each attribution identifier to a binary code segment of a preset number of bits; and concatenating each binary code segment in hierarchical order from the subject domain to the attribute sequence to obtain the classification code.

[0007] According to a specific implementation of an embodiment of this application, the hierarchical grid unit for dividing the spatial region to which the target data belongs includes: dividing the spatial region into multiple levels of square grid units; wherein, the highest level grid unit covers the entire space of the target data, and the lowest level grid unit corresponds to millimeter-level spatial accuracy.

[0008] According to a specific implementation of an embodiment of this application, the step of determining the corresponding divided grid cells based on the spatial coordinates of the target data and mapping the index of the grid cells to a one-dimensional spatial code includes: calculating the grid level where the target data is located and the grid coordinates within the level based on the latitude and longitude coordinates of the target data; and calling a space filling curve algorithm to convert the grid coordinates into a one-dimensional binary spatial code.

[0009] According to a specific implementation of this application, the step of dividing the time information of the target data according to a preset time granularity, mapping each time sub-unit in the time information to binary code and then combining them in sequence includes: dividing the time into one or more levels of year, month, day, hour, minute, second, and millisecond; generating corresponding binary codes according to the time values ​​of each level; combining the binary codes in order from year to millisecond, and filling the combined binary sequence with bits to obtain a time code.

[0010] According to a specific implementation of an embodiment of this application, the step of fusing the classification code, spatial code, and temporal code in a predetermined order to generate a unified spatiotemporal classification code includes: concatenating the Base64 format strings corresponding to the classification code, spatial code, and temporal code in the order of classification, space, and time to generate the unified spatiotemporal classification code.

[0011] According to a specific implementation of an embodiment of this application, after generating the spatiotemporal classification unified code, the method further includes: responding to a received spatiotemporal query request or classification retrieval request, converting the query conditions in the request into the corresponding spatiotemporal classification unified code or code prefix range; and locating the corresponding data page or data partition based on the physical index structure established by the spatiotemporal classification unified code to perform a data retrieval operation.

[0012] Secondly, embodiments of this application also provide a data spatiotemporal classification indexing system, comprising: a classification encoding unit, used to encode target data to be indexed based on a preset hierarchical classification system to generate a classification code; a spatial encoding unit, used to divide the spatial region to which the target data belongs into hierarchical grid units, determine the corresponding divided grid units according to the spatial coordinates of the target data, and map the index of the grid units to a one-dimensional spatial code; a time encoding unit, used to divide the time information of the target data according to a preset time granularity, map each time sub-unit in the time information to binary code and combine them in sequence to obtain a time code; and a fusion unit, used to fuse the classification code, spatial code and time code in a predetermined order to generate a unified spatiotemporal classification code, and write the unified spatiotemporal classification code as the primary key, partition key or composite index of a database table into the database storage engine to establish a physical index structure based on the unified spatiotemporal classification code.

[0013] According to a specific implementation of an embodiment of this application, the classification coding unit includes: an acquisition module, used to acquire the attribution identifier of the target data in the subject domain, business object, data entity and attribute sequence based on the hierarchical classification system defined according to the coal mine data fusion and sharing specification; a mapping module, used to map each attribution identifier to a binary code segment of a preset number of bits; and a splicing module, used to splice each binary code segment in hierarchical order from subject domain to attribute sequence to obtain the classification code.

[0014] According to a specific implementation of an embodiment of this application, the spatial coding unit is specifically used to: divide the spatial region into multiple levels of square grid units; wherein, the highest level grid unit covers the entire space of the target data, and the lowest level grid unit corresponds to millimeter-level spatial accuracy.

[0015] The data spatiotemporal classification indexing method and system provided in the embodiments of this application solves the problems of lacking a unified coding system and difficulty in achieving cross-type data collaborative management by constructing a preset hierarchical classification system to generate classification codes. It provides standardized semantic identifiers and association links for multi-source heterogeneous data such as structured numerical data and unstructured images and point clouds. Secondly, by dividing the data into hierarchical grid units and mapping them to one-dimensional spatial codes, it overcomes the limitations of traditional index spatial division accuracy and inability to take into account both large-scale mining areas and local high-precision positioning, achieving cross-scale high-precision spatial positioning while maintaining the spatial locality of the data. Thirdly, by generating time codes through the combination of binary codes of multi-granularity time sub-units, it solves the defects of single granularity of time index and difficulty in supporting efficient access to multi-scale continuous time data. Finally, by merging the three types of codes into a unified spatiotemporal classification code and directly writing it into the storage engine as the primary key, partition key, or first column of the composite index, it overcomes the drawbacks of traditional index structure complexity, high update cost, and incompatibility with high-frequency dynamic data flow management. By unifying the index structure and storage structure, it significantly reduces the index maintenance overhead and realizes real-time writing and efficient retrieval of massive monitoring data. In summary, this application improves data indexing efficiency by constructing a compact indexing system that integrates classification, space, and time. Attached Figure Description

[0016] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0017] Figure 1 A schematic flowchart of an indexing method for spatiotemporal data classification provided in an embodiment of this application;

[0018] Figure 2 A classification coding diagram provided for embodiments of this application;

[0019] Figure 3 A schematic diagram of the classification attribute encoding process provided for embodiments of this application;

[0020] Figure 4 A schematic diagram showing the distribution of indexed regions for different spatial entities provided in the embodiments of this application;

[0021] Figure 5 A time-coding diagram provided for embodiments of this application;

[0022] Figure 6 A schematic diagram of a key-value storage model for geophysical big data provided for embodiments of this application;

[0023] Figure 7 A schematic diagram of MSTC spatiotemporal classification coding provided for embodiments of this application;

[0024] Figure 8 A schematic diagram illustrating database usage for different datasets provided in embodiments of this application;

[0025] Figure 9 A schematic diagram illustrating the time and space consumption of constructing different database indexes for embodiments of this application;

[0026] Figure 10 A schematic diagram of a MySQL query test provided for an embodiment of this application;

[0027] Figure 11 A schematic diagram of a PostgreSQL query test provided for an embodiment of this application;

[0028] Figure 12 A schematic diagram of a MongoDB query test provided for an embodiment of this application;

[0029] Figure 13 A schematic diagram of HBase query testing provided for an embodiment of this application;

[0030] Figure 14 A schematic diagram of a Cassandra query test provided for an embodiment of this application;

[0031] Figure 15 A schematic diagram of an Apache Iceberg query test provided for an embodiment of this application;

[0032] Figure 16 A schematic diagram of a data spatiotemporal classification indexing system provided for embodiments of this application. Detailed Implementation

[0033] The embodiments of this application will now be described in detail with reference to the accompanying drawings.

[0034] It should be understood that the described embodiments are merely some, not all, of the embodiments in this application. All other embodiments obtained by those skilled in the art based on the embodiments in this application without inventive effort are within the scope of protection of this application.

[0035] To enable those skilled in the art to better understand the technical concept, implementation scheme and beneficial effects of the embodiments of this application, detailed descriptions are provided below through specific embodiments.

[0036] Firstly, embodiments of this application provide an indexing method for spatiotemporal data classification, which facilitates improved data indexing efficiency.

[0037] like Figure 1 As shown, embodiments of this application provide an indexing method for spatiotemporal data classification, including:

[0038] S11. Based on the preset hierarchical classification system, the target data to be indexed is encoded to generate classification codes;

[0039] In the process of building intelligent coal mines, geoscientific data exhibits significant characteristics such as diverse sources, heterogeneous structures, and complex semantic relationships. Achieving unified description and efficient organization across business systems and data types has become a key bottleneck restricting the data fusion and sharing of transparent geological platforms. Traditional data encoding methods are mostly designed for single business systems and lack a globally unified semantic identification system. This makes it difficult to achieve automated classification and thematic attachment when data is accessed into the intelligent mine data warehouse, and the problem of data silos remains prominent. To address the aforementioned issues, this application pre-establishes a standardized hierarchical classification system covering the entire business domain of coal mines. In some examples, the hierarchical classification system strictly adheres to the "Intelligent Mine Data Fusion and Sharing Specification," summarizing coal mine geoscience data into four core areas: foundation, production, safety, and management. This is further subdivided into 59 thematic areas, 222 business objects, 1047 data entities, and 12547 attribute sequences, forming a four-level logical coding framework of "thematic area—business object—data entity—attribute sequence." Then, based on this framework, when classifying and coding the target data to be indexed, the system first matches and locates it level by level within the pre-defined hierarchical system according to its business origin and attribute characteristics, determining its thematic area, business object, data entity, and specific attributes. Subsequently, the category information of these four levels is sequentially converted into fixed-length binary code segments, and a unique 48-bit classification code is generated through concatenation.

[0040] The classification coding in this embodiment not only fully preserves the business semantics and hierarchical relationship of the data, but also provides a standardized pre-identifier for subsequent integration with spatial coding and temporal coding in a highly compact binary form. This ensures that the entire Multi-Source Spatio-Temporal Classification (MSTC) unified coding system has consistent identity description capabilities and cross-modal collaborative retrieval foundation in a multi-source heterogeneous data environment.

[0041] S12. Divide the spatial region to which the target data belongs into hierarchical grid cells, determine the corresponding grid cells according to the spatial coordinates of the target data, and map the index of the grid cells to a one-dimensional spatial code;

[0042] After completing the unified coding of the classification dimensions of multi-source heterogeneous coal mine data, in order to solve the problem of efficient organization and rapid positioning of its spatial location information, it is necessary to further standardize the indexing and expression of the spatial attributes of the data. To this end, in some examples, this application first subdivides the geographic space of the mining area into square grid units with coverage ranging from 5,000 kilometers to 5 millimeters according to the principle of geohash multi-scale spatial grid division, according to a preset 14-level resolution, to adapt to the cross-scale positioning needs from macro-mining area layout to micro-geological structure; then, for the target data to be indexed, the specific level grid unit to which it belongs is calculated and matched according to the latitude and longitude coordinates corresponding to its collection location; finally, in order to compress the two-dimensional grid information into a one-dimensional comparable code while maintaining spatial proximity, the Hilbert space filling curve is used to map and transform the index of the hierarchical grid unit to generate a compact and locally preserved unique spatial code.

[0043] This embodiment not only achieves efficient dimensionality reduction representation of massive spatial data from a two-dimensional plane to a one-dimensional key value, but also lays a structured foundation for subsequent integration with classification coding and time coding to form a unified MSTC spatiotemporal index key.

[0044] S13. Divide the time information of the target data according to a preset time granularity, map each time sub-unit in the time information into binary code, and then combine them in sequence to obtain the time code;

[0045] After spatial coding achieves efficient dimensionality reduction representation of the spatial location of coal mine geoscience data, in order to further enhance the data's sequential organization capability and multi-granularity retrieval flexibility in the time dimension, it is necessary to perform standardized coding processing on the time information corresponding to data collection or updating.

[0046] To this end, this application firstly, based on the diverse timeliness requirements in coal mine exploration and monitoring operations, pre-defines a fine-grained time system covering seven levels: year, month, day, hour, minute, second, and millisecond, dividing the continuous time axis into time sub-units with clear semantic levels. Subsequently, for the timestamps of the target data to be indexed, the specific values ​​corresponding to each level are extracted one by one according to the above granularity order, and converted into fixed-length binary codes respectively. Finally, to ensure the uniformity and parsability of the coding structure, the binary codes of all time sub-units are strictly spliced ​​together from coarse-grained to fine-grained order to form a complete and compact multi-scale time coding sequence.

[0047] This embodiment not only transforms unstructured time information into a key-value format suitable for efficient computer storage and comparison, but also provides direct index support for subsequent fast querying and partitioning based on time range, time point, or time granularity by preserving the hierarchical prefix characteristics of the time series.

[0048] S14. The classification code, spatial code, and time code are merged in a predetermined order to generate a unified spatiotemporal classification code. The unified spatiotemporal classification code is used as the primary key, partition key, or first column of a composite index in a database table and written into the database storage engine to establish a physical index structure based on the unified spatiotemporal classification code.

[0049] After independently encoding the classification dimensions, spatial location, and time series of multi-source heterogeneous coal mine data, in order to achieve coordinated organization and integrated indexing of the three types of information, it is necessary to organically integrate the above three types of codes into a globally unique identifier key. Therefore, this application follows a predetermined splicing order. In some examples, the splicing order usually adopts a fixed format with classification code first, spatial code in the middle, and time code last. Then, the three fixed-length or variable-length codes are concatenated sequentially to generate a compact and semantically complete spatiotemporal classification unified code. Subsequently, at the database table structure design level, the spatiotemporal classification unified code is explicitly defined as the first column of the primary key, partition key, or composite index of the data table, and written into the storage engine based on the underlying database B+ tree, log-structured merge tree (LSM), or hash index mechanism, thereby constructing an ordered index structure based on MSTC encoding on the physical storage medium.

[0050] This embodiment not only compresses high-dimensional spatiotemporal classification information into one-dimensional comparable key values, enabling data records to be tightly arranged in accordance with the semantics of coal mine business in disk or memory, but also completely avoids the additional write amplification and update overhead caused by independent index maintenance in traditional solutions by designing the index structure and storage structure in one, providing native underlying support for high-concurrency writing and multi-dimensional fast retrieval of massive geoscience monitoring data.

[0051] The spatiotemporal classification indexing method provided in this application solves the problems of lacking a unified coding system and difficulty in achieving cross-type data collaborative management by constructing a preset hierarchical classification system to generate classification codes. It provides standardized semantic identifiers and association links for multi-source heterogeneous data such as structured numerical data and unstructured images and point clouds. Secondly, by dividing the data into hierarchical grid units and mapping them to one-dimensional spatial codes, it overcomes the limitations of traditional index spatial division accuracy and inability to take into account both large-scale mining areas and local high-precision positioning, achieving cross-scale high-precision spatial positioning while maintaining the spatial locality of the data. Thirdly, by generating time codes through the combination of binary codes of multi-granularity time sub-units, it solves the defects of time index granularity being single and difficult to support efficient access to multi-scale continuous time data. Finally, by merging the three types of codes into a unified spatiotemporal classification code and directly writing it as the primary key, partition key, or first column of a composite index into the storage engine, it overcomes the drawbacks of traditional index structures being complex, having high update costs, and being unsuitable for high-frequency dynamic data flow management. By unifying the index structure and storage structure, it significantly reduces index maintenance overhead and achieves real-time writing and efficient retrieval of massive monitoring data. In summary, this application improves data indexing efficiency by constructing a compact indexing system that integrates classification, space, and time.

[0052] In some embodiments, encoding the target data to be indexed based on a preset hierarchical classification system to generate a classification code includes: obtaining the attribution identifier of the target data in the subject domain, business object, data entity, and attribute sequence based on the hierarchical classification system defined according to the coal mine data fusion and sharing specification; mapping each attribution identifier to a binary code segment of a preset number of bits; and concatenating each binary code segment in hierarchical order from subject domain to attribute sequence to obtain the classification code.

[0053] In the management of geoscience big data in coal mines, the data sources cover a variety of acquisition methods such as borehole exploration, geophysical exploration, remote sensing monitoring, laser scanning and underground sensor networks. The data formats involve structured numerical data, unstructured images and videos and semi-structured point cloud text. Traditional indexing methods lack a unified classification framework for coal mine business semantics, which leads to the fragmentation of multimodal data at the storage level and makes it difficult to achieve cross-source association retrieval.

[0054] To address this issue, in some cases, the first step is to construct a hierarchical classification and coding system based on the "Intelligent Mine Data Fusion and Sharing Specification." Figure 2 This is a classification coding diagram provided for an embodiment of this application, such as... Figure 2As shown, the hierarchical classification coding system categorizes and identifies coal mine geoscience data layer by layer according to a four-level logical framework: subject area, business object, data entity, and attribute sequence. This comprehensively covers the four core areas of infrastructure, production, safety, and management. The hierarchical classification coding system can be further subdivided into 59 subject areas, 222 business objects, 1047 data entities, and 12547 attributes, forming a systematic and comprehensive data coding architecture. Based on this, the classification coding adopts a 48-bit binary encoding structure, with its bit allocation and encoding logic strictly adhering to… Figure 2 As shown, and following the coding logic: ,in, For classification coding; The subject field is encoded using a 6-bit binary number, which can represent 64 different subject fields; The business object is encoded using 12-bit binary numbers, which can represent 4096 different business objects. Data entity encoding consists of 12 binary digits and can represent 4096 different data entities; The attribute encoding consists of 18-bit binary numbers, representing 262,144 different attributes. The symbol "||" represents the binary splitting operation, used to concatenate the codes to form a complete MSTC classification code. For each piece of target data to be indexed, based on its business semantics, it matches the attribution identifier level by level among the aforementioned subject areas, business objects, data entities, and attribute sequences. Each attribution identifier is then mapped to a binary code segment of a preset number of bits. Specifically... Figure 3 This is a schematic diagram of the classification attribute encoding process provided in an embodiment of this application. See also: Figure 3 The subject area, business object, data entity, and attribute sequence can be mapped to 6-bit, 12-bit, 12-bit, and 18-bit binary code segments respectively, forming a complete 48-bit binary classification code. The encoding structure can be referenced from... Figure 2 Taking the measurement point number of the real-time stress monitoring system as an example, the 48-bit binary code can be represented as 10110100100110101011001100111111110000110001111111; then, the 48-bit binary code is divided into groups of 6 bits each, resulting in 8 binary units. Each unit is converted into a decimal value and mapped to a predefined Base64 character set. For example, the character set is "0123456789@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz", which finally generates a compact string format classification code.

[0055] In this embodiment, the classification code serves as the database partition key. It not only provides standardized semantic tags for multi-source heterogeneous data in coal mines, enabling data with originally disparate formats, such as drilling footage data, stress monitoring curves, and gas concentration time-series records, to be logically aggregated under a unified classification framework, but also significantly improves storage and transmission efficiency while ensuring information integrity through fixed-length binary encoding and Base64 compression mapping. This lays a compact and efficient structural foundation for subsequent fusion and splicing with spatial and temporal encoding.

[0056] In some embodiments, the hierarchical grid unit that divides the spatial region to which the target data belongs includes: dividing the spatial region into multiple levels of square grid units; wherein the highest level grid unit covers the entire space of the target data, and the lowest level grid unit corresponds to millimeter-level spatial accuracy.

[0057] In coal mine geoscience big data, spatial information not only involves the macro-deployment of the mining area's surface over a range of several kilometers, but also covers the centimeter-level or even millimeter-level fine positioning of underground roadways, borehole structures, and rock strata interfaces. Although traditional spatial indexes have the ability to divide into multi-level grids, their encoding methods have inherent defects in maintaining local proximity, making it difficult to meet the dual needs of rapid overview of a large area and high-precision local retrieval.

[0058] To address this issue, this embodiment employs a multi-scale spatial grid partitioning strategy based on the Geohash principle in the spatial dimension. In some examples, the Earth's surface or the target mining area can be divided into 14 levels of square grid units. Taking two-dimensional spatial data as an example, its spatial resolution can be refined from 5000 kilometers at level 1 to 5 millimeters at level 14, forming a complete hierarchical system covering the entire spatial domain down to the microscopic scale. The basic unit of each grid unit is represented by longitude (lon) and latitude (lat) coordinates, and is converted into a unique one-dimensional spatial key through spatial indexing to optimize storage and retrieval efficiency. As the levels increase, the spatial grid is gradually refined, making the data storage granularity more precise. To ensure the uniformity and consistency of the encoding, the spatial coordinate range of each grid unit is normalized according to specific rules to ensure efficient spatial calculation and matching at different scales. In some examples, the grid units can be further subdivided into finer units, and the spatial coordinate ranges in longitude and latitude are standardized as follows: The spatial extent of a grid cell is determined by calculating the minimum bounding rectangle (MBR), using the following formula: ,in, and Representing the resolution of the grid cells at the current level, this embodiment ensures the consistency and computability of coordinate transformations between different grid levels by normalizing the longitude range to grid coordinates at the current level and mapping the latitude range accordingly.

[0059] For different spatial entities, MBR can effectively handle their representation and encoding in the spatial grid. Figure 4 For a schematic diagram of the index region distribution of different spatial entities provided in the embodiments of this application, see [link to relevant documentation]. Figure 4 This further illustrates the distribution of different spatial entities in the mesh generation, among which, Figure 4 'a' represents a point element occupying a single grid cell. Figure 4 b represents the distribution of line elements within a single grid. Figure 4 c represents a line element that spans multiple grids. Figure 4 d represents a face element located within a single MBR. Figure 4 e represents a surface element distributed across multiple grid regions. Figure 4 f represents the spatial distribution of volume elements across multiple grid cells. This application utilizes a multi-level grid partitioning mechanism to enable spatial coding to dynamically select the optimal resolution based on data density and query requirements. This supports both large-scale overview retrieval at the mining area level and precise positioning down to millimeter-level geological structural details, laying a fine and scalable grid foundation for subsequent spatial dimensionality reduction and proximity-preserving coding.

[0060] In some embodiments, determining the corresponding divided grid cells based on the spatial coordinates of the target data and mapping the index of the grid cells to a one-dimensional spatial code includes: calculating the grid level where the target data is located and the grid coordinates within the level based on the latitude and longitude coordinates of the target data; and calling a space filling curve algorithm to convert the grid coordinates into a one-dimensional binary spatial code.

[0061] After completing the multi-scale spatial grid division, how to efficiently map two-dimensional or even three-dimensional spatial coordinates into one-dimensional indexable coding keys, while ensuring that spatially adjacent data points still maintain their proximity relationship in the one-dimensional coding sequence, is the key bottleneck for improving spatial query performance.

[0062] To address this issue, this embodiment introduces a Hilbert space-filling curve at the spatial coding layer. Table 1 shows the spatial dimension coding algorithm provided in this embodiment. As shown in Table 1, firstly, based on the latitude and longitude coordinates (lon, lat) of the target data and the spatial resolution required for the current query or storage, the grid level to which the target data belongs and the grid coordinates within that level are determined. Then, the Hilbert curve recursive mapping function is called to convert the two-dimensional grid coordinates into a one-dimensional binary sequence. In some cases, this embodiment follows the continuous mapping rule of the Hilbert curve, ensuring that points with closer Euclidean distances in two-dimensional space also have correspondingly closer Hamming distances in one-dimensional Hilbert coding, thereby reducing the I / O overhead of range queries and nearest neighbor searches. Simultaneously, this embodiment also uses 60 bits of binary to encode latitude and longitude positions, corresponding to 10 Hilbert fill curve levels. Each 6-bit code represents a Hilbert curve level, achieving continuous coding coverage from global range to millimeter-level precision through this multi-level recursive mapping. To further improve storage and retrieval efficiency, the 60-bit binary spatial code was then divided into units of 6 bits each, mapped to the Base64 character set, and generated a compact string-formatted spatial code key, such as... Figure 4 As shown in the diagram, the applicable scenarios for the mapping mechanism are intuitively illustrated using a spatial index region diagram, that is, regardless of... Figure 4 A single-grid point entity of type a, or Figure 4 For all cross-mesh entities of type f, a unique one-dimensional spatial encoding can be obtained through Hilbert curves. Figure 4 Spatial objects spanning multiple grid boundaries in parameters c, 4e, and 4f are associated with all intersecting grid cells, and deduplication techniques are employed during query processing to eliminate the impact of duplicate data. Furthermore, spatial coding supports a dynamic resolution adjustment strategy, achieving an adaptive balance between storage cost and query performance based on data density and query load. This ensures that the coding mechanism maintains high-efficiency access performance when processing multi-scale spatial data such as high-frequency points in underground coal mine sensor networks, fine borehole structures, and large-scale mining area distributions.

[0063]

[0064] In some embodiments, dividing the time information of the target data according to a preset time granularity, mapping each time sub-unit in the time information to binary code and then combining them in sequence, includes: dividing the time into one or more levels of year, month, day, hour, minute, second, and millisecond; generating corresponding binary codes according to the time values ​​of each level; combining the binary codes in order from year to millisecond, and filling the bits of the combined binary sequence to obtain the time code.

[0065] In coal mine geoscience big data, time information exhibits significant multi-granularity characteristics. Specifically, borehole exploration data is usually recorded to the "day" granularity, mine microseismic monitoring needs to be accurate to the "millisecond" level, while real-time sensing data such as gas concentration and stress changes are sampled with a "second" or "millisecond" period. Traditional time indexing methods mostly use a single-precision timestamp format, which makes it difficult to take into account the timeliness differences of different exploration and production data under the same coding framework. This results in coarse-grained queries needing to scan a large number of fine-grained records, and fine-grained retrieval being inefficient due to the mismatch of index granularity.

[0066] To address this issue, this application constructs a multi-scale time subdivision coding system at the time coding layer. Specifically, the continuous time dimension is uniformly subdivided into seven levels: year (Y), month (M), day (D), hour (H), minute (m), second (s), and millisecond (ms). Each level is assigned a specific binary bit length and value range. The range of each time subdivision granularity is as follows: , , , , , , Specifically, the year corresponds to 12 bits, with a binary range of 2000-4096; the month corresponds to 4 bits, with a range of 1-12, and 13, 14, and 15 reserved as extension bits; the day corresponds to 5 bits, with a range of 1-31; the hour corresponds to 5 bits, with a range of 0-23, and 25-31 reserved as extension bits; the minute corresponds to 6 bits, with a range of 0-59, and 61-63 reserved as extension bits; the second corresponds to 6 bits, also with 61-63 reserved; and the millisecond corresponds to 10 bits, with a range of 0-999, and 1000-1023 reserved as extension bits. This explicit reservation of unused encoded values ​​provides ample space for future expansion of time granularity or special time identification needs, ensuring the sustainable development of the encoding system. After being refined by a specially designed time encoding mechanism, multi-scale time encoding can organically and orderly connect and integrate various time components, thereby generating a unique and semantically clear MSTC time code. The time code can be determined by the following formula: ,in, It is time-encoded.

[0067] During the encoding generation stage, Figure 5 A time-coding diagram provided for an embodiment of this application, such as Figure 5As shown, taking the timestamp "2024-12-31 23:59:59.0999" as an example, firstly, each time sub-unit is converted into a binary code of the corresponding bit length. For example, the year 2024 is converted into "11111101000" and padded with zeros at the high bits to extend it to 12 bits "00011111101000", the month 12 is converted into "1100", the day 31 is converted into "11111", the hour 23 is converted into "10111", the minute 59 is converted into "111011", the second 59 is converted into "111011", and the millisecond 999 is converted into "1111100111" and padded with zeros at the high bits to extend it to 12 bits "001111100111". Subsequently, binary sequences are combined in a coarse-grained to fine-grained order from year to millisecond. For the year and millisecond time units (both exceeding 6 bits), an even-division padding strategy is applied: the 12-bit year code is split into two 6-bit units, "011111" and "101000", and the 12-bit millisecond code is split into two 6-bit units, "001111" and "100111". Finally, the sequences are concatenated in the order of year (high-order bits), year (low-order bits), month, day, hour, minute, second, millisecond (high-order bits), and millisecond (low-order bits) to form a complete binary time code sequence. This sequence is then divided into 6-bit units, mapped to the Base64 character set, and generated... Figure 5 The 9-bit compact time-coded string "TbATLuuDa" is shown at the bottom.

[0068] This embodiment utilizes this encoding mechanism to enable prefix matching for querying. Because the time encoding strictly follows a hierarchical order from year to millisecond, coarse-grained time ranges can be directly converted into corresponding encoding prefixes for rapid location. For example, for the query "2024-12-31 59:00 to 2024-12-31 59:59", the start time encoding is "TbATL0", and the end time encoding is "TbATLu". Both share the prefix "TbATL", and the database only needs to perform a wildcard query for "TbATL". "Or a prefix range scan can hit all the data within the time period without having to convert the query boundary into a complex time range condition. This embodiment not only solves the problem that single-precision time encoding cannot reflect the complete timeliness information of the data, but also avoids the complex conversion overhead of multi-scale time conditions during querying, so that millisecond-level response performance can be obtained under the same encoding system from interannual trend analysis to millisecond-level microseismic event retrieval."

[0069] In some embodiments, the step of fusing the classification code, spatial code, and temporal code in a predetermined order to generate a unified spatiotemporal classification code includes: concatenating the Base64 format strings corresponding to the classification code, spatial code, and temporal code in the order of classification, space, and time to generate the unified spatiotemporal classification code.

[0070] After generating classification, spatial, and temporal codes independently, it is necessary to organically integrate the coding information of these three dimensions into a unified identifier, so that it can not only carry complete spatiotemporal classification semantics, but also be used directly as an efficient index key in the database system.

[0071] To address this issue, this application employs a composite key mechanism in the unified coding model for spatiotemporal classification. Figure 6 A schematic diagram of the key-value storage model for geophysical big data provided in this application embodiment is shown below. Figure 6 The classification code, spatial code, and temporal code are concatenated in a fixed order according to classification, space, and time to generate a globally unique MSTC unified code. S represents spatial coding, which uses Hilbert space-filling curves to ensure the locality and continuity of data; T represents temporal coding, which uses a multi-scale time segmentation strategy to organize data at different time granularities; C represents classification coding, which uses a hierarchical classification system to distinguish between thematic data and business data in order to improve query efficiency and data management capabilities. Figure 7 This is a schematic diagram of the MSTC spatiotemporal classification coding provided in an embodiment of this application. The coding structure of the unified spatiotemporal classification coding is as follows: Figure 7 As shown, the total length is 27 Base64 characters. The first 8 bits are the classification code, corresponding to a compact string after Base64 mapping of the 48-bit binary classification code; the middle 10 bits are the spatial code, corresponding to the Base64 representation of the 60-bit binary Hilbert spatial code; and the last 9 bits are the time code, corresponding to a 9-bit string after Base64 compression of the multi-granularity time binary sequence. In this embodiment, the concatenation order is determined with full consideration of the query characteristics of coal mine geoscience data. Specifically, the classification code is placed at the beginning to facilitate rapid domain-level filtering by business theme; the spatial code is in the middle to support efficient execution of spatial range queries and proximity searches; and the time code is placed at the end to support prefix matching scanning of time ranges by utilizing the ordered nature of string suffixes.

[0072] This embodiment, through this fusion mechanism, compresses the classification, spatial, and temporal attributes, which were originally scattered across different fields, into a single sortable, comparable, and hashable string key. This key can be directly used as the primary key, partition key, or the first column of a composite index in a database table. In different types of database engines such as MySQL, PostgreSQL, MongoDB, Cassandra, HBase, and Apache Iceberg, it can achieve efficient location by leveraging the key-value ordered characteristics of B+ trees, LSM trees, or hash indexes. The unified encoding is not only a unique identifier for the data but also a computable information carrier that carries the complete business semantics, spatial location, and temporal validity of the data. This provides a compact, unified, and high-performance index infrastructure for subsequent range queries based on encoding prefixes, cross-source data association analysis, and the construction of intelligent lake warehouses.

[0073] In some embodiments, after generating the spatiotemporal classification unified code, the method further includes: in response to a received spatiotemporal query request or classification retrieval request, converting the query conditions in the request into the corresponding spatiotemporal classification unified code or code prefix range; and locating the corresponding data page or data partition based on the physical index structure established by the spatiotemporal classification unified code to perform a data retrieval operation.

[0074] After constructing the MSTC unified encoding and using it as the primary key, partition key, or first column of a composite index in the database, it is also necessary to use this encoding structure to achieve efficient mapping from user query requests to the underlying physical data pages.

[0075] To address this issue, this application establishes a fast location mechanism based on encoding prefixes at the data retrieval layer. Specifically, when a spatiotemporal query request or a classification retrieval request is received, the spatiotemporal range, classification attributes, and other constraints in the query conditions are first converted into the corresponding unified encoding or encoding prefix range according to the MSTC encoding rules. Subsequently, the database engine directly utilizes the physical index structure built based on MSTC unified encoding to quickly locate the corresponding data page or data partition through the key-value ordering, thereby completely avoiding full table scans.

[0076] To comprehensively verify the effectiveness and universality of MSTC coding in coal mine geoscience big data management, in a specific embodiment, a VMware Workstation Ubuntu 20.04 operating system was deployed in a Precision 7920 Tower high-performance computing environment. This environment was configured with dual Intel Xeon Silver 4214R CPUs, 128 GB RAM, 8 GB Nvidia Quadro RTX 4000 GPU, 512 GB SSD, and 4 TB HDD storage. A comparative test platform covering six mainstream database systems was also built. Table 2 shows the database test environment and version table provided in this embodiment. Specific versions are shown in Table 2.

[0077]

[0078] The experiment used two-dimensional spatial distribution data of drilling points in a coal mining area, with latitude and longitude precision of 14 decimal places, spatial code length of 10 bits, and MSTC unified code total length of 27 bits, and generated... , , , Four datasets of different sizes were used to comprehensively evaluate index building time, storage space overhead, and query response performance.

[0079] Regarding index building and storage costs, Figure 8 This is a schematic diagram illustrating the database usage of different datasets provided in the embodiments of this application. See [link / reference]. Figure 8 ,in, Figure 8 Figure 'a' shows a comparison of write time consumption, indicating that Iceberg combined with Spark distributed storage on the Hadoop Distributed File System (HDFS) exhibits the best write efficiency for large table formats, followed by MongoDB, while Cassandra performs better with larger data volumes. The write time increased significantly, more than 10 times that of Iceberg. MySQL and PostgreSQL, as relational databases, write quickly when the data volume is small, but their write efficiency decreases significantly as the data scale increases. Figure 8 b is a diagram comparing the write space usage. The results show that HBase has the highest storage space usage for the same amount of data, about 5 times that of MySQL. MySQL has the smallest data usage due to its row storage method. PostgreSQL and MongoDB have storage space usage in between. Iceberg performs better in terms of storage space by combining columnar storage and partitioning optimization strategies. Figure 9A schematic diagram illustrating the time and space consumption for constructing different database indexes provided in the embodiments of this application is shown below. Figure 9 ,in, Figure 9 Figure 'a' illustrates the time consumption for index building, showing that MySQL's write speed is relatively slow. While Cassandra's index creation time is short, there is a dynamic index building process. The entire indexing process for the dataset takes approximately one hour. Figure 9 b is a diagram illustrating the space usage for index building. The space usage of the four databases is basically the same, with Cassandra being slightly higher, showing a linear trend across different datasets.

[0080] Regarding query performance verification Figures 10 to 15 The comparative experiments in Table 3 fully verify the effectiveness of the physical indexing and positioning mechanism based on the MSTC encoding prefix. Specifically, Figure 10 This is a schematic diagram of a MySQL query test provided in an embodiment of this application, wherein, Figure 10 The following shows four query methods without an index: MSTC query, fuzzy MSTC query, compound query, and range query. Their performance is similar, and the query time increases exponentially with the data size. Figure 10 b shows that after indexing, MSTC queries and fuzzy MSTC queries are performed... Achieved nearly 100x performance improvement on the dataset, with response time reduced to milliseconds, and maintained its superiority over compound queries and range queries as the data size increased. Figure 11 This is a schematic diagram of a PostgreSQL query test provided in an embodiment of this application, wherein, Figure 11 The result shows that even with a large dataset, MSTC queries are still superior to other query methods when no index is created. Figure 11 b shows that after creating the index, both MSTC queries and compound queries achieved significant performance improvements. Query times on the dataset consistently remain below 0.001 seconds, especially when processing... The fuzzy MSTC query speed is 100 times faster than the traditional range query. Figure 12 This is a schematic diagram of a MongoDB query test provided in an embodiment of this application, wherein, Figure 12 The data shows that MSTC queries are consistently faster than range queries and compound queries even without indexes. Figure 12 b shows that after the index is created, MSTC queries are performed... On datasets, it is approximately 7.5 times faster than MongoDB's built-in geospatial index 2dsphere and range queries. MSTC query speed on the dataset is approximately 10,000 times faster than 2dsphere. Figure 13This is a schematic diagram of an HBase query test provided in an embodiment of this application. Figure 13 The display shows that when no row key is used, the query time increases linearly with the dataset size, and... A timeout occurred on the dataset; Figure 13 b shows that after using row keys, queries with MSTC encoding as row keys maintain a performance of less than 10ms across all datasets, while traditional range queries show an exponential increase in time. Figure 14 This is a schematic diagram of a Cassandra query test provided in an embodiment of this application, wherein, Figure 14 a shows After indexing the dataset with MSTC, query speed improved significantly, while traditional range queries showed no improvement. Figure 14 b shows The MSTC query on the dataset shows the greatest performance improvement. Table 3 shows the query timeline after creating the partition key in Cassandra according to the embodiments of this application. See Table 3 for the case where no partition key is specified. A range query on the dataset takes approximately 0.5269 seconds. The dataset has increased to 4.5239 seconds. The above datasets experienced query timeouts due to resource bottlenecks triggered by full table scans; however, after creating an index, MSTC queries... The dataset still maintains a response time of less than 0.01 seconds. Figure 15 This is a schematic diagram of an Apache Iceberg query test provided in an embodiment of this application, wherein, Figure 15 The query time, without partitions, increases exponentially as the dataset size increases. Figure 15 b shows that after partitioning based on the first four digits of the MSTC spatial encoding, the query time is significantly optimized, and is consistently controlled between 0.2 and 0.3 seconds regardless of the dataset size.

[0081]

[0082] This embodiment employs a collaborative mechanism for encoding conversion and index positioning, enabling the use of "TbATL" to achieve the desired results. Wildcard queries with a prefix can directly locate all microseismic events from 59:00 to 59:59 on December 31, 2024. This also enables spatial retrieval of goaf areas across multiple grid levels to achieve deduplication and aggregation without traversing the entire table. By pre-compressing complex multidimensional spatiotemporal constraints at the encoding layer and directly utilizing the computable prefix properties of ordered string keys at the storage layer, this application achieves a one-time index jump from query intent to physical data pages, providing millisecond-level response data access capabilities for high-frequency dynamic updates and real-time early warning of geological hazards on the transparent geological platform.

[0083] Secondly, embodiments of this application provide an indexing system for spatiotemporal data classification, which facilitates improved data indexing efficiency.

[0084] like Figure 16 As shown, embodiments of this application also provide an indexing system for spatiotemporal data classification, including: a classification coding unit 31, a spatial coding unit 32, a temporal coding unit 33, and a fusion unit 34.

[0085] Among them, the classification coding unit 31 is used to encode the target data to be indexed based on a preset hierarchical classification system and generate classification codes;

[0086] Spatial coding unit 32 is used to divide the spatial region to which the target data belongs into hierarchical grid units, determine the corresponding divided grid units according to the spatial coordinates of the target data, and map the index of the grid unit to a one-dimensional spatial code;

[0087] The time encoding unit 33 is used to divide the time information of the target data according to a preset time granularity, and to map each time sub-unit in the time information into binary codes and combine them in sequence to obtain the time code;

[0088] The fusion unit 34 is used to fuse the classification code, spatial code and temporal code in a predetermined order to generate a unified spatiotemporal classification code. The unified spatiotemporal classification code is used as the primary key, partition key or the first column of the composite index of the database table and written into the database storage engine to establish a physical index structure based on the unified spatiotemporal classification code.

[0089] The spatiotemporal classification indexing system provided in this application solves the problems of lacking a unified coding system and difficulty in achieving cross-type data collaborative management by constructing a preset hierarchical classification system to generate classification codes. It provides standardized semantic identifiers and association links for multi-source heterogeneous data such as structured numerical data and unstructured images and point clouds. Secondly, by dividing the data into hierarchical grid units and mapping them to one-dimensional spatial codes, it overcomes the limitations of traditional index spatial division accuracy and inability to take into account both large-scale mining areas and local high-precision positioning, achieving cross-scale high-precision spatial positioning while maintaining the spatial locality of the data. Thirdly, by generating time codes through the combination of binary codes of multi-granularity time sub-units, it solves the defects of time index granularity being single and difficult to support efficient access to multi-scale continuous time data. Finally, by merging the three types of codes into a unified spatiotemporal classification code and directly writing it into the storage engine as the primary key, partition key, or first column of the composite index, it overcomes the drawbacks of traditional index structure being complex, having high update costs, and being unsuitable for high-frequency dynamic data flow management. By unifying the index structure and storage structure, it significantly reduces index maintenance overhead and achieves real-time writing and efficient retrieval of massive monitoring data. In summary, this application improves data indexing efficiency by constructing a compact indexing system that integrates classification, space, and time.

[0090] In some embodiments, the classification coding unit includes: an acquisition module, configured to acquire the attribution identifier of target data in the subject domain, business object, data entity and attribute sequence based on the hierarchical classification system defined according to the coal mine data fusion and sharing specification; a mapping module, configured to map each attribution identifier to a binary code segment of a preset number of bits; and a splicing module, configured to splice each binary code segment in hierarchical order from subject domain to attribute sequence to obtain the classification code.

[0091] In some embodiments, the spatial coding unit is specifically used to: divide the spatial region into multiple levels of square grid units; wherein the highest level grid unit covers the entire space of the target data, and the lowest level grid unit corresponds to millimeter-level spatial accuracy.

[0092] This application discloses a unified MSTC coding indexing method for multi-source heterogeneous data in coal mines. By integrating classification, spatial, and temporal three-dimensional coding, a compact indexing system integrating standardized management and efficient retrieval is constructed. Specifically, in the classification dimension, based on a multi-level logical framework of "subject area—business object—data entity—attribute sequence," it covers four major areas: basic, production, safety, and management, subdivided into 59 subjects, 222 business objects, 1047 data entities, and 12547 attributes, forming a clearly structured and scalable coding system. In the spatial dimension, a Geohash multi-scale grid partitioning and Hilbert curve mapping strategy is adopted to divide the two-dimensional space into 14 levels of grids, achieving cross-scale high-precision positioning from 5000 kilometers to 5 millimeters, while maintaining local data continuity to improve storage and retrieval efficiency. In the temporal dimension, a multi-level subdivided coding system of year, month, day, hour, minute, second, and millisecond is constructed, achieving unified temporal coding through multi-scale temporal granularity partitioning to adapt to the temporal requirements of different exploration and production data. In summary, the MSTC unified coding model in this embodiment demonstrates significant advantages in terms of coal mine geoscience data query speed, retrieval accuracy, and storage efficiency. It can effectively support the efficient dynamic updates of the transparent geology platform and significantly improve the real-time early warning capability and intelligent decision-making level of coal mine geological disaster hazards.

[0093] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0094] The various embodiments in this specification are described in a related manner. The same or similar parts between the various embodiments can be referred to each other. Each embodiment focuses on describing the differences from other embodiments.

[0095] In particular, the device embodiment is basically similar to the method embodiment, so the description is relatively simple. For relevant details, please refer to the description of the method embodiment.

[0096] For ease of description, the above apparatus is described by dividing it into various functional units / modules. Of course, in implementing this application, the functions of each unit / module can be implemented in one or more software and / or hardware.

[0097] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. The storage medium can be a magnetic disk, optical disk, read-only memory (ROM), or random access memory (RAM), etc.

[0098] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. An indexing method for spatiotemporal data classification, characterized in that, include: Based on a pre-defined hierarchical classification system, the target data to be indexed is encoded to generate classification codes; The spatial region to which the target data belongs is divided into hierarchical grid cells. The corresponding grid cells are determined according to the spatial coordinates of the target data, and the index of the grid cells is mapped to a one-dimensional spatial code. The time information of the target data is divided according to a preset time granularity. Each time sub-unit in the time information is mapped to a binary code and then combined in sequence to obtain a time code. The classification code, spatial code, and temporal code are merged in a predetermined order to generate a unified spatiotemporal classification code. This unified spatiotemporal classification code is then used as the primary key, partition key, or the first column of a composite index in a database table and written into the database storage engine to establish a physical index structure based on the unified spatiotemporal classification code.

2. The indexing method for spatiotemporal data classification according to claim 1, characterized in that, The process of encoding the target data to be indexed based on a preset hierarchical classification system to generate classification codes includes: Based on the hierarchical classification system defined according to the coal mine data fusion and sharing specifications, the affiliation identifier of the target data in the subject area, business object, data entity and attribute sequence is obtained; Each of the aforementioned attribution identifiers is mapped to a binary code segment of a preset number of bits; The classification code is obtained by concatenating the binary code segments in hierarchical order from the subject domain to the attribute sequence.

3. The indexing method for spatiotemporal data classification according to claim 1, characterized in that, The hierarchical grid unit that divides the spatial region to which the target data belongs includes: The spatial region is divided into multiple levels of square grid cells; the highest level grid cell covers the entire spatial domain of the target data, and the lowest level grid cell corresponds to millimeter-level spatial accuracy.

4. The indexing method for spatiotemporal data classification according to claim 1, characterized in that, The step of determining the corresponding subdivided grid cells based on the spatial coordinates of the target data and mapping the index of the grid cells to a one-dimensional spatial code includes: Based on the latitude and longitude coordinates of the target data, calculate the grid level where the target data is located and the grid coordinates within the level; The space filling curve algorithm is invoked to convert the grid coordinates into one-dimensional binary space codes.

5. The indexing method for spatiotemporal data classification according to claim 1, characterized in that, The step of dividing the time information of the target data according to a preset time granularity, mapping each time sub-unit in the time information to binary code, and then combining them sequentially includes: Divide time into one or more levels from year, month, day, hour, minute, second, and millisecond; Generate corresponding binary codes based on the time values ​​of each level; The binary codes are combined in order from year to millisecond, and the combined binary sequence is padded with bits to obtain the time code.

6. The indexing method for spatiotemporal data classification according to claim 1, characterized in that, The step of fusing the classification code, spatial code, and temporal code in a predetermined order to generate a unified spatiotemporal classification code includes: The Base64 format strings corresponding to the classification code, spatial code, and temporal code are concatenated in the order of classification, space, and time to generate the unified spatiotemporal classification code.

7. The indexing method for spatiotemporal data classification according to claim 1, characterized in that, After generating the unified code for spatiotemporal classification, the following is also included: In response to a received spatiotemporal query request or classification retrieval request, the query conditions in the request are converted into the corresponding spatiotemporal classification unified code or code prefix range; Based on the physical index structure established by the spatiotemporal classification unified coding, the corresponding data page or data partition is located to perform data retrieval operations.

8. An indexing system for spatiotemporal data classification, characterized in that, include: The classification coding unit is used to encode the target data to be indexed based on a preset hierarchical classification system, and generate classification codes. A spatial coding unit is a hierarchical grid unit used to divide the spatial region to which the target data belongs. The corresponding grid unit is determined according to the spatial coordinates of the target data, and the index of the grid unit is mapped to a one-dimensional spatial code. The time encoding unit is used to divide the time information of the target data according to a preset time granularity, and to map each time sub-unit in the time information into binary codes and combine them in sequence to obtain the time code; The fusion unit is used to fuse the classification code, spatial code and temporal code in a predetermined order to generate a unified spatiotemporal classification code. The unified spatiotemporal classification code is used as the primary key, partition key or the first column of the composite index of the database table and written into the database storage engine to establish a physical index structure based on the unified spatiotemporal classification code.

9. The indexing system for spatiotemporal data classification according to claim 1, characterized in that, The classification coding unit includes: The acquisition module is used to obtain the attribution identifier of target data in subject areas, business objects, data entities and attribute sequences based on the hierarchical classification system defined according to the coal mine data fusion and sharing specifications. The mapping module is used to map each of the attribution identifiers to a binary code segment of a preset number of bits. The splicing module is used to splice the binary code segments in a hierarchical order from the subject domain to the attribute sequence to obtain the classification code.

10. The indexing system for spatiotemporal data classification according to claim 1, characterized in that, The spatial coding unit is specifically used to divide the spatial region into multiple levels of square grid units; wherein the highest level grid unit covers the entire space of the target data, and the lowest level grid unit corresponds to millimeter-level spatial accuracy.