Disk image generation method, device, equipment and storage medium
By determining the disk image format based on the read/write characteristics of a single disk when the target disk data changes, and mapping the changed disk data to the storage device, the problem of low storage device load capacity in the prior art is solved, and efficient data read/write and snapshot maintenance of the storage device are achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 张倩
- Filing Date
- 2022-01-18
- Publication Date
- 2026-06-26
AI Technical Summary
In existing technologies, disk images generated using virtual disk image formats and RAW formats increase the number of I/O operations on storage devices when reading or writing data, leading to a reduction in the storage device's load capacity.
When a change in target disk data is detected, the corresponding disk image format is determined based on the read and write characteristics of a single disk, and the changed disk data is mapped to the corresponding disk image in the storage device, including write-infrequent and write-exfrequent disk image formats. The write-infrequent region main read/write file and the write-exfrequent region latest file are used to store the data, respectively.
By optimizing the disk image format, the number of migrations of the storage device is reduced, the load capacity of the storage device is improved, and the number of I/O operations for disk image snapshots is reduced, thereby improving the performance of the storage device.
Smart Images

Figure CN115374020B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of disaster recovery technology, and in particular to a disk image generation method, apparatus, device, and storage medium. Background Technology
[0002] Currently, when backing up disk data on a target disk, a specific disk image format is generally used to map the disk data to the storage device, such as Virtual Hard Disk format (VHD) and RAW image format. Disk images generated using these disk image formats will increase the number of I / O operations on the storage device when reading or writing data to the disk image or when maintaining the disk image, which will reduce the load capacity of the storage device. Therefore, how to improve the load capacity of the storage device has become an urgent technical problem to be solved.
[0003] The above content is only used to help understand the technical solution of the present invention and does not represent an admission that the above content is prior art. Summary of the Invention
[0004] The main objective of this invention is to provide a disk image generation method, apparatus, device, and storage medium, which aims to solve the technical problem of low load capacity of existing storage devices.
[0005] To achieve the above objectives, the present invention provides a disk image generation method, the method comprising the following steps:
[0006] When a change is detected in the data on the target disk, the corresponding disk unit in the target disk is determined based on the data address of the changed data.
[0007] Determine the read / write characteristics of the disk unit, and determine the disk image format corresponding to the disk unit based on the read / write characteristics;
[0008] The disk change data is mapped to the corresponding disk image in the storage device according to the disk image format.
[0009] Optionally, the disk image format includes a write-short disk image format and a write-long disk image format;
[0010] The step of determining the read / write characteristics of the disk unit and determining the disk image format corresponding to the disk unit based on the read / write characteristics includes:
[0011] The number of data reads and data writes of the disk within a preset historical period are determined based on the historical read and write records of the disk.
[0012] When the number of data reads is greater than the number of data writes, the read / write characteristics of the disk block are determined to be low write characteristics, and the disk image format corresponding to the low write characteristics is a low write disk image format;
[0013] When the number of data reads is less than the number of data writes, the read / write characteristics of the disk block are determined to be multi-write characteristics, and the disk image format corresponding to the multi-write characteristics is a multi-write disk image format.
[0014] Optionally, the disk image format includes a write-sparse disk image format and a write-intensive disk image format; determining the read / write characteristics of the disk unit and determining the disk image format corresponding to the disk unit based on the read / write characteristics includes:
[0015] Obtain the system parameters of the business system, and determine the read / write characteristics of the disk unit based on the system parameters;
[0016] When the read / write characteristic is a low-write characteristic, it is determined that the disk image format corresponding to the disk block is a low-write disk image format;
[0017] When the read / write characteristic is a multi-write characteristic, the disk image format corresponding to the disk block is determined to be a multi-write disk image format.
[0018] Optionally, the disk image includes a write-sparse region main read / write file and a write-sparse region snapshot file;
[0019] The step of mapping the disk change data to the corresponding disk image in the storage device according to the disk image format includes:
[0020] When the read / write characteristics of a single disk are low-write characteristics, obtain the low-write region main read / write file and low-write region snapshot file in the storage device that correspond to the low-write disk image format;
[0021] Map the disk change data in the disk unit to the main read / write file in the low-write region;
[0022] Upon receiving a data write request, the mapped data in the main read / write file of the low-write region is read and written to the snapshot file of the low-write region.
[0023] Optionally, the main read / write file of the less-written region includes header information, a mapping table, and a read / write data area;
[0024] The step of mapping disk change data in the disk unit to the main read / write file in the low-write region includes:
[0025] The data information of disk change data in the disk unit is converted into the header information of the main read / write file of the write-scarce region, and the block address corresponding to the disk change data is obtained.
[0026] Determine whether the block address exists in the mapping table of the main read / write file of the less-written region;
[0027] When the block address does not exist in the mapping table, storage space is allocated in the read / write data area of the main read / write file in the less-written region for the disk change data corresponding to the block address, and the block address and the address of the allocated storage space are written into the mapping table;
[0028] When the block address exists in the mapping table, the disk change data corresponding to the block address is read and written into the corresponding storage space in the read / write data area of the main read / write file of the low-write region.
[0029] Optionally, the write-scarce region of the main read / write file further includes a copy flag bit;
[0030] Upon receiving a data write request, the step of reading the disk change data from the main read / write file of the low-write region and writing it to the low-write region snapshot file includes:
[0031] When a data write request is received, it is determined whether the copy flag bit corresponding to the disk-mapped data in the main read / write file of the low-write area is a valid flag bit;
[0032] When the copy flag bit is a valid flag bit, the disk mapping data is copied to the corresponding snapshot in the write-scarce region snapshot file to generate a disk image of the target disk;
[0033] When the copy flag bit is invalid, storage space is allocated for the disk mapping data in the main read / write file of the less-written region to generate a disk image of the target disk.
[0034] Optionally, the disk image includes the latest file in the multi-write region and a snapshot file in the multi-write region;
[0035] The step of mapping the disk change data to the corresponding disk image in the storage device according to the disk image format includes:
[0036] When the read / write characteristics of a single disk are multi-write characteristics, the latest file of the multi-write region is created in the storage device according to the multi-write disk image format;
[0037] Map the disk change data in the disk unit to the latest file in the multi-write region;
[0038] Upon receiving a snapshot creation request, the latest file of the multi-write region is closed and renamed to obtain a multi-write region snapshot file, and the latest file of the multi-write region is recreated in the storage device.
[0039] Optionally, the latest file in the multi-write region includes the latest file header information, the latest file write header information, the latest file read / write data area, and the latest file mapping information;
[0040] The step of mapping disk change data in the single disk to the latest file in the multi-write region includes:
[0041] Obtain the data information and block address of the disk change data in the disk unit, generate the latest file header information based on the data information, and determine the latest file write header information based on the block address;
[0042] Generate the latest file mapping information of the latest file in the multi-write region based on the latest file write header information;
[0043] The disk change data is written to the latest file read / write data area of the latest file in the multi-write region according to the latest file mapping information.
[0044] Furthermore, to achieve the above objectives, the present invention also proposes a disk image generation apparatus, the apparatus comprising:
[0045] The monitoring module is used to determine the corresponding disk unit in the target disk based on the data address of the changed data when a change in data is detected in the target disk.
[0046] The determination module is used to determine the read and write characteristics of the disk unit and determine the disk image format corresponding to the disk unit based on the read and write characteristics;
[0047] The mapping module is used to map the disk change data to the corresponding disk image in the storage device according to the disk image format.
[0048] Furthermore, to achieve the above objectives, the present invention also proposes a disk image generation device, the device comprising: a memory, a processor, and a disk image generation program stored on the memory and executable on the processor, the disk image generation program being configured to implement the steps of the disk image generation method as described above.
[0049] In addition, to achieve the above objectives, the present invention also proposes a storage medium storing a disk image generation program, wherein when the disk image generation program is executed by a processor, it implements the steps of the disk image generation method described above.
[0050] When a change is detected in the data on a target disk, this invention determines the corresponding disk segment on the target disk based on the data address of the changed data; determines the read / write characteristics of the disk segment, and determines the disk image format corresponding to the disk segment based on the read / write characteristics; and maps the changed data to the corresponding disk image in the storage device according to the disk image format. Because this invention determines the disk image format based on the read / write characteristics of the disk segment corresponding to the changed data when the disk data on the target disk changes, and maps the changed data to the corresponding disk image in the storage device according to the disk image format, it ensures that the data in the disk image is stored in contiguous sectors. This reduces the need for track relocation when subsequently reading and writing to the disk image in the storage device, and avoids increasing the number of I / O operations when maintaining snapshots of the disk image, thus improving the load capacity of the storage device. Attached Figure Description
[0051] Figure 1 This is a schematic diagram of the structure of the disk image generation device for the hardware operating environment involved in the embodiments of the present invention;
[0052] Figure 2 This is a flowchart illustrating the first embodiment of the disk image generation method of the present invention;
[0053] Figure 3 This is a flowchart illustrating the second embodiment of the disk image generation method of the present invention;
[0054] Figure 4 This is a schematic diagram of the write-scarce disk image format of the second embodiment of the disk image generation method of the present invention;
[0055] Figure 5 This is a flowchart illustrating the third embodiment of the disk image generation method of the present invention;
[0056] Figure 6 This is a schematic diagram of the multi-write disk image format in the third embodiment of the disk image generation method of the present invention;
[0057] Figure 7 This is a structural block diagram of the first embodiment of the disk image generation apparatus of the present invention.
[0058] The realization of the objective, functional features and advantages of the present invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation
[0059] It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of the invention.
[0060] Reference Figure 1 , Figure 1This is a schematic diagram of the disk image generation device structure of the hardware operating environment involved in the embodiments of the present invention.
[0061] like Figure 1 As shown, the disk image generation device may include: a processor 1001, such as a central processing unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used to enable communication between these components. The user interface 1003 may include a display screen or an input unit such as a keyboard; optionally, the user interface 1003 may also include a standard wired interface or a wireless interface. The network interface 1004 may optionally include a standard wired interface or a wireless interface (such as a Wireless-Fidelity (Wi-Fi) interface). The memory 1005 may be high-speed random access memory (RAM) or stable non-volatile memory (NVM), such as a disk drive. The memory 1005 may also optionally be a storage device independent of the aforementioned processor 1001.
[0062] Those skilled in the art will understand that Figure 1 The structure shown does not constitute a limitation on the disk image generation device and may include more or fewer components than shown, or combine certain components, or have different component arrangements.
[0063] like Figure 1 As shown, the memory 1005, which serves as a storage medium, may include an operating system, a network communication module, a user interface module, and a disk image generation program.
[0064] exist Figure 1 In the disk image generation device shown, the network interface 1004 is mainly used for data communication with the network server; the user interface 1003 is mainly used for data interaction with the user; the processor 1001 and the memory 1005 in the disk image generation device of the present invention can be set in the disk image generation device, and the disk image generation device calls the disk image generation program stored in the memory 1005 through the processor 1001 and executes the disk image generation method provided in the embodiment of the present invention.
[0065] This invention provides a method for generating a disk image, referring to... Figure 2 , Figure 2 This is a flowchart illustrating the first embodiment of the disk image generation method of the present invention.
[0066] In this embodiment, the disk image generation method includes the following steps:
[0067] Step S1: When a change in data is detected in the target disk, the corresponding disk unit in the target disk is determined based on the data address of the changed data.
[0068] It should be noted that the executing entity in this embodiment can be a computing service device with data processing, network communication, and program execution functions, such as a tablet computer, personal computer, or mobile phone, or an electronic device or disk image generation device capable of performing the above functions. The following description uses a disk image generation device as an example to illustrate this embodiment and the subsequent embodiments.
[0069] It is understood that the target disk can be a disk for which a disk image needs to be generated; a disk is a device used to store information, which is usually digitized and stored using media such as electricity, magnetism or optics. In this embodiment, the disk mainly refers to storage devices such as hard disks, redundant arrays of independent disks (RAID), SAN storage (Storage Area Network, SAN), and network attached storage (NAS).
[0070] It should be understood that a sector is the smallest unit of data storage device that accepts read or write functions. Each sector has an address number (Logical Block Addressing, LBA). Since non-contiguous data writing will generate disk fragmentation, disk fragmentation will increase the hard drive's seek time and affect system performance. In order to improve system performance, the target disk can be pre-divided according to the LBA of the target disk to obtain several disk slabs.
[0071] It is understandable that a disk slice can be the range of sectors that can be completed in a single system read / write call. The sectors in a disk slice are contiguous, and the range of sectors that can be completed in a single system read / write call is determined by the system's performance. Dividing the target disk into several disk slices can also be done by dividing the target disk into several disk slices based on the LBA of the target disk and the system's read / write performance.
[0072] It is understandable that data changes can mean that data on the target disk is modified; data address can mean the address of the changed data on the target disk; and changed data can mean the data that has been modified on the target disk. For example, if data is written to the target disk, then the data written is the changed data.
[0073] In a specific implementation, the disk image generation method of this embodiment can be applied to the generation of virtual machine disk images. When the virtual machine starts, the target disk in the virtual machine is monitored in real time. When changes are detected in the data of the target disk, the disk unit in the target disk that stores the disk change data is determined according to the disk change data.
[0074] Step S2: Determine the read / write characteristics of each disk unit, and determine the disk image format corresponding to each disk unit based on the read / write characteristics.
[0075] It should be understood that read / write characteristics can be either read or write characteristics; since not all areas of a disk or data storage device contain data, when generating a disk image of a disk or data storage device, it is necessary to store the data of the disk or data storage device using a disk image format.
[0076] Step S3: Map the disk change data to the corresponding disk image in the storage device according to the disk image format.
[0077] It is understood that the storage device can be a device that stores a disk image of the target disk; the disk image can be an object in the computer field, commonly a file or a block device, which contains the content and structure of a disk or data storage device, including but not limited to hard disks, floppy disks, magnetic tapes, optical discs, flash drives, etc. It is usually copied at the sector level of the target disk, thereby completely copying the structure and content of the target disk file system. In this embodiment, the disk image refers to a large-capacity hard disk image.
[0078] In practice, the disk image generation device divides the target disk into several disk shards based on the LBA and system performance. When a change in data on the target disk is detected, the device determines the corresponding disk shard in the target disk that stores the changed data based on the data address of the changed data, determines the read / write characteristics of the disk shard, determines the corresponding disk image format based on the read / write characteristics, and maps the changed data in the disk shard to the corresponding disk image in the storage device using the disk image format.
[0079] Furthermore, since data read / write operations differ across different disk platters, and the performance requirements for data read and write operations also vary, different disk image formats are determined for disk platters with different read / write characteristics to reduce I / O operations during subsequent data read or write operations. These disk image formats include low-write disk image formats and high-write disk image formats. Step S2 includes:
[0080] Step S21: Determine the number of data reads and data writes of the disk within a preset historical time period based on the historical read and write records of the disk.
[0081] It is understandable that historical read / write records can be records of reading disk data or writing data to a disk within a preset historical period; the preset historical period can be the duration of the number of times data is read and written, which can be set according to the specific scenario; the number of times data is read can be the number of times data is read from a disk; the number of times data is written can be the number of times data is written to a disk.
[0082] It should be understood that the data reading frequency can also be determined based on the number of data reads and the preset historical duration; the data writing frequency can also be determined based on the number of data writes and the preset historical duration.
[0083] Step S22: When the number of data reads is greater than the number of data writes, the read / write characteristics of the disk block are determined to be low write characteristics, and the disk image format corresponding to the low write characteristics is a low write disk image format.
[0084] It should be understood that the "less write" characteristic can be the characteristic that the number of data reads on a single disk is greater than the number of data writes; the files stored on a single disk with the "less write" characteristic are generally infrequently modified files such as program files, library files, configuration files, or database table areas.
[0085] Step S23: When the number of data reads is less than the number of data writes, the read / write characteristics of the disk block are determined to be multi-write characteristics, and the disk image format corresponding to the multi-write characteristics is a multi-write disk image format.
[0086] It should be understood that the multiple write characteristic can be the characteristic that the number of data reads on a single disk is less than the number of data writes; the files stored on a single disk corresponding to the multiple write characteristic are generally the log files of the database system.
[0087] Understandably, the read / write characteristics of a disk can also be determined based on the ratio of data reads to data writes within a preset historical period. If the percentage of data reads within the preset historical period is higher than a first preset value, it is a low-write characteristic; if the percentage of data reads within the preset historical period is lower than a second preset value, it is a high-write characteristic. For example, if the first preset value is 80%, and the percentage of data reads within the preset historical period is 88%, then the read / write characteristics of the disk are low-write characteristics.
[0088] Furthermore, in order to determine different disk image formats for disk platters with different read / write characteristics, thereby reducing the number of I / O operations during subsequent data reading or writing, the disk image formats include write-infrequent disk image formats and write-exfrequent disk image formats. Step S2 further includes: obtaining system parameters of the business system and determining the read / write characteristics of the disk platter based on the system parameters; when the read / write characteristics are write-infrequent, determining that the disk image format corresponding to the disk platter is a write-infrequent disk image format; when the read / write characteristics are write-exfrequent, determining that the disk image format corresponding to the disk platter is a write-exfrequent disk image format.
[0089] It is understandable that system parameters include the configuration parameters and characteristic parameters of the business system. The configuration parameters can be manually set; for example, if the configuration parameter of a disk in the target disk of the business system is manually set to a write-intensive characteristic, then that disk is characterized as write-intensive. If the configuration parameter of a disk in the target disk of the business system is manually set to a write-inefficient characteristic, then that disk is characterized as write-inefficient. The characteristic parameters of the business system can be parameters corresponding to the business system type. For example, if the characteristic parameters of the business system correspond to the business system type of a log business system, then the read / write characteristic of a disk is determined to be write-intensive. If the characteristic parameters of the business system correspond to the business system type of a configuration file system, then the read / write characteristic of a disk is determined to be write-inefficient.
[0090] In this embodiment, when a change in data is detected on the target disk, the corresponding disk fragment on the target disk is determined based on the data address of the changed data; the read / write characteristics of the disk fragment are determined, and the disk image format corresponding to the disk fragment is determined based on the read / write characteristics; the changed data is mapped to the corresponding disk image on the storage device according to the disk image format. Since this embodiment determines the disk image format based on the read / write characteristics of the disk fragment corresponding to the changed data when the disk data on the target disk changes, and maps the changed data to the corresponding disk image on the storage device according to the disk image format, it ensures that the data in the disk image is stored in contiguous sectors. This reduces the need for track relocation when performing read / write operations on the disk image on the storage device, and does not increase the number of I / O operations when maintaining snapshots of the disk image, thus improving the load capacity of the storage device.
[0091] refer to Figure 3 , Figure 3 This is a flowchart illustrating the second embodiment of the disk image generation method of the present invention.
[0092] Based on the first embodiment described above, in this embodiment, step S3 includes:
[0093] Step S31: When the read / write characteristics of a single disk are low write characteristics, obtain the low write region main read / write file and low write region snapshot file in the storage device that correspond to the low write disk image format.
[0094] It is understandable that when the read / write characteristics of a single disk are low-write characteristics, the corresponding disk image in the storage device includes the low-write region main read / write file and the low-write region snapshot file.
[0095] It should be understood that writing fewer disk image formats can be referenced. Figure 4 The write-free disk image format is mainly divided into two areas: the write-free main read / write file and the write-free snapshot file. The write-free main read / write file contains header information, copy flag bits, map, and read / write data area; the write-free snapshot file contains snapshot header information, snapshot valid data map table, and snapshot entity data.
[0096] Understandably, the write-scarce region master read / write file can be a file that carries the data read / write functions of a disk slab with write-scarce characteristics. Since this file is organized into a contiguous space, when writing disk data from a write-scarce disk slab to the write-scarce region master read / write file, random I / O is not generated because the written data is stored in contiguous space, which can improve the efficiency of data writing. The write-scarce region snapshot file can be a file that carries the storage of all snapshot data. Snapshot data can refer to the data set at a certain point in time in the disk image.
[0097] Step S32: Map the disk change data in the disk unit to the main read / write file of the low-write area.
[0098] Understandably, when a data write request is received, if the block address is not found in the mapping table, it means that the corresponding data has not been mapped. In this case, storage space is allocated for it at the end of the main read / write file of the write-scarce region. If the block address has been mapped, the corresponding block in the main read / write file of the write-scarce region is copied to the corresponding snapshot in the snapshot file of the write-scarce region. Then, the disk change data is mapped to the main read / write file of the write-scarce region, so that the historical data in the main read / write file of the write-scarce region is copied to the snapshot point before the data is modified.
[0099] Step S33: Upon receiving a data write request, read the mapping data in the main read / write file of the low-write region and write it to the snapshot file of the low-write region.
[0100] In the specific implementation, when a data write request is received, the mapped data in the main read / write file of the write-scarce region is read and written to the snapshot point corresponding to the snapshot file of the write-scarce region.
[0101] Furthermore, to ensure that the data in the write-scarce region main read / write file is stored in contiguous space to reduce the number of I / O operations during subsequent data reads, the write-scarce region main read / write file includes header information, a mapping table, and a read / write data area; step S32 includes:
[0102] Step S321: Convert the data information of the disk change data in the disk single piece into the header information of the main read / write file of the write-scarce region, and obtain the block address corresponding to the disk change data.
[0103] Understandably, the data information includes the file identification mark of the file corresponding to the data, block size, file size, and read / write record statistics. The data information is converted into the header information of the main read / write file in the less written area. When multiple sectors are combined into one block for unified allocation, read / write, or management, the number of that block is the block address.
[0104] Step S322: Determine whether the block address exists in the mapping table of the main read / write file of the less-written region.
[0105] Understandably, the mapping table includes three elements: the mapping source address, the offset within the mapped file, and the mapping length. The mapping source address records the starting block number of the disk data, the offset within the mapped file records the file offset within the main read / write file in the write-scarce region, and the mapping length records the number of variable-length blocks included in this mapping. Determining whether a block address exists in the mapping table of the main read / write file in the write-scarce region can be done by checking whether the mapping source address in the mapping table contains that block address. If it does, then the block address exists in the mapping table; otherwise, the block address does not exist in the mapping table.
[0106] Step S323: When the block address does not exist in the mapping table, allocate storage space for the disk change data corresponding to the block address in the read / write data area of the main read / write file in the low-write region, and write the block address and the address of the allocated storage space into the mapping table.
[0107] It is understandable that if a block address does not exist in the mapping table, it means that the disk data corresponding to that block address has not been mapped; the read / write data area can be a region for reading, writing and storing data.
[0108] In the specific implementation, if the block address does not exist in the mapping table, it means that the disk data corresponding to the block address has not been mapped in the main read / write file of the write-scarce region. At the end of the read / write data area of the main read / write file of the write-scarce region, storage space is allocated for the disk change data corresponding to the block address, and the block address and the address of the allocated storage space are written into the mapping table.
[0109] Step S324: When the block address exists in the mapping table, read the disk change data corresponding to the block address and write it into the corresponding storage space in the read / write data area of the main read / write file of the low-write region.
[0110] It should be understood that when the address of this block exists in the mapping table, it indicates that the disk data corresponding to this address has been mapped to the primary read / write file in the write-scarce region.
[0111] In the specific implementation, when the address of the block exists in the mapping table, it indicates that the disk data corresponding to the address of the block has been mapped in the main read-write file of the write-scarce region. At this time, according to the offset and mapping length recorded in the mapping table, the mapping data corresponding to the address of the block in the main read-write file of the write-scarce region is copied to the snapshot corresponding to the snapshot in the write-scarce region snapshot file, and the new disk change data is mapped to the corresponding storage space in the read-write data area.
[0112] Furthermore, to ensure the integrity of snapshot data by copying historical data to snapshot points before data modification, the write-scarce region main read / write file also includes a copy flag bit. The step of reading the mapped data from the write-scarce region main read / write file and writing it to the write-scarce region snapshot file to generate a disk image of the target disk upon receiving a data write request includes: upon receiving a data write request, determining whether the copy flag bit corresponding to the disk mapped data in the write-scarce region main read / write file is a valid flag bit; if the copy flag bit is a valid flag bit, copying the disk mapped data to the corresponding snapshot in the write-scarce region snapshot file to generate a disk image of the target disk; if the copy flag bit is an invalid flag bit, allocating storage space for the disk mapped data in the write-scarce region main read / write file to generate a disk image of the target disk.
[0113] Understandably, the write-scarce region snapshot file contains: snapshot header information (snapshot1 head), snapshot valid data mapping table (snapshot1 map), and snapshot entity data (snapshot1 data). The snapshot header information (snapshot1 head) includes information such as snapshot name, snapshot size, and amount of valid data. The snapshot mapping table (snapshot1 map) mainly contains the mapping source address, the offset within the mapping file, and the mapping length. The snapshot entity data (snapshot1 data) is the actual mapping data written to the write-scarce region snapshot file.
[0114] It should be understood that when the copy flag is a valid flag, it can be represented by "1"; when the copy flag is an invalid flag, it can be represented by "0"; if no snapshot of disk data has been created in the write-scarce area snapshot file or all snapshots have been deleted, the copy flag is initialized to "0"; if there are historical snapshots in the write-scarce area snapshot file, the copy flag is initialized to "1".
[0115] In the specific implementation, when receiving the disk-mapped data in the main read / write file of the write-scarce region to be written to the snapshot file of the write-scarce region, if the copy flag bit corresponding to the disk-mapped data is "1", then the block corresponding to the disk-mapped data in the main read / write file of the write-scarce region is copied to the corresponding snapshot in the snapshot file of the write-scarce region to achieve the copying of historical data to the snapshot point before modifying the data; if the copy flag bit corresponding to the disk-mapped data is "0", then storage space is allocated for it in the main read / write file of the write-scarce region.
[0116] Understandably, when a data write request is received, since all the data to be read is in the primary read / write file in the low-write area, and this file is usually organized into a contiguous space, the read is usually completed in one go without any extra I / O. Therefore, there is no extra disk migration. This performance is almost equivalent to the performance of using the disk or data storage device directly, with almost no additional performance loss. This is because the primary read / write file in the low-write area mainly carries data reading. In this case, reducing the migration of the disk or data storage device is equivalent to increasing the carrying capacity of the entire disk or data storage device.
[0117] It should be understood that since the main read / write file of the write-scarce region stores the latest data of the corresponding disk slab, there is no need to merge files when deleting snapshots later; they can be deleted directly without increasing the number of I / O operations. When reading data from the storage device later, since all the data to be read is written in the main read / write file of the write-scarce region, and the main read / write file of the write-scarce region is organized into a contiguous space, the data reading is usually completed in one go without any extra I / O operations.
[0118] In this embodiment, when the read / write characteristics of a single disk are characterized by low write activity, a low write region master read / write file and a low write region snapshot file are created in the storage device according to the low write disk image format. The disk data in the single disk is mapped to the low write region master read / write file. When a data write request is received, the mapped data in the low write region master read / write file is read and written to the low write region snapshot file. When reading data, since all the data to be read is in the low write region master read / write file, and the low write region master read / write file is a contiguous space, no additional I / O operations are performed when reading data. When deleting snapshots, no file merging is required, and no additional I / O operations are performed, thus improving the disk's carrying capacity.
[0119] refer to Figure 5 , Figure 5 This is a flowchart illustrating the third embodiment of the disk image generation method of the present invention.
[0120] Based on the above embodiments, in this embodiment, step S3 includes:
[0121] Step S34: When the read / write characteristics of a single disk are multi-write characteristics, create the latest file of the multi-write region in the storage device according to the multi-write disk image format.
[0122] It should be understood that when the read / write characteristics of a single disk are multi-write characteristics, the corresponding disk image in the storage device includes multi-write region snapshot files and the latest multi-write region files.
[0123] Understandably, the format for multi-write disk images can be referenced. Figure 6 The multi-write disk image format includes the latest file of the multi-write region and the snapshot file of the multi-write region. The latest file of the multi-write region contains: the latest file header information (Head), the latest file write header information (IO_Head), the latest file read / write data area, and the latest file mapping information (map). The latest file header information mainly includes tag information, etc. The latest file write header information (IO_Head) contains information about a single data write, including the number of sectors written in a single data write, the LBA address of the source of the single data write, and the offset within the file mapped by the single data write data. By enumerating all the latest file write header information (IO_Head), a complete mapping table can be constructed. When repeated writes occur, the data is overwritten into the previously allocated area according to the previous mapping relationship. When the number of writes is sufficient and write fragmentation occurs, the defragmentation module is started asynchronously. When the disk or data storage device is not busy with IO, defragmentation is performed on this fragment once. Because the fragment size is small, it is completed almost instantly.
[0124] Step S35: Map the disk change data in the disk unit to the latest file in the multi-write area.
[0125] In the specific implementation, when generating the disk image of a disk with multiple write characteristics, the disk data in the disk is first mapped to the latest file in the multiple write region.
[0126] Step S36: Upon receiving a snapshot creation request, close and rename the latest file of the multi-write region to obtain a multi-write region snapshot file, and recreate the latest file of the multi-write region in the storage device.
[0127] In the specific implementation, when the disk image generation device receives a snapshot creation request, it closes the latest file in the current multi-write region, renames the latest file in the current multi-write region to the created snapshot point, and recreates the latest file in the multi-write region to store the latest file.
[0128] Furthermore, to reduce disk fragmentation in the storage device when generating a disk image, the latest file in the multi-write region includes latest file header information, latest file write header information, latest file read / write data area, and latest file mapping information. Mapping the disk change data in the disk unit to the latest file in the multi-write region includes: obtaining the data information and block address of the disk change data in the disk unit; generating the latest file header information based on the data information; determining the latest file write header information based on the block address; generating the latest file mapping information of the latest file in the multi-write region based on the latest file write header information; and writing the disk change data into the latest file read / write data area of the latest file in the multi-write region based on the latest file mapping information.
[0129] It is understandable that when mapping disk data of a single disk with multiple write characteristics to a main read / write file in a multiple write area, if the disk data has never been mapped before, a new storage space is allocated at the end of the file, and the header information and disk data are written all at once, without increasing the number of IO writes.
[0130] In the specific implementation, the latest file header information is generated based on the data information of the disk change data in the disk unit, the latest file write header information of the latest file in the multi-write area is generated based on the block address of the disk data, a complete latest file mapping table is constructed by enumerating all the latest file write header information, and the disk change data of the disk unit is written to the latest file read and write data area according to the mapping relationship of the mapping table.
[0131] Understandably, in this embodiment, when generating a disk image of a disk with multiple write characteristics, if the data has never been mapped before, a new space is allocated at the end of the latest file in the multiple write region after data is received. Then, the latest file write header information (IO_Head) and the actual data are written in a single IO operation. This does not increase the number of IO writes, reduces the number of disk or data storage device migrations, and is almost equivalent to directly using the disk or data storage device, greatly improving IO write performance. Even during repeated writes, because the original data characteristics are similar, the writes usually do not cross multiple mapping units, so it is also a single IO write. There is almost no migration of the disk or data storage device, and even when reading data, the data can be quickly located in which snapshot and which file offset by the latest file mapping information (map) table of the multiple write region can be easily used. Furthermore, since each multiple write region snapshot file is obtained by closing and renaming the latest file of the multiple write region, subsequent maintenance of the disk image and deletion of one or more multiple write region snapshot files do not require merging between snapshot files, and no unnecessary IO occurs.
[0132] In this embodiment, when the read / write characteristics of a single disk are write-intensive, a latest file for the write-intensive region is created in the storage device according to the write-intensive disk image format. Disk change data in the single disk is mapped to this latest file. Upon receiving a snapshot creation request, the latest file for the write-intensive region is closed and renamed to obtain a write-intensive snapshot file, and then the latest file for the write-intensive region is recreated in the storage device. This allows mapping of disk change data in a single disk to a latest file for the write-intensive region when the disk is write-intensive, and closing and renaming the latest file to the created snapshot point to obtain a write-intensive snapshot file upon receiving a snapshot creation request. Since the latest file for the write-intensive region is a contiguous space, no random I / O is generated when mapping disk change data from the single disk to this file. Furthermore, each write-intensive snapshot file is obtained by closing and renaming the latest file for the write-intensive region, eliminating the need for file merging when deleting subsequent write-intensive snapshot files, thus avoiding unnecessary I / O and improving the load capacity of the storage device.
[0133] Furthermore, embodiments of the present invention also propose a storage medium storing a disk image generation program, wherein when the disk image generation program is executed by a processor, it implements the steps of the disk image generation method described above.
[0134] Reference Figure 7 , Figure 7 This is a structural block diagram of the first embodiment of the disk image generation apparatus of the present invention.
[0135] like Figure 7As shown, the disk image generation device proposed in this embodiment of the invention includes: a segmentation module 10, a determination module 20, and a generation module 30.
[0136] The monitoring module 10 is used to determine the corresponding disk unit in the target disk based on the data address of the changed data when a change in data is detected in the target disk.
[0137] The determining module 20 is used to determine the read and write characteristics of the disk unit and determine the disk image format corresponding to the disk unit based on the read and write characteristics.
[0138] The mapping module 30 is used to map the disk change data to the corresponding disk image in the storage device according to the disk image format.
[0139] When a change is detected in the data on a target disk, this invention determines the corresponding disk segment on the target disk based on the data address of the changed data; determines the read / write characteristics of the disk segment, and determines the disk image format corresponding to the disk segment based on the read / write characteristics; and maps the changed data to the corresponding disk image in the storage device according to the disk image format. Because this invention determines the disk image format based on the read / write characteristics of the disk segment corresponding to the changed data when the disk data on the target disk changes, and maps the changed data to the corresponding disk image in the storage device according to the disk image format, it ensures that the data in the disk image is stored in contiguous sectors. This reduces the need for track relocation when subsequently reading and writing to the disk image in the storage device, and avoids increasing the number of I / O operations when maintaining snapshots of the disk image, thus improving the load capacity of the storage device.
[0140] The determining module 20 is further configured to determine the number of data reads and data writes of the disk unit within a preset historical time period based on the historical read and write records of the disk unit; when the number of data reads is greater than the number of data writes, the read and write characteristics of the disk unit are determined to be low-write characteristics, and the disk image format corresponding to the low-write characteristics is a low-write disk image format; when the number of data reads is less than the number of data writes, the read and write characteristics of the disk unit are determined to be high-write characteristics, and the disk image format corresponding to the high-write characteristics is a high-write disk image format; the disk image format includes low-write disk image format and high-write disk image format.
[0141] The determining module 20 is further configured to acquire system parameters of the business system and determine the read / write characteristics of the disk unit based on the system parameters; when the read / write characteristics are low-write characteristics, determine that the disk image format corresponding to the disk unit is a low-write disk image format; when the read / write characteristics are high-write characteristics, determine that the disk image format corresponding to the disk unit is a high-write disk image format; the disk image format includes low-write disk image format and high-write disk image format.
[0142] The mapping module 30 is further configured to, when the read / write characteristics of a single disk are low-write characteristics, obtain the low-write region main read / write file and the low-write region snapshot file corresponding to the low-write disk image format in the storage device; map the disk change data in the single disk to the low-write region main read / write file; and, upon receiving a data write request, read the mapped data in the low-write region main read / write file and write it to the low-write region snapshot file; the disk image includes the low-write region main read / write file and the low-write region snapshot file.
[0143] The mapping module 30 is further configured to convert the data information of disk change data in the disk unit into header information of the write-scarce region main read / write file, and obtain the block address corresponding to the disk change data; determine whether the block address exists in the mapping table of the write-scarce region main read / write file; when the block address does not exist in the mapping table, allocate storage space for the disk change data corresponding to the block address in the read / write data area of the write-scarce region main read / write file and write the block address and the address of the allocated storage space into the mapping table; when the block address exists in the mapping table, read the disk change data corresponding to the block address and write it into the corresponding storage space in the read / write data area of the write-scarce region main read / write file; the write-scarce region main read / write file includes header information, a mapping table, and a read / write data area.
[0144] The mapping module 30 is further configured to, upon receiving a data write request, determine whether the copy flag bit corresponding to the disk mapping data in the write-scarce region main read / write file is a valid flag bit; if the copy flag bit is a valid flag bit, copy the disk mapping data to the corresponding snapshot in the write-scarce region snapshot file to generate a disk image of the target disk; if the copy flag bit is an invalid flag bit, allocate storage space for the disk mapping data in the write-scarce region main read / write file to generate a disk image of the target disk; the write-scarce region main read / write file also includes a copy flag bit.
[0145] The mapping module 30 is further configured to: create a latest file for a multi-write region in the storage device according to the multi-write disk image format when the read / write characteristics of a single disk are multi-write characteristics; map disk change data in the single disk to the latest file for the multi-write region; and when a snapshot creation request is received, close and rename the latest file for the multi-write region to obtain a snapshot file for the multi-write region, and recreate the latest file for the multi-write region in the storage device; the disk image includes the latest file for the multi-write region and the snapshot file for the multi-write region.
[0146] The mapping module 30 is further configured to acquire data information and block addresses of disk change data in the single disk, generate latest file header information based on the data information, and determine latest file write header information based on the block addresses; generate latest file mapping information for the latest file of the multi-write region based on the latest file write header information; and write the disk change data into the latest file read / write data area of the latest file of the multi-write region based on the latest file mapping information; the latest file of the multi-write region includes latest file header information, latest file write header information, latest file read / write data area, and latest file mapping information.
[0147] Other embodiments or specific implementations of the disk image generation apparatus of the present invention can be found in the above-described method embodiments, and will not be repeated here.
[0148] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or system. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or system that includes that element.
[0149] The sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.
[0150] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as read-only memory / random access memory, magnetic disk, optical disk) and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of the present invention.
[0151] The above are merely preferred embodiments of the present invention and do not limit the scope of the patent. Any equivalent structural or procedural transformations made based on the description and drawings of the present invention, or direct or indirect applications in other related technical fields, are similarly included within the scope of patent protection of the present invention.
Claims
1. A method for generating a disk image, characterized in that, The method includes: When a change is detected in the data on the target disk, the corresponding disk unit in the target disk is determined based on the data address of the changed data. The read / write characteristics of the disk unit are determined, and the disk image format corresponding to the disk unit is determined based on the read / write characteristics. The disk image format includes a low-write disk image format and a high-write disk image format. The disk change data is mapped to the corresponding disk image in the storage device according to the disk image format. The disk image includes a write-sparse region primary read / write file and a write-sparse region snapshot file; The step of mapping the disk change data to the corresponding disk image in the storage device according to the disk image format includes: When the read / write characteristics of a single disk are low-write characteristics, obtain the low-write region main read / write file and low-write region snapshot file in the storage device that correspond to the low-write disk image format; Map the disk change data in the disk unit to the main read / write file in the low-write region; Upon receiving a data write request, the mapped data in the main read / write file of the write-scarce region is read and written to the snapshot file of the write-scarce region; The disk image includes the latest file in the multi-write region and a snapshot file in the multi-write region; The step of mapping the disk change data to the corresponding disk image in the storage device according to the disk image format includes: When the read / write characteristics of a single disk are multi-write characteristics, the latest file of the multi-write region is created in the storage device according to the multi-write disk image format; Map the disk change data in the disk unit to the latest file in the multi-write region; Upon receiving a snapshot creation request, the latest file of the multi-write region is closed and renamed to obtain a multi-write region snapshot file, and the latest file of the multi-write region is recreated in the storage device.
2. The method as described in claim 1, characterized in that, The step of determining the read / write characteristics of the disk unit and determining the disk image format corresponding to the disk unit based on the read / write characteristics includes: The number of data reads and data writes of the disk within a preset historical period are determined based on the historical read and write records of the disk. When the number of data reads is greater than the number of data writes, the read / write characteristics of the disk block are determined to be low write characteristics, and the disk image format corresponding to the low write characteristics is a low write disk image format; When the number of data reads is less than the number of data writes, the read / write characteristics of the disk block are determined to be multi-write characteristics, and the disk image format corresponding to the multi-write characteristics is a multi-write disk image format.
3. The method as described in claim 1, characterized in that, The step of determining the read / write characteristics of the disk unit and determining the disk image format corresponding to the disk unit based on the read / write characteristics includes: Obtain the system parameters of the business system, and determine the read / write characteristics of the disk unit based on the system parameters; When the read / write characteristic is a low-write characteristic, it is determined that the disk image format corresponding to the disk block is a low-write disk image format; When the read / write characteristic is a multi-write characteristic, the disk image format corresponding to the disk block is determined to be a multi-write disk image format.
4. The method as described in claim 1, characterized in that, The main read / write file of the low-write region includes header information, a mapping table, and a read / write data area; The step of mapping disk change data in the disk unit to the main read / write file in the low-write region includes: The data information of disk change data in the disk unit is converted into the header information of the main read / write file of the write-scarce region, and the block address corresponding to the disk change data is obtained. Determine whether the block address exists in the mapping table of the main read / write file of the less-written region; When the block address does not exist in the mapping table, storage space is allocated in the read / write data area of the main read / write file in the less-written region for the disk change data corresponding to the block address, and the block address and the address of the allocated storage space are written into the mapping table; When the block address exists in the mapping table, the disk change data corresponding to the block address is read and written into the corresponding storage space in the read / write data area of the main read / write file of the low-write region.
5. The method as described in claim 4, characterized in that, The main read / write file of the low-write region also includes a copy flag bit; Upon receiving a data write request, the step of reading the mapped data from the main read / write file of the low-write region and writing it to the snapshot file of the low-write region includes: When a data write request is received, it is determined whether the copy flag bit corresponding to the disk-mapped data in the main read / write file of the low-write area is a valid flag bit; When the copy flag bit is a valid flag bit, the disk mapping data is copied to the corresponding snapshot in the write-scarce region snapshot file to generate a disk image of the target disk; When the copy flag bit is invalid, storage space is allocated for the disk mapping data in the main read / write file of the less-written region to generate a disk image of the target disk.
6. The method as described in claim 1, characterized in that, The latest file in the multi-write region includes the latest file header information, the latest file write header information, the latest file read / write data area, and the latest file mapping information; The step of mapping disk change data in the single disk to the latest file in the multi-write region includes: Obtain the data information and block address of the disk change data in the disk unit, generate the latest file header information based on the data information, and determine the latest file write header information based on the block address; Generate the latest file mapping information of the latest file in the multi-write region based on the latest file write header information; The disk change data is written to the latest file read / write data area of the latest file in the multi-write region according to the latest file mapping information.
7. A disk image generation apparatus, characterized in that, The device includes: The monitoring module is used to determine the corresponding disk unit in the target disk based on the data address of the changed data when a change in data is detected in the target disk. The determination module is used to determine the read and write characteristics of the disk unit and determine the disk image format corresponding to the disk unit based on the read and write characteristics. The disk image format includes a low-write disk image format and a high-write disk image format. The mapping module is used to map the disk change data to the corresponding disk image in the storage device according to the disk image format. The disk image includes a write-scarce region main read / write file and a write-scarce region snapshot file. The disk image also includes a write-multiple region latest file and a write-multiple region snapshot file. The mapping module is further configured to, when the read / write characteristics of a single disk are low write characteristics, obtain the low write region main read / write file and low write region snapshot file in the storage device corresponding to the low write disk image format; map the disk change data in the single disk to the low write region main read / write file; and, upon receiving a data write request, read the mapped data in the low write region main read / write file and write it to the low write region snapshot file. The mapping module is further configured to: when the read / write characteristics of a single disk are multi-write characteristics, create a latest file of the multi-write region in the storage device according to the multi-write disk image format; map disk change data in the single disk to the latest file of the multi-write region; and when a snapshot creation request is received, close and rename the latest file of the multi-write region to obtain a multi-write region snapshot file, and recreate the latest file of the multi-write region in the storage device.
8. A disk image generation device, characterized in that, The device includes: a memory, a processor, and a disk image generation program stored on the memory and executable on the processor, the disk image generation program being configured to implement the steps of the disk image generation method as described in any one of claims 1 to 6.
9. A storage medium, characterized in that, The storage medium stores a disk image generation program, which, when executed by a processor, implements the steps of the disk image generation method as described in any one of claims 1 to 6.