Blockchain-based data storage and verification method and device, and storage medium

By storing DICOM image files in a multi-branch tree on the blockchain, the issues of data security and privacy protection in DICOM image file sharing are solved, ensuring file validity and security, and supporting secure file sharing and circulation.

CN115394408BActive Publication Date: 2026-06-16NEUSOFT CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NEUSOFT CORP
Filing Date
2022-08-05
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Due to issues such as data security, privacy protection, and unclear responsibilities, the sharing and circulation of DICOM image files in the medical field is still in its early stages, and existing technologies are insufficient to guarantee the integrity and privacy of the data.

Method used

A blockchain-based data storage and verification method is adopted. By screening target data elements in DICOM image files to construct a multi-branch tree, the files are stored off-chain, while the multi-branch tree is stored on-chain to determine the validity of the files. This allows for partial data loss or alteration to protect privacy.

🎯Benefits of technology

While protecting patient privacy and security, we ensure the validity and immutability of DICOM image files, and promote secure sharing and circulation.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115394408B_ABST
    Figure CN115394408B_ABST
Patent Text Reader

Abstract

The present disclosure relates to a blockchain-based data storage and verification method, device and storage medium. The method comprises: in response to receiving a DICOM image file sent by a data provider, screening target data elements from the DICOM image file, the target data elements being used to interpret an image corresponding to the DICOM image file, and the number of the target data elements being less than the total number of data elements in the DICOM image file; constructing a multi-way tree according to the target data elements; storing the DICOM image file in an off-chain storage space; and storing a file ID of the DICOM image file, the multi-way tree, and a storage address of the DICOM image file in the off-chain storage space in an on-chain storage space, wherein the multi-way tree is used to determine whether the DICOM image file is valid. The present disclosure can guarantee the validity of the DICOM image file while allowing some data in the DICOM image file to be lost.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of blockchain technology, and in particular to a blockchain-based data storage and verification method, apparatus, and storage medium. Background Technology

[0002] DICOM (Digital Imaging and Communications in Medicine) is an international standard for medical images and related information. In the medical field, patient medical images are stored in the DICOM file format. This format contains PHI (protected health information) about the patient, such as name, gender, and age, as well as other image-related information, such as information about the device that captured and generated the image, and some medical contextual information. In application scenarios, after medical imaging equipment generates DICOM image files, medical staff use DICOM readers (computer software capable of displaying DICOM images) to read and diagnose any problems found in the images. Because a single medical image file can consume a large amount of memory—for example, the data size of a patient's CT scan image can reach several hundred megabytes, and the data size of a patient's thin-section CT scan image can reach 1 gigabyte—approximately 80% of the massive medical data consists of medical imaging data. This massive amount of medical imaging data has significant research value for medical development.

[0003] Currently, some regions have established regional imaging centers to collect and store imaging data provided by medical institutions for research and study. However, due to concerns and anxieties of data providers, such as data security, privacy protection, and unclear responsibilities, the sharing and circulation of imaging data remains in its early stages. Summary of the Invention

[0004] To overcome the problems existing in related technologies, this disclosure provides a data storage and verification method, apparatus and storage medium based on blockchain.

[0005] According to a first aspect of the present disclosure, a blockchain-based data storage and verification method is provided, the method comprising:

[0006] In response to receiving a DICOM image file sent by a data provider, target data elements are filtered from the DICOM image file. The target data elements are used to interpret the image corresponding to the DICOM image file. The number of target data elements is less than the total number of data elements in the DICOM image file.

[0007] Construct a multi-branch tree based on the target data elements;

[0008] The DICOM image file is stored in off-chain storage space;

[0009] The file ID of the DICOM image file, the multi-way tree, and the storage address of the DICOM image file in the off-chain storage space are stored in the on-chain storage space. The multi-way tree stored in the on-chain storage space is used to determine whether the DICOM image file stored in the off-chain storage space is valid.

[0010] Optionally, the number of target data elements is multiple, each data element includes a tag identifier, and the step of constructing a multi-way tree based on the target data elements includes:

[0011] Based on the tag category to which the target tag identifier of the target data element belongs, the multiple target data elements are divided into multiple class sets, and the class sets correspond one-to-one with the tag categories;

[0012] The class hash value corresponding to each of the class sets is used as the node hash value of the second-level class node of the multi-way tree, and the second-level class node of the multi-way tree corresponds one-to-one with the class set;

[0013] Perform hash calculations on all the aforementioned hash values ​​to obtain the root hash value of the root node of the multi-way tree.

[0014] Optionally, constructing a multi-way tree based on the target data elements further includes:

[0015] For each of the aforementioned class sets, the target data elements in the class set are grouped according to a preset number N to obtain M target data element groups, wherein the number of target data elements in the M-1 target data element groups is N, and the number of target data elements in a target data element group is less than or equal to N;

[0016] The group hash value corresponding to each of the target data element groups is used as the node hash value of the third group node under the second-level class node corresponding to the class set. The third group node under the second-level class node corresponding to the class set corresponds one-to-one with the target data element group.

[0017] The class hash value corresponding to the class set is obtained by hashing the node hash values ​​of M third-level group nodes under the second-level class node corresponding to the class set.

[0018] Optionally, constructing a multi-way tree based on the target data elements further includes:

[0019] For each target data element group, perform a hash calculation on each target data element in the target data element group to obtain the element hash value;

[0020] The hash value of each element is used as the node hash value of the fourth layer element node under the third layer group node corresponding to the target data element group. The fourth layer element node under the third layer group node corresponding to the target data element group corresponds one-to-one with the target data element in the target data element group.

[0021] The group hash value corresponding to the target data element group is obtained by hashing the node hash values ​​of all fourth-level element nodes under the third-level group node corresponding to the target data element group.

[0022] Optionally, the multi-way tree is a value-based multi-way tree, and correspondingly, the root node is characterized based on the root hash value and the number of leaf nodes under the root node;

[0023] The second-level class nodes of the multi-branch tree are characterized by the corresponding class hash value and the number of leaf nodes under the second-level class node.

[0024] Optionally, the method further includes:

[0025] In response to receiving a DICOM image file query request from a data querying party, determine the target file ID carried in the query request;

[0026] Query the target storage address and the first target multi-branch tree corresponding to the target file ID from the chain;

[0027] The target DICOM image file is obtained from the off-chain storage space according to the target storage address;

[0028] Calculate the second target multi-branch tree corresponding to the target DICOM image file;

[0029] The first target multi-way tree and the second target multi-way tree are compared to obtain the similarity score;

[0030] If the target DICOM image file is determined to be valid based on the similarity score, the target DICOM image file is then fed back to the data querying party.

[0031] Optionally, the first target multi-way tree and the second target multi-way tree have the same structure, and the tag identifier of the target data element corresponding to the i-th fourth-level element node in the first target multi-way tree is the same as the tag identifier of the target data element corresponding to the i-th fourth-level element node in the second target multi-way tree. The step of comparing the first target multi-way tree and the second target multi-way tree to obtain a similarity score includes:

[0032] If the root node of the first target multi-way tree is the same as the root node of the second target multi-way tree, the similarity is determined to be 100%.

[0033] When the root node of the first target multi-way tree is different from the root node of the second target multi-way tree, for each tag category in the first target multi-way tree and the second target multi-way tree, determine the first number of fourth-level element nodes with the same position and different node hash values ​​under the second-level class node.

[0034] The similarity is calculated based on the first quantity under each of the aforementioned tag categories, the weight of each of the aforementioned tag categories, and the total number of leaf nodes in the first target multi-way tree.

[0035] Optionally, determining the first number of fourth-level element nodes with different hash values ​​at the same position under each tag category in the first target multi-way tree and the second target multi-way tree includes:

[0036] For each tag category, compare whether the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree are the same;

[0037] If the second-level class nodes corresponding to the tag category are the same in the first target multi-way tree and the second target multi-way tree, the first quantity under the tag category is determined to be 0;

[0038] If the second-level class nodes corresponding to the tag categories in the first target multi-way tree and the second target multi-way tree are different, determine the target third-level group nodes whose hash values ​​of nodes at the same position under the second-level class nodes corresponding to the tag categories in the first target multi-way tree and the second target multi-way tree are different;

[0039] Determine the second number of fourth-layer element nodes at the same position but with different node hash values ​​under each of the target third-layer group nodes; and,

[0040] The sum of the second quantity corresponding to each of the target third-layer group nodes is taken as the first quantity.

[0041] According to a second aspect of the present disclosure, a blockchain-based data storage and verification apparatus is provided, the apparatus comprising:

[0042] A receiving module is configured to, in response to receiving a DICOM image file sent by a data provider, filter target data elements from the DICOM image file, wherein the target data elements are used to interpret the image corresponding to the DICOM image file, and the number of target data elements is less than the total number of data elements in the DICOM image file.

[0043] The construction module is used to construct a multi-way tree based on the target data elements;

[0044] The first storage module is used to store the DICOM image file in off-chain storage space;

[0045] The second storage module is used to store the file ID of the DICOM image file, the multi-way tree, and the storage address of the DICOM image file in the off-chain storage space in the on-chain storage space. The multi-way tree stored in the on-chain storage space is used to determine whether the DICOM image file stored in the off-chain storage space is valid.

[0046] Optionally, the number of target data elements is multiple, and each data element includes a tag identifier. The construction module includes:

[0047] The classification submodule is used to divide multiple target data elements into multiple class sets based on the tag category to which the target tag identifier of the target data element belongs, and the class set corresponds one-to-one with the tag category;

[0048] The first execution submodule is used to use the class hash value corresponding to each of the class sets as the node hash value of the second-level class node of the multi-way tree, wherein the second-level class node of the multi-way tree corresponds one-to-one with the class set;

[0049] The first calculation submodule is used to perform hash calculations on all the aforementioned hash values ​​to obtain the root hash value of the root node of the multi-way tree.

[0050] Optionally, the building module further includes:

[0051] The grouping submodule is used to group the target data elements in each class set according to a preset number N to obtain M target data element groups, wherein the number of target data elements in the M-1 target data element groups is N, and the number of target data elements in a target data element group is less than or equal to N.

[0052] The second execution submodule is used to use the group hash value corresponding to each of the target data element groups as the node hash value of the third group node under the second-level class node corresponding to the class set, and the third group node under the second-level class node corresponding to the class set corresponds one-to-one with the target data element group.

[0053] The class hash value corresponding to the class set is obtained by hashing the node hash values ​​of M third-level group nodes under the second-level class node corresponding to the class set.

[0054] Optionally, the building module further includes:

[0055] The second calculation submodule is used to perform hash calculation on each target data element in each target data element group to obtain the element hash value.

[0056] The third execution submodule is used to use the hash value of each element as the node hash value of the fourth layer element node under the third layer group node corresponding to the target data element group, and the fourth layer element node under the third layer group node corresponding to the target data element group corresponds one-to-one with the target data element in the target data element group.

[0057] The group hash value corresponding to the target data element group is obtained by hashing the node hash values ​​of all fourth-level element nodes under the third-level group node corresponding to the target data element group.

[0058] Optionally, the multi-way tree is a value-based multi-way tree, and correspondingly, the root node is characterized based on the root hash value and the number of leaf nodes under the root node;

[0059] The second-level class nodes of the multi-branch tree are characterized by the corresponding class hash value and the number of leaf nodes under the second-level class node.

[0060] Optionally, the device further includes:

[0061] The determination module is used to determine the target file ID carried in the query request in response to receiving a DICOM image file query request sent by the data query party;

[0062] The query module is used to query the target storage address and the first target multi-branch tree corresponding to the target file ID from the chain;

[0063] The acquisition module is used to acquire the target DICOM image file from the off-chain storage space according to the target storage address;

[0064] The calculation module is used to calculate the second target multi-branch tree corresponding to the target DICOM image file;

[0065] The comparison module is used to compare the first target multi-way tree and the second target multi-way tree to obtain the similarity.

[0066] The feedback module is used to return the target DICOM image file to the data query party when the target DICOM image file is determined to be valid based on the similarity score.

[0067] Optionally, the first target multi-way tree and the second target multi-way tree have the same structure, and the tag identifier of the target data element corresponding to the i-th fourth-level element node in the first target multi-way tree is the same as the tag identifier of the target data element corresponding to the i-th fourth-level element node in the second target multi-way tree. The comparison module includes:

[0068] The first determining submodule is used to determine the similarity as 100% when the root node of the first target multi-way tree is the same as the root node of the second target multi-way tree.

[0069] The second determining submodule is used to determine the first number of fourth-level element nodes with the same position and different node hash values ​​under the second-level class node corresponding to each tag category in the first target multi-way tree and the second target multi-way tree when the root node of the first target multi-way tree is different from the root node of the second target multi-way tree.

[0070] The third calculation submodule is used to calculate the similarity based on the first quantity under each of the tag categories, the weight of each of the tag categories, and the total number of leaf nodes of the first target multi-way tree.

[0071] Optionally, the second determining submodule includes:

[0072] The comparison submodule is used to compare whether the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree are the same for each tag category;

[0073] The third determining submodule is used to determine that the first quantity under the tag category is 0 when the second-level class nodes corresponding to the tag category are the same in the first target multi-way tree and the second target multi-way tree;

[0074] The fourth determination submodule is used to determine the target third-level group node whose node hash values ​​are different at the same position under the tag category in the first target multi-way tree and the second target multi-way tree when the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree are different.

[0075] The fifth determining submodule is used to determine the second number of fourth-layer element nodes with the same position and different node hash values ​​under each target third-layer group node; and,

[0076] The fourth calculation submodule is used to take the sum of the second quantity corresponding to each of the target third-layer group nodes as the first quantity.

[0077] According to a third aspect of the present disclosure, a computer-readable storage medium is provided, having stored thereon computer program instructions that, when executed by a processor, implement the steps of the blockchain-based data storage and verification method provided in the first aspect of the present disclosure.

[0078] According to a fourth aspect of the present disclosure, an apparatus is provided, comprising:

[0079] A memory on which computer programs are stored;

[0080] A processor is configured to execute the computer program in the memory to implement the steps of the blockchain-based data storage and verification method provided in the first aspect of this disclosure.

[0081] The technical solutions provided by the embodiments of this disclosure may include the following beneficial effects:

[0082] Upon receiving a DICOM image file from a data provider, target data elements are selected from the DICOM image file. These target data elements are data used to interpret the images corresponding to the DICOM image file, and these images are core data with significant research value for medical development. A multi-branch tree is constructed based on the target data elements. The DICOM image file is stored in off-chain storage, while the file ID of the DICOM image file, the multi-branch tree, and the storage address of the DICOM image file in off-chain storage are stored in on-chain storage. The multi-branch tree stored in on-chain storage is used to determine the validity of the DICOM image file stored in off-chain storage. This disclosed method, by storing the multi-branch tree used to determine the validity of the DICOM image file stored in off-chain storage on the blockchain, can prevent the multi-branch tree from being tampered with based on the characteristics of blockchain, thereby ensuring the validity of the DICOM image file in off-chain storage.

[0083] Furthermore, since the number of target data elements is less than the total number of data elements in the DICOM image file, this method of constructing a multi-way tree based on the target data elements to determine the validity of the DICOM image file stored in off-chain storage allows for the loss or alteration of some data in the DICOM image file without changing the corresponding image (i.e., the image corresponding to the DICOM image file can be interpreted). For example, it allows users to mask patient privacy information in the DICOM image file to ensure that the patient's privacy information is not disclosed, thereby protecting the patient's privacy. Therefore, this data processing method can ensure the validity and immutability of the DICOM image file while protecting the patient's privacy, thereby promoting the secure sharing and circulation of DICOM image files. In summary, this method can ensure the validity of the DICOM image file even if some data in the DICOM image file is lost.

[0084] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and are not intended to limit this disclosure. Attached Figure Description

[0085] The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments consistent with this disclosure and, together with the description, serve to explain the principles of this disclosure.

[0086] Figure 1 This is a schematic diagram of the file structure of a DICOM image file according to an exemplary embodiment of the present disclosure.

[0087] Figure 2 This is a flowchart illustrating a blockchain-based data storage and verification method according to an exemplary embodiment of this disclosure.

[0088] Figure 3 This is a schematic diagram of a multi-branch tree according to an exemplary embodiment of the present disclosure.

[0089] Figure 4 This is a flowchart illustrating a data query verification process according to an exemplary embodiment of the present disclosure.

[0090] Figure 5 This is a block diagram illustrating a blockchain-based data storage and verification device according to an exemplary embodiment of the present disclosure.

[0091] Figure 6 This is a block diagram illustrating another blockchain-based data storage and verification apparatus according to an exemplary embodiment of the present disclosure. Detailed Implementation

[0092] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this disclosure as detailed in the appended claims.

[0093] It should be noted that all actions involving the acquisition of signals, information, or data in this application are carried out in compliance with the relevant data protection laws and policies of the country where the application is located, and with the authorization granted by the owner of the relevant device.

[0094] To facilitate a better understanding of the technical solutions disclosed herein by those skilled in the art, the blockchain technology and file structure of DICOM image files involved in the embodiments of this disclosure will be described below.

[0095] Blockchain is a chain-like data structure that combines data blocks in chronological order, constructing an immutable and unforgeable distributed ledger. Based on cryptography, peer-to-peer (P2P) technology, distributed consensus algorithms, and other computer science theories and technologies, blockchain provides secure record-keeping and storage services for transactions without the intervention of third-party nodes. Blockchain technology is increasingly being applied in various industries and is an indispensable technology for the future development of the digital economy and the construction of new trust systems.

[0096] DICOM image files have a fixed file structure. See also Figure 1 A DICOM image file consists of three parts: a preamble, a prefix, and data elements. The prefix is ​​always "DICOM," indicating that the file is in DICOM format. If the prefix is ​​not this fixed value, it means the file is corrupted or that it is not a DICOM image file. The data elements are arranged in order according to their tags.

[0097] Each data element's structure includes four parts: tag identifier, value representation, data length, and data value. The tag identifier includes a group number and an element number (sorting basis). The tag identifier is a data dictionary defined in the DICOM protocol, and it determines certain data types or content categories of its own data elements or the entire file.

[0098] There are approximately two thousand tags defined in the DICOM protocol. In this embodiment of the disclosure, the tags in the DICOM protocol are divided into five tag categories: Patient (representing patient-related information), Study (representing examination-related information), Series (representing examination sequence-related information), Image (representing image-related information), and common (representing other related information).

[0099] Based on the characteristics of blockchain, and to address the problems existing in related technologies, this disclosure proposes using blockchain to store DICOM image files to ensure their integrity and validity. It also employs methods from related technologies to ensure data integrity and validity, thereby safeguarding the security of DICOM image files. One such method involves hashing the complete file data to obtain a Merkle root, storing the Merkle root on the blockchain, and storing the file data off-chain. When verifying the integrity and validity of the off-chain file data, it is re-hashed to obtain a new Merkle root. This new Merkle root is compared with the on-chain Merkle root. If they match, the off-chain verified file data is complete and valid (usable). If they do not match, the off-chain verified file data has been tampered with or corrupted, and is therefore untrustworthy. However, this method only determines whether the file data is valid, not the degree of validity. It should be explained that the degree of validity refers to the proportion of valid data in the current off-chain file data relative to the original file data. In this context, valid data in the current off-chain file data refers to the data portion that is identical to the original file data. Valid data in the current off-chain file data can be determined by comparing it with the original file data. Therefore, in this embodiment, the validity of the verified file data refers to the similarity between the verified file data and the original file data.

[0100] The reason why methods for ensuring data integrity and validity in related technologies cannot determine the validity of file data is that these methods involve hashing the entire file data to obtain a Merkle root. This Merkle root can only represent the entire file data as a whole, and cannot represent a portion of it. Therefore, since the Merkle root can only represent the entire file data, comparing Merkle roots can only determine whether the entire file data is valid or invalid. When determining that the entire file data is invalid, because the Merkle root can only represent the entire file data, it is impossible to know which data caused the invalidity, or the percentage of data that caused the invalidity.

[0101] Furthermore, once Merkle root data is stored on-chain, it becomes impossible to modify the off-chain file data based on application-layer business needs, such as the inability to perform sensitive data masking. Therefore, this approach is highly limited and inflexible.

[0102] The inventors discovered that even when certain data elements in a DICOM image file are corrupted, the image can still be interpreted by ignoring the corrupted data elements. Image data is core data with significant research value for medical advancements. For example, data elements under the "Patient" category may include PatientName, PatientID, PatientSex, PatientAge, PatientAddress, and PatientTelephoneNumbers. The loss of PatientAddress, PatientTelephoneNumbers, and other data elements in a DICOM image file has negligible impact on image interpretation. Therefore, this disclosure, based on the discovery of the structural characteristics of DICOM image files, proposes a novel blockchain-based data storage and verification method, apparatus, and storage medium suitable for DICOM image files.

[0103] The following provides a detailed description of the embodiments of the technical solution disclosed herein.

[0104] Figure 2 This is a flowchart illustrating a blockchain-based data storage and verification method according to an exemplary embodiment. This blockchain-based data storage and verification method can be applied to data sharing systems, data sharing platforms, and any node on a blockchain network, such as... Figure 2 As shown, a blockchain-based data storage and verification method may include the following steps.

[0105] S11. In response to receiving a DICOM image file sent by a data provider, filter target data elements from the DICOM image file. The target data elements are used to interpret the image corresponding to the DICOM image file. The number of target data elements is less than the total number of data elements in the DICOM image file.

[0106] The image interpreted from the DICOM image file in this embodiment may refer to an image scanned by a medical imaging device, which is composed of pixels and the value of each pixel. In some embodiments, the image interpreted from the DICOM image file may refer to an image scanned by a medical imaging device and related image information that can assist medical personnel in identifying the image content, such as patient gender, age, and other related image information.

[0107] It should be noted that the same image content may lead to different diagnoses for patients of different ages and genders due to variations in age, gender, and other relevant indicators. Therefore, in this embodiment, target data elements can be predefined according to the needs of the business layer. For example, upon receiving a DICOM image file from a data provider, target data elements are selected from the DICOM image file based on the needs of the business layer or a predefined set of target tags. These target data elements are used to interpret the image corresponding to the DICOM image file. The number of target data elements is less than the total number of data elements in the DICOM image file. For instance, target data elements may not include data elements such as PatientName, PatientAddress, and PatientTelephoneNumbers. Target data elements may include data elements such as Slice thickness, Columns, Pixel spacing, Window center, and Window width.

[0108] S12. Construct a multi-branch tree based on the target data elements.

[0109] It should be explained that each node in a binary tree has one data item and at most two child nodes. If each node in the tree is allowed to have more than two child nodes, then such a tree is called an n-order multi-way tree, or an n-ary tree.

[0110] S13. Store the DICOM image file in off-chain storage space.

[0111] S14. Store the file ID of the DICOM image file, the multi-way tree, and the storage address of the DICOM image file in the off-chain storage space in the on-chain storage space, wherein the multi-way tree stored in the on-chain storage space is used to determine whether the DICOM image file stored in the off-chain storage space is valid.

[0112] The file ID of a DICOM image file is used to uniquely identify the DICOM image file. The file ID of a DICOM image file can be generated after the DICOM image file is received from the data provider, or it can be provided by the data provider.

[0113] Since the height of the tree is an unavoidable lower bound for the hit search, in this embodiment of the disclosure, the height of the multi-branch tree can be set to 4 to shorten the data processing time (i.e., reduce the time complexity) when determining whether the DICOM image file below the chain is valid based on the multi-branch tree.

[0114] By employing the method disclosed herein, the multi-branch tree used to determine the validity of DICOM image files stored in off-chain storage is stored on the blockchain. This approach prevents the multi-branch tree from being tampered with based on the characteristics of the blockchain, thereby ensuring the validity of DICOM image files stored off-chain.

[0115] Furthermore, since the number of target data elements is less than the total number of data elements in the DICOM image file, the multi-branch tree constructed based on the target data elements to determine the validity of the DICOM image file stored off-chain can be used to determine the validity of the image corresponding to the DICOM image file (i.e., the target data elements that can be interpreted from the image corresponding to the DICOM image file). Based on this, the multi-branch tree stored on-chain allows data in the DICOM image file stored off-chain that was not involved in constructing the multi-branch tree to be lost or modified. Specifically, the multi-branch tree can determine the validity of the target data elements that constructed it, but it cannot determine the validity of other data elements that were not involved in constructing the multi-branch tree. In other words, whether other data elements besides the target data elements are modified or not, it does not affect the determination of the validity of the DICOM image file based on the multi-branch tree. Therefore, storing this disclosed multi-branch tree on the blockchain can avoid the potential risks caused by users modifying the target data elements, and allows other data elements besides the target data elements to be modified. For example, allowing users / applications to mask patient privacy information (such as name, address, etc.) in DICOM image files to ensure that patient privacy is not disclosed, thereby protecting patient privacy. Therefore, this data processing method can protect patient privacy while ensuring the validity and immutability of DICOM image files, even when users have privacy protection needs, thus promoting the secure sharing and circulation of DICOM image files. In summary, this method can maintain the validity of DICOM image files even if some data is lost.

[0116] Optionally, the number of target data elements is multiple, each data element includes a tag identifier, and the step of constructing a multi-way tree based on the target data elements includes:

[0117] Based on the tag category to which the target tag identifier of the target data element belongs, the multiple target data elements are divided into multiple class sets, and the class sets correspond one-to-one with the tag categories; the class hash value corresponding to each class set is used as the node hash value of the second-level class node of the multi-way tree, and the second-level class node of the multi-way tree corresponds one-to-one with the class set; hash calculation is performed on all the class hash values ​​to obtain the root hash value of the root node of the multi-way tree.

[0118] In this embodiment, the tags in the DICOM protocol are divided into five categories: Patient (representing patient-related information), Study (representing examination-related information), Series (representing examination sequence information), Image (representing image-related information), and common (representing other related information). Based on the tag category to which the target tag of a target data element belongs, all target data elements are divided into five class sets. Each class set corresponds one-to-one with a tag category.

[0119] The five class hash values ​​corresponding to the five class sets are used as the node hash values ​​of the second-level class nodes of the multi-way tree. The number of second-level class nodes in the multi-way tree is 5, and the second-level class nodes of the multi-way tree correspond one-to-one with the class sets.

[0120] It should be noted that, in this embodiment of the disclosure, for ease of describing the relationship between the second-level nodes of the multi-way tree and the class set, the second-level nodes of the multi-way tree are referred to as second-level class nodes. The parent node of each second-level class node in the multi-way tree is a first-level node (i.e., the root node). Therefore, by hashing the five class hash values, the root hash value of the root node of the multi-way tree can be obtained. For example, assume that the node hash values ​​of the second-level class nodes are H... a (The collection of classes corresponding to the tag category "Patient") H b (A collection of classes corresponding to the tag category Study), H c (The collection of classes corresponding to the tag category Series), H d (The collection of classes corresponding to the tag category Image), H e (For the collection of classes with the tag category "common"), the root hash value of the root node of the multi-way tree can be calculated using the formula H. root =hash(H a ||H b ||H c ||H d ||H eHash calculation methods are described in relevant technical documents and will not be covered in this disclosure.

[0121] The calculation method for the class hash value corresponding to the class set will be explained in the following embodiments.

[0122] Optionally, constructing a multi-way tree based on the target data elements further includes:

[0123] For each class set, the target data elements in the class set are grouped according to a preset number N to obtain M target data element groups, wherein the number of target data elements in the M-1 target data element groups is N, and the number of target data elements in a single target data element group is less than or equal to N; the group hash value corresponding to each target data element group is used as the node hash value of the third-level group node under the second-level class node corresponding to the class set, and the third-level group node under the second-level class node corresponding to the class set corresponds one-to-one with the target data element group; wherein the class hash value corresponding to the class set is obtained by hashing the node hash values ​​of the M third-level group nodes under the second-level class node corresponding to the class set.

[0124] Here, the preset quantity N refers to the maximum number of leaf nodes under each third-level node in the multi-way tree. For example, N is a positive integer such as 7, 8, or 9. M is also an integer, and M is specifically determined based on the number of target data elements in the class set and N.

[0125] Taking N=8 as an example, for each type of set, the set is divided into M groups according to the grouping method of 8 target data elements per group. The first M-1 target data element groups contain 8 target data elements, and the last target data element group contains less than or equal to 8 target data elements. In one implementation, during grouping, the target data elements corresponding to every consecutive 8 target tag identifiers can be grouped according to the target tag identifier of each target data element.

[0126] After grouping, the group hash value corresponding to each target data element group is used as the node hash value of the third-level group node under the second-level class node of the class set. The third-level group node under the second-level class node of the class set corresponds one-to-one with the target data element group.

[0127] It should be noted that, in this embodiment of the disclosure, in order to facilitate the description of the correspondence between the third-level nodes of the multi-way tree and the target data element group, the third-level nodes of the multi-way tree are referred to as the third-level group nodes.

[0128] Based on the correspondence between a second-level class node and a class set as described in the foregoing embodiments, and considering that a class set can be divided into M target data element groups and that a target data element group corresponds to a third-level group node, it can be seen that the child node of the second-level class node corresponding to the class set is the third-level group node corresponding to the M target data element groups under that class set.

[0129] Accordingly, the class hash value of a class set is obtained by hashing the group hash values ​​corresponding to the M target data element groups under that class set. Alternatively, it can be understood that the class hash value corresponding to a class set (i.e., the node hash value of the second-level class node corresponding to the class set) is obtained by hashing the node hash values ​​of the M third-level group nodes under the second-level class node corresponding to the class set. For example, suppose class set A is divided into four target data element groups: F, G, H, and I, with corresponding group hash values ​​H... f H g H h H i Then the class hash value H of class set A. a The calculation method is H a =hash(H f ||H g ||H h ||H i ).

[0130] The method for calculating the group hash value corresponding to the target data element group will be described in the following embodiments.

[0131] Optionally, constructing a multi-way tree based on the target data elements further includes:

[0132] For each target data element group, a hash calculation is performed on each target data element in the target data element group to obtain an element hash value; each element hash value is used as the node hash value of the fourth-level element node under the third-level group node corresponding to the target data element group, and the fourth-level element node under the third-level group node corresponding to the target data element group corresponds one-to-one with the target data element in the target data element group; wherein, the group hash value corresponding to the target data element group is obtained by hash calculation of the node hash values ​​of all fourth-level element nodes under the third-level group node corresponding to the target data element group.

[0133] Specifically, when performing hash calculations on target data elements based on their structure, the hash calculation can be performed on at least one of the tag identifier, value representation, data length, and data value in the target data element.

[0134] For example, suppose the target data element group F includes target data elements d0, d1, d2, d3, d4, d5, d6, and d7. A hash calculation is performed on each target data element in the target data element group F, resulting in element hash values ​​as hash(d0), hash(d1), hash(d2), hash(d3), hash(d4), hash(d5), hash(d6), and hash(d7), respectively. Each element hash value is used as the node hash value of the fourth-level element node under the third-level group node corresponding to the target data element group F. The fourth-level element node under the third-level group node corresponding to the target data element group F corresponds one-to-one with the target data elements d0, d1, d2, d3, d4, d5, d6, and d7 in the target data element group F. The group hash value H corresponding to the target data element group F is... f This is obtained by hashing the hash values ​​hash(d0), hash(d1), hash(d2), hash(d3), hash(d4), hash(d5), hash(d6), and hash(d7) of all fourth-level element nodes under the third-level group node corresponding to the target data element group F. The calculation process can be performed using formula H. f =hash(hash(d0)||hash(d1)||...||hash(d7)) representation.

[0135] It should be noted that, in this embodiment of the disclosure, in order to facilitate the description of the relationship between the fourth-level node of the multi-way tree (i.e., the leaf node of the multi-way tree in this disclosure) and the target data element, the fourth-level node of the multi-way tree is referred to as the fourth-level element node.

[0136] As described in the foregoing embodiments, a target data element group corresponds to a third-level group node, and a target data element group includes multiple target data elements, with each target data element corresponding to a fourth-level element node. Therefore, the child nodes of the third-level group node corresponding to the target data element group are the fourth-level element nodes corresponding to each target data element in that target data element group.

[0137] Based on the foregoing embodiments, the multi-way tree in this disclosure has a height of 4. The first-level nodes are the root nodes, the second-level nodes are second-level class nodes, the third-level nodes are third-level group nodes, and the fourth-level nodes are fourth-level element nodes. Each fourth-level element node corresponds one-to-one with a target data element. Each third-level group node corresponds one-to-one with a target data element group. Each second-level class node corresponds one-to-one with a class set. The dependency relationship between the target data element and the target data element group is used to determine the dependency relationship between the fourth-level element nodes and the third-level group nodes. Similarly, the dependency relationship between the third-level group nodes and the second-level class nodes is used to determine the dependency relationship between the third-level group nodes and the second-level class nodes. Since all second-level class nodes belong to the root node, the connection relationships between nodes in the multi-way tree can be determined, thus enabling the construction of a complete multi-way tree structure. For example, as shown... Figure 3 The multi-branch tree shown.

[0138] It should be noted that in constructing a multi-way tree based on the above-described principles, the target data elements can be first classified and grouped. Then, hash calculations are performed on the target data elements within each group to construct the fourth-level element nodes of the multi-way tree. Next, based on the hierarchical relationship between the target data elements and their groups, the third-level group nodes of the multi-way tree are constructed according to the fourth-level element nodes. Then, based on the hierarchical relationship between the target data element groups and their class sets, the second-level class nodes of the multi-way tree are constructed according to the third-level group nodes. Finally, the root node of the multi-way tree is constructed based on all the second-level class nodes, thus obtaining the completed multi-way tree.

[0139] Another implementation involves first performing a hash calculation on each target data element to construct the fourth-level element nodes of the multi-way tree. Then, the target data elements are classified and grouped. Next, based on the dependency relationship between the target data elements and their groups, the third-level group nodes of the multi-way tree are constructed according to the fourth-level element nodes. Then, based on the dependency relationship between the target data element groups and their class sets, the second-level class nodes of the multi-way tree are constructed according to the third-level group nodes. Finally, the root node of the multi-way tree is constructed based on all the second-level class nodes, thus obtaining the completed multi-way tree.

[0140] Optionally, the multi-way tree is a value-based multi-way tree, and correspondingly, the root node is characterized based on the root hash value and the number of leaf nodes under the root node. For example... Figure 3 The root node of a multi-way tree can be represented by (x+y+z+n+m, R), where (x+y+z+n+m) represents the number of leaf nodes under the root node, and R represents the number of leaf nodes in the tree. root .

[0141] Accordingly, the second-level class nodes of the multi-way tree are characterized based on the corresponding class hash value and the number of leaf nodes under that second-level class node. For example Figure 3 In the diagram, the second-level class node of the class set corresponding to the tag category "Patient" can be represented as Patient(x, A), where x represents the number of leaf nodes under this second-level class node, and A represents the number of leaf nodes under H. a .

[0142] Optionally, the third-level group nodes of the multi-way tree are represented based on the corresponding group hash value and the number of leaf nodes under that third-level group node. For example... Figure 3 In the diagram, the third-level group node corresponding to the target data element group F can be represented as (8, F), where 8 indicates that there are 8 leaf nodes (i.e., 8 fourth-level element nodes) under this third-level group node, and F represents H. f .

[0143] Figure 4 This is a flowchart illustrating a data query verification process according to an exemplary embodiment of this disclosure. For example... Figure 4 As shown, it includes:

[0144] S41. In response to receiving a DICOM image file query request sent by the data querying party, determine the target file ID carried in the query request.

[0145] S42. Query the target storage address and the first target multi-branch tree corresponding to the target file ID from the chain.

[0146] S43. Obtain the target DICOM image file from the off-chain storage space according to the target storage address.

[0147] S44. Calculate the second target multi-branch tree corresponding to the target DICOM image file.

[0148] The construction method of the second target multi-way tree is the same as that of the first target multi-way tree stored on the chain.

[0149] S45. Compare the first target multi-branch tree and the second target multi-branch tree to obtain the similarity.

[0150] S46. If the target DICOM image file is determined to be valid based on the similarity score, the target DICOM image file is fed back to the data querying party.

[0151] Similarity represents the validity of a target DICOM image file. In one implementation, a target DICOM image file is determined to be valid if the similarity is greater than a preset similarity threshold (e.g., 90%). Conversely, a target DICOM image file is determined to be invalid if the similarity is less than or equal to the preset similarity threshold.

[0152] This approach, because the first target multi-branch tree on the chain is not constructed based on the entire DICOM image file, but rather on partial data elements (i.e., target data elements), allows for the loss or alteration of some data in the DICOM image file stored off-chain without changing the corresponding image (i.e., the image corresponding to the DICOM image file can be interpreted). For example, it allows users to mask patient privacy information in the DICOM image file to ensure its confidentiality, thus protecting patient privacy. Therefore, this data processing method allows for differences between the first and second target multi-branch trees. Even with minimal differences, the validity of the target DICOM image file can be determined, and the specific degree of validity (i.e., similarity) can be calculated. This solves the problem in related technologies where Merkle root-based methods for ensuring data integrity and validity can only determine whether the file data is valid, but not the degree of validity. This method allows for masking of sensitive information in the off-chain DICOM image file after the multi-branch tree of the DICOM image file is on-chain, protecting patient privacy while maintaining the validity of the masked DICOM image file. In summary, by adopting the method disclosed herein, the validity of DICOM image files can be guaranteed even if some data in the DICOM image file is lost without losing important image data.

[0153] Optionally, the first target multi-way tree and the second target multi-way tree have the same structure, and the tag identifier of the target data element corresponding to the i-th fourth-level element node in the first target multi-way tree is the same as the tag identifier of the target data element corresponding to the i-th fourth-level element node in the second target multi-way tree. The step of comparing the first target multi-way tree and the second target multi-way tree to obtain a similarity score includes:

[0154] If the root node of the first target multi-way tree is the same as the root node of the second target multi-way tree, the similarity is determined to be 100%. If the root node of the first target multi-way tree is different from the root node of the second target multi-way tree, for each tag category in the first and second target multi-way trees, the first number of fourth-level element nodes with the same position and different node hash values ​​under the second-level class node is determined. The similarity is calculated based on the first number under each tag category, the weight of each tag category, and the total number of leaf nodes in the first target multi-way tree.

[0155] When determining whether the nodes of the first target multi-way tree and the second target multi-way tree are the same, this disclosure may compare the node hash value of the node, or it may compare the node hash value and the number of leaf nodes under the node. The following embodiments of this disclosure will be described using the comparison of node hash value as an example.

[0156] by Figure 3 The following example illustrates a multi-way tree structure. Assume the first target multi-way tree is represented as T1, and the second target multi-way tree as T2. T1 and T2 have identical structures, and the tag identifier of the target data element corresponding to the i-th fourth-level element node in T1 is the same as the tag identifier of the target data element corresponding to the i-th fourth-level element node in T2. ​​Therefore, it can be seen that the k-th node of the j-th level in T1 and the k-th node of the j-th level in T2 are compared one-to-one.

[0157] If the root hash value R1 of the root node of T1 is the same as the root hash value R2 of the root node of T2, then the node hash values ​​of the i-th fourth-level element node in T1 and the i-th fourth-level element node in T2 are also the same. Therefore, the similarity between T1 and T2 is 100%. This is similar to the principle of comparing Merkle root correlation techniques, and will not be elaborated further here.

[0158] When the root hash value R1 of the root node of T1 is different from the root hash value R2 of the root node of T2, for the second-level class nodes A1 and A2 of the corresponding tag category Series in T1 and T2, perform a peer comparison (peer comparison refers to comparing nodes at the same position) on the fourth-level element nodes under the second-level class node A1 in T1 and the fourth-level element nodes under the second-level class node A2 in T2 to obtain the first number P of fourth-level element nodes with different hash values ​​at the same position. A .

[0159] Similarly, we can obtain the first quantity P corresponding to the second-level class nodes B1 and B2 with the tag category Study. B The first quantity P corresponding to the second-level class nodes C1 and C2 of the tag category Series. C The first quantity P corresponding to the second-level class nodes D1 and D2 of the tag category Image. D The first quantity P corresponding to the second-level class nodes E1 and E2 with the tag category "common" is... E .

[0160] The similarity is calculated based on the first quantity under each tag category, the weight of each tag category, and the total number of leaf nodes in the first target multi-way tree.

[0161]

[0162] Where V represents similarity, and A′, B′, C′, D′, and E′ are the weights of each tag category.

[0163] Optionally, determining the first number of fourth-level element nodes with different hash values ​​at the same position under each tag category in the first target multi-way tree and the second target multi-way tree includes:

[0164] For each tag category, compare whether the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree are the same; if the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree are the same, determine that the first quantity under the tag category is 0; if the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree are not the same, determine the target third-level group nodes under the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree where the node hash values ​​at the same position are different; determine the second quantity of the fourth-level element nodes under each target third-level group node where the node hash values ​​at the same position are different; and take the sum of the second quantities corresponding to each target third-level group node as the first quantity.

[0165] Let's take the second-level class nodes A1 and A2 from the aforementioned embodiment as an example. We compare whether A1 and A2 are the same. If A1 and A2 are the same, the first quantity under the tag category Series is determined to be 0. This is similar to the principle that the similarity is 100% when the root nodes are the same.

[0166] If A1 and A2 are different, identify the target third-layer group nodes whose hash values ​​at the same position under A1 and A2 are different. That is, compare F1 and F2, G1 and G2, H1 and H2, I1 and I2 (only by...). Figure 3 (Using the example of the third-layer group node to the I node as an example) to determine whether they are the same, and to identify the different target third-layer group nodes.

[0167] If we assume the target third-layer group nodes are F1 and F2, then we further determine the second number of fourth-layer element nodes at the same position but with different node hash values ​​under the target third-layer group nodes F1 and F2. Then, we take the sum of the second numbers corresponding to all target third-layer group nodes as the first number.

[0168] This method of comparing nodes layer by layer from the root node reduces the complexity compared to directly comparing leaf nodes in T1 and T2. This is because when nodes at a certain level are identical, there's no need to compare their child nodes. For example, when comparing nodes A1 and A2, if A1 and A2 are identical, the expression hash(d0) is used to represent hash(d2). 31 The leaf nodes of A1 and A2 are all identical. That is, there is no need to compare the child nodes of A1 and A2, and therefore no need to compare the leaf nodes of A1 and A2 from hash(d0) to hash(d2). 31 This method of comparing A1 and A2 is similar to comparing the leaf nodes hash(d0) to hash(d). 31 Compared to the previous method, the comparison complexity is greatly reduced.

[0169] Optionally, in the process of determining the first number of fourth-level element nodes with the same position and different node hash values ​​under the second-level class node corresponding to each tag category in the first target multi-branch tree and the second target multi-branch tree, it can be determined which target fourth-level element nodes have the same position and different node hash values.

[0170] In some feasible implementations, after identifying the target fourth-level element node, the target data element name and similarity score corresponding to the target fourth-level element node can be displayed to the data queryer. If the data queryer deems it acceptable to ignore any changes or loss of these target data elements, the target DICOM image file is considered valid for the data queryer, and is subsequently returned to the data queryer.

[0171] Figure 5 This is a block diagram illustrating a blockchain-based data storage and verification device according to an exemplary embodiment. Figure 5 As shown, the blockchain-based data storage and verification device 500 includes:

[0172] The receiving module 510 is configured to, in response to receiving a DICOM image file sent by a data provider, filter target data elements from the DICOM image file, wherein the target data elements are used to interpret the image corresponding to the DICOM image file, and the number of target data elements is less than the total number of data elements in the DICOM image file.

[0173] Construction module 520 is used to construct a multi-branch tree based on the target data elements;

[0174] The first storage module 530 is used to store the DICOM image file in the off-chain storage space;

[0175] The second storage module 540 is used to store the file ID of the DICOM image file, the multi-branch tree, and the storage address of the DICOM image file in the off-chain storage space in the on-chain storage space, wherein the multi-branch tree stored in the on-chain storage space is used to determine whether the DICOM image file stored in the off-chain storage space is valid.

[0176] By using the aforementioned blockchain-based data storage and verification device, the multi-branch tree used to determine whether the DICOM image files stored in the off-chain storage space are valid is stored on the blockchain. This allows the multi-branch tree to be prevented from being tampered with based on the characteristics of the blockchain, thereby ensuring the validity of the DICOM image files in the off-chain storage space.

[0177] Furthermore, since the number of target data elements is less than the total number of data elements in the DICOM image file, this method of constructing a multi-way tree based on the target data elements to determine the validity of the DICOM image file stored in off-chain storage allows for the loss or alteration of some data in the DICOM image file without changing the corresponding image (i.e., ensuring the image can be deciphered). For example, it allows users to mask patient privacy information in the DICOM image file to prevent its disclosure, thus protecting patient privacy. Therefore, this method ensures the validity and immutability of the DICOM image file while protecting patient privacy, thereby promoting the secure sharing and circulation of DICOM image files. In summary, this method ensures the validity of the DICOM image file even with the loss of some data.

[0178] Optionally, the number of target data elements is multiple, and each data element includes a tag identifier. The construction module 520 includes:

[0179] The classification submodule is used to divide multiple target data elements into multiple class sets based on the tag category to which the target tag identifier of the target data element belongs, and the class set corresponds one-to-one with the tag category;

[0180] The first execution submodule is used to use the class hash value corresponding to each of the class sets as the node hash value of the second-level class node of the multi-way tree, wherein the second-level class node of the multi-way tree corresponds one-to-one with the class set;

[0181] The first calculation submodule is used to perform hash calculations on all the aforementioned hash values ​​to obtain the root hash value of the root node of the multi-way tree.

[0182] Optionally, the building module 520 further includes:

[0183] The grouping submodule is used to group the target data elements in each class set according to a preset number N to obtain M target data element groups, wherein the number of target data elements in the M-1 target data element groups is N, and the number of target data elements in a target data element group is less than or equal to N.

[0184] The second execution submodule is used to use the group hash value corresponding to each of the target data element groups as the node hash value of the third group node under the second-level class node corresponding to the class set, and the third group node under the second-level class node corresponding to the class set corresponds one-to-one with the target data element group.

[0185] The class hash value corresponding to the class set is obtained by hashing the node hash values ​​of M third-level group nodes under the second-level class node corresponding to the class set.

[0186] Optionally, the building module 520 further includes:

[0187] The second calculation submodule is used to perform hash calculation on each target data element in each target data element group to obtain the element hash value.

[0188] The third execution submodule is used to use the hash value of each element as the node hash value of the fourth layer element node under the third layer group node corresponding to the target data element group, and the fourth layer element node under the third layer group node corresponding to the target data element group corresponds one-to-one with the target data element in the target data element group.

[0189] The group hash value corresponding to the target data element group is obtained by hashing the node hash values ​​of all fourth-level element nodes under the third-level group node corresponding to the target data element group.

[0190] Optionally, the multi-way tree is a value-based multi-way tree, and correspondingly, the root node is characterized based on the root hash value and the number of leaf nodes under the root node;

[0191] The second-level class nodes of the multi-branch tree are characterized by the corresponding class hash value and the number of leaf nodes under the second-level class node.

[0192] Optionally, the device 500 further includes:

[0193] The determination module is used to determine the target file ID carried in the query request in response to receiving a DICOM image file query request sent by the data query party;

[0194] The query module is used to query the target storage address and the first target multi-branch tree corresponding to the target file ID from the chain;

[0195] The acquisition module is used to acquire the target DICOM image file from the off-chain storage space according to the target storage address;

[0196] The calculation module is used to calculate the second target multi-branch tree corresponding to the target DICOM image file;

[0197] The comparison module is used to compare the first target multi-way tree and the second target multi-way tree to obtain the similarity.

[0198] The feedback module is used to return the target DICOM image file to the data query party when the target DICOM image file is determined to be valid based on the similarity score.

[0199] Optionally, the first target multi-way tree and the second target multi-way tree have the same structure, and the tag identifier of the target data element corresponding to the i-th fourth-level element node in the first target multi-way tree is the same as the tag identifier of the target data element corresponding to the i-th fourth-level element node in the second target multi-way tree. The comparison module includes:

[0200] The first determining submodule is used to determine the similarity as 100% when the root node of the first target multi-way tree is the same as the root node of the second target multi-way tree.

[0201] The second determining submodule is used to determine the first number of fourth-level element nodes with the same position and different node hash values ​​under the second-level class node corresponding to each tag category in the first target multi-way tree and the second target multi-way tree when the root node of the first target multi-way tree is different from the root node of the second target multi-way tree.

[0202] The third calculation submodule is used to calculate the similarity based on the first quantity under each of the tag categories, the weight of each of the tag categories, and the total number of leaf nodes of the first target multi-way tree.

[0203] Optionally, the second determining submodule includes:

[0204] The comparison submodule is used to compare whether the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree are the same for each tag category;

[0205] The third determining submodule is used to determine that the first quantity under the tag category is 0 when the second-level class nodes corresponding to the tag category are the same in the first target multi-way tree and the second target multi-way tree;

[0206] The fourth determination submodule is used to determine the target third-level group node whose node hash values ​​are different at the same position under the tag category in the first target multi-way tree and the second target multi-way tree when the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree are different.

[0207] The fifth determining submodule is used to determine the second number of fourth-layer element nodes with the same position and different node hash values ​​under each target third-layer group node; and,

[0208] The fourth calculation submodule is used to take the sum of the second quantity corresponding to each of the target third-layer group nodes as the first quantity.

[0209] Regarding the apparatus in the above embodiments, the specific manner in which each module performs its operation has been described in detail in the embodiments related to the method, and will not be elaborated upon here.

[0210] This disclosure also provides a computer-readable storage medium having computer program instructions stored thereon, which, when executed by a processor, implement the steps of the blockchain-based data storage and verification method provided in this disclosure.

[0211] Figure 6 This is a block diagram illustrating another blockchain-based data storage and verification device 800 according to an exemplary embodiment. For example, device 800 may be a mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, medical device, fitness equipment, personal digital assistant, etc.

[0212] Reference Figure 6 The device 800 may include one or more of the following components: a processing component 802, a memory 804, a power supply component 806, a multimedia component 808, an audio component 810, an input / output interface 812, a sensor component 814, and a communication component 816.

[0213] Processing component 802 typically controls the overall operation of device 800, such as operations associated with display, telephone calls, data communication, camera operation, and recording. Processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Furthermore, processing component 802 may include one or more modules to facilitate interaction between processing component 802 and other components. For example, processing component 802 may include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.

[0214] Memory 804 is configured to store various types of data to support the operation of device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, etc. Memory 804 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.

[0215] Power supply component 806 provides power to various components of device 800. Power supply component 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power to device 800.

[0216] Multimedia component 808 includes a screen that provides an output interface between the device 800 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touchscreen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensors may sense not only the boundaries of the touch or swipe action but also the duration and pressure associated with the touch or swipe operation. In some embodiments, multimedia component 808 includes a front-facing camera and / or a rear-facing camera. When the device 800 is in an operating mode, such as a shooting mode or a video mode, the front-facing camera and / or the rear-facing camera may receive external multimedia data. Each front-facing camera and rear-facing camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

[0217] Audio component 810 is configured to output and / or input audio signals. For example, audio component 810 includes a microphone (MIC) configured to receive external audio signals when device 800 is in an operating mode, such as call mode, recording mode, and voice recognition mode. The received audio signals may be further stored in memory 804 or transmitted via communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

[0218] Input / output interface 812 provides an interface between processing component 802 and peripheral interface modules, such as keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to, home buttons, volume buttons, power buttons, and lock buttons.

[0219] Sensor assembly 814 includes one or more sensors for providing status assessments of various aspects of device 800. For example, sensor assembly 814 may detect the on / off state of device 800, the relative positioning of components such as the display and keypad of device 800, changes in the position of device 800 or a component of device 800, the presence or absence of user contact with device 800, the orientation or acceleration / deceleration of device 800, and temperature changes of device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, sensor assembly 814 may also include an accelerometer, a gyroscope, a magnetometer, a pressure sensor, or a temperature sensor.

[0220] Communication component 816 is configured to facilitate wired or wireless communication between device 800 and other devices. Device 800 can access wireless networks based on communication standards, such as WiFi, 2G, or 3G, or combinations thereof. In one exemplary embodiment, communication component 816 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, communication component 816 also includes a near-field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

[0221] In an exemplary embodiment, the device 800 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to perform the above-described blockchain-based data storage and verification method.

[0222] In an exemplary embodiment, a non-transitory computer-readable storage medium including instructions is also provided, such as a memory 804 including instructions, which can be executed by the processor 820 of the device 800 to complete the aforementioned blockchain-based data storage and verification method. For example, the non-transitory computer-readable storage medium may be a ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.

[0223] In another exemplary embodiment, a computer program product is also provided, the computer program product comprising a computer program executable by a programmable device, the computer program having a code portion for performing the above-described blockchain-based data storage and verification method when executed by the programmable device.

[0224] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of this disclosure. This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.

[0225] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.

Claims

1. A data storage and verification method based on blockchain, characterized in that, The method includes: In response to receiving a DICOM image file sent by a data provider, target data elements are filtered from the DICOM image file. The target data elements are used to interpret the image corresponding to the DICOM image file. The number of target data elements is less than the total number of data elements in the DICOM image file. Construct a multi-branch tree based on the target data elements; The DICOM image file is stored in off-chain storage space; The file ID of the DICOM image file, the multi-way tree, and the storage address of the DICOM image file in the off-chain storage space are stored in the on-chain storage space. The multi-way tree stored in the on-chain storage space is used to determine whether the DICOM image file stored in the off-chain storage space is valid. The method further includes: in response to receiving a DICOM image file query request sent by a data querying party, determining the target file ID carried in the query request; querying the target storage address and a first target multi-way tree corresponding to the target file ID from the blockchain; obtaining the target DICOM image file from the off-chain storage space according to the target storage address; calculating the second target multi-way tree corresponding to the target DICOM image file; comparing the first target multi-way tree and the second target multi-way tree to obtain a similarity; and if the target DICOM image file is determined to be valid based on the magnitude of the similarity, feeding back the target DICOM image file to the data querying party.

2. The method according to claim 1, characterized in that, The number of target data elements is multiple, and each data element includes a tag identifier. The step of constructing a multi-way tree based on the target data elements includes: Based on the tag category to which the target tag identifier of the target data element belongs, the multiple target data elements are divided into multiple class sets, and the class sets correspond one-to-one with the tag categories; The class hash value corresponding to each of the class sets is used as the node hash value of the second-level class node of the multi-way tree, and the second-level class node of the multi-way tree corresponds one-to-one with the class set; Perform hash calculations on all the aforementioned hash values ​​to obtain the root hash value of the root node of the multi-way tree.

3. The method according to claim 2, characterized in that, The step of constructing a multi-way tree based on the target data elements further includes: For each of the aforementioned class sets, the target data elements in the class set are grouped according to a preset number N to obtain M target data element groups, wherein the number of target data elements in the M-1 target data element groups is N, and the number of target data elements in a target data element group is less than or equal to N; The group hash value corresponding to each of the target data element groups is used as the node hash value of the third group node under the second-level class node corresponding to the class set. The third group node under the second-level class node corresponding to the class set corresponds one-to-one with the target data element group. The class hash value corresponding to the class set is obtained by hashing the node hash values ​​of M third-level group nodes under the second-level class node corresponding to the class set.

4. The method according to claim 3, characterized in that, The step of constructing a multi-way tree based on the target data elements further includes: For each target data element group, perform a hash calculation on each target data element in the target data element group to obtain the element hash value; The hash value of each element is used as the node hash value of the fourth layer element node under the third layer group node corresponding to the target data element group. The fourth layer element node under the third layer group node corresponding to the target data element group corresponds one-to-one with the target data element in the target data element group. The group hash value corresponding to the target data element group is obtained by hashing the node hash values ​​of all fourth-level element nodes under the third-level group node corresponding to the target data element group.

5. The method according to claim 2, characterized in that, The multi-way tree is a value-based multi-way tree, and correspondingly, the root node is characterized based on the root hash value and the number of leaf nodes under the root node; The second-level class nodes of the multi-branch tree are characterized by the corresponding class hash value and the number of leaf nodes under the second-level class node.

6. The method according to claim 1, characterized in that, The first target multi-way tree and the second target multi-way tree have the same structure, and the tag identifier of the target data element corresponding to the i-th fourth-level element node in the first target multi-way tree is the same as the tag identifier of the target data element corresponding to the i-th fourth-level element node in the second target multi-way tree. The step of comparing the first target multi-way tree and the second target multi-way tree to obtain a similarity score includes: If the root node of the first target multi-way tree is the same as the root node of the second target multi-way tree, the similarity is determined to be 100%. When the root node of the first target multi-way tree is different from the root node of the second target multi-way tree, for each tag category in the first target multi-way tree and the second target multi-way tree, determine the first number of fourth-level element nodes with the same position and different node hash values ​​under the second-level class node. The similarity is calculated based on the first quantity under each of the aforementioned tag categories, the weight of each of the aforementioned tag categories, and the total number of leaf nodes in the first target multi-way tree.

7. The method according to claim 6, characterized in that, For each tag category in the first target multi-way tree and the second target multi-way tree, determining the first number of fourth-layer element nodes with the same position and different node hash values ​​under the second-layer class node includes: For each tag category, compare whether the second-level class nodes corresponding to the tag category in the first target multi-way tree and the second target multi-way tree are the same; If the second-level class nodes corresponding to the tag category are the same in the first target multi-way tree and the second target multi-way tree, the first quantity under the tag category is determined to be 0; If the second-level class nodes corresponding to the tag categories in the first target multi-way tree and the second target multi-way tree are different, determine the target third-level group nodes whose hash values ​​of nodes at the same position under the second-level class nodes corresponding to the tag categories in the first target multi-way tree and the second target multi-way tree are different; Determine the second number of fourth-layer element nodes at the same position but with different node hash values ​​under each of the target third-layer group nodes; and, The sum of the second quantity corresponding to each of the target third-layer group nodes is taken as the first quantity.

8. A blockchain-based data storage and verification device, characterized in that, The device includes: A receiving module is configured to, in response to receiving a DICOM image file sent by a data provider, filter target data elements from the DICOM image file, wherein the target data elements are used to interpret the image corresponding to the DICOM image file, and the number of target data elements is less than the total number of data elements in the DICOM image file. The construction module is used to construct a multi-way tree based on the target data elements; The first storage module is used to store the DICOM image file in off-chain storage space; The second storage module is used to store the file ID of the DICOM image file, the multi-way tree, and the storage address of the DICOM image file in the off-chain storage space in the on-chain storage space. The multi-way tree stored in the on-chain storage space is used to determine whether the DICOM image file stored in the off-chain storage space is valid. The device further includes: The determination module is used to determine the target file ID carried in the query request in response to receiving a DICOM image file query request sent by the data query party; The query module is used to query the target storage address and the first target multi-branch tree corresponding to the target file ID from the chain; The acquisition module is used to acquire the target DICOM image file from the off-chain storage space according to the target storage address; The calculation module is used to calculate the second target multi-branch tree corresponding to the target DICOM image file; The comparison module is used to compare the first target multi-way tree and the second target multi-way tree to obtain the similarity. The feedback module is used to return the target DICOM image file to the data query party when the target DICOM image file is determined to be valid based on the similarity score.

9. A computer-readable storage medium having computer program instructions stored thereon, characterized in that, When executed by a processor, the program instructions implement the steps of the method described in any one of claims 1-7.

10. An apparatus, characterized in that, include: A memory on which computer programs are stored; A processor for executing the computer program in the memory to implement the steps of the method according to any one of claims 1-7.