[0021] The technical solution of the present invention will be further described below in conjunction with the embodiments and drawings.
[0022] reference figure 1 As shown, the present invention is a method for managing medical image desensitization data based on content uniqueness, which is applied to the management of image desensitization data in a medical system and includes at least the following steps:
[0023] Obtain the source data of medical image desensitization and DICOM of desensitization data, and calculate the SHA256 hash value of the two data respectively, and store the unique identification of the two data in the database; that is, in this step, the source data and the desensitization data are calculated separately The SHA256 hash value of sensitive data DICOM is used as the unique identification of the two, because the DICOM tags of the source data and the desensitized data are not exactly the same, so the SHA256 value of the two will not be the same;
[0024] Delete the label data of the source data and the desensitization data DICOM, separately calculate the SHA256 hash value of the image data of the two, and compare the source data after the label is deleted and the desensitization data after the label is deleted to calculate the SHA256 hash value When the two hash values are the same, it will be used as the accurate identification corresponding to the source data and the desensitization data and stored in the database; in this step, combine figure 1 As shown, the label data of the source data DICOM data is deleted, and the SHA256 hash value of the image data is calculated separately. For the source data and the desensitized data, the calculated SHA256 value should be the same, so it can be used as the source data Accurate identification corresponding to the desensitization data one by one.
[0025] Query the corresponding source data based on each SHA256 value stored in the database. Record the SHA256 value of the source data DICOM, the SHA256 value of the desensitized data, and the SHA256 value of the data after the label is deleted in the database. Since the content of the source data DICOM and the desensitized DICOM after the label is deleted is the image data, the SHA256 obtained The same value can be used as the correspondence between the source data and the desensitization data. The SHA256 that records the desensitized data is to speed up the query. You can also directly calculate the SHA256 value of the desensitized data after the DICOM tag is deleted to query the corresponding source data in reverse. Especially if the DICOM tag of the desensitized data is changed, causing the corresponding SHA256 not to be outdated in the database, directly calculating the SHA256 value of the unlabeled DICOM and querying it in reverse, the corresponding source data can still be found. This is impossible with the existing desensitization system. The actual storage location of the source data stored in the database corresponds to the SHA256 value of the DICOM data. The method described can be implemented with the IPFS system.
[0026] In addition, in the embodiments of the method of the present invention, all implementations are based on existing open source software components, and languages such as Go, JavaScript, Python, C++, Java, etc. can be used, or they can be implemented in a combination of two. In this embodiment, It is implemented using JavaScript+Java, because this content belongs to the content of the prior art, and the principle is not repeated here.
[0027] Using the method of the present invention, in the PACS system, the SHA256 value of DICOM is also used to find the actual storage location. For the same DICOM data, the SHA256 value is also the same, thus avoiding repeated storage of multiple unnecessary copies.
[0028] In the solution of the present invention, the present invention also provides a medical image desensitization data management system based on content uniqueness, which is applied to the management of image desensitization data in a medical system, and includes at least the following modules:
[0029] Database for storing data;
[0030] The unique identification acquisition module is used to acquire the source data of medical image desensitization and the desensitization data DICOM, and calculate the SHA256 hash value of the two data respectively, and store the unique identification corresponding to the two in the database;
[0031] The accurate identification acquisition module is used to delete the label data of the source data and the desensitized data DICOM, separately calculate the SHA256 hash value of the two image data, and compare the source data after the label is deleted and the label data after the label is deleted. SHA256 hash value calculated by sensitive data, and when the two hash values are the same, use it as an accurate identifier corresponding to the source data and the desensitized data and store it in the database;
[0032] The source data query module is used to query the corresponding source data according to each SHA256 value stored in the database.
[0033] For the database storage part of the present invention, traditional PACS or other storage methods can be used, in conjunction with a database storing the unique hash value of DICOM content and the corresponding relationship between the storage path, and the effect similar to IPFS can be achieved, but in automatic backup and task scheduling It will be different from IPFS. For specific content, please refer to the existing technology.
[0034] This system corresponds to the above method, so the specific processing process will not be repeated here.
[0035] The present invention also provides a management system for medical image desensitization data based on the uniqueness of content, including a network, a memory, a processor, and a computer program stored in the memory and running on the processor. When the computer program is executed by the device, the steps of the method are realized. The processor may be a central processing unit (Central Processing Unit, CPU) and other hardware components, and the memory may be a hard disk, a memory, a plug-in hard disk, a smart memory card, a secure digital card flash memory and other storage devices, etc., the computer The program includes computer program code, source code form, object code, executable file or some intermediate form, etc.
[0036] The present invention also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the method are realized. The program can be implemented in languages such as Go, JavaScript, Python, C++, Java, etc., or can be implemented in combination of two.
[0037] In summary, the present invention uses hash coding based on the content of medical image data to realize the one-to-one mapping relationship between the original data and the desensitized data, so as to achieve the purpose of finding the original data from the desensitized data, and use The uniqueness of the hash value of the data content and the IPFS storage system reduce the storage of duplicate data and reduce storage costs. In actual operation, because the image data is often large, although the content of the label is easy to be changed, the content of the image part is generally not modified. Therefore, it is of little significance to save multiple duplicate copies in clinical and scientific research activities. It is difficult to avoid. The management method based on the uniqueness of the content can perfectly solve the problem of repeated storage of data by comparing the unique hash value.
[0038] The sequence of the above embodiments is only for ease of description, and does not represent the advantages and disadvantages of the embodiments.
[0039] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions recorded in the foregoing embodiments are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.