A method of hiding information in a data file
By preprocessing the data files and replacing the parameter sequence, the problem of low quality in carrier data recovery was solved, achieving large-capacity information hiding and high-quality carrier recovery. It is applicable to a variety of data types, and the PSNR reaches over 51dB.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- XIAN INSTITUE OF SPACE RADIO TECH
- Filing Date
- 2022-09-26
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies for hiding information in data files suffer from low and difficult-to-improve quality of data recovery from the carrier, and large-capacity information hiding algorithms are not suitable for non-natural image carriers, and the quality of the carrier cannot be controlled.
By preprocessing the data file, a parameter sequence is generated using pseudo-random sequences or compression and decompression methods. This sequence replaces the binary sequence in the data file, and the parameters are appended to the filename. The secret information is then hidden using XOR operations. The receiving end recovers the carrier data based on the parameters.
It enables the hiding of secret information in large amounts of data without increasing the information transmission rate, and can restore the original data with high quality. It is suitable for image and non-image data carriers, with controllable carrier quality and a PSNR of over 51dB.
Smart Images

Figure CN115688128B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a data communication method, and more particularly to a method for hiding information in a data file, belonging to the field of communication. Background Technology
[0002] Information hiding, also known as data hiding, is an important branch of information security. It utilizes the redundancy of human vision to embed secret information into a carrier, thereby achieving the purpose of securely transmitting secret information.
[0003] Current conventional information hiding methods, such as LSB, are lossy information hiding methods. Even with noise or lossy compression, the hidden information can be recovered normally, but the carrier data cannot be accurately recovered. It can be said that loss always exists, and it is difficult to improve the quality of carrier data recovery. For images, if the LSB algorithm hides 1 / 8 of the data, the carrier recovery quality is not high, and the image PSNR can only reach about 51dB. Summary of the Invention
[0004] The technical problem solved by this invention is to overcome the shortcomings of the prior art and provide a method for hiding information in data files. This method can hide large amounts of information and can also hide information through parameter design, which exceeds the capabilities of traditional information hiding methods. It also has a certain degree of security and is suitable for various data security transmission systems.
[0005] The technical solution of this invention is:
[0006] A method for hiding information in a data file includes:
[0007] Step 1: Preprocess and hide information in data file X to obtain a cryptic data stream Z;
[0008] Step 2: Dehide the cryptic data stream Z, extract the secret information, and restore the carrier.
[0009] Preprocessing and information hiding of data file X yields a cryptic data stream Z, including:
[0010] Let the data file be X = X1, X2, ... Xn, where each data value is a value between 0 and 255, represented by 8 bits. Extract the Kth significant bit of each data to form a binary sequence B = b1, b2, ... bn;
[0011] Using a set of parameters S = s1, s2, ..., sq of length q bits, generate a binary sequence of length n, denoted as Y = y1, y2, ..., yn, where parameter q is less than n; replace the binary sequence B in the data stream with Y;
[0012] Convert parameter S to hexadecimal notation HH and append it to the filename of data file X;
[0013] Hiding g bits of secret information in a binary sequence Y forms a binary sequence U=a, v1, v2, ..., vg, where g=n-1; a is a binary number, equal to 0 indicates that the carrier data can be recovered without loss, and equal to 1 indicates that the recovered carrier data has a certain degree of distortion within an allowable range;
[0014] Perform an XOR operation between the binary sequence U and the K'th valid bit of the data file X (or some pre-defined binary sequence) to obtain the binary sequence D = d1, d2, ..., dn; K' is not equal to K.
[0015] Replace the Kth valid bit in the data file X with a binary sequence D to obtain the hidden data stream Z containing the hidden information.
[0016] The method of generating a binary sequence of length n using a set of parameters S=s1, s2, ... sq of length q bits is specifically as follows: a binary sequence Y=y1, y2, ... yn is generated based on a pseudo-random sequence method; if Y is equal to B, the carrier can be recovered without loss, corresponding to a equal to 0; if Y is not equal to B, the carrier has a certain degree of distortion within the allowable range, corresponding to a equal to 1.
[0017] The method of generating a binary sequence of length n using a set of parameters S=s1, s2, ... sq of length q bits is specifically as follows: a binary sequence Y=y1, y2, ... yn is generated based on compression and decompression methods; if Y is equal to B, the carrier can be recovered without loss, corresponding to a equal to 0; if Y is not equal to B, the carrier has a certain degree of distortion within the allowable range, corresponding to a equal to 1.
[0018] The parameter S in the pseudo-random sequence-based method is:
[0019] The q-bit parameters that generate the chaotic sequence are: parameter μ, initial value X0, and a binary sequence S=s1, s2, ... sq corresponding to length n.
[0020] The parameter S in the pseudo-random sequence-based method is:
[0021] The q-bit parameters for generating the longest linear shift register sequence, i.e., the m-sequence, are: the generation parameters of the m-sequence, i.e., the initial state, structure, and the binary sequence S=s1, s2, ..., sq corresponding to the length n.
[0022] The parameter S in the compression and decompression method is:
[0023] Lossless compression of the binary sequence B yields a compressed data file. The binary sequence corresponding to this file is used as the q-bit parameter sequence S=s1, s2, ..., sq, with a equal to 0.
[0024] The parameter S in the compression and decompression method is:
[0025] Lossy compression is performed on the binary sequence B to obtain the compressed data file. The binary sequence corresponding to this file is used as the q-bit parameter sequence S=s1, s2, ... sq, with a equal to 1.
[0026] The process of de-hiding the cryptic data stream Z, extracting the secret information, and restoring the carrier includes:
[0027] Extract the Kth significant bit of the dense data stream Z, and perform an XOR operation with the K'th significant bit of the data file X (which is agreed upon in advance) or a certain pre-agreed binary sequence to obtain the binary sequence U.
[0028] Extract g bits of secret information from the binary sequence U;
[0029] Based on the extended symbol HH in the filename, recover the binary parameter sequence S of length q;
[0030] Using parameter S, a binary sequence of length n is generated to obtain the Kth effective bit information sequence B=b1, b2, ..., bn of data stream X, and the first bit of the binary sequence U is output as a carrier quality evaluation flag; if the first bit is 0, the data stream X can be recovered without loss; if the first bit is 1, it indicates that the recovered data stream X has a certain degree of distortion within the allowable range.
[0031] The advantages of this invention compared to the prior art are:
[0032] This invention utilizes information hiding technology to embed secret information into data files for transmission. The receiving end can extract generation parameters from the data file or data file name, and then recover the carrier data information based on the generation parameters, as well as recover the original data and extract information data with high quality. Without increasing the information transmission rate, it can hide one-eighth (embedding rate 1 bpp) or more of a large amount of data in the data file, and the hidden capacity is fixed, ensuring the confidentiality of transmission.
[0033] Compared with the prior art, the present invention has the following substantial differences and advancements:
[0034] (1) Current large-capacity hiding algorithms, such as the least significant bit hiding method, can hide one-eighth of the large data volume in the data file, but the data as the carrier can only be recovered in a lossy manner, and the PSNR can only reach 50dB-51dB, which cannot be improved further. However, the original data can always be recovered with high quality under the same one-eighth hiding capacity of the present invention. For the image, the carrier quality can be guaranteed to be above 51dB.
[0035] (2) The current spatial hiding algorithm for bit substitution cannot control the carrier quality. The carrier data recovery quality of the present invention can be known in advance (according to the flag bit). When a=0, the carrier can be recovered without distortion.
[0036] (3) Current large-capacity hiding algorithms are mainly designed for natural image carriers and are not suitable for random image carriers; however, the method of this invention is suitable not only for image data carriers but also for non-image data carriers.
[0037] (4) According to the preprocessing, the parameters can be appended to the file name, and the carrier can be optimized in advance according to the requirements of whether the carrier is distorted or not. The hidden capacity can be flexibly adjusted through parameter settings, and the actual capacity can exceed 1 / 8 or even 4 / 8. Attached Figure Description
[0038] Figure 1 This is a flowchart of the method of the present invention. Detailed Implementation
[0039] (I) In order to verify the performance of the algorithm proposed in this invention, the simulation experiment first used a dat data file. The data stream is X=X1, X2.......Xn, and the value of each data is a value between 0 and 255, represented by 8 bits. The Kth effective bit is B=b1, b2, ..., bn, n=2048.
[0040] Random sequence data with equal probability distribution of 0 and 1 is embedded as secret information in a data file (carrier), and then the secret information is extracted and the carrier is restored.
[0041] The specific method is as follows:
[0042] Step 1: Preprocess and hide information in data file X to obtain a cryptic data stream Z. The process is as follows:
[0043] (1) Extract the Kth effective bit of data file X to form a binary sequence B=b1, b2, ... bn, where K=8;
[0044] (2) Using a set of parameters S = s1, s2, ..., sq of length q bits, generate a binary sequence of length n, denoted as Y = y1, y2, ..., yn, where the parameter q is less than n. Replace the binary sequence B in the data stream with Y;
[0045] The method for generating a binary (0 or 1) sequence of length n using parameters S=s1, s2, ..., sq is based on compression and decompression to generate a 0-1 binary sequence Y=y1, y2, ..., yn, where q=32.
[0046] The parameters in the compression and decompression method are used to compress a binary sequence B to obtain a compressed data file. The binary sequence corresponding to this file is used as a q-bit parameter sequence S=s1,s2......sq, with a corresponding to a equal to 0.
[0047] (3) Convert the q bit parameters into hexadecimal symbols 0~F in groups of 4 bits and append them to the data X file name, where the length of the hexadecimal data is 8;
[0048] (4) Hide g bits of secret information in the binary sequence Y to form a binary sequence U=a, v1, v2, ..., vg, where a=0 and g=2047;
[0049] There are various methods for hiding secret information, such as spatial domain hiding and transform domain hiding. Here, we use direct spatial domain substitution to hide the secret information.
[0050] (5) Perform an XOR operation on the binary sequence U and the data stream X with the K' (K' = 1~8, K' ≠ K) valid bits agreed upon in advance or a certain agreed binary sequence to obtain the binary sequence D = d1, d2, ..., dn, where K' = 3. In a special case, when the agreed binary sequence is all 0, it is equivalent to no XOR operation, which is also a case of this algorithm.
[0051] (6) Replace the Kth valid bit in the data file X with the binary sequence D to obtain the hidden data stream Z.
[0052] Step 2: Dehide the cryptic data stream Z, extract the secret information, and restore the carrier. The process is as follows:
[0053] (1) Extract the Kth valid bit of data stream Z and perform an XOR operation with the K' (K'=1~8, K' is not equal to K) valid bit of data stream X or a certain pre-agreed binary sequence to obtain the binary sequence U, where K'=3;
[0054] (2) Extract g bits of secret information from the binary sequence U, where g = 2047;
[0055] (3) Based on the extended symbol HH in the file name, recover the binary parameter sequence S of length q, where q=32;
[0056] (4) Using parameter S, generate a binary sequence of length n to obtain the Kth effective bit information sequence B=b1, b2, ..., bn of data stream X, and output the first bit of the binary sequence U as the carrier quality evaluation flag. If the first bit is 0, then data stream X can be recovered.
[0057] (ii) In order to verify the performance of the algorithm proposed in this invention, an information hiding experiment was conducted using an 8-bit grayscale image with a size of 512×512 from the standard test image library.
[0058] Random sequence data with equal probability distribution of 0 and 1 is embedded as secret information in a data file (carrier), and then the secret information is extracted and the carrier is restored.
[0059] The specific method is as follows:
[0060] Step 1: Preprocess and hide information in data file X to obtain a cryptic data stream Z. The process is as follows:
[0061] (1) Extract the Kth effective bit of data file X to form a binary sequence B=b1, b2, ... bn, where K=8 and n=262144;
[0062] (2) Using a set of parameters S=s1, s2, ..., sq of length q bits, generate a binary sequence of length n, denoted as Y=y1, y2, ..., yn, where the parameter q is less than n. Replace the binary sequence B in the data stream with Y, where q=128;
[0063] The method for generating a binary (0 or 1) sequence of length n using parameters S = s1, s2, ..., sq is a pseudo-random sequence-based method that generates a 0-1 binary sequence Y = y1, y2, ..., yn, where parameter q is less than n. The parameters in the pseudo-random sequence-based method are the chaotic sequence parameter μ, the initial value X0, and the binary sequence S = s1, s2, ..., sq corresponding to length n; where a equals 1.
[0064] Typical chaotic sequences are generated based on the Logistic chaotic mapping, and its model is as follows: , t {0,1,2,…,n-1};where, This is called the branch parameter, 3.5699456 < ≤4; Initial value, 0 < <1; given Value and initial value That is, the generation can be calculated ;right By making a decision, we can obtain a 0-1 binary sequence Yi (i=1,…n). When , Yi = 1; 0 < <0.5, Yi=0;
[0065] (3) Convert the q-bit parameters into hexadecimal symbols 0~F in groups of 4 bits and append them to the data X file name, where the length of the hexadecimal data is 32;
[0066] (4) Hide g bits of secret information in the binary sequence Y to form a binary sequence U=a, v1, v2, ..., vg, where a=1 and g=262143;
[0067] (5) Perform an XOR operation on the binary sequence U and the data stream X with the K' (K' = 1~8, K' ≠ K) valid bits agreed upon in advance or a certain agreed binary sequence to obtain the binary sequence D = d1, d2, ..., dn, where K' = 3. In a special case, when the agreed binary sequence is all 0, it is equivalent to no XOR operation, which is also a case of this algorithm.
[0068] (6) Replace the Kth valid bit in the data file X with the binary sequence D to obtain the hidden data stream Z.
[0069] Step 2: Dehide the cryptic data stream Z, extract the secret information, and restore the carrier. The process is as follows:
[0070] (1) Extract the Kth valid bit of data stream Z and perform an XOR operation with the K' (K'=1~8, K' is not equal to K) valid bit of data stream X or a certain pre-agreed binary sequence to obtain the binary sequence U, where K'=3;
[0071] (2) Extract g bits of secret information from the binary sequence U, where g = 262143;
[0072] (3) Based on the extended symbol HH in the file name, recover the binary parameter sequence S of length q, where q=128;
[0073] (4) Using parameter S, generate a binary sequence of length n to obtain the Kth effective bit information sequence B=b1, b2, ..., bn of data stream X, and output the first bit of the binary sequence U as the carrier quality evaluation flag. If the first bit is 1, then the recovered data stream X has a certain degree of distortion within the allowable range.
[0074] Data preprocessing has almost no impact on the quality of the carrier image, information hiding does not affect the quality of the recovered carrier, and data hiding does not affect the subjective quality of the carrier image. Secret information can always be recovered. The data file (carrier) can be recovered as needed, or with high quality as required. The algorithm has a large capacity: a typical capacity of up to 1 / 8, or it can process L significant bits at once, reaching a capacity of L / 8. Appending the q-bit parameter to the data X filename meets the system's filename length requirements, or placing the q-bit parameter in bits 2 to q+1 of the Kth significant bit can overcome the system's filename length limitation, preventing filename tampering during transmission. For images, PSNR performance is superior to traditional algorithms such as LSB, reaching 51dB-60dB, or even infinitely high.
[0075] The dense image has good invisibility during transmission.
[0076] Information recovery.
[0077] The restored image has a PSNR of 51dB and a hidden size of 1bpp.
[0078] The restored image has a PSNR of 60dB and a hidden size of 1bpp.
[0079] The restored image has PSNR=∞ and a hidden size of 1bpp.
[0080] This invention proposes a method for hiding information in a data file. The method preprocesses the data file, uses parameters to generate a sequence that meets the requirements, and obtains a data file that meets the requirements. The parameters and secret information are encoded and hidden for transmission using different methods. After storage or channel transmission, the receiving end can recover the secret information and recover the carrier data file with high quality.
[0081] Information hiding technology has been applied to terrestrial information networks and in the technical design and demonstration of spacecraft and various satellite data transmission systems, and will undoubtedly see wider application in the future. This invention provides a fixed-capacity information hiding method, with a hiding capacity of 1 / 8 or more. Simultaneously, this method features high-quality recovery of the data carrier, thus possessing practical value in both terrestrial systems and spacecraft engineering.
[0082] The parts of this invention not described in detail are common knowledge to those skilled in the art.
Claims
1. A method for hiding information in a data file, characterized in that, include: Preprocessing and information hiding are performed on data file X to obtain a cryptic data stream Z; Dehide the encrypted data stream Z, extract the secret information, and restore the carrier. Preprocessing and information hiding of data file X yields a cryptic data stream Z, including: Let the data file be X = X1, X2, ... Xn, where each data value is a value between 0 and 255, represented by 8 bits. Extract the Kth significant bit of each data to form a binary sequence B = b1, b2, ... bn; Using a set of parameters S = s1, s2, ..., sq of length q bits, generate a binary sequence of length n, denoted as Y = y1, y2, ..., yn, where parameter q is less than n; replace the binary sequence B in the data stream with Y; Convert parameter S to hexadecimal notation HH and append it to the filename of data file X; Hiding g bits of secret information in a binary sequence Y forms a binary sequence U=a, v1, v2, ..., vg, where g=n-1; a is a binary number, equal to 0 indicates that the carrier data can be recovered without loss, and equal to 1 indicates that the recovered carrier data has a certain degree of distortion within an allowable range; Perform an XOR operation between the binary sequence U and the K'th valid bit of the data file X (or some pre-defined binary sequence) to obtain the binary sequence D = d1, d2, ..., dn; K' is not equal to K. Replace the Kth valid bit in the data file X with a binary sequence D to obtain the hidden data stream Z containing the hidden information.
2. The method for hiding information in a data file according to claim 1, characterized in that, The method of generating a binary sequence of length n using a set of parameters S=s1, s2, ... sq of length q bits is specifically as follows: a binary sequence Y=y1, y2, ... yn is generated based on a pseudo-random sequence method; if Y is equal to B, the carrier can be recovered without loss, corresponding to a equal to 0; if Y is not equal to B, the carrier has a certain degree of distortion within the allowable range, corresponding to a equal to 1.
3. The method for hiding information in a data file according to claim 2, characterized in that, The method of generating a binary sequence of length n using a set of parameters S=s1, s2, ... sq of length q bits is specifically as follows: a binary sequence Y=y1, y2, ... yn is generated based on compression and decompression methods; if Y is equal to B, the carrier can be recovered without loss, corresponding to a equal to 0; if Y is not equal to B, the carrier has a certain degree of distortion within the allowable range, corresponding to a equal to 1.
4. A method for hiding information in a data file according to claim 2, characterized in that, The parameter S in the pseudo-random sequence-based method is: The q-bit parameters that generate the chaotic sequence are: parameter μ, initial value X0, and a binary sequence S=s1, s2, ... sq corresponding to length n.
5. A method for hiding information in a data file according to claim 2, characterized in that, The parameter S in the pseudo-random sequence-based method is: The q-bit parameters for generating the longest linear shift register sequence, i.e., the m-sequence, are: the generation parameters of the m-sequence, i.e., the initial state, structure, and the binary sequence S=s1, s2, ..., sq corresponding to the length n.
6. A method for hiding information in a data file according to claim 3, characterized in that, The parameter S in the compression and decompression method is: Lossless compression of the binary sequence B yields a compressed data file. The binary sequence corresponding to this file is used as the q-bit parameter sequence S=s1, s2, ..., sq, with a equal to 0.
7. A method for hiding information in a data file according to claim 3, characterized in that, The parameter S in the compression and decompression method is: Lossy compression is performed on the binary sequence B to obtain the compressed data file. The binary sequence corresponding to this file is used as the q-bit parameter sequence S=s1, s2, ... sq, with a equal to 1.
8. A method for hiding information in a data file according to claim 1, characterized in that, The process of de-hiding the cryptic data stream Z, extracting the secret information, and restoring the carrier includes: Extract the Kth significant bit of the dense data stream Z, and perform an XOR operation with the K'th significant bit of the data file X (which is agreed upon in advance) or a certain pre-agreed binary sequence to obtain the binary sequence U. Extract g bits of secret information from the binary sequence U; Based on the extended symbol HH in the filename, recover the binary parameter sequence S of length q; Using parameter S, a binary sequence of length n is generated to obtain the Kth effective bit information sequence B=b1, b2, ..., bn of data stream X, and the first bit of the binary sequence U is output as a carrier quality evaluation flag; if the first bit is 0, the data stream X can be recovered without loss; if the first bit is 1, it indicates that the recovered data stream X has a certain degree of distortion within the allowable range.