Carbon emission data blockchain storage and tracing method and system
By generating data fingerprints through hash calculations and constructing a source traceability graph, the problems of low data retrieval efficiency and difficulty in identifying tampering in carbon emission data management are solved. This enables tamper-proof evidence storage and rapid source traceability of carbon emission data, thereby improving monitoring and disposal efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING TRUTH WISDOM POWER TECH CO LTD
- Filing Date
- 2026-03-26
- Publication Date
- 2026-06-19
Smart Images

Figure CN122241775A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of blockchain data management technology, and in particular to a method and system for blockchain-based evidence storage and traceability of carbon emission data. Background Technology
[0002] As the global climate change problem becomes increasingly severe, carbon emission management has become a key area of focus for governments and enterprises around the world. Accurately recording, tracing and managing carbon emission data is of great significance for achieving carbon peaking and carbon neutrality goals. Traditional carbon emission data management methods mainly rely on centralized database systems, where enterprises report emission data themselves, and then regulatory authorities review and compile the data.
[0003] Blockchain technology, due to its decentralized, immutable, and traceable characteristics, is gradually being applied to the field of carbon emission data management. The development of IoT technology enables carbon emission source devices to collect and transmit emission data in real time, providing a technical foundation for the refined management of carbon emissions. However, existing carbon emission data management solutions still have problems such as the lack of an efficient data indexing mechanism in the carbon emission data on-chain scheme, resulting in low data retrieval efficiency, inability to meet the needs of real-time supervision and rapid auditing, the ability to achieve only simple data chain traceability, inability to effectively identify and locate data tampering behavior, and reliance on post-event manual review to discover data anomalies, making it impossible to identify suspected tampered data nodes in real time. Summary of the Invention
[0004] This invention provides a method and system for blockchain-based storage and traceability of carbon emission data, which can at least solve some of the problems existing in the prior art.
[0005] A first aspect of this invention provides a method for blockchain-based storage and traceability of carbon emission data, comprising:
[0006] Emission data from carbon emission source devices is acquired, and a data fingerprint corresponding to the emission data is generated through hash calculation. The data fingerprint and the emission data are encapsulated into a storage unit. A time identifier and device identifier are extracted from the emission data and combined to generate a blockchain address index, and the storage location of the storage unit is determined. The storage unit is written to the storage location through consensus verification to obtain the storage record.
[0007] The process involves reading the existing evidence record and using it as the current tracing node, taking the preceding evidence record associated with the existing evidence record as the preceding tracing node, calculating the hash association value between the data fingerprints in the current tracing node and the preceding tracing node to generate an association verification identifier, creating a directed edge from the preceding tracing node to the current tracing node, and embedding the association verification identifier into the directed edge to obtain the tracing relationship graph.
[0008] Based on the source tracing relationship graph, the data fingerprint difference value and time interval between the current source node and its adjacent source nodes are calculated, and the state vector of the current source node is updated through a dynamic propagation algorithm. Based on the state vector and the associated verification identifier, an anomaly score is calculated. When the anomaly score is greater than the preset anomaly threshold, the current source node is marked as a suspected tampering node, and recursive verification is performed along the directed edge to determine the source of the anomaly propagation.
[0009] In one alternative implementation,
[0010] Acquiring emission data from carbon emission source devices, generating a data fingerprint corresponding to the emission data through hash calculation, and encapsulating the data fingerprint and the emission data into a storage unit includes:
[0011] Emission data is obtained from the carbon emission source device through a preset device access protocol. The emission data is serialized according to a preset field order to obtain an emission data byte stream. A hash calculation is performed on the emission data byte stream to generate an initial hash value. The device identifier and time identifier are extracted from the emission data, concatenated, and then a hash calculation is performed to generate an identifier hash value. The initial hash value and the identifier hash value are XORed to obtain an intermediate hash value.
[0012] Multiple rounds of iterative hash calculation are performed on the intermediate hash value. In each round of iteration, the hash output of the previous round is combined with the preset salt value and the hash calculation is performed again. The iteration is repeated until the preset number of iterations is reached to obtain the final hash value, and the final hash value is used as the data fingerprint.
[0013] The data fingerprint and the emission data are structured and encapsulated. A binding relationship between the data fingerprint and the emission data is established in the encapsulation structure and a version identifier is added to obtain the evidence storage unit.
[0014] In one alternative implementation,
[0015] The time identifier and device identifier are extracted from the emission data to generate a blockchain address index and determine the storage location of the evidence storage unit. The evidence storage unit is then written to the storage location through consensus verification to obtain the evidence storage record, which includes:
[0016] The time identifier and device identifier are extracted from the emission data and combined according to a preset concatenation rule to obtain a combined identifier string. A hash mapping operation is performed on the combined identifier string to generate a hash mapping value, and a sharding index is calculated by combining the hash mapping value and the sharding index. The blockchain address index is generated by combining the hash mapping value and the sharding index.
[0017] The target shard of the evidence storage unit in the blockchain distributed ledger is determined based on the blockchain address index, and the specific storage node is located within the target shard. The storage space corresponding to the specific storage node is determined as the storage location of the evidence storage unit.
[0018] The evidence storage unit is sent to the consensus node set corresponding to the storage location. The data fingerprint in the evidence storage unit is extracted, and the consistency check value between the data fingerprint and the encapsulation structure of the evidence storage unit is calculated. The consensus node set verifies the integrity of the evidence storage unit based on the consistency check value. The number of consensus nodes that pass the verification is counted, and the verification pass rate is calculated. When the verification pass rate reaches a preset consensus threshold, the consensus verification is determined to be successful. The data fingerprint in the evidence storage unit and the blockchain address index are hashed to generate a block header hash value. The evidence storage unit and the block header hash value are encapsulated to obtain a new block and written to the storage location. The writing time and the blockchain height are recorded to obtain the evidence storage record.
[0019] In one alternative implementation,
[0020] Reading the stored evidence record and using it as the current tracing node, and using the preceding stored evidence record associated with the stored evidence record as the preceding tracing node, calculating the hash association value between the data fingerprints in the current tracing node and the preceding tracing node to generate an association verification identifier includes:
[0021] The evidence storage record is read from the blockchain distributed ledger as the current traceability node. The blockchain address index of the current traceability node is parsed to obtain the device identifier and historical evidence storage records with the same device identifier are retrieved. Historical evidence storage records with a blockchain height smaller than the current traceability node are identified and marked as candidate evidence storage records. The candidate evidence storage record with the largest blockchain height is extracted to obtain the previous evidence storage record and used as the previous traceability node.
[0022] Extract the current data fingerprint corresponding to the current tracing node and the previous data fingerprint corresponding to the previous tracing node, serialize and concatenate them to obtain a dual fingerprint sequence, and perform a hash operation to generate a fingerprint hash association value. Calculate the block height difference between the block height corresponding to the current tracing node and the block height corresponding to the previous tracing node, and determine the continuity of the current tracing node and the previous tracing node on the blockchain. If they are continuous, combine the fingerprint hash association value and the block height difference to generate a continuity hash association value as the hash association value.
[0023] If they are not continuous, the intermediate evidence records between the previous source node and the current source node are retrieved and the corresponding data fingerprints are extracted to obtain the intermediate fingerprint chain. The chain hash calculation is then combined to generate a cross-hash association value as the hash association value.
[0024] The hash association value is bound to the blockchain address index of the current tracing node to obtain the association verification identifier.
[0025] In one alternative implementation,
[0026] Create directed edges from the preceding source node to the current source node and embed the association verification identifier into the directed edges to obtain the source relationship graph, including:
[0027] Extract the blockchain address index of the preceding tracing node as the starting node identifier, extract the blockchain address index of the current tracing node as the ending node identifier, perform a combination operation on the starting node identifier and the ending node identifier to generate an edge identifier, and create a directed edge based on the edge identifier and establish a directional association relationship.
[0028] Based on the directional association relationship, the embedding position of the association verification identifier in the directed edge is determined and embedded as a verification attribute in the embedding position. The hash association value in the association verification identifier is extracted and combined with the directional association relationship to calculate the integrity verification value of the directed edge. Based on the pre-acquired block height difference and the integrity verification value, the edge weight value of the directed edge is jointly calculated.
[0029] Based on the edge weight value, the continuity between the current tracing node and the previous tracing node on the blockchain is determined to obtain a continuity judgment result. If the continuity judgment result is continuous, the directed edge is marked as a continuous edge type. If the continuity judgment result is discontinuous, the directed edge is marked as a cross-edge type.
[0030] The directed edges, their corresponding label types, and edge weight values are stored in a graph data structure. All evidence records in the blockchain distributed ledger are traversed and the directed edges are repeated. The directed edges are combined with the corresponding traceability nodes to construct a traceability relationship graph.
[0031] In one alternative implementation,
[0032] Based on the source tracing graph, the data fingerprint difference and time interval between the current source node and its neighboring source nodes are calculated, and the state vector of the current source node is updated using a dynamic propagation algorithm, including:
[0033] The current source node is selected from the source relationship graph. Based on the directed edge traversal, the source nodes directly connected to the current source node are taken as adjacent source nodes. The current data fingerprint corresponding to the current source node and the adjacent data fingerprint corresponding to the adjacent source nodes are extracted and local sensitive hash mapping is performed to obtain the current fingerprint vector and the adjacent fingerprint vector. The distance metric is calculated to obtain the data fingerprint difference value. A difference feature matrix is constructed based on all data fingerprint difference values.
[0034] The time difference between the current source node and its neighboring source nodes is calculated to obtain the corresponding time interval. The integrity verification value of the directed edge connecting the current source node and its neighboring source nodes is extracted. The trust propagation coefficient is obtained by coupling the value with the time interval.
[0035] The difference feature matrix and the trust propagation coefficient are combined to generate a propagation influence matrix, and singular value decomposition is performed to extract dominant singular values. The difference feature matrix is reconstructed based on the dominant singular values to obtain a compressed propagation matrix. Adjacent source nodes are traversed and their corresponding state vectors are extracted. The state vectors are transformed with the compressed propagation matrix to obtain a propagation state representation. Based on the edge type of the directed edges connecting the current source node and adjacent source nodes, the propagation state representation is subjected to differential modulation and tensor contraction operations to obtain an aggregated state vector, which is used as the state vector of the current source node.
[0036] In one alternative implementation,
[0037] An anomaly score is calculated based on the state vector and the associated verification identifier. When the anomaly score is greater than a preset anomaly threshold, the current source node is marked as a suspected tampering node. Recursive verification is performed along the directed edges to determine the source of the anomaly propagation, including:
[0038] Project the state vector corresponding to the current source node onto the preset anomaly detection space to obtain the anomaly feature vector, and extract the associated verification identifier of the current source node and parse it to obtain the verification feature vector;
[0039] The abnormal feature vector and the verification feature vector are multiplied by a tensor to obtain a joint feature vector, and then dimensionality reduction is performed to obtain an abnormal response value. The norm of the abnormal response value is extracted to obtain an abnormal score. It is determined whether the abnormal score is greater than a preset abnormal threshold. If it is greater, the current source node is marked as a suspected tampering node, and all directed edges connecting the suspected tampering node are extracted.
[0040] Based on the direction of the directed edge, traverse the previous source nodes that are connected to the suspected tampered node and calculate the corresponding anomaly score. Mark the previous source nodes with anomaly scores greater than the anomaly threshold as suspected tampered nodes and record the directed edge. Repeat the traversal and marking until the anomaly score of the previous source node is less than or equal to the anomaly threshold.
[0041] Extract all source nodes marked as suspected tampering nodes, and determine the source node whose abnormal score first exceeds the abnormal threshold based on the directed edges and the distribution of the abnormal scores to obtain the source of the abnormal propagation.
[0042] A second aspect of this invention provides a blockchain-based system for storing and tracing carbon emission data, comprising:
[0043] The evidence storage unit includes: acquiring emission data from carbon emission source equipment; generating a data fingerprint corresponding to the emission data through hash calculation; encapsulating the data fingerprint and the emission data into an evidence storage unit; extracting a time identifier and an equipment identifier from the emission data to generate a blockchain address index and determine the storage location of the evidence storage unit; and writing the evidence storage unit into the storage location through consensus verification to obtain an evidence storage record.
[0044] The tracing unit includes reading the stored evidence record and using it as the current tracing node; using the preceding stored evidence record associated with the stored evidence record as the preceding tracing node; calculating the hash association value between the data fingerprints in the current tracing node and the preceding tracing node to generate an association verification identifier; creating a directed edge from the preceding tracing node to the current tracing node and embedding the association verification identifier into the directed edge to obtain a tracing relationship graph;
[0045] The verification unit includes calculating the data fingerprint difference value and time interval between the current source node and its adjacent source nodes based on the source relationship graph and updating the state vector of the current source node through a dynamic propagation algorithm; calculating an anomaly score based on the state vector and the associated verification identifier; marking the current source node as a suspected tampering node when the anomaly score is greater than a preset anomaly threshold; performing recursive verification along the directed edge and determining the source of the anomaly propagation.
[0046] A third aspect of the present invention provides an electronic device, comprising:
[0047] A processor and a memory for storing processor-executable instructions, wherein the processor is configured to invoke instructions stored in the memory to perform the aforementioned method.
[0048] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0049] In this invention, carbon emission data is encapsulated with hash fingerprints and original data into a storage unit and written into the blockchain. An address index is generated using time and device identifiers to determine the storage location, achieving tamper-proof storage of carbon emission data. By constructing directed edges from previous traceability nodes to the current traceability node and embedding association verification identifiers calculated based on data fingerprints into these directed edges, a complete traceability graph is formed, enabling full lifecycle traceability of carbon emission data. This clearly shows the data generation, flow, and change paths, providing reliable technical support for carbon emission regulation. By updating the state vector of traceability nodes through a dynamic propagation algorithm and calculating anomaly scores based on data fingerprint differences, time intervals, and association verification identifiers, suspected tampering nodes can be automatically identified, and the source of anomaly propagation can be recursively traced along directed edges. This achieves rapid location and traceability of carbon emission data anomalies, significantly improving data security monitoring capabilities and anomaly handling efficiency. Attached Figure Description
[0050] Figure 1 This is a flowchart illustrating the carbon emission data blockchain storage and traceability method according to an embodiment of the present invention.
[0051] Figure 2 This is a flowchart illustrating the state propagation process of the carbon emission data blockchain storage and traceability method in this embodiment of the invention. Detailed Implementation
[0052] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0053] The technical solution of the present invention will be described in detail below with reference to specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
[0054] Figure 1 This is a flowchart illustrating the carbon emission data blockchain storage and traceability method according to an embodiment of the present invention, as shown below. Figure 1 As shown, the method includes:
[0055] Emission data from carbon emission source devices is acquired, and a data fingerprint corresponding to the emission data is generated through hash calculation. The data fingerprint and the emission data are encapsulated into a storage unit. A time identifier and device identifier are extracted from the emission data and combined to generate a blockchain address index, and the storage location of the storage unit is determined. The storage unit is written to the storage location through consensus verification to obtain the storage record.
[0056] The process involves reading the existing evidence record and using it as the current tracing node, taking the preceding evidence record associated with the existing evidence record as the preceding tracing node, calculating the hash association value between the data fingerprints in the current tracing node and the preceding tracing node to generate an association verification identifier, creating a directed edge from the preceding tracing node to the current tracing node, and embedding the association verification identifier into the directed edge to obtain the tracing relationship graph.
[0057] Based on the source tracing relationship graph, the data fingerprint difference value and time interval between the current source node and its adjacent source nodes are calculated, and the state vector of the current source node is updated through a dynamic propagation algorithm. Based on the state vector and the associated verification identifier, an anomaly score is calculated. When the anomaly score is greater than the preset anomaly threshold, the current source node is marked as a suspected tampering node, and recursive verification is performed along the directed edge to determine the source of the anomaly propagation.
[0058] In one alternative implementation,
[0059] Acquiring emission data from carbon emission source devices, generating a data fingerprint corresponding to the emission data through hash calculation, and encapsulating the data fingerprint and the emission data into a storage unit includes:
[0060] Emission data is obtained from the carbon emission source device through a preset device access protocol. The emission data is serialized according to a preset field order to obtain an emission data byte stream. A hash calculation is performed on the emission data byte stream to generate an initial hash value. The device identifier and time identifier are extracted from the emission data, concatenated, and then a hash calculation is performed to generate an identifier hash value. The initial hash value and the identifier hash value are XORed to obtain an intermediate hash value.
[0061] Multiple rounds of iterative hash calculation are performed on the intermediate hash value. In each round of iteration, the hash output of the previous round is combined with the preset salt value and the hash calculation is performed again. The iteration is repeated until the preset number of iterations is reached to obtain the final hash value, and the final hash value is used as the data fingerprint.
[0062] The data fingerprint and the emission data are structured and encapsulated. A binding relationship between the data fingerprint and the emission data is established in the encapsulation structure and a version identifier is added to obtain the evidence storage unit.
[0063] Emission data is acquired from carbon emission source devices through a pre-defined device access protocol. The device access protocol can be a standard communication protocol based on the Industrial Internet of Things (IIoT), such as ModBus, MQTT, or OPC UA. Taking a continuous flue gas monitoring system in a coal-fired power plant as an example, it collects hourly data on the emission concentrations and flow rates of pollutants such as carbon dioxide, sulfur dioxide, and nitrogen oxides. The emission data includes, but is not limited to, fields such as: device identifier "CPMS20240601001", timestamp "2024-06-01 10:00:00", carbon dioxide concentration "156.7 g / m³", emission flow rate "21500 m³ / h", and operating status code "0001".
[0064] The acquired emission data is serialized according to a preset field order to obtain an emission data byte stream. The serialization process uses a compact binary format, with each field arranged in a fixed-length or prefix-length order. For the aforementioned emission data, the serialized byte stream structure is: device identifier + timestamp + carbon dioxide concentration + emission flow rate + operating status code. Serialization uses big-endian byte order to ensure data consistency across different platforms.
[0065] A hash calculation is performed on the emission data byte stream to generate an initial hash value. The hash algorithm used is SHA-256, which can map input data of arbitrary length to a fixed-length 256-bit / 32-byte output. The SHA-256 calculation is performed on the aforementioned serialized byte stream to obtain the initial hash value (calculated based on the actual data).
[0066] The device identifier and time identifier are extracted from the emission data and concatenated. Then, a hash calculation is performed to generate the identifier hash value. The device identifier "CPMS20240601001" and the timestamp "2024-06-01 10:00:00" are concatenated according to the rule of "device identifier + underscore + time identifier" to form the string "CPMS20240601001_2024-06-01 10:00:00". The SHA-256 hash calculation is performed on this string to obtain the identifier hash value (based on the actual calculation result).
[0067] The initial hash value and the identifier hash value are XORed to obtain the intermediate hash value. The XOR operation is performed bit by bit; if corresponding bits are the same, the result is 0; otherwise, the result is 1. XORing two 256-bit hash values yields a 256-bit intermediate hash value.
[0068] Multiple rounds of iterative hash calculations are performed on the intermediate hash value. In each iteration, the hash output of the previous round is combined with a preset salt value, and the hash calculation is performed again. The preset salt value is a 128-bit random byte sequence, such as "f7d59b3584b8bd5c1c2f76f3b9a72a1f". In the first iteration, the intermediate hash value and the salt value are concatenated, and SHA-256 calculation is performed. In the second iteration, the output of the first round is concatenated with the salt value, and SHA-256 calculation is performed again. This concatenation and calculation are repeated until the preset number of iterations is completed. The number of iterations is usually set to 10,000 to increase the computational difficulty and prevent brute-force attacks. After multiple rounds of iterative hash calculations, the final hash value is obtained. The final hash value is determined based on the actual calculation results and is used as the data fingerprint.
[0069] The data fingerprint and emission data are structured and encapsulated to form a storage unit. A binding relationship between the data fingerprint and emission data is established within the encapsulated structure, and a version identifier is added. The specific encapsulation format adopts a JSON structure, containing the following fields: version identifier "version":"1.0", data fingerprint "fingerprint" which is the final hash value, the original emission data object "emission_data", containing fields such as device identifier, timestamp, emission data, and timestamp "timestamp":1717027200000 (millisecond-level Unix timestamp).
[0070] The structured encapsulation example is as follows: {"version":"1.0","fingerprint":"<final hash value>","emission_data":{"device_id":"CPMS20240601001","timestamp":"2024-06-01 10:00:00","co2_concentration":156.7,"flow_rate":21500,"status_code":"0001"},"timestamp":1717027200000}.
[0071] After the evidence storage unit is encapsulated, it is submitted to the blockchain network for on-chain storage. The blockchain network can adopt a consortium blockchain architecture, with participants including carbon emission source companies, regulatory agencies, carbon trading markets, and other stakeholders. The on-chain process uses consensus mechanisms such as PBFT or Raft algorithms to ensure data consistency and immutability. After the data is on-chain, a unique blockchain transaction hash is generated, such as "0x7b4d864d75481bfdad4a2595d2e6c96452baa1a0975f78623d9d19d1c22760fc", which serves as the unique index identifier of the evidence storage unit on the chain.
[0072] In this embodiment, the hash generation method enhanced by multi-source features effectively improves the sensitivity of data fingerprints to subtle data changes, making tampering more difficult to avoid. The salting iterative processing enhances the fingerprint's anti-collision and anti-replay attack capabilities, significantly improving security strength. The structured binding makes the data verifiable and traceable, avoiding the weak verification chain caused by the separation of data and fingerprints in traditional evidence storage methods, and significantly improving the security, reliability, and traceability credibility of the evidence storage process.
[0073] In one alternative implementation,
[0074] The time identifier and device identifier are extracted from the emission data to generate a blockchain address index and determine the storage location of the evidence storage unit. The evidence storage unit is then written to the storage location through consensus verification to obtain the evidence storage record, which includes:
[0075] The time identifier and device identifier are extracted from the emission data and combined according to a preset concatenation rule to obtain a combined identifier string. A hash mapping operation is performed on the combined identifier string to generate a hash mapping value, and a sharding index is calculated by combining the hash mapping value and the sharding index. The blockchain address index is generated by combining the hash mapping value and the sharding index.
[0076] The target shard of the evidence storage unit in the blockchain distributed ledger is determined based on the blockchain address index, and the specific storage node is located within the target shard. The storage space corresponding to the specific storage node is determined as the storage location of the evidence storage unit.
[0077] The evidence storage unit is sent to the consensus node set corresponding to the storage location. The data fingerprint in the evidence storage unit is extracted, and the consistency check value between the data fingerprint and the encapsulation structure of the evidence storage unit is calculated. The consensus node set verifies the integrity of the evidence storage unit based on the consistency check value. The number of consensus nodes that pass the verification is counted, and the verification pass rate is calculated. When the verification pass rate reaches a preset consensus threshold, the consensus verification is determined to be successful. The data fingerprint in the evidence storage unit and the blockchain address index are hashed to generate a block header hash value. The evidence storage unit and the block header hash value are encapsulated to obtain a new block and written to the storage location. The writing time and the blockchain height are recorded to obtain the evidence storage record.
[0078] The time stamp and device identifier are extracted from the emission data and combined according to a preset concatenation rule to obtain a combined identifier string. The time stamp adopts the ISO 8601 format, such as "20240602093000" for the carbon emission record of a coal-fired power plant at 9:30 on June 2, 2024; the device identifier is the carbon emission monitoring equipment number of the coal-fired power plant, "CEMS10025". Following the same concatenation rule as in the previous embodiment, the device identifier is placed first, the time stamp is placed last, and an underscore connector is added in between to obtain the combined identifier string "CEMS10025_20240602093000".
[0079] A hash mapping operation is performed on the combined identifier string to generate a hash mapping value. The hash mapping uses the SHA-256 algorithm, which maps an input of arbitrary length to a fixed-length 256-bit output. The SHA-256 calculation is performed on the aforementioned combined identifier string "CEMS10025_20240602093000" to obtain the hash mapping value (based on the actual calculation result). The sharding index is calculated based on a preset sharding rule. The sharding rule is defined as taking the first 8 hexadecimal digits of the hash mapping value, converting them to decimal, and then taking the modulo of the total number of shards. Assuming the blockchain network has 16 shards, the sharding index is obtained by converting the first 8 hexadecimal digits to decimal and taking the modulo of 16.
[0080] A blockchain address index is generated by combining the hash mapping value and the shard index. The combination operation involves converting the shard index to a two-digit hexadecimal number and concatenating it with the last 20 digits of the hash mapping value to obtain the blockchain address index. This blockchain address index is unique, ensuring that different storage units are allocated to different storage locations, achieving distributed data storage and load balancing.
[0081] The target shard of the blockchain distributed ledger is determined based on the blockchain address index. Specifically, the corresponding shard is located based on the shard index. Within the target shard, the specific storage node is located. A consistent hash ring algorithm is used to evenly distribute the storage nodes on the hash ring. The last few digits of the blockchain address index are used as the location basis and mapped onto the hash ring. The first node found in a clockwise direction is the target storage node, and the storage space corresponding to this node is determined as the storage location of the evidence storage unit.
[0082] The evidence storage unit is sent to the consensus node set corresponding to the storage location. In a sharded blockchain architecture, each shard consists of multiple consensus nodes, responsible for the consensus verification and storage of the data in that shard. The evidence storage unit data packet, in JSON format, is sent to the consensus node set of that shard, containing the complete evidence storage unit content and the target storage location information.
[0083] Extract the data fingerprint from the evidence storage unit and calculate the consistency check value between the data fingerprint and the encapsulation structure of the evidence storage unit. Extract the "fingerprint" value of the data fingerprint field from the JSON structure of the evidence storage unit and extract the emission data object "emission_data". Recalculate the data fingerprint of the emission data object according to the same rules mentioned above, and compare whether the calculation result is consistent with the original data fingerprint. The consistency check is represented by a Boolean value.
[0084] The consensus node set verifies the integrity of the evidence storage unit based on the consistency check value. Each consensus node independently executes the verification process and records the verification result. The number of consensus nodes that pass the verification is counted and the verification pass rate is calculated. The verification pass rate is the ratio of the number of passing nodes to the total number of nodes. When the verification pass rate reaches the preset consensus threshold, the consensus verification is deemed successful.
[0085] The data fingerprint and blockchain address index in the evidence storage unit are hashed to generate the block header hash value. The data fingerprint and blockchain address index are concatenated and then subjected to SHA-256 hash calculation to obtain the block header hash value.
[0086] The new block is obtained by encapsulating the evidence storage unit and the block header hash value, and then written to the storage location. The new block structure consists of two parts: a block header and a block body. The block header contains metadata such as the block header hash value, the hash reference of the previous block, and a timestamp. The block body contains the complete content of the evidence storage unit. The new block is then written to the target storage node in the corresponding shard, completing the on-chain storage of the evidence storage unit.
[0087] The notarized record is generated by recording the write time and the blockchain height. The write time uses UTC timestamp format, and the blockchain height is the current block height of the corresponding shard chain.
[0088] In this embodiment, a stable and evenly distributed blockchain address index is generated based on the time stamp and device identifier of the emission data. Combined with sharding rules, precise target shard positioning is achieved, which significantly improves the addressing efficiency and storage load balancing capability of the evidence storage unit in the distributed ledger. This avoids storage latency and performance bottlenecks caused by node congestion or hotspot sharding. By performing fingerprint consistency verification on the evidence storage unit in the consensus node set and using the verification pass rate as the consensus judgment basis, the credibility and anti-forgery capability of the evidence storage result are greatly improved. This significantly enhances the integrity verification strength before data writing and reduces the risk of interference from malicious nodes. By simultaneously encapsulating the data fingerprint and address index in the new block and generating the block header hash value, the final written evidence storage record has higher non-repudiation and traceability, thus strengthening the security and reliability of the overall evidence storage system.
[0089] In one alternative implementation,
[0090] Reading the stored evidence record and using it as the current tracing node, and using the preceding stored evidence record associated with the stored evidence record as the preceding tracing node, calculating the hash association value between the data fingerprints in the current tracing node and the preceding tracing node to generate an association verification identifier includes:
[0091] The evidence storage record is read from the blockchain distributed ledger as the current traceability node. The blockchain address index of the current traceability node is parsed to obtain the device identifier and historical evidence storage records with the same device identifier are retrieved. Historical evidence storage records with a blockchain height smaller than the current traceability node are identified and marked as candidate evidence storage records. The candidate evidence storage record with the largest blockchain height is extracted to obtain the previous evidence storage record and used as the previous traceability node.
[0092] Extract the current data fingerprint corresponding to the current tracing node and the previous data fingerprint corresponding to the previous tracing node, serialize and concatenate them to obtain a dual fingerprint sequence, and perform a hash operation to generate a fingerprint hash association value. Calculate the block height difference between the block height corresponding to the current tracing node and the block height corresponding to the previous tracing node, and determine the continuity of the current tracing node and the previous tracing node on the blockchain. If they are continuous, combine the fingerprint hash association value and the block height difference to generate a continuity hash association value as the hash association value.
[0093] If they are not continuous, the intermediate evidence records between the previous source node and the current source node are retrieved and the corresponding data fingerprints are extracted to obtain the intermediate fingerprint chain. The chain hash calculation is then combined to generate a cross-hash association value as the hash association value.
[0094] The hash association value is bound to the blockchain address index of the current tracing node to obtain the association verification identifier.
[0095] The current traceability node is retrieved from the blockchain distributed ledger. Taking the latest carbon emission data record of a thermal power plant as an example, the current traceability node data is {"block_hash":"f582a4c16c0e9a5d327cecbe4c95df0cc9557c3fb7a8a0e3f4b8cc0c5a727243","block_height":1056892,"timestamp":"2024-06-02T09:35:1"} 2Z","shard_index":7,"node_id":"Node75","address_index":"0732b62b39d5c086784145","da ta_fingerprint":"d3b47d720cdb5d201a58f0e43f8c452b0cde688a85b1b31a2c4f3c30bdc46df5"}.
[0096] The device identifier is obtained by parsing the blockchain address index of the current tracing node. The blockchain address index "0732b62b39d5c086784145" is constructed by adding a shard index to the device hash mapping value. This is then reverse-parsed using the blockchain index parsing service to recover the device identifier "CEMS10025". Historical evidence records with the same device identifier "CEMS10025" are retrieved. A distributed query engine performs a cross-shard query across the blockchain network to locate all evidence records containing this device identifier.
[0097] Identify historical evidence records with a blockchain height of 1056892 that is less than the current traceability node, and mark them as candidate evidence records. The query results contain multiple records, such as: "Record 1":{"block_hash":"a247c1f9d42e43b0c7599c57369e8a391e8b52e9d78a3735a1cc8207d1c8a762","block_height":1056889,"timestamp":"2024-06-02T08:30:25Z","shard_index":7,"node_id":"Node73","address_index":"0732b62b39d5c086784145","data_fingerprint":"67e9f2731f78649b5a29b734de67430e158fcec093c76be4f6630e2b665c 6b1d"};"Record 2":{"block_hash":"c8254b97e592e48bc1e66900a5a17c2269fcbee76257385e81 ef9868be73ff08","block_height":1056882,"timestamp":"2024-06-02T07:30:18Z","shar d_index":7,"node_id":"Node76","address_index":"0732b62b39d5c086784145","data_fi ngerprint":"09a8b2c74f10e3d59c2e76f3591afd82c46e5b13789a48f1b3c7ea3a88fb2e15"}.
[0098] Extract the candidate evidence record with the highest blockchain height to obtain the preceding evidence record, and use it as the preceding traceability node. Sort the candidate records in descending order of blockchain height, and select the record with the highest height, 1, as the preceding traceability node. The preceding traceability node data is {"block_hash":"a247c1f9d42e43b0c7599c57369e8a391e8b52e9d78a3735a1cc8207d1c8a762","block_height":1056889,"timestamp":"2024-06-02T0 8:30:25Z","shard_index":7,"node_id":"Node73","address_index":"0732b62b39d5c086784145", "data_fingerprint":"67e9f2731f78649b5a29b734de67430e158fcec093c76be4f6630e2b665c6b1d"}.
[0099] Extract the current data fingerprint "d3b47d720cdb5d201a58f0e43f8c452b0cde688a85b1b31a2c4f3c30bdc46df5" corresponding to the current tracing node and concatenate it with the previous data fingerprint "67e9f2731f78649b5a29b734de67430e158fcec093c76be4f6630e2b665c6b1d" corresponding to the previous tracing node, and obtain a dual fingerprint sequence. The serialization concatenation rule is to place the preceding data fingerprint first, followed by the current data fingerprint, with no separator in between, resulting in "67e9f2731f78649b5a29b734de67430e158fcec093c76be4f6630e2b665c6b1dd3b47d720cdb5d201a58f0e43f8c452b0cde688a85b1b31a2c4f3c30bdc46df5". A hash operation is then performed on this dual fingerprint sequence, using the SHA-256 algorithm to generate the fingerprint hash association value "e2c64f35cb15e6d6b9c3a6fcb633a5aeb308a72e9e1eb2dc57c5dfd5c9fd1a74".
[0100] The block height difference between the current traceability node (block height 1056892) and the previous traceability node (block height 1056889) is calculated, and the difference is found to be 3. The continuity of the current traceability node and the previous traceability node on the blockchain is determined by whether the block height difference is less than or equal to a preset threshold. The preset threshold is 5 blocks. Since the difference of 3 is less than the threshold of 5, they are considered continuous.
[0101] The fingerprint hash association value and the block height difference are combined to generate a continuous hash association value, which is then used as the hash association value. The combination operation is performed by converting the block height difference into the hexadecimal string "03", concatenating it with the fingerprint hash association value, and then performing another SHA-256 hash calculation. The concatenation result is "03e2c64f35cb15e6d6b9c3a6fcb633a5aeb308a72e9e1eb2dc57c5dfd5c9fd1a74", and after performing the hash calculation, the continuous hash association value "a63c5e82f94184b6a411c8e7b45b812d9d713e1a8974ab984d4f9123f4562d86" is obtained, which is used as the final hash association value.
[0102] In handling discontinuities, if the block height difference between the current tracing node and the previous tracing node exceeds a preset threshold of 5, it is considered discontinuous. In this case, it is necessary to retrieve intermediate evidence records between the previous and current tracing nodes and extract the corresponding data fingerprints to obtain an intermediate fingerprint chain. Assuming a block height difference of 10, the retrieved data fingerprints corresponding to the intermediate evidence records are "24a6f75b8e2d9c314d58f1962ac7bf452c8f92e3c7a1536478fc2b176ac2ef97" and "f1d2e58a3c964b70d2a8c6175e2b0f3129f7c0e835a6d24a85c73e6b9c12d047". The retrieval of intermediate evidence records is performed through a distributed query engine on the blockchain network. During the execution and retrieval process, three conditions must be met simultaneously: the intermediate evidence record must have the same device identifier as the current tracing node. After determining the device identifier by parsing the blockchain address index, this is used as the key condition for retrieval. The block height of the intermediate evidence record must be within the open interval between the previous tracing node and the current tracing node, that is, the block height must be strictly greater than the block height of the previous tracing node and strictly less than the block height of the current tracing node. All retrieved intermediate evidence records need to be arranged in ascending order of block height to ensure the temporal correctness of subsequent chain hash calculations.
[0103] In practice, the query engine uses the blockchain address index of the current traceability node as a filtering condition, limits the range of block height values, extracts the data fingerprint field from the evidence storage records that meet the conditions, and obtains them in ascending order of block height to form a complete intermediate fingerprint chain. Combined with the previous data fingerprint and the current data fingerprint, they are arranged in ascending order of block height to form a complete fingerprint chain.
[0104] A cross-hatching hash association value is generated by combining chained hash calculations. The chained hash calculation method starts with the preceding data fingerprint, concatenates it sequentially with the next data fingerprint, calculates the hash value, and then concatenates the result with the next data fingerprint to calculate the hash, until all data fingerprints have been processed. For example, the preceding data fingerprint is concatenated with the first intermediate fingerprint to calculate hash value A, hash value A is concatenated with the second intermediate fingerprint to calculate hash value B, hash value B is concatenated with the current data fingerprint to calculate hash value C, and hash value C is the cross-hatching hash association value "b5a7f2d18c65e4936a9d2f781e3c0b5a479c8f6d2e51a3794c8b6d5a97e2f183".
[0105] The association verification identifier is obtained by binding the hash association value with the blockchain address index of the current tracing node. The binding method is to combine the hash association value "a63c5e82f94184b6a411c8e7b45b812d9d713e1a8974ab984d4f9123f4562d86" with the blockchain address index "0732b62b39d5c086784145" to generate the association verification identifier "a63c5e82f94184b6a411c8e7b45b812d9d713e1a8974ab984d4f9123f4562d86-0732b62b39d5c086784145".
[0106] In this embodiment, by introducing historical record retrieval based on device identifiers, block height continuity judgment, and dual-fingerprint chain-based association verification mechanism during the tracing process, the authenticity, integrity, and anomaly identification capabilities of the tracing link are significantly enhanced. By locating historical evidence based on device identifiers and automatically filtering preceding nodes according to block height, it is ensured that the tracing path accurately corresponds to the same data entity, reducing interference from irrelevant records and improving the accuracy of the tracing trajectory. By comparing block heights to determine the continuity between tracing nodes and generating continuity hash association values and cross-chain hash association values respectively, abnormal chain segments can be automatically identified when there are link breaks, missing records, or potential tampering, significantly improving the anomaly detection capability of the tracing link. By binding the hash association value with the blockchain address index to generate an association verification identifier, each tracing becomes a unique and verifiable association credential, enhancing the non-repudiation and auditability of the tracing results.
[0107] In one alternative implementation,
[0108] Create directed edges from the preceding source node to the current source node and embed the association verification identifier into the directed edges to obtain the source relationship graph, including:
[0109] Extract the blockchain address index of the preceding tracing node as the starting node identifier, extract the blockchain address index of the current tracing node as the ending node identifier, perform a combination operation on the starting node identifier and the ending node identifier to generate an edge identifier, and create a directed edge based on the edge identifier and establish a directional association relationship.
[0110] Based on the directional association relationship, the embedding position of the association verification identifier in the directed edge is determined and embedded as a verification attribute in the embedding position. The hash association value in the association verification identifier is extracted and combined with the directional association relationship to calculate the integrity verification value of the directed edge. Based on the pre-acquired block height difference and the integrity verification value, the edge weight value of the directed edge is jointly calculated.
[0111] Based on the edge weight value, the continuity between the current tracing node and the previous tracing node on the blockchain is determined to obtain a continuity judgment result. If the continuity judgment result is continuous, the directed edge is marked as a continuous edge type. If the continuity judgment result is discontinuous, the directed edge is marked as a cross-edge type.
[0112] The directed edges, their corresponding label types, and edge weight values are stored in a graph data structure. All evidence records in the blockchain distributed ledger are traversed and the directed edges are repeated. The directed edges are combined with the corresponding traceability nodes to construct a traceability relationship graph.
[0113] The blockchain address index of the preceding traceability node is extracted as the starting node identifier. Taking the aforementioned carbon emission data from a thermal power plant as an example, the blockchain address index of the preceding traceability node is "0732b62b39d5c086784145", which is directly used as the starting node identifier. The blockchain address index of the current traceability node, also "0732b62b39d5c086784145", is extracted as the ending node identifier. In practical applications, carbon emission data generated by the same device at different times will share the same device identifier, but the blockchain address index may differ due to different time identifiers. The starting node identifier and the ending node identifier are combined to generate an edge identifier. The combination operation uses string concatenation, connecting the starting node identifier and the ending node identifier with a double arrow symbol to obtain the edge identifier "0732b62b39d5c086784145->0732b62b39d5c086784145".
[0114] Directed edges are created based on edge identifiers, and directional relationships are established. The creation of directed edges uses the edge creation statement of the graph database. The unique identifier of the edge is the edge identifier generated above. The starting vertex is the previous source node, and the ending vertex is the current source node. The direction is from the previous source node to the current source node, representing the temporal sequence and data evolution path. Using an attribute graph model, a directed edge object is created, represented as {id:"0732b62b39d5c086784145->0732b62b39d5c086784145",source:"0732b62b39d5c086784145",target:"0732b62b39d5c086784145",direction:"forward"}. The directional relationship is defined as a unidirectional relationship from the previous source node to the current source node, reflecting the cumulative relationship of carbon emission data over time.
[0115] The embedding position of the association verification identifier in the directed edge is determined based on the directional association relationship and embedded at that position as a verification attribute. The association verification identifier is "a63c5e82f94184b6a411c8e7b45b812d9d713e1a8974ab984d4f9123f4562d86-0732b62b39d5c086784145" generated earlier. The embedding position is set to the "checksum" attribute field of the directed edge. The embedding is implemented by adding the attribute {checksum:"a63c5e82f94184b6a411c8e7b45b812d9d713e1a8974ab984d4f9123f4562d86-0732b62b39d5c086784145"} to the directed edge object.
[0116] The hash association value is extracted from the association verification identifier and combined with the directional association relationship to calculate the integrity verification value of the directed edge. The hash association value is "a63c5e82f94184b6a411c8e7b45b812d9d713e1a8974ab984d4f9123f4562d86". The integrity verification value is calculated by combining the directional association relationship. When the direction is "forward", the first 16 bits of the hash association value are taken; when the direction is "backward", the last 16 bits of the hash association value are taken. In this embodiment, the direction is "forward", so the first 16 bits "a63c5e82f94184b6" are taken as the integrity verification value.
[0117] The edge weight of the directed edge is calculated based on the pre-obtained block height difference and integrity verification value. The block height difference is 3, and the integrity verification value is "a63c5e82f94184b6". The edge weight is calculated by converting the integrity verification value to a decimal number, multiplying it by the reciprocal of the block height difference, and then taking the logarithm. To simplify the calculation, the principle of larger weights for smaller block height differences is adopted. The weight calculation formula is 10 minus the block height difference. In this example, the edge weight is 7.
[0118] The continuity judgment result is obtained by determining the continuity between the current tracing node and its predecessor on the blockchain based on the edge weight value. The judgment criterion is whether the edge weight value is greater than a preset threshold. The preset threshold is 5. When the edge weight value is greater than or equal to 5, it is judged as continuous; when it is less than 5, it is judged as discontinuous. In this embodiment, the edge weight value is 7, which is greater than the threshold 5, and the judgment result is continuous. The directed edge is marked as a continuous edge type, specifically by adding the attribute {edge_type:"continuous",weight:7} to the directed edge object.
[0119] In cases of discontinuity, if the edge weight value is less than a preset threshold, the result is considered discontinuous, and the directed edge is marked as a spanning edge. Specifically, this is achieved by adding the attribute `{edge_type:"spanning", weight: edge weight value}` to the directed edge object. A spanning edge indicates a data gap or a significant time interval between two source nodes, requiring extra attention.
[0120] Directed edges, along with their corresponding label types and edge weights, are stored in a graph data structure. The graph data structure uses an attribute graph model, consisting of nodes and edges. Nodes represent traceability nodes and include attributes such as blockchain address index, block height, and data fingerprint; edges represent the relationships between traceability nodes and include attributes such as edge identifier, edge type, and edge weight. The complete directed edge object is stored as {id:"0732b62b39d5c086784145->0732b62b39d5c086784145",source:"0732b62b39d5c086784145",target:"0732b62b39d5c086784145",direction:"forward",checksum:"a63c5e82f94184b6a411c8e7b45b812d9d713e1a8974ab984d4f9123f4562d86-0732b62b39d5c086784145",edge_type:"continuous",weight:7}.
[0121] For each pair of related traceability nodes, the process of constructing directed edges is repeated to generate corresponding directed edges and store them in a graph data structure. In practical applications, a scheduled task can be set to scan the newly added evidence records on the blockchain at regular intervals and update the traceability graph.
[0122] The source tracing relationship graph is constructed by combining directed edges with their corresponding source tracing nodes. The source tracing relationship graph is constructed using a graph merging operation, with all source tracing nodes as vertices and all directed edges as edges, forming a complete carbon emission data source tracing relationship network.
[0123] In this embodiment, by constructing directed edges using blockchain address indexes, embedding verification attributes, and calculating edge weights based on hash association values and block height differences during the tracing process, the structural expression capability, trust verification capability, and anomaly identification capability of the tracing link are significantly improved compared to existing technologies. By creating directed edges using edge identifiers generated from start and end nodes and establishing directional associations, the tracing path is more clearly expressed in terms of structure, effectively avoiding the problems of ambiguous node associations and difficulty in path reconstruction in traditional tracing methods. By embedding association verification identifiers into directed edges and calculating integrity verification values based on directional associations, each link not only describes the path relationship but also embeds verifiable integrity evidence, significantly enhancing the credibility of the tracing link.
[0124] In one alternative implementation,
[0125] Based on the source tracing graph, the data fingerprint difference and time interval between the current source node and its neighboring source nodes are calculated, and the state vector of the current source node is updated using a dynamic propagation algorithm, including:
[0126] The current source node is selected from the source relationship graph. Based on the directed edge traversal, the source nodes directly connected to the current source node are taken as adjacent source nodes. The current data fingerprint corresponding to the current source node and the adjacent data fingerprint corresponding to the adjacent source nodes are extracted and local sensitive hash mapping is performed to obtain the current fingerprint vector and the adjacent fingerprint vector. The distance metric is calculated to obtain the data fingerprint difference value. A difference feature matrix is constructed based on all data fingerprint difference values.
[0127] The time difference between the current source node and its neighboring source nodes is calculated to obtain the corresponding time interval. The integrity verification value of the directed edge connecting the current source node and its neighboring source nodes is extracted. The trust propagation coefficient is obtained by coupling the value with the time interval.
[0128] The difference feature matrix and the trust propagation coefficient are combined to generate a propagation influence matrix, and singular value decomposition is performed to extract dominant singular values. The difference feature matrix is reconstructed based on the dominant singular values to obtain a compressed propagation matrix. Adjacent source nodes are traversed and their corresponding state vectors are extracted. The state vectors are transformed with the compressed propagation matrix to obtain a propagation state representation. Based on the edge type of the directed edges connecting the current source node and adjacent source nodes, the propagation state representation is subjected to differential modulation and tensor contraction operations to obtain an aggregated state vector, which is used as the state vector of the current source node.
[0129] Select the current source node from the source relationship graph. For example, taking the latest carbon emission data record of the aforementioned thermal power plant as an example, the current traceability node data is {"block_hash":"f582a4c16c0e9a5d327cecbe4c95df0cc9557c3fb7a8a0e3f4b8cc0c5a727243","block_height":1056892,"timestamp":"2024-06-02T09:35:12Z","address_index":"0732b62b39d5c086784145","data_fingerprint":"d3b47d720cdb5d201a58f0e43f8c452b0cde688a85b1b31a2c4f3c30bdc46df5"}.
[0130] The process involves traversing directed edges to identify directly connected source nodes as adjacent source nodes. All directed edges in the source graph originating from or ending at the current source node are queried to obtain its adjacent source nodes. The query result contains three adjacent source nodes: Node 1 is the previous source node {"block_height":1056889,"address_index":"0732b62b39d5c086784145","data_fingerprint":"67e9f2731f78649b5a29b734de67430e158fcec093c76be4f6630e2b665c6b1d"}; Node 2 is the parallel device node {"block_height":1056890,"address_index":"0732c73b46e8d29789525}. 6","data_fingerprint":"95a7c4d12b38f6e09c5d2a81f47b6e32d5c8a97f1e4b3d26a7c9e58d3b41f2c0"}; Adjacent node 3 is a downstream device node {"block_height":1056880,"address_index":"0732e85d67f9e3a89a6367","data_fingerprint":"38d6c9a5f2e7b1d84c59f32a6e7b9d58c4a26f37e95d81c4b3a7e69d28f5c4b1"}.
[0131] Extract the current data fingerprint corresponding to the current source node and the adjacent data fingerprints corresponding to neighboring source nodes, and perform Local Sensitive Hash (LSH) mapping. LSH mapping uses the SimHash algorithm, dividing the hash value into 8 bytes, each byte converted to 8 bits, and mapping it to the corresponding position in the feature vector. When the binary bit is 1, the corresponding position in the feature vector is 1; when it is 0, the corresponding position in the feature vector is -1. Applying this mapping to the current data fingerprint "d3b47d720cdb5d201a58f0e43f8c452b0cde688a85b1b31a2c4f3c30bdc46df5" yields the current fingerprint vector V0, with the first 8 bits being [1,-1,1,1,-1,-1,-1,1]. Applying the same mapping to the data fingerprints of three adjacent nodes yields the adjacent fingerprint vectors V1, V2, and V3.
[0132] The distance metric is calculated to obtain the data fingerprint difference value. Hamming distance is used as the distance metric, which is the sum of the number of different bits at corresponding positions of two vectors. The Hamming distance between the current fingerprint vector V0 and its three adjacent fingerprint vectors V1, V2, and V3 is calculated to obtain three data fingerprint difference values: D1=12, D2=38, and D3=45. The smaller the difference value, the higher the data similarity. A difference feature matrix is constructed based on all data fingerprint difference values. A matrix M is created, where the matrix element M[i,j] represents the data fingerprint difference value between the source node i and the source node j. For the previous example, a 4×4 matrix is constructed, with diagonal elements being 0 and off-diagonal elements representing the difference values between corresponding nodes. The resulting difference feature matrix is [[0,12,38,45],[12,0,29,42],[38,29,0,31],[45,42,31,0]].
[0133] The time difference between the current source node and its neighboring source nodes is calculated to obtain the corresponding time interval. The time difference is calculated by extracting the node timestamps, calculating the difference between two timestamps, and converting it to seconds. The current source node's timestamp is "2024-06-02T09:35:12Z", and the timestamps of the three neighboring nodes are "2024-06-02T08:30:25Z", "2024-06-02T08:45:18Z", and "2024-06-02T06:15:42Z". The calculated time intervals are: T1 = 3887 seconds, T2 = 2994 seconds, and T3 = 12210 seconds.
[0134] Extract the integrity verification value of the directed edge connecting the current source node and its adjacent source nodes. Extract the integrity verification value field from the directed edge object generated earlier. The integrity verification values of the three directed edges are "a63c5e82f94184b6", "b74d6f93e85295c7", and "c85e7fa4f96306d8".
[0135] The trust propagation coefficient is obtained by coupling operations with the time interval. The first 8 digits of the integrity verification value are converted to decimal, multiplied by the reciprocal of the time interval, and then a scaling function is applied to extract the first 8 digits of the integrity verification value. For example, "a63c5e82" is converted to the decimal number 2788974210. The time decay factor is obtained by dividing 1000 by the time interval. The two are then multiplied and divided by 10. 12 The initial propagation coefficients are obtained; the sigmoid function is applied for normalization to obtain the final trust propagation coefficients. The trust propagation coefficients of the three adjacent nodes are P1=0.87, P2=0.75, and P3=0.36, respectively.
[0136] The difference feature matrix and the trust propagation coefficient are combined to generate the propagation influence matrix. The elements in the difference feature matrix are modulated using the trust propagation coefficient by multiplying each element by its corresponding propagation coefficient. The resulting propagation influence matrix is [[0,10.44,28.5,16.2],[10.44,0,21.75,15.12],[28.5,21.75,0,11.16],[16.2,15.12,11.16,0]].
[0137] Singular value decomposition (SVD) is performed to extract dominant singular values. The propagation influence matrix is decomposed into three matrices: U, S, and V. S is a diagonal matrix, and its diagonal elements are the singular values, calculated to be [72.63, 8.42, 4.19, 0.21]. The two largest singular values, 72.63 and 8.42, are selected as the dominant singular values.
[0138] The compressed propagation matrix is obtained by reconstructing the difference feature matrix based on the dominant singular values. The singular vectors corresponding to the first two dominant singular values are retained, and the other singular values are set to zero. The matrix is then recalculated. The reconstructed compressed propagation matrix is [[0,10.2,27.8,15.9],[10.2,0,21.4,14.8],[27.8,21.4,0,11.0],[15.9,14.8,11.0,0]].
[0139] Traverse adjacent source tracing nodes and extract their corresponding state vectors. Each state vector represents the current state information of a node, including key indicators such as carbon emissions and equipment operating parameters. The initial state vector is a 32-dimensional vector, where the first 8 bits represent the carbon emission intensity level, the middle 16 bits represent the equipment operating status, and the last 8 bits represent the data reliability score. The state vectors of three adjacent nodes are S1, S2, and S3, each containing 32 elements.
[0140] The propagation state representation is obtained by transforming the state vector with the compressed propagation matrix. The inverse of the compressed propagation matrix is then multiplied by the state vector to obtain the mapped state representation. This operation is performed on the state vectors S1, S2, and S3 of three adjacent nodes to obtain three propagation state representations T1, T2, and T3.
[0141] Differential modulation and tensor contraction operations are performed on the propagation state representation based on the edge type of the directed edges connecting the current source node and its neighboring source nodes. Differential modulation uses different weights according to the edge type: continuous edges have a weight of 0.9, and crossing edges have a weight of 0.6. The edge types connecting the current node to its three neighboring nodes are continuous edges, continuous edges, and crossing edges, with corresponding weights of 0.9, 0.9, and 0.6, respectively. The propagation state representation is modulated using these weights to obtain the modulated state representation. The tensor contraction operation merges the three modulated state representations into a single aggregated state vector using a weighted summation method. For each position, the weighted average of the three state representations is calculated, with the weight being the corresponding edge type weight.
[0142] The aggregated state vector is obtained and used as the state vector of the current source tracing node. The calculated aggregated state vector is a 32-dimensional vector. The first 8 bits are [0.85, 0.72, 0.91, 0.68, 0.79, 0.88, 0.76, 0.82], indicating that the carbon emission intensity is at a medium-to-high level; the middle 16 bits reflect key parameters of the equipment's operating status; and the last 8 bits are [0.92, 0.89, 0.91, 0.94, 0.90, 0.93, 0.95, 0.91], indicating a high data reliability score. This state vector serves as the final state representation of the current source tracing node and is used for subsequent analysis and decision-making.
[0143] In this embodiment, by vectorizing the fingerprint differences between the current node and its neighboring nodes, the data evolution relationship between nodes can be more precisely characterized, enhancing the ability to identify subtle abnormal changes. By coupling the time interval with the integrity verification value to generate a trust propagation coefficient, the propagation path can automatically adjust the influence weight according to the link credibility, thereby improving the ability of the source tracing calculation to suppress discontinuous links and unreliable nodes and improving the identification accuracy of abnormal chain segments. By performing singular value decomposition on the difference feature matrix and extracting the dominant singular value, feature denoising and structural information enhancement are achieved, enabling the propagation calculation to maintain stability and computational efficiency in a high-dimensional feature environment. By combining edge type to perform differential modulation on the propagation state representation, the different contributions of continuous links and cross-links to the source tracing state can be distinguished, effectively enhancing the expressive power and resolution of the source tracing state update.
[0144] Figure 2 This is a flowchart illustrating the state propagation process of the carbon emission data blockchain storage and traceability method in this embodiment of the invention.
[0145] In one alternative implementation,
[0146] An anomaly score is calculated based on the state vector and the associated verification identifier. When the anomaly score is greater than a preset anomaly threshold, the current source node is marked as a suspected tampering node. Recursive verification is performed along the directed edges to determine the source of the anomaly propagation, including:
[0147] Project the state vector corresponding to the current source node onto the preset anomaly detection space to obtain the anomaly feature vector, and extract the associated verification identifier of the current source node and parse it to obtain the verification feature vector;
[0148] The abnormal feature vector and the verification feature vector are multiplied by a tensor to obtain a joint feature vector, and then dimensionality reduction is performed to obtain an abnormal response value. The norm of the abnormal response value is extracted to obtain an abnormal score. It is determined whether the abnormal score is greater than a preset abnormal threshold. If it is greater, the current source node is marked as a suspected tampering node, and all directed edges connecting the suspected tampering node are extracted.
[0149] Based on the direction of the directed edge, traverse the previous source nodes that are connected to the suspected tampered node and calculate the corresponding anomaly score. Mark the previous source nodes with anomaly scores greater than the anomaly threshold as suspected tampered nodes and record the directed edge. Repeat the traversal and marking until the anomaly score of the previous source node is less than or equal to the anomaly threshold.
[0150] Extract all source nodes marked as suspected tampering nodes, and determine the source node whose abnormal score first exceeds the abnormal threshold based on the directed edges and the distribution of the abnormal scores to obtain the source of the abnormal propagation.
[0151] The anomaly feature vector is obtained by projecting the state vector corresponding to the current source node onto a preset anomaly detection space. Taking the carbon emission data of the aforementioned thermal power plant as an example, the state vector of the current source node is a 32-dimensional vector calculated above. The first 8 bits are [0.85, 0.72, 0.91, 0.68, 0.79, 0.88, 0.76, 0.82], representing carbon emission intensity; the middle 16 bits are equipment operating parameters; and the last 8 bits are [0.92, 0.89, 0.91, 0.94, 0.90, 0.93, 0.95, 0.91], representing data credibility. The anomaly detection space is defined as a principal component analysis model trained based on historical data, which is projected by performing an inner product operation between the feature vector and the orthogonal basis of the feature space. The projection matrix is provided by the pre-trained anomaly detection model and has a dimension of 32×16, representing the projection of the 32-dimensional state vector onto the 16-dimensional anomaly detection space. Perform a projection operation, multiplying the state vector by the projection matrix to obtain a 16-dimensional anomaly feature vector [0.79,0.83,0.67,0.92,0.75,0.81,0.88,0.72,0.85,0.91,0.76,0.83,0.69,0.78,0.84,0.77].
[0152] Extract the associated verification identifier of the current tracing node and parse it to obtain the verification feature vector. The associated verification identifier is "a63c5e82f94184b6a411c8e7b45b812d9d713e1a8974ab984d4f9123f4562d86-0732b62b39d5c086784145" generated earlier. The parsing method involves extracting the hash association value "a63c5e82f94184b6a411c8e7b45b812d9d713e1a8974ab984d4f9123f4562d86", grouping it into 16 groups of 4 characters each. Each group is converted to a decimal number and normalized to the [0,1] interval, resulting in the verification feature vector [0.65,0.37,0.92,0.83,0.64,0.46,0.71,0.88,0.77,0.51,0.68,0.93,0.85,0.42,0.69,0.75]. The verification feature vector reflects the integrity of the association between the current source node and its predecessor nodes.
[0153] A joint feature vector is obtained by performing a tensor product operation on the anomaly feature vector and the verification feature vector. The tensor product operation method is to multiply the corresponding elements of the two vectors, resulting in a 16-dimensional joint feature vector [0.51, 0.31, 0.62, 0.76, 0.48, 0.37, 0.62, 0.63, 0.65, 0.46, 0.52, 0.77, 0.59, 0.33, 0.58, 0.58]. The joint feature vector comprehensively reflects the degree of anomaly and the integrity of the association. Dimensionality reduction mapping is performed to obtain the anomaly response value. The dimension reduction mapping adopts a weighted summation method. The weight vector is obtained by the anomaly detection model through a large amount of historical data, with a dimension of 16×1. The weight values reflect the importance of each feature in anomaly detection. The inner product operation is performed, multiplying the joint feature vector by the weight vector to obtain the scalar anomaly response value of 0.23.
[0154] An anomaly score is obtained by extracting the norm of the abnormal response value. The norm is calculated by taking the absolute value of the abnormal response value, resulting in an anomaly score of 0.23. The anomaly score is then checked against a preset anomaly threshold. For example, the preset anomaly threshold is 0.6, determined through extensive historical data analysis to balance detection sensitivity and false alarm rate. Since the current anomaly score of 0.23 is less than the threshold of 0.6, the result is considered normal.
[0155] To demonstrate the tampering detection process, assume that an abnormal tracing node is subsequently detected. Its state vector is the same as the aforementioned node, but its data fingerprint has been maliciously modified, causing the verification feature vector to become [0.25, 0.17, 0.32, 0.43, 0.24, 0.26, 0.31, 0.38, 0.37, 0.21, 0.28, 0.33, 0.35, 0.22, 0.29, 0.35]. Repeating the aforementioned calculation process, an anomaly score of 0.72 is obtained, which is greater than the threshold of 0.6. Therefore, this node is marked as a suspected tampering node. The anomaly marker information is recorded as {node_id:"0732b62b39d5c086784145",block_height:1056892,anomaly_score:0.72,timestamp:"2024-06-02T09:35:12Z",status:"suspected_tampered"}.
[0156] Extract all directed edges connecting the suspected tampered node. Query all directed edges in the source graph that originate from or end at the suspected tampered node to obtain the connection relationships. The query result contains three directed edges, connecting three adjacent nodes: predecessor node 1, parallel node 2, and downstream node 3. The data of the three edges is {id: "edge1", source: "0732b62b39d5c086784145", target: "0732e85d67f9e3a89a6367", edge_type:"continuous",weight:7}、{id:"edge2",source:"0732c73b46e8d297895256",target :"0732b62b39d5c086784145",edge_type:"continuous",weight:7},{id:"edge3",source:"0732 b62b39d5c086784145",target:"0732f96e78g0f4b90a7478",edge_type:"spanning",weight:4}.
[0157] Based on the traversal of directed edges, the preceding source nodes connected to the suspected tampered node are identified. The preceding source node is the node whose directed edge points to the suspected tampered node; based on the edge direction, it is identified as node 2, with the corresponding address index "0732c73b46e8d297895256". The anomaly score corresponding to the preceding source node is calculated, and the aforementioned anomaly detection process is repeated. The anomaly score of preceding node 2 is 0.83, which is greater than the threshold of 0.6. Therefore, it is marked as a suspected tampered node, and the connecting edge edge2 is recorded.
[0158] Continue traversing the predecessor node of node 2, identifying it as node 4, with the corresponding address index "0732d84c57h9e5b01c8589". The anomaly score of node 4 is calculated to be 0.91, which is greater than the threshold, so it is marked as a suspected tampered node and the connecting edge edge4 is recorded. Repeat the traversal of the predecessor node of node 4, identifying it as node 5. The anomaly score is calculated to be 0.95, which is greater than the threshold, so it is marked as a suspected tampered node and the connecting edge edge5 is recorded. Continue traversing the predecessor node of node 5, identifying it as node 6. The anomaly score is calculated to be 0.48, which is less than the threshold of 0.6, so it is determined to be a normal node. At this point, the anomaly scores of the predecessor nodes are all less than the threshold, and the traversal terminates.
[0159] Extract all source nodes marked as suspected tampering nodes, including node 1 (the current node), node 2, node 4, and node 5, forming an anomalous node set. Based on the distribution of directed edges and anomalous scores, determine the source node whose anomalous score first exceeds the anomalous threshold. Analyze the timestamps and anomalous scores of each node in the anomalous node set, arranged in chronological order as follows: node 5 ("2024-06-01T14:25:36Z", 0.95), node 4 ("2024-06-01T16:42:19Z", 0.91), node 2 ("2024-06-02T08:45:18Z", 0.83), and node 1 ("2024-06-02T09:35:12Z", 0.72). The anomalous scores show a decreasing trend, with node 5 being the earliest node to exhibit anomalies, thus identified as the source of the anomalous propagation.
[0160] To verify the accuracy of the source identification, the data characteristics of node 5 were further analyzed. The original data fingerprint of node 5 was extracted and compared with the fingerprint stored on the blockchain, revealing inconsistencies. Inspection of the data content of node 5 revealed that CO2 emissions data had been artificially inflated by 15%, inconsistent with actual monitoring values. Tracing the blockchain transaction records corresponding to node 5 revealed an abnormal delay in the submission time of this record compared to normal submission times, further corroborating the tampering. After confirming node 5 as the source of the tampering, an alert was issued to regulatory authorities, and the relevant carbon emission data was locked and isolated to prevent the tampered data from being used for carbon emission rights trading.
[0161] In this embodiment, by projecting the state vector onto the anomaly detection space and fusing it with the associated verification features using tensors, anomaly detection is not only based on data change features but also incorporates link verification information, improving the ability to identify complex or covert tampering behaviors. By performing dimensionality reduction mapping on the joint feature vector to calculate the anomaly response value, the detection process becomes more stable and noise is effectively suppressed, allowing the anomaly score to more accurately reflect the degree of anomaly of the node. By tracing the source step by step along directed edges when the node's anomaly score exceeds the threshold and re-checking the anomaly score of the preceding node, the anomaly propagation path can be automatically inferred from the blockchain link, thereby achieving accurate tracking of the anomaly propagation chain. By analyzing the anomaly score distribution of all suspected nodes and locating the node that first exceeds the threshold, the source of the anomaly can be reliably identified, avoiding the problem of not being able to trace the source after the anomaly spreads in traditional technologies.
[0162] A second aspect of this invention provides a blockchain-based system for storing and tracing carbon emission data, comprising:
[0163] The evidence storage unit includes: acquiring emission data from carbon emission source equipment; generating a data fingerprint corresponding to the emission data through hash calculation; encapsulating the data fingerprint and the emission data into an evidence storage unit; extracting a time identifier and an equipment identifier from the emission data to generate a blockchain address index and determine the storage location of the evidence storage unit; and writing the evidence storage unit into the storage location through consensus verification to obtain an evidence storage record.
[0164] The tracing unit includes reading the stored evidence record and using it as the current tracing node; using the preceding stored evidence record associated with the stored evidence record as the preceding tracing node; calculating the hash association value between the data fingerprints in the current tracing node and the preceding tracing node to generate an association verification identifier; creating a directed edge from the preceding tracing node to the current tracing node and embedding the association verification identifier into the directed edge to obtain a tracing relationship graph;
[0165] The verification unit includes calculating the data fingerprint difference value and time interval between the current source node and its adjacent source nodes based on the source relationship graph and updating the state vector of the current source node through a dynamic propagation algorithm; calculating an anomaly score based on the state vector and the associated verification identifier; marking the current source node as a suspected tampering node when the anomaly score is greater than a preset anomaly threshold; performing recursive verification along the directed edge and determining the source of the anomaly propagation.
[0166] A third aspect of the present invention provides an electronic device, comprising:
[0167] A processor and a memory for storing processor-executable instructions, wherein the processor is configured to invoke instructions stored in the memory to perform the aforementioned method.
[0168] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0169] This invention can be a method, apparatus, system, and / or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of the invention.
[0170] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. A method for blockchain-based storage and traceability of carbon emission data, characterized in that, include: Emission data from carbon emission source devices is acquired, and a data fingerprint corresponding to the emission data is generated through hash calculation. The data fingerprint and the emission data are encapsulated into a storage unit. A time identifier and device identifier are extracted from the emission data and combined to generate a blockchain address index, and the storage location of the storage unit is determined. The storage unit is written to the storage location through consensus verification to obtain the storage record. The process involves reading the existing evidence record and using it as the current tracing node, taking the preceding evidence record associated with the existing evidence record as the preceding tracing node, calculating the hash association value between the data fingerprints in the current tracing node and the preceding tracing node to generate an association verification identifier, creating a directed edge from the preceding tracing node to the current tracing node, and embedding the association verification identifier into the directed edge to obtain the tracing relationship graph. Based on the source tracing relationship graph, the data fingerprint difference value and time interval between the current source node and its adjacent source nodes are calculated, and the state vector of the current source node is updated through a dynamic propagation algorithm. Based on the state vector and the associated verification identifier, an anomaly score is calculated. When the anomaly score is greater than the preset anomaly threshold, the current source node is marked as a suspected tampering node, and recursive verification is performed along the directed edge to determine the source of the anomaly propagation.
2. The method according to claim 1, characterized in that, Acquire emission data from carbon emission source devices, generate a data fingerprint corresponding to the emission data through hash calculation, and encapsulate the data fingerprint and the emission data into a storage unit, including: Emission data is obtained from the carbon emission source device through a preset device access protocol. The emission data is serialized according to a preset field order to obtain an emission data byte stream. A hash calculation is performed on the emission data byte stream to generate an initial hash value. The device identifier and time identifier are extracted from the emission data, concatenated, and then a hash calculation is performed to generate an identifier hash value. The initial hash value and the identifier hash value are XORed to obtain an intermediate hash value. Multiple rounds of iterative hash calculation are performed on the intermediate hash value. In each round of iteration, the hash output of the previous round is combined with the preset salt value and the hash calculation is performed again. The iteration is repeated until the preset number of iterations is reached to obtain the final hash value, and the final hash value is used as the data fingerprint. The data fingerprint and the emission data are structured and encapsulated. A binding relationship between the data fingerprint and the emission data is established in the encapsulation structure and a version identifier is added to obtain the evidence storage unit.
3. The method according to claim 1, characterized in that, The time identifier and device identifier are extracted from the emission data to generate a blockchain address index and determine the storage location of the evidence storage unit. The evidence storage unit is then written to the storage location through consensus verification to obtain the evidence storage record, which includes: The time identifier and device identifier are extracted from the emission data and combined according to a preset concatenation rule to obtain a combined identifier string. A hash mapping operation is performed on the combined identifier string to generate a hash mapping value, and a sharding index is calculated by combining the hash mapping value and the sharding index. The blockchain address index is generated by combining the hash mapping value and the sharding index. The target shard of the evidence storage unit in the blockchain distributed ledger is determined based on the blockchain address index, and the specific storage node is located within the target shard. The storage space corresponding to the specific storage node is determined as the storage location of the evidence storage unit. The evidence storage unit is sent to the consensus node set corresponding to the storage location. The data fingerprint in the evidence storage unit is extracted, and the consistency check value between the data fingerprint and the encapsulation structure of the evidence storage unit is calculated. The consensus node set verifies the integrity of the evidence storage unit based on the consistency check value. The number of consensus nodes that pass the verification is counted, and the verification pass rate is calculated. When the verification pass rate reaches a preset consensus threshold, the consensus verification is determined to be successful. The data fingerprint in the evidence storage unit and the blockchain address index are hashed to generate a block header hash value. The evidence storage unit and the block header hash value are encapsulated to obtain a new block and written to the storage location. The writing time and the blockchain height are recorded to obtain the evidence storage record.
4. The method according to claim 1, characterized in that, Reading the stored evidence record and using it as the current tracing node, and using the preceding stored evidence record associated with the stored evidence record as the preceding tracing node, calculating the hash association value between the data fingerprints in the current tracing node and the preceding tracing node to generate an association verification identifier includes: The evidence storage record is read from the blockchain distributed ledger as the current traceability node. The blockchain address index of the current traceability node is parsed to obtain the device identifier and historical evidence storage records with the same device identifier are retrieved. Historical evidence storage records with a blockchain height smaller than the current traceability node are identified and marked as candidate evidence storage records. The candidate evidence storage record with the largest blockchain height is extracted to obtain the previous evidence storage record and used as the previous traceability node. Extract the current data fingerprint corresponding to the current tracing node and the previous data fingerprint corresponding to the previous tracing node, serialize and concatenate them to obtain a dual fingerprint sequence, and perform a hash operation to generate a fingerprint hash association value. Calculate the block height difference between the block height corresponding to the current tracing node and the block height corresponding to the previous tracing node, and determine the continuity of the current tracing node and the previous tracing node on the blockchain. If they are continuous, combine the fingerprint hash association value and the block height difference to generate a continuity hash association value as the hash association value. If they are not continuous, the intermediate evidence records between the previous source node and the current source node are retrieved and the corresponding data fingerprints are extracted to obtain the intermediate fingerprint chain. The chain hash calculation is then combined to generate a cross-hash association value as the hash association value. The hash association value is bound to the blockchain address index of the current tracing node to obtain the association verification identifier.
5. The method according to claim 1, characterized in that, Create directed edges from the preceding source node to the current source node and embed the association verification identifier into the directed edges to obtain the source relationship graph, including: Extract the blockchain address index of the preceding tracing node as the starting node identifier, extract the blockchain address index of the current tracing node as the ending node identifier, perform a combination operation on the starting node identifier and the ending node identifier to generate an edge identifier, and create a directed edge based on the edge identifier and establish a directional association relationship. Based on the directional association relationship, the embedding position of the association verification identifier in the directed edge is determined and embedded as a verification attribute in the embedding position. The hash association value in the association verification identifier is extracted and combined with the directional association relationship to calculate the integrity verification value of the directed edge. Based on the pre-acquired block height difference and the integrity verification value, the edge weight value of the directed edge is jointly calculated. Based on the edge weight value, the continuity between the current tracing node and the previous tracing node on the blockchain is determined to obtain a continuity judgment result. If the continuity judgment result is continuous, the directed edge is marked as a continuous edge type. If the continuity judgment result is discontinuous, the directed edge is marked as a cross-edge type. The directed edges, their corresponding label types, and edge weight values are stored in a graph data structure. All evidence records in the blockchain distributed ledger are traversed and the directed edges are repeated. The directed edges are combined with the corresponding traceability nodes to construct a traceability relationship graph.
6. The method according to claim 1, characterized in that, Based on the source tracing graph, the data fingerprint difference and time interval between the current source node and its neighboring source nodes are calculated, and the state vector of the current source node is updated using a dynamic propagation algorithm, including: The current source node is selected from the source relationship graph. Based on the directed edge traversal, the source nodes directly connected to the current source node are taken as adjacent source nodes. The current data fingerprint corresponding to the current source node and the adjacent data fingerprint corresponding to the adjacent source nodes are extracted and local sensitive hash mapping is performed to obtain the current fingerprint vector and the adjacent fingerprint vector. The distance metric is calculated to obtain the data fingerprint difference value. A difference feature matrix is constructed based on all data fingerprint difference values. The time difference between the current source node and its neighboring source nodes is calculated to obtain the corresponding time interval. The integrity verification value of the directed edge connecting the current source node and its neighboring source nodes is extracted. The trust propagation coefficient is obtained by coupling the value with the time interval. The difference feature matrix and the trust propagation coefficient are combined to generate a propagation influence matrix, and singular value decomposition is performed to extract dominant singular values. The difference feature matrix is reconstructed based on the dominant singular values to obtain a compressed propagation matrix. Adjacent source nodes are traversed and their corresponding state vectors are extracted. The state vectors are transformed with the compressed propagation matrix to obtain a propagation state representation. Based on the edge type of the directed edges connecting the current source node and adjacent source nodes, the propagation state representation is subjected to differential modulation and tensor contraction operations to obtain an aggregated state vector, which is used as the state vector of the current source node.
7. The method according to claim 1, characterized in that, An anomaly score is calculated based on the state vector and the associated verification identifier. When the anomaly score is greater than a preset anomaly threshold, the current source node is marked as a suspected tampering node. Recursive verification is performed along the directed edges to determine the source of the anomaly propagation, including: Project the state vector corresponding to the current source node onto the preset anomaly detection space to obtain the anomaly feature vector, and extract the associated verification identifier of the current source node and parse it to obtain the verification feature vector; The abnormal feature vector and the verification feature vector are multiplied by a tensor to obtain a joint feature vector, and then dimensionality reduction is performed to obtain an abnormal response value. The norm of the abnormal response value is extracted to obtain an abnormal score. It is determined whether the abnormal score is greater than a preset abnormal threshold. If it is greater, the current source node is marked as a suspected tampering node, and all directed edges connecting the suspected tampering node are extracted. Based on the direction of the directed edge, traverse the previous source nodes that are connected to the suspected tampered node and calculate the corresponding anomaly score. Mark the previous source nodes with anomaly scores greater than the anomaly threshold as suspected tampered nodes and record the directed edge. Repeat the traversal and marking until the anomaly score of the previous source node is less than or equal to the anomaly threshold. Extract all source nodes marked as suspected tampering nodes, and determine the source node whose abnormal score first exceeds the abnormal threshold based on the directed edges and the distribution of the abnormal scores to obtain the source of the abnormal propagation.
8. A blockchain-based carbon emission data storage and traceability system, used to implement the method described in any one of claims 1-7, characterized in that, include: The evidence storage unit includes: acquiring emission data from carbon emission source equipment; generating a data fingerprint corresponding to the emission data through hash calculation; encapsulating the data fingerprint and the emission data into an evidence storage unit; extracting a time identifier and an equipment identifier from the emission data to generate a blockchain address index and determine the storage location of the evidence storage unit; and writing the evidence storage unit into the storage location through consensus verification to obtain an evidence storage record. The tracing unit includes reading the stored evidence record and using it as the current tracing node; using the preceding stored evidence record associated with the stored evidence record as the preceding tracing node; calculating the hash association value between the data fingerprints in the current tracing node and the preceding tracing node to generate an association verification identifier; creating a directed edge from the preceding tracing node to the current tracing node and embedding the association verification identifier into the directed edge to obtain a tracing relationship graph; The verification unit includes calculating the data fingerprint difference value and time interval between the current source node and its adjacent source nodes based on the source relationship graph and updating the state vector of the current source node through a dynamic propagation algorithm; calculating an anomaly score based on the state vector and the associated verification identifier; marking the current source node as a suspected tampering node when the anomaly score is greater than a preset anomaly threshold; performing recursive verification along the directed edge and determining the source of the anomaly propagation.
9. An electronic device, characterized in that, include: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the method according to any one of claims 1 to 7.
10. A computer-readable storage medium having computer program instructions stored thereon, characterized in that, When the computer program instructions are executed by the processor, they implement the method described in any one of claims 1 to 7.