An asset fingerprint-based automated mimicry honeypot method and system

By constructing an automated mimicry honeypot method based on asset fingerprints, and utilizing dynamic mimicry attitude calculation and multi-dimensional scoring factors, the problems of insufficient automation and resource optimization of mimicry honeypots are solved. This enables dynamic synchronization and efficient resource allocation of mimicry honeypots, thereby improving protection effectiveness and deployment efficiency.

CN121690785BActive Publication Date: 2026-06-30BEIJING YUHONG XINAN TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING YUHONG XINAN TECHNOLOGY CO LTD
Filing Date
2025-12-19
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing mimicry honeypot methods are insufficient in terms of automation, dynamic adaptability, and resource optimization. They cannot effectively cope with rapidly changing attack methods, resulting in low camouflage realism and threat intelligence gathering capabilities, unreasonable resource allocation, and impact on business system performance or failure to fully utilize resources under high and low load conditions.

Method used

By acquiring business communication mirror messages, extracting asset fingerprint construction behavior and static feature vectors, calculating dynamic mimicry attitudes, filtering out high-priority mimicry honeypot targets, and using active scanning and mirror message parsing to obtain comprehensive input data, we introduce scoring factors such as communication behavior complexity, protocol interaction depth, and communication association chain length to achieve adaptive resource allocation and dynamic mimicry honeypot deployment.

Benefits of technology

It enables dynamic synchronization between mimicry honeypot features and real business operations, improving the overall balance and effectiveness of protection, reducing maintenance costs, increasing deployment efficiency, ensuring priority deployment of mimicry honeypots at critical nodes and in highly complex environments, and automatically completing target selection and deployment.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121690785B_ABST
    Figure CN121690785B_ABST
Patent Text Reader

Abstract

This invention proposes an automated mimicry honeypot method and system based on asset fingerprints, relating to the field of network security technology. The method includes: acquiring mirrored business communication packets uploaded by users to a business system; extracting asset fingerprints to construct multiple business communication behavior feature vectors, whereby the asset fingerprints include source IP and destination IP; obtaining an IP address list based on the source and destination IPs; initiating active scanning of the IP addresses in the IP address list to obtain scan response data; constructing a business static feature vector based on the scan response data; calculating the dynamic mimicry attitude of the pre-mimicry honeypot; calculating a simulation priority score; sorting the business static feature vectors in descending order according to the simulation priority score; and selecting mimicry honeypots with a corresponding proportion of business static feature vectors based on the dynamic mimicry attitude value. This method enables protection strategies to automatically tilt towards high-risk targets with high defense effectiveness.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention proposes an automated mimicry honeypot method and system based on asset fingerprinting, which relates to the field of network security technology. Background Technology

[0002] With the increasing complexity and sophistication of cyberattacks, traditional security defense systems face severe challenges. Mimicry honeypots, as a proactive defense technology, deceive attackers by simulating real system services, providing valuable information for security analysis. However, existing mimicry honeypot methods have significant shortcomings in automation, dynamic adaptability, and resource optimization, mainly in the following aspects: Traditional mimicry honeypots often employ static configuration, with fixed service simulation strategies that cannot be dynamically adjusted according to changes in the network environment and attack patterns. This rigid configuration mode makes it difficult for honeypots to effectively cope with rapidly changing attack methods, weakening the realism of the disguise and the threat intelligence gathering capability; Traditional mimicry honeypot solutions often rely on experience or simple rules for selection, failing to accurately identify which service features are most likely to be attacked and which features are most attractive to attackers, resulting in unreasonable resource allocation and low return on security investment; Traditional mimicry honeypot solutions fail to establish a dynamic balance mechanism between resource consumption and security protection effectiveness. Under high load, honeypot services may still be over-deployed, affecting the performance of business systems; while under low load, available resources may not be fully utilized, missing opportunities to collect threat intelligence. Summary of the Invention

[0003] This invention provides an automated mimicry honeypot method and system based on asset fingerprinting to solve the aforementioned problems:

[0004] This invention proposes an automated mimicry honeypot method based on asset fingerprinting, the method comprising:

[0005] Obtain the business communication image message uploaded by the user to the business system, and extract asset fingerprints based on the business communication image message to construct multiple business communication behavior feature vectors. The asset fingerprint includes: source IP and destination IP.

[0006] Based on the source IP and destination IP, obtain an IP address list, initiate an active scan of the IP addresses in the IP address list to obtain scan response data, and construct a service static feature vector based on the scan response data;

[0007] The dynamic mimicry attitude of the pre-mimicked honeypot is calculated based on the service communication behavior feature vector and its corresponding service static feature vector.

[0008] Calculate the simulated priority scores of the pre-mimicked honeypot's service communication behavior feature vector and its corresponding service static feature vector. Sort the service static feature vectors in descending order according to the simulated priority scores, and select the corresponding proportion of service static feature vector mimicked honeypots based on the dynamic mimicry value.

[0009] Further, the system obtains business communication image messages uploaded by users to the business system, and extracts asset fingerprints based on the business communication image messages to construct a business communication behavior feature vector, including:

[0010] Deploy a traffic mirroring port on the switch of the business network to which the business system belongs, and listen to the business communication mirroring packets uploaded by users to the business system through the traffic mirroring port;

[0011] The asset fingerprint is extracted from the business communication mirror message to construct a communication behavior feature vector. The asset fingerprint also includes: source port, destination port, transport layer protocol, transport layer session duration, data packet size, transport layer round-trip time statistics, and application layer protocol type.

[0012] Further, based on the source IP and destination IP, an IP address list is obtained; an active scan is initiated on the IP addresses in the IP address list to obtain scan response data; and a service static feature vector is constructed based on the scan response data, including:

[0013] Based on the source IP and destination IP, extract the IP addresses of all communication entities, perform deduplication and validity verification on the extracted IP addresses of communication entities, and form a target scan IP address list;

[0014] For the IP addresses in the target scan IP address list, perform an active scan within a preset port range, identify the status of each port including open ports, closed ports, and filtered ports, and obtain the corresponding scan response data;

[0015] Establish a protocol connection with the open port, obtain service banner information, and parse the service type and version number;

[0016] A static feature vector for the service is constructed based on the scan response data. The static feature vector for the service includes: service type, version number, list of open ports, and visible service identifier.

[0017] Furthermore, the dynamic mimicry attitude of the pre-mimicked honeypot is calculated based on its service communication behavior feature vector and corresponding service static feature vector, including:

[0018] The number of times each feature in the static feature vector of the business is attacked within a preset time window is counted, the average number of times all service features are attacked is obtained, and a historical attack frequency score is obtained based on the average number of times.

[0019] By querying the Internet asset fingerprint database, the frequency of occurrence of each service static feature in the business static feature vector in Internet public assets within a preset time window is counted, the average frequency of occurrence of all service features is obtained, and the feature uniqueness score is calculated based on the average frequency of occurrence.

[0020] Based on the historical attack frequency score and feature uniqueness score, the dynamic pseudo-attitude of the pre-mimicking honeypot is calculated using a dynamic pseudo-attitude model, which is as follows:

[0021]

[0022] in, Indicates at a point in time Dynamic pseudo-attitude towards honeypots This indicates the system's real-time resource score. Indicates at a point in time Historical attack frequency score Indicates at a point in time Feature uniqueness score, This represents the weighting coefficient of the historical attack frequency score. λ represents the weighting coefficient for the uniqueness score of the aforementioned feature, and λ represents the time decay coefficient. Indicates a time interval.

[0023] Further, the simulated priority scores of the pre-mimetic honeypot's service communication behavior feature vector and its corresponding service static feature vector are calculated, including:

[0024] The number of port changes is obtained based on the source port and destination port in the business communication behavior feature vector. The communication behavior complexity index is obtained based on the number of port changes, session duration fluctuation, statistical value fluctuation of transport layer round-trip time, and application layer protocol type distribution.

[0025] The protocol interaction depth index is obtained by parsing the number of service interaction steps, handshake process complexity, and data interaction volume based on the application layer protocol type and service banner information.

[0026] Port association relationships are obtained based on business communication behavior feature vectors. The length of the continuous interaction chain formed by the target communication entity and other communication entities is calculated based on the port association relationships. The communication association chain length index is obtained based on the length of the continuous interaction chain.

[0027] The simulated priority score is obtained based on the communication behavior complexity index, protocol interaction depth index, and communication association chain length index.

[0028] This invention proposes an automated mimicry honeypot system based on asset fingerprinting, the system comprising:

[0029] A business communication behavior feature vector module is constructed to obtain the business communication image messages uploaded by users to the business system. Based on the business communication image messages, asset fingerprints are extracted to construct multiple business communication behavior feature vectors. The asset fingerprints include: source IP and destination IP.

[0030] A business static feature vector construction module is established, which obtains an IP address list based on the source IP and destination IP, initiates an active scan on the IP addresses in the IP address list to obtain scan response data, and constructs a business static feature vector based on the scan response data.

[0031] The dynamic attitude acquisition module is used to calculate the dynamic attitude of the pre-mimicked honeypot based on the business communication behavior feature vector and its corresponding business static feature vector.

[0032] The mimicry vector filtering module is used to calculate the simulated priority scores of the business communication behavior feature vectors of the pre-mimicry honeypots and their corresponding business static feature vectors. The business static feature vectors are sorted in descending order according to the simulated priority scores, and the corresponding proportion of business static feature vector mimicry honeypots are selected based on the dynamic mimicry attitude value.

[0033] Furthermore, the module for constructing the business communication behavior feature vector includes:

[0034] The business communication mirroring message monitoring module is used to deploy a traffic mirroring port on the switch of the business network to which the business system belongs, and to monitor the business communication mirroring messages uploaded by users to the business system through the traffic mirroring port.

[0035] The asset fingerprint extraction module is used to extract asset fingerprints from the business communication mirror message to construct a communication behavior feature vector. The asset fingerprint also includes: source port, destination port, transport layer protocol, transport layer session duration, data packet size, transport layer round-trip time statistics, and application layer protocol type.

[0036] Furthermore, the module for constructing the business static feature vector includes:

[0037] A target scan IP address list is generated, which is used to extract the IP addresses of all communication entities based on the source IP and destination IP. The extracted communication entity IP addresses are deduplicated and their validity is verified to form the target scan IP address list.

[0038] The module for acquiring scan response data is used to perform active scanning on the IP addresses in the target scan IP address list within a preset port range, identify the status of each port including open ports, closed ports and filtered ports, and acquire the corresponding scan response data.

[0039] The protocol connection module is used to establish a protocol connection with the open port, obtain service banner information, and parse the service type and version number.

[0040] A business static feature vector generation module is used to construct a business static feature vector based on the scan response data. The business static feature vector includes: service type, version number, open port list, and visible service identifier.

[0041] Furthermore, the module for obtaining dynamic simulated attitudes includes:

[0042] The historical attack frequency score acquisition module is used to count the number of times each feature in the business static feature vector is attacked within a preset time window, obtain the average number of times all service features are attacked, and obtain the historical attack frequency score based on the average number of times.

[0043] The feature uniqueness scoring module is used to query the Internet asset fingerprint database, count the frequency of occurrence of each service static feature in the business static feature vector in Internet public assets within a preset time window, obtain the average frequency of occurrence of all service features, and calculate the feature uniqueness score based on the average frequency of occurrence.

[0044] The dynamic attitude simulation module is used to calculate the dynamic attitude of the pre-mimicked honeypot based on the historical attack frequency score and feature uniqueness score, using a dynamic attitude simulation model. The dynamic attitude simulation model is as follows:

[0045]

[0046] in, Indicates at a point in time Dynamic pseudo-attitude towards honeypots This indicates the system's real-time resource score. Indicates at a point in time Historical attack frequency score Indicates at a point in time Feature uniqueness score, This represents the weighting coefficient of the historical attack frequency score. λ represents the weighting coefficient for the uniqueness score of the aforementioned feature, and λ represents the time decay coefficient. Indicates a time interval.

[0047] Furthermore, the module for selecting mimicry vectors includes:

[0048] The module for obtaining the communication behavior complexity index is used to obtain the number of port changes based on the source port and destination port in the business communication behavior feature vector, and to obtain the communication behavior complexity index based on the number of port changes, session duration fluctuation, statistical value fluctuation of transport layer round-trip time and application layer protocol type distribution.

[0049] The protocol interaction depth index acquisition module is used to parse the number of service interaction steps, handshake process complexity and data interaction volume according to the application layer protocol type and service banner information to obtain the protocol interaction depth index.

[0050] The module for obtaining the communication association chain length index is used to obtain port association relationships based on business communication behavior feature vectors, calculate the length of the continuous interaction chain formed by the target communication entity and other communication entities based on the port association relationships, and obtain the communication association chain length index based on the length of the continuous interaction chain.

[0051] The module for obtaining simulated priority scores is used to obtain simulated priority scores based on the communication behavior complexity index, protocol interaction depth index, and communication association chain length index.

[0052] The beneficial effects of this invention are as follows: It achieves dynamic synchronization between the characteristics of mimicry honeypots and real business operations, making it more difficult to distinguish mimicry honeypots from real assets from an external perspective; simultaneously, it utilizes the static characteristics of business operations obtained through active scanning and the dynamic characteristics obtained through mirror packet parsing to provide more comprehensive input data for mimicry honeypot protection strategies, helping to accurately match mimicry targets from multiple aspects such as service type, version, and communication mode; dynamic mimicry attitude calculation not only considers historical attack frequency but also the rarity of services on the Internet, enabling protection strategies to automatically tilt towards high-risk and high-defense-efficiency targets, achieving adaptive resource allocation; it introduces three scoring factors—communication behavior complexity, protocol interaction depth, and communication association chain length—to quantify the priority of mimicry deployment in multiple dimensions, ensuring that mimicry honeypots are deployed preferentially at critical nodes and in highly complex communication environments, improving the overall balance and effectiveness of protection; through a unified indicator calculation and sorting screening mechanism, it automatically completes the selection and deployment of mimicry honeypot targets, eliminating the need for manual configuration, significantly reducing maintenance costs and improving deployment efficiency. Attached Figure Description

[0053] To make the content of this invention easier to understand, the invention will be further described in detail below with reference to specific embodiments and accompanying drawings, wherein...

[0054] Figure 1 This is a flowchart of an automated mimicry honeypot method based on asset fingerprinting in a preferred embodiment of the present invention;

[0055] Figure 2 This is a structural block diagram of an automated mimicry honeypot system based on asset fingerprinting, according to a preferred embodiment of the present invention. Detailed Implementation

[0056] The present invention will be further described below with reference to the accompanying drawings and specific embodiments, so that those skilled in the art can better understand and implement the present invention. However, the embodiments described are not intended to limit the present invention.

[0057] Example 1, refer to Figure 1 As shown, in one embodiment of the present invention, an automated mimicry honeypot method based on asset fingerprinting is characterized in that the method includes:

[0058] Obtain the business communication image message uploaded by the user to the business system, and extract asset fingerprints based on the business communication image message to construct multiple business communication behavior feature vectors. The asset fingerprint includes: source IP and destination IP.

[0059] Based on the source IP and destination IP, obtain an IP address list, initiate an active scan of the IP addresses in the IP address list to obtain scan response data, and construct a service static feature vector based on the scan response data;

[0060] The dynamic mimicry attitude of the pre-mimicked honeypot is calculated based on the service communication behavior feature vector and its corresponding service static feature vector.

[0061] Calculate the simulated priority scores of the pre-mimicked honeypot's service communication behavior feature vector and its corresponding service static feature vector. Sort the service static feature vectors in descending order according to the simulated priority scores, and select the corresponding proportion of service static feature vector mimicked honeypots based on the dynamic mimicry value.

[0062] The working principle of the above technical solution is as follows: In the business network to which the business system belongs, the system listens for mirrored business communication packets uploaded by users to the business system through the traffic mirroring port of the switch. The collected mirrored packets are parsed to extract asset fingerprint information such as source IP and destination IP. Based on the asset fingerprints, multiple business communication behavior feature vectors are constructed. These feature vectors may include source port, destination port, transport layer protocol, session duration, packet size, transport layer round-trip time statistics, and application layer protocol type. This step provides a dynamic behavioral data foundation for subsequent service feature analysis and mimicry strategy calculation.

[0063] Based on the extracted source and destination IPs, an IP address list is generated, and an active scan is initiated for each IP address in the list. During the scan, the open, closed, and filtered states of each port are identified, and a protocol connection is established with the open ports to obtain the corresponding scan response data. This constructs a static service feature vector, which may include service type, version number, list of open ports, and visible service identifiers. This step supplements the system's static service feature information, complementing the dynamic behavioral data.

[0064] By statistically analyzing the number of attacks and historical attack frequency of each static service feature within a preset time window, and combining this with the feature's frequency of occurrence in the Internet public asset database, a feature uniqueness score is calculated. Using a dynamic mimicry attitude calculation model, the historical attack frequency score and the feature uniqueness score are weighted and synthesized, while also incorporating the system's real-time resource score and time decay factor, to obtain the dynamic mimicry attitude of each pre-mimicked honeypot target node. This score reflects the priority of mimicking the node's protection at the current point in time.

[0065] A comprehensive analysis is performed on the business communication behavior feature vector and static feature vector of each pre-mimicking honeypot. The communication behavior complexity index, protocol interaction depth index, and communication association chain length index are calculated, and these three are combined to simulate a priority score. Based on the score, all business static feature vectors are sorted in descending order. Then, according to a ratio set by the dynamic mimicking attitude value, high-priority business static feature vectors are selected and deployed as mimicking honeypot targets, thereby achieving an efficient defense deployment under limited resource conditions.

[0066] The above technical solution achieves the following effects: By directly collecting mirrored communication packets from the business network, precise asset fingerprints and business communication behavior characteristics are extracted, enabling dynamic synchronization between the mimicry honeypot characteristics and real business, making it more difficult for mimicry honeypots to be distinguished from real assets from external observations; simultaneously, by utilizing the static business characteristics obtained through active scanning and the dynamic characteristics obtained through mirrored packet parsing, more comprehensive input data is provided for mimicry honeypot protection strategies, helping to accurately match mimicry targets from multiple aspects such as service type, version, and communication mode; dynamic mimicry attitude calculation not only considers the frequency of historical attacks but also the rarity of services in the Internet, enabling protection strategies to automatically tilt towards high-risk and high-defense-efficiency targets, achieving adaptive resource allocation; introducing three scoring factors—communication behavior complexity, protocol interaction depth, and communication association chain length—quantifies the priority of mimicry deployment in multiple dimensions, ensuring that mimicry honeypots are deployed first in critical nodes and high-complexity communication environments, improving the overall balance and effectiveness of protection; through a unified indicator calculation and sorting screening mechanism, the selection and deployment of mimicry honeypot targets are automatically completed, eliminating the need for manual configuration, significantly reducing maintenance costs and improving deployment efficiency.

[0067] One embodiment of the present invention involves obtaining a business communication image message uploaded by a user to a business system, and extracting an asset fingerprint based on the business communication image message to construct a business communication behavior feature vector, including:

[0068] Deploy a traffic mirroring port on the switch of the business network to which the business system belongs, and listen to the business communication mirroring packets uploaded by users to the business system through the traffic mirroring port;

[0069] The asset fingerprint is extracted from the business communication mirror message to construct a communication behavior feature vector. The asset fingerprint also includes: source port, destination port, transport layer protocol, transport layer session duration, data packet size, transport layer round-trip time statistics, and application layer protocol type.

[0070] The working principle of the above technical solution is as follows: Configure traffic mirroring ports on the switches of the business network to achieve lossless replication of business communication traffic; ensure that the performance and stability of the original business communication are not affected by traffic replication at the hardware level of the switch; mirror all business communication packets passing through the switch, including uplink and downlink traffic to achieve full traffic capture; deeply analyze the network protocol stack, analyze it layer by layer from the physical layer to the application layer to extract asset fingerprints, and organize the extracted multi-dimensional asset fingerprints into structured feature vectors.

[0071] The above technical solution has the following effects: the constructed communication behavior feature vector can be used as input to calculate the historical attack frequency score and feature uniqueness score, thereby evaluating the priority of mimicry honeypot deployment and realizing a precise and dynamic protection strategy; the method can add new fingerprint parameters (such as more application layer behavior features) at any time to adapt to the needs of different business systems without modifying the business system itself, and has strong scalability.

[0072] In one embodiment of the present invention, an IP address list is obtained based on the source IP and destination IP, an active scan is initiated on the IP addresses in the IP address list to obtain scan response data, and a service static feature vector is constructed based on the scan response data, including:

[0073] Based on the source IP and destination IP, extract the IP addresses of all communication entities, perform deduplication and validity verification on the extracted IP addresses of communication entities, and form a target scan IP address list;

[0074] For the IP addresses in the target scan IP address list, perform an active scan within a preset port range, identify the status of each port including open ports, closed ports, and filtered ports, and obtain the corresponding scan response data;

[0075] Establish a protocol connection with the open port, obtain service banner information, and parse the service type and version number;

[0076] A static feature vector for the service is constructed based on the scan response data. The static feature vector for the service includes: service type, version number, list of open ports, and visible service identifier.

[0077] One embodiment of the present invention calculates the dynamic mimicry attitude of the pre-mimicked honeypot based on the service communication behavior feature vector and the corresponding service static feature vector, including:

[0078] The number of times each feature in the static feature vector of the business is attacked within a preset time window is counted, the average number of times all service features are attacked is obtained, and a historical attack frequency score is obtained based on the average number of times.

[0079] By querying the Internet asset fingerprint database, the frequency of occurrence of each service static feature in the business static feature vector in Internet public assets within a preset time window is counted, the average frequency of occurrence of all service features is obtained, and the feature uniqueness score is calculated based on the average frequency of occurrence.

[0080] Based on the historical attack frequency score and feature uniqueness score, the dynamic pseudo-attitude of the pre-mimicking honeypot is calculated using a dynamic pseudo-attitude model, which is as follows:

[0081]

[0082] in, Indicates at a point in time Dynamic pseudo-attitude towards honeypots This indicates the system's real-time resource score. Indicates at a point in time Historical attack frequency score Indicates at a point in time Feature uniqueness score, This represents the weighting coefficient of the historical attack frequency score. λ represents the weighting coefficient for the uniqueness score of the aforementioned feature, and λ represents the time decay coefficient. Indicates a time interval.

[0083]

[0084] Where N represents the number of features in the business static feature vector. This indicates that within a preset time window, the j-th feature is... The number of attacks recorded in real time. This indicates the preset maximum number of attacks.

[0085]

[0086] in, Indicates in The frequency of occurrence of the j-th static feature of a service in internet public assets at time j. This indicates the maximum preset frequency of occurrence.

[0087]

[0088] in, Indicates in CPU usage of the host machine in the constant mimicry honeypot. Indicates in The memory usage of the host machine in the constant mimicry honeypot. Indicates in The network bandwidth usage of the host in the mimicry honeypot at all times. The weighting coefficient representing the CPU utilization of the host computer. A weighting factor representing the host's memory usage. The weighting coefficient represents the network bandwidth utilization of the host.

[0089] The working principle and effects of the above technical solution are as follows: Based on the CPU, memory, and network bandwidth utilization of the honeypot deployment nodes, and their corresponding weights , and This reflects whether the current system has the resource conditions to support a mimicry honeypot. This represents a combination of time-weighted historical attack frequency and feature uniqueness score. Used for normalization, attenuation factor The more recent the historical data, the greater its impact. Dynamic mimicry depends not only on attack risk but also on system deployment capabilities; if honeypot deployment nodes are resource-constrained, even with high risk, new mimicry instances should not be rashly deployed; when Very low (when resources are insufficient), overall This will significantly reduce the overall index, preventing deployment decisions from being distorted due to high attack data. The weakest link effect: if any key condition is too weak, the overall index will decrease. A combination of attack history and unique characteristics is used, including historical attack frequency. It reflects the intensity of attacks that this type of target has suffered in the past and is a commonly used indicator for risk prediction;

[0090] Uniqueness of features This measure assesses the rarity of a service simulated in a public network asset. The rarer the service, the more likely it is to attract attackers' interest. The combination of attack history and unique characteristics can be automatically balanced in different scenarios: high-frequency attacks + high uniqueness → high priority; high-frequency attacks but features with a lot of noise → priority is suppressed; high uniqueness but few historical attacks → still has some priority, but it will not be artificially high.

[0091] Time decay weighted This indicates that the attack landscape is dynamic; past threats may be ineffective, recent attack records carry more weight, and older data gradually loses its influence. This design corresponds to the sliding window risk assessment in the cybersecurity field, avoiding misjudgments caused by long-term accumulated old data. Since the numerator accumulates data from multiple time points, directly adding them would cause the result to depend on the number of samples. Normalizing the denominator... Normalize the denominator using the same attenuation coefficient, ensuring that the numerator / denominator is essentially a weighted average. The range is stable between 0 and 1. Weighting coefficients. and The model can be optimized based on the actual business threat model. For example, financial systems prioritize attack frequency, while industrial control systems prioritize unique characteristics (rare device protocols). This adjustability makes the model applicable to different types of networks and honeypot strategies. High resource capacity, attack frequency, and unique characteristics are all essential for achieving high dynamic mimicry and reducing distortion from a single metric. Adaptive time weighting ensures that trend changes are reflected in the results in real time, adapting to changes in the attack landscape. Honeypots should not be deployed on nodes with insufficient resources to prevent performance crashes or misleading attackers.

[0092] In one embodiment of the present invention, calculating the simulated priority score of the pre-simulated honeypot's service communication behavior feature vector and its corresponding service static feature vector includes:

[0093] The number of port changes is obtained based on the source port and destination port in the business communication behavior feature vector. The communication behavior complexity index is obtained based on the number of port changes, session duration fluctuation, statistical value fluctuation of transport layer round-trip time, and application layer protocol type distribution.

[0094] The protocol interaction depth index is obtained by parsing the number of service interaction steps, handshake process complexity, and data interaction volume based on the application layer protocol type and service banner information.

[0095] Port association relationships are obtained based on business communication behavior feature vectors. The length of the continuous interaction chain formed by the target communication entity and other communication entities is calculated based on the port association relationships. The communication association chain length index is obtained based on the length of the continuous interaction chain.

[0096] The simulated priority score is obtained based on the communication behavior complexity index, protocol interaction depth index, and communication association chain length index.

[0097] The working principle and effect of the above technical solution are as follows: a data collection module is deployed on the honeypot management platform to collect real-time traffic of business services running in the target production network, with the sampling window set to 10 minutes.

[0098] Data collection fields include: source port, destination port, application layer protocol type (HTTP, DNS, FTP, etc.), service banner information (e.g., HTTP Server header), session duration, and TCP round-trip time (RTT). The collected raw traffic data is parsed and normalized to form a business communication behavior feature vector. Vector elements include port pairs, protocol types, banner strings, and latency statistics. The communication behavior complexity index is calculated within the sampling window as follows: port change count is calculated, deduplicated according to the combination of source and destination ports; the port change count equals the total number of changes in the combination.

[0099] Session duration fluctuation statistics are calculated by taking the standard deviation or variance of all session durations within the sampling window to reflect the degree of duration fluctuation. Transport layer round-trip time fluctuation statistics are calculated by taking the standard deviation or variance of the transport layer round-trip time values ​​collected within the sampling window to reflect transport layer latency stability. Application layer protocol type distribution entropy is calculated by taking the Shannon entropy of the occurrence frequency of different application protocol types and normalizing it to [0,1]. The communication behavior complexity index is obtained by weighting and summing these four sub-indicators according to pre-set weights.

[0100]

[0101] in, Indicates the number of port changes. This represents the standard deviation of the session duration. Indicates the standard deviation of round-trip time. This represents the normalized application layer protocol type distribution entropy. , , and This represents the weighting coefficient, based on the application layer protocol type and service banner. According to the protocol specifications and banner information, it analyzes the number of handshake and data exchange steps in the business interaction process; it statistically analyzes the message types and field diversity involved in the handshake, and calculates the complexity score; it normalizes the total data volume (bytes) of a complete interaction and synthesizes a depth index.

[0102]

[0103] in, Indicates the depth index of protocol interaction. Indicates the number of normalized interaction steps. Definition: The number of signaling steps or messages in a complete business interaction process. Purpose: More steps indicate a more complex protocol process. Indicates the complexity of the handshake process. Definition: A comprehensive score of the number of message types and the diversity of fields involved in the handshake process. Purpose: High structural complexity indicates greater simulation difficulty. Indicates the amount of data exchanged. Definition: The total number of bytes or packets transmitted bidirectionally in a complete session. Used after normalization. Function: Large data exchange indicates high communication load and rich content. , and This represents a weighting coefficient, used to adjust the relative impact of the number of steps, handshake complexity, and data volume. It is used for: constructing port relationships by extracting call relationships between port combinations from feature vectors (e.g., access chains triggered by a certain service port); calculating the length of continuous interaction chains by constructing a communication graph within a sampling window and calculating the longest expected interaction chain length formed by the target entity and other entities; and normalizing the association chain length exponent.

[0104] By combining the three indices, a simulated priority score is calculated.

[0105]

[0106] in, The simulation priority score is used to determine the deployment priority of services in the mimicry honeypot and drive resource allocation strategies. A higher simulation priority score means that the service is simulated first.

[0107] Multi-dimensional and precise assessment: Compared to assessment methods relying solely on a single communication indicator, this technical solution simultaneously incorporates three types of information: communication behavior characteristics, protocol interaction characteristics, and network association chain characteristics. This yields a more comprehensive and accurate simulated priority score, effectively differentiating the complexity and potential risks of different services. Automated priority determination: The calculation process for simulated priority scores is automated, repeatable, and can be refreshed in real time. It adapts to dynamically changing business communication environments and can adjust the deployment order of mimicry honeypots based on real-time network conditions, achieving optimal resource allocation. Improved honeypot deployment efficiency and security: Simulating high-priority services first helps quickly attract and capture attack behaviors, enhancing the deterrent power and data collection capabilities of the honeypot system. Delaying or omitting simulations for low-priority services saves system resources and reduces unnecessary simulation overhead. Risk-driven defense strategy optimization: Since the priority score comprehensively considers communication complexity, protocol depth, and association chain length, this solution naturally reflects the potential attack surface size of a service, providing data support for security operations personnel. Combined with historical attack behavior records, the weighting coefficients can be further optimized to achieve adaptive defense.

[0108] Example 2: An automated mimicry honeypot system based on asset fingerprinting, the system comprising:

[0109] A business communication behavior feature vector module is constructed to obtain the business communication image messages uploaded by users to the business system. Based on the business communication image messages, asset fingerprints are extracted to construct multiple business communication behavior feature vectors. The asset fingerprints include: source IP and destination IP.

[0110] A business static feature vector construction module is established, which obtains an IP address list based on the source IP and destination IP, initiates an active scan on the IP addresses in the IP address list to obtain scan response data, and constructs a business static feature vector based on the scan response data.

[0111] The dynamic attitude acquisition module is used to calculate the dynamic attitude of the pre-mimicked honeypot based on the business communication behavior feature vector and its corresponding business static feature vector.

[0112] The mimicry vector filtering module is used to calculate the simulated priority scores of the business communication behavior feature vectors of the pre-mimicry honeypots and their corresponding business static feature vectors. The business static feature vectors are sorted in descending order according to the simulated priority scores, and the corresponding proportion of business static feature vector mimicry honeypots are selected based on the dynamic mimicry attitude value.

[0113] In one embodiment of the present invention, the module for constructing the service communication behavior feature vector includes:

[0114] The business communication mirroring message monitoring module is used to deploy a traffic mirroring port on the switch of the business network to which the business system belongs, and to monitor the business communication mirroring messages uploaded by users to the business system through the traffic mirroring port.

[0115] The asset fingerprint extraction module is used to extract asset fingerprints from the business communication mirror message to construct a communication behavior feature vector. The asset fingerprint also includes: source port, destination port, transport layer protocol, transport layer session duration, data packet size, transport layer round-trip time statistics, and application layer protocol type.

[0116] In one embodiment of the present invention, the module for constructing the business static feature vector includes:

[0117] A target scan IP address list is generated, which is used to extract the IP addresses of all communication entities based on the source IP and destination IP. The extracted communication entity IP addresses are deduplicated and their validity is verified to form the target scan IP address list.

[0118] The module for acquiring scan response data is used to perform active scanning on the IP addresses in the target scan IP address list within a preset port range, identify the status of each port including open ports, closed ports and filtered ports, and acquire the corresponding scan response data.

[0119] The protocol connection module is used to establish a protocol connection with the open port, obtain service banner information, and parse the service type and version number.

[0120] A business static feature vector generation module is used to construct a business static feature vector based on the scan response data. The business static feature vector includes: service type, version number, open port list, and visible service identifier.

[0121] In one embodiment of the present invention, the dynamic hypothetical attitude acquisition module includes:

[0122] The historical attack frequency score acquisition module is used to count the number of times each feature in the business static feature vector is attacked within a preset time window, obtain the average number of times all service features are attacked, and obtain the historical attack frequency score based on the average number of times.

[0123] The feature uniqueness scoring module is used to query the Internet asset fingerprint database, count the frequency of occurrence of each service static feature in the business static feature vector in Internet public assets within a preset time window, obtain the average frequency of occurrence of all service features, and calculate the feature uniqueness score based on the average frequency of occurrence.

[0124] The dynamic attitude simulation module is used to calculate the dynamic attitude of the pre-mimicked honeypot based on the historical attack frequency score and feature uniqueness score, using a dynamic attitude simulation model. The dynamic attitude simulation model is as follows:

[0125]

[0126] in, Indicates at a point in time Dynamic pseudo-attitude towards honeypots This indicates the system's real-time resource score. Indicates at a point in time Historical attack frequency score Indicates at a point in time Feature uniqueness score, This represents the weighting coefficient of the historical attack frequency score. λ represents the weighting coefficient for the uniqueness score of the aforementioned feature, and λ represents the time decay coefficient. Indicates a time interval.

[0127] In one embodiment of the present invention, the screening mimicry vector module includes:

[0128] The module for obtaining the communication behavior complexity index is used to obtain the number of port changes based on the source port and destination port in the business communication behavior feature vector, and to obtain the communication behavior complexity index based on the number of port changes, session duration fluctuation, statistical value fluctuation of transport layer round-trip time and application layer protocol type distribution.

[0129] The protocol interaction depth index acquisition module is used to parse the number of service interaction steps, handshake process complexity and data interaction volume according to the application layer protocol type and service banner information to obtain the protocol interaction depth index.

[0130] The module for obtaining the communication association chain length index is used to obtain port association relationships based on business communication behavior feature vectors, calculate the length of the continuous interaction chain formed by the target communication entity and other communication entities based on the port association relationships, and obtain the communication association chain length index based on the length of the continuous interaction chain.

[0131] The module for obtaining simulated priority scores is used to obtain simulated priority scores based on the communication behavior complexity index, protocol interaction depth index, and communication association chain length index.

[0132] The automated mimicry honeypot system based on asset fingerprinting described in this embodiment of the invention is used to implement the automated mimicry honeypot method based on asset fingerprinting described in the above embodiments. Both are based on the same inventive concept and have the same technical effects, which will not be repeated here.

[0133] Obviously, the above embodiments are merely illustrative examples for clear explanation and are not intended to limit the implementation. Those skilled in the art will recognize that other variations or modifications can be made based on the above description. It is neither necessary nor possible to exhaustively list all possible implementations here. However, obvious variations or modifications derived therefrom are still within the scope of protection of this invention.

Claims

1. An asset fingerprint based automated mimicry honeypot method, characterized in that, The method includes: Obtain the business communication image message uploaded by the user to the business system, and extract asset fingerprints based on the business communication image message to construct multiple business communication behavior feature vectors. The asset fingerprint includes: source IP and destination IP. Based on the source IP and destination IP, obtain an IP address list, initiate an active scan of the IP addresses in the IP address list to obtain scan response data, and construct a service static feature vector based on the scan response data; The dynamic mimicry of the pre-mimicked honeypot is calculated based on its service communication behavior feature vector and corresponding service static feature vector, including: The number of times each feature in the static feature vector of the business is attacked within a preset time window is counted, the average number of times all service features are attacked is obtained, and a historical attack frequency score is obtained based on the average number of times. By querying the Internet asset fingerprint database, the frequency of occurrence of each service static feature in the business static feature vector in Internet public assets within a preset time window is counted, the average frequency of occurrence of all service features is obtained, and the feature uniqueness score is calculated based on the average frequency of occurrence. Based on the historical attack frequency score and feature uniqueness score, the dynamic pseudo-attitude of the pre-mimicking honeypot is calculated using a dynamic pseudo-attitude model, which is as follows: in, Indicates at a point in time Dynamic pseudo-attitude towards honeypots This indicates the system's real-time resource score. Indicates at a point in time Historical attack frequency score Indicates at a point in time Feature uniqueness score, This represents the weighting coefficient of the historical attack frequency score. λ represents the weighting coefficient for the uniqueness score of the aforementioned feature, and λ represents the time decay coefficient. Indicates a time interval; Where N represents the number of features in the business static feature vector. This indicates that within a preset time window, the j-th feature is... The number of attacks recorded in real time. This indicates the preset maximum number of attacks. in, Indicates in The frequency of occurrence of the j-th static feature of a service in internet public assets at time j. This indicates the maximum preset frequency of occurrence. in, Indicates in CPU usage of the host machine in the constant mimicry honeypot. Indicates in The memory usage of the host machine in the constant mimicry honeypot. Indicates in The network bandwidth usage of the host in the mimicry honeypot at all times. The weighting coefficient representing the CPU utilization of the host computer. A weighting factor representing the host's memory usage. A weighting coefficient representing the network bandwidth utilization of a host; Calculate the simulated priority scores of the service communication behavior feature vectors of the pre-mimicking honeypots and their corresponding service static feature vectors. Sort the service static feature vectors in descending order according to their simulated priority scores, and select the corresponding proportion of service static feature vector mimicking honeypots based on the dynamic mimicking attitude values.

2. The automated mimicry honeypot method based on asset fingerprinting according to claim 1, characterized in that, Obtain the business communication image messages uploaded by the user to the business system, and extract asset fingerprints based on the business communication image messages to construct a business communication behavior feature vector, including: Deploy a traffic mirroring port on the switch of the business network to which the business system belongs, and listen to the business communication mirroring packets uploaded by users to the business system through the traffic mirroring port; The asset fingerprint is extracted from the business communication mirror message to construct a communication behavior feature vector. The asset fingerprint also includes: source port, destination port, transport layer protocol, transport layer session duration, data packet size, transport layer round-trip time statistics, and application layer protocol type.

3. The automated mimicry honeypot method based on asset fingerprinting according to claim 1, characterized in that, Based on the source IP and destination IP, obtain an IP address list, initiate an active scan of the IP addresses in the IP address list to obtain scan response data, and construct a service static feature vector based on the scan response data, including: Based on the source IP and destination IP, extract the IP addresses of all communication entities, perform deduplication and validity verification on the extracted IP addresses of communication entities, and form a target scan IP address list; For the IP addresses in the target scan IP address list, perform an active scan within a preset port range, identify the status of each port including open ports, closed ports, and filtered ports, and obtain the corresponding scan response data; Establish a protocol connection with the open port, obtain service banner information, and parse the service type and version number; A static feature vector for the service is constructed based on the scan response data. The static feature vector for the service includes: service type, version number, list of open ports, and visible service identifier.

4. The automated mimicry honeypot method based on asset fingerprinting according to claim 1, characterized in that, Calculate the simulated priority scores of the pre-mimetic honeypot's service communication behavior feature vector and its corresponding service static feature vector, including: The number of port changes is obtained based on the source port and destination port in the business communication behavior feature vector. The communication behavior complexity index is obtained based on the number of port changes, session duration fluctuation, statistical value fluctuation of transport layer round-trip time, and application layer protocol type distribution. The protocol interaction depth index is obtained by parsing the number of service interaction steps, handshake process complexity, and data interaction volume based on the application layer protocol type and service banner information. Port association relationships are obtained based on business communication behavior feature vectors. The length of the continuous interaction chain formed by the target communication entity and other communication entities is calculated based on the port association relationships. The communication association chain length index is obtained based on the length of the continuous interaction chain. The simulated priority score is obtained based on the communication behavior complexity index, protocol interaction depth index, and communication association chain length index.

5. An automated mimicry honeypot system based on asset fingerprinting, characterized in that, The system includes: A business communication behavior feature vector module is constructed to obtain the business communication image messages uploaded by users to the business system. Based on the business communication image messages, asset fingerprints are extracted to construct multiple business communication behavior feature vectors. The asset fingerprints include: source IP and destination IP. A business static feature vector construction module is established, which obtains an IP address list based on the source IP and destination IP, initiates an active scan on the IP addresses in the IP address list to obtain scan response data, and constructs a business static feature vector based on the scan response data. A dynamic attitude acquisition module is used to calculate the dynamic attitude of the pre-mimicked honeypot based on its business communication behavior feature vector and corresponding business static feature vector. The dynamic attitude acquisition module includes: The historical attack frequency score acquisition module is used to count the number of times each feature in the business static feature vector is attacked within a preset time window, obtain the average number of times all service features are attacked, and obtain the historical attack frequency score based on the average number of times. The feature uniqueness scoring module is used to query the Internet asset fingerprint database, count the frequency of occurrence of each service static feature in the business static feature vector in Internet public assets within a preset time window, obtain the average frequency of occurrence of all service features, and calculate the feature uniqueness score based on the average frequency of occurrence. The dynamic attitude simulation module is used to calculate the dynamic attitude of the pre-mimicked honeypot based on the historical attack frequency score and feature uniqueness score, using a dynamic attitude simulation model. The dynamic attitude simulation model is as follows: in, Indicates at a point in time Dynamic pseudo-attitude towards honeypots This indicates the system's real-time resource score. Indicates at a point in time Historical attack frequency score Indicates at a point in time Feature uniqueness score, This represents the weighting coefficient of the historical attack frequency score. λ represents the weighting coefficient for the uniqueness score of the aforementioned feature, and λ represents the time decay coefficient. Indicates a time interval; Where N represents the number of features in the business static feature vector. This indicates that within a preset time window, the j-th feature is... The number of attacks recorded in real time. This indicates the preset maximum number of attacks. in, Indicates in The frequency of occurrence of the j-th static feature of a service in internet public assets at time j. This indicates the maximum preset frequency of occurrence. in, Indicates in CPU usage of the host machine in the constant mimicry honeypot. Indicates in The memory usage of the host machine in the constant mimicry honeypot. Indicates in The network bandwidth usage of the host in the mimicry honeypot at all times. The weighting coefficient representing the CPU utilization of the host computer. A weighting factor representing the host's memory usage. A weighting coefficient representing the network bandwidth utilization of a host; The mimicry vector filtering module is used to calculate the simulated priority scores of the business communication behavior feature vectors of the pre-mimicry honeypots and their corresponding business static feature vectors. The business static feature vectors are sorted in descending order according to the simulated priority scores, and the corresponding proportion of business static feature vector mimicry honeypots are selected based on the dynamic mimicry attitude value.

6. The automated mimicry honeypot system based on asset fingerprinting according to claim 5, wherein the module for constructing business communication behavior feature vectors includes: The business communication mirroring message monitoring module is used to deploy a traffic mirroring port on the switch of the business network to which the business system belongs, and to monitor the business communication mirroring messages uploaded by users to the business system through the traffic mirroring port. The asset fingerprint extraction module is used to extract asset fingerprints from the business communication mirror message to construct a communication behavior feature vector. The asset fingerprint also includes: source port, destination port, transport layer protocol, transport layer session duration, data packet size, transport layer round-trip time statistics, and application layer protocol type.

7. The automated mimicry honeypot system based on asset fingerprinting according to claim 5, characterized in that, The module for constructing static feature vectors for business operations includes: A target scan IP address list is generated, which is used to extract the IP addresses of all communication entities based on the source IP and destination IP. The extracted communication entity IP addresses are deduplicated and their validity is verified to form the target scan IP address list. The module for acquiring scan response data is used to perform active scanning on the IP addresses in the target scan IP address list within a preset port range, identify the status of each port including open ports, closed ports and filtered ports, and acquire the corresponding scan response data. The protocol connection module is used to establish a protocol connection with the open port, obtain service banner information, and parse the service type and version number. A business static feature vector generation module is used to construct a business static feature vector based on the scan response data. The business static feature vector includes: service type, version number, open port list, and visible service identifier.

8. The automated mimicry honeypot system based on asset fingerprinting according to claim 5, characterized in that, The module for selecting mimicry vectors includes: The module for obtaining the communication behavior complexity index is used to obtain the number of port changes based on the source port and destination port in the business communication behavior feature vector, and to obtain the communication behavior complexity index based on the number of port changes, session duration fluctuation, statistical value fluctuation of transport layer round-trip time and application layer protocol type distribution. The protocol interaction depth index acquisition module is used to parse the number of service interaction steps, handshake process complexity and data interaction volume according to the application layer protocol type and service banner information to obtain the protocol interaction depth index. The module for obtaining the communication association chain length index is used to obtain port association relationships based on business communication behavior feature vectors, calculate the length of the continuous interaction chain formed by the target communication entity and other communication entities based on the port association relationships, and obtain the communication association chain length index based on the length of the continuous interaction chain. The module for obtaining simulated priority scores is used to obtain simulated priority scores based on the communication behavior complexity index, protocol interaction depth index, and communication association chain length index.