Precious metal feeding anti-theft method, system, device and medium based on double tag set
By constructing a precious metal feeding anti-theft method with dual-label sets, and utilizing a fusion spatiotemporal graph and a multi-task model, real-time identification and early prediction of abnormal precious metal feeding behavior are achieved. This solves the problems of high false alarm rate and low generalization in existing technologies and adapts to the anti-theft needs of different security levels.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GUANGDONG CHICO ELECTRONIC INC
- Filing Date
- 2026-03-25
- Publication Date
- 2026-06-19
AI Technical Summary
Existing anti-theft technologies for precious metal feeding cannot effectively provide early warnings of feeding states that are normal but prone to turning into abnormal ones, resulting in a high false alarm rate and low generalization, which cannot meet the high precision and high real-time requirements of industrial scenarios.
A dual-label-based approach is adopted to construct a fused spatiotemporal map by acquiring visual data of feeding behavior, calculate three-dimensional scores and perform clustering, combine feeding status and variable labels, and use a multi-task fusion model to make predictions and determine a hierarchical anti-theft strategy.
It enables real-time identification and early prediction of abnormal precious metal feeding behavior, reduces false alarm rate, adapts to the anti-theft needs of different security levels, and improves the accuracy and generalization of anti-theft strategies.
Smart Images

Figure CN122244795A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of anti-theft technology for precious metal feeding, and in particular to a method, system, device and medium for anti-theft of precious metal feeding based on a dual-tag set. Background Technology
[0002] Precious metal feeding is a core process in the precious metal deep processing industry. The weight, type, and feeding method of the raw materials directly determine the production efficiency. Moreover, precious metal raw materials are valuable and prone to problems such as smuggling, substitution, and underfeeding. Therefore, the safety monitoring of the feeding process is of extremely high importance.
[0003] Current anti-theft solutions for precious metal feeding mainly fall into three categories: manual inspection, single visual monitoring, and simple sensor threshold alarms. These solutions cannot provide early warnings for feeding states that are normal but prone to turning into abnormal ones, thus failing to meet the high-precision, high-real-time, and low-false-alarm-rate anti-theft requirements of industrial scenarios. Summary of the Invention
[0004] The main objective of this application is to propose a method, system, device, and medium for preventing theft of precious metals based on a dual-tag set, so as to solve one or more technical problems existing in the prior art, and at least provide a beneficial option or create conditions.
[0005] To achieve the above objectives, one aspect of this application proposes a method for preventing theft of precious metals based on a dual-tag set, the method comprising: Obtain visual data of feeding behavior, construct a fused spatiotemporal graph based on the feeding visual data, and output a graph structure feature vector through the fused spatiotemporal graph; Based on the graph structure feature vector, calculate the three-dimensional scores of posterior uncertainty, boundary proximity, and sparsity for each feeding behavior, cluster the three-dimensional scores, and assign labels to high-score clusters and low-score clusters to obtain variability labels; Based on the graph structure feature vector, abnormal feeding status and normal feeding status are identified, and labels are assigned to the abnormal feeding status and the normal feeding status to obtain feeding status labels; The graph structure feature vector, the corresponding feeding state label, and the variability label are combined to obtain a dual-label dataset; The dual-label dataset is input into the trained multi-task fusion model, which outputs the predicted values of the feeding status label and the variability label. Based on the predicted values, the corresponding extension domain is determined, and based on the corresponding extension domain, the corresponding hierarchical anti-theft strategy is determined.
[0006] In some embodiments, calculating the three-dimensional scores of posterior uncertainty, boundary proximity, and sparsity for each feeding action includes: Based on the graph structure feature vector, the score of the posterior uncertainty is calculated using a Gaussian mixture model and a Bayesian posterior probability model. Based on the graph structure feature vector, calculate the absolute difference of the log-class conditional density of the two Gaussian mixture models, and calculate the boundary proximity score based on the absolute difference. The sparsity score is calculated based on the graph structure feature vector using the total joint probability density formula.
[0007] In some embodiments, the step of clustering the three-dimensional scores and assigning labels to high-score clusters and low-score clusters to obtain variability labels includes: The fitted 3D scores are classified into high-score clusters and low-score clusters using the established clustering algorithm. Based on the three-dimensional scores, calculate the average weighted composite score of the high-score cluster and the low-score cluster for each feeding behavior; Based on the set score threshold and the average weighted comprehensive score, low-variability labels and high-variability labels are divided. The variable label includes the low variable label and the high variable label.
[0008] In some embodiments, the graph structure feature vector includes personnel action features, material state features, and equipment parameter features. The labels are assigned to the abnormal feeding state and the normal feeding state to obtain the feeding state labels, including: Using the established extraction model, based on the personnel action characteristics and the set confidence threshold, it is determined whether the feeder is feeding materials normally; Based on the material state characteristics, the equipment parameter characteristics, and the work order verification data, determine whether the input material matches the work order verification data and has the same weight. When it is determined that the feeder is feeding normally, and the material fed matches the data approved in the work order and has the same weight, the feeding behavior corresponding to the graph structure feature vector is marked as the normal feeding state.
[0009] In some embodiments, the abnormal feeding state includes replacing materials. The label is assigned values for the abnormal feeding state and the normal feeding state to obtain the feeding state label, including: Based on the characteristics of the personnel's movements, the established target detection model is used to detect whether the hand area simultaneously contains target precious metal materials and non-target precious metal materials. Based on the material state characteristics, the target detection model is used to detect whether the feeding port area contains only the non-target precious metal material and not the target precious metal material. Based on the equipment parameter characteristics and the material state characteristics, check whether the weight and material type characteristics at the feeding port are consistent with the data approved in the work order. When the hand area contains both target precious metal material and non-target precious metal material, and the feeding port area contains only the non-target precious metal material and not the target precious metal material, and the weights of the feeding ports are consistent but the characteristics of the material types do not match, then the abnormal feeding state is considered to be a replacement material.
[0010] In some embodiments, the abnormal feeding status includes insufficient material feeding. The label is assigned values for the abnormal feeding status and the normal feeding status to obtain the feeding status label, including: Based on the aforementioned personnel movement characteristics, the established target detection model is used to detect whether the movement trajectory of the feeder conforms to the established feeding specifications. Based on the material state characteristics, the set target detection model is used to detect whether the material quantity is consistent with the set work order verification data. Based on the equipment parameter characteristics, the weight at the feeding port is found to be consistent with the data verified in the work order. When it is determined that the action trajectory of the feeder conforms to the set feeding specifications, and the quantity of the material and the weight of the feeding port do not match the data verified by the set work order, the abnormal feeding status is considered to be insufficient material feeding.
[0011] In some embodiments, determining the corresponding extension domain based on the predicted value includes: Based on the predicted value of the feeding status label, the feeding status label is determined to be in a normal feeding state, and based on the predicted value of the variability label, the variability label is determined to be a low variability label. Based on the normal feeding state and the low variability label, the corresponding extension domain is determined to be a positive variable domain from the four quadrant extension domain, so as to determine the hierarchical anti-theft strategy set for the positive variable domain. The four-quadrant extension domain includes the negative quantitative domain, the negative qualitative domain, the positive qualitative domain, and the positive quantitative domain.
[0012] To achieve the above objectives, another aspect of this application proposes a precious metal feeding anti-theft system based on a dual-tag set, the system comprising: The vision processing module is used to acquire visual data of feeding behavior, construct a fused spatiotemporal graph based on the feeding visual data, and output a graph structure feature vector through the fused spatiotemporal graph. The variability determination module is used to calculate the three-dimensional scores of posterior uncertainty, boundary proximity and sparsity of each feeding behavior based on the graph structure feature vector, cluster the three-dimensional scores and assign labels to high-score clusters and low-score clusters to obtain variability labels. The feeding status determination module is used to identify abnormal feeding status and normal feeding status based on the graph structure feature vector, and assign labels to the abnormal feeding status and the normal feeding status to obtain feeding status labels. A dual-label set construction module is used to combine the graph structure feature vector, the corresponding feeding state label, and the variability label to obtain a dual-label dataset. The extension domain decision module is used to input the dual-label dataset into the trained multi-task fusion model, output the predicted values of the feeding status label and the variability label, determine the corresponding extension domain based on the predicted values, and determine the corresponding hierarchical anti-theft strategy based on the corresponding extension domain.
[0013] To achieve the above objectives, another aspect of this application provides an electronic device, which includes a memory and a processor. The memory stores a computer program, and the processor executes the computer program to implement the above-described method.
[0014] To achieve the above objectives, another aspect of the embodiments of this application proposes a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method.
[0015] The embodiments of this application include at least the following beneficial effects: This application provides a method, system, device, and medium for anti-theft of precious metal feeding based on a dual-label set. This solution deeply integrates the actions of feeding personnel, the state of precious metal materials, and the parameters of feeding equipment to construct spatiotemporal correlation features, solving the false alarm problem of only looking at actions without looking at materials or only looking at values without looking at behavior, and breaking through the limitations of single feature detection in existing technologies; through a dual-label dataset of feeding status labels and variability labels and extended domain decision-making, it can predict and issue an early warning 3 seconds in advance for feeding states that are normal but prone to transformation into abnormalities. This system enables early prediction of abnormal behavior; it determines the corresponding extension domain based on the dual-label results and classification of material feeding status and variability, and implements a graded anti-theft strategy to adapt to the precious metal feeding needs of different security levels, thereby improving the accuracy of the anti-theft strategy and achieving differentiated graded anti-theft handling; it enables real-time identification, early prediction, and graded handling of abnormal precious metal feeding behavior, and improves the model's generalization through a feedback iterative optimization mechanism, reducing false alarm rate and industrial deployment costs, and solving the core problems of existing technologies such as passive alarm, high false alarm rate, low generalization, and insufficient integration of all elements. Attached Figure Description
[0016] Figure 1 This is a flowchart of a precious metal feeding anti-theft method based on a dual-tag set provided in an embodiment of this application; Figure 2 This is a schematic diagram of the four-quadrant extension domain provided in the embodiments of this application; Figure 3 This is a schematic diagram of the structure of the precious metal feeding anti-theft system based on a dual-tag set provided in the embodiments of this application; Figure 4 This is a schematic diagram of the hardware structure of the electronic device provided in the embodiments of this application. Detailed Implementation
[0017] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to limit it. In the following description, when referring to the accompanying drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with those of this application; they are merely examples of apparatuses and methods consistent with some aspects of the embodiments of this application as detailed in the appended claims.
[0018] It is understood that the terms “first,” “second,” etc., used in this application may be used herein to describe various concepts, but unless otherwise stated, these concepts are not limited by these terms. These terms are only used to distinguish one concept from another. For example, without departing from the scope of the embodiments of this application, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the words “if,” “when,” or “in response to a determination” as used herein may be interpreted as “when…” or “when…” or “in response to a determination.”
[0019] As used in this application, the terms "at least one", "multiple", "each", "any", etc., "at least one" includes one, two or more, "multiple" includes two or more, "each" refers to each of the corresponding multiples, and "any" refers to any one of the multiples.
[0020] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.
[0021] Figure 1 This is an optional flowchart of a precious metal feeding anti-theft method based on a dual-tag set provided in this application embodiment. Figure 1 The method may include, but is not limited to, steps S100 to S500.
[0022] Step S100: Obtain visual data of feeding behavior; construct a fused spatiotemporal graph based on the visual data of feeding; and output the graph structure feature vector through the fused spatiotemporal graph.
[0023] Step S200: Based on the graph structure feature vector, calculate the three-dimensional scores of posterior uncertainty, boundary proximity and sparsity for each feeding behavior, cluster the three-dimensional scores and assign labels to high-score clusters and low-score clusters to obtain variability labels.
[0024] Step S300: Based on the graph structure feature vector, identify the abnormal feeding state and the normal feeding state, and assign labels to the abnormal feeding state and the normal feeding state to obtain the feeding state labels.
[0025] Step S400: Combine the graph structure feature vector, the corresponding feeding status label, and the variability label to obtain a dual-label dataset.
[0026] Step S500: Input the dual-label dataset into the trained multi-task fusion model, output the predicted values of the feeding status label and the variability label, determine the corresponding extension domain based on the predicted values, and determine the corresponding hierarchical anti-theft strategy based on the corresponding extension domain.
[0027] Steps S100 to S500 as illustrated in this embodiment define the feeder, material, and equipment as spatiotemporal graph nodes to achieve integrated monitoring of all elements of personnel, materials, and equipment. This deeply integrates the feeder's actions, the state of the precious metal material, and the parameters of the feeder equipment to construct spatiotemporal correlation features, solving the false alarm problem of relying solely on actions or numerical values without considering the material or behavior. This overcomes the limitations of single-feature detection in existing technologies. Through a dual-label dataset of feeder status and variability labels and extended domain decision-making, abnormal behavior can be predicted in advance. Based on the dual-label results and division of feeder status and variability, the corresponding extended domain is determined, and a graded anti-theft strategy is implemented to adapt to the precious metal feeder requirements of different security levels, improving the accuracy of the anti-theft strategy and achieving differentiated graded anti-theft handling. This enables real-time identification, early prediction, and graded handling of abnormal precious metal feeder behavior. Simultaneously, a feedback iterative optimization mechanism improves the model's generalization ability, reduces false alarm rates and industrial deployment costs, and solves the core problems of existing technologies such as passive alarms, high false alarm rates, low generalization, and insufficient integration of all elements.
[0028] Through technologies such as spatial normalization, 3σ outlier removal, and MediaPipe key point visibility filtering, it has strong adaptability to interference factors such as changes in lighting, material occlusion, personnel protective equipment, and instantaneous sensor hop counts in industrial scenarios, ensuring stable operation in complex industrial environments.
[0029] In some embodiments of S100, based on the acquisition sensors installed in the feeding area, data is collected through the acquisition sensors, and the three core detection targets in the feeding area, namely the feeder, precious metal materials and feeding equipment, are detected using the established target detection model, and a set of candidate areas for each target is output.
[0030] The target detection model is the YOLOv5 target detection model, but other detection models are not limited in this application. The feeding equipment includes: electronic scale, feeding port and material box.
[0031] Based on the candidate regions, using the established extraction model, 21 key points of the hands, 12 key points of the upper limbs and torso of the person feeding the food were cropped and extracted. The normalized coordinates and visibility information of each key point were output, and the confidence of the visibility information was extracted to form visual recognition data.
[0032] Simultaneously, visual features of precious metal materials are extracted to obtain visual material data.
[0033] The visual material data includes: material type, material quantity, and location characteristics. Visual material data may also include other data, which are not limited in this application. The extraction model can be a MediaPipe pose or MediaPipehand keypoint extraction model.
[0034] The system acquires the operating data of the feeding equipment, aligns the visual recognition data, visual material data, and operating data with timestamps, and stores them in the edge database to form feeding visual data. The acquisition frequency is set to 200ms / time to balance real-time performance and computational load.
[0035] The operating data includes the weight of the electronic scale and the status of the feeding port. Other data may also be included, which are not limited in this application.
[0036] Data cleaning is achieved by using the 3σ principle to remove outliers and using linear interpolation to fill in missing values.
[0037] Outliers can be the instantaneous number of jumps in the electronic scale and / or key point detection errors. Missing values can be due to the loss of key points caused by temporary occlusion or the loss of parameters caused by temporary sensor disconnection. This application does not impose any restrictions on the types of the above values.
[0038] A three-dimensional feeding coordinate system is constructed with the center of the feeding port as the absolute reference origin.
[0039] Based on visual recognition data, the normalized coordinates of the 2D screen of each core key point of the feeder are converted into relative coordinates under the feeding coordinate system to achieve spatial normalization.
[0040] Based on visual material data and work data, the material type, quantity, location characteristics, electronic scale weight, and feeding port status are numerically normalized. The min-max normalization formula is used to convert them into normalized values in the range of [0,1], eliminating spatial interference factors such as camera installation location, feeding personnel height, and material specifications, and ensuring the consistency and comparability of features.
[0041] All cleaned and normalized feature data are standardized to form structured feature vectors.
[0042] Based on the time series, the structured feature vectors of the set number of consecutive steps are extracted to form the time series segments of feeding behavior. The feeder, material and equipment are the nodes of the spatiotemporal graph. Each node has corresponding node features, and the node features carry normalized coordinates and visibility information.
[0043] Based on temporal segments, node features within the same frame are extracted. For the spatiotemporal graph nodes of the feeder, physiological connections (shoulder-elbow-wrist) are established for the feeder's key points. For the three spatiotemporal graph nodes of the feeder, material, and equipment, spatial relationships between the feeder, material, and equipment are established. Spatial edges are obtained through physiological and spatial connections.
[0044] Based on time segments, the time edge is the motion trajectory of the same node in adjacent frames.
[0045] The motion trajectory can be the movement trajectory of the hand from the electronic scale to the feeding port and / or the change trajectory of the material weight.
[0046] Based on the definition of nodes and edges in the spatiotemporal graph, a fusion spatiotemporal graph of people, materials, and equipment is constructed to achieve a preliminary correlation between the spatial structural features and temporal motion features of material feeding behavior.
[0047] The final output of the fused spatiotemporal graph is a graph structure feature vector with node features and edge weights. Its dimension and numerical range are determined by the structured feature vector, ensuring that it can be directly input into S200 for dual-label labeling without additional format conversion.
[0048] In some embodiments of S200, the variable tag differs from traditional technical solutions that only determine normal or abnormal status. The essence of variable tag determination is to quantify the variability of the material feeding state through algorithms, predicting whether the state will transform into an abnormal state, and providing a core basis for tiered anti-theft strategies. This is achieved through graph structure feature vectors.
[0049] Addressing the pain point that existing precious metal feeding anti-theft technologies can only passively detect anomalies and cannot predict potential risks in advance, this paper uses variable tags to determine the stability of the feeding status: for stable normal states (low variable tags), routine monitoring and handling are performed; for abnormal states (high variable tags), direct handling is performed; for unstable states (high variable, such as normal feeding but deviation from the standard, suspected anomalies that have not yet been determined), early warning and intervention are provided, realizing an upgrade from "passive alarm" to "proactive prediction".
[0050] Based on graph structure feature vectors, the label variability algorithm is used to calculate the three-dimensional scores of posterior uncertainty, boundary proximity and sparsity for each sample.
[0051] Posterior uncertainty characterizes the fuzziness of the model's determination of the feeding state; boundary proximity characterizes whether the feeding state is at the normal-abnormal boundary; sparsity characterizes whether the feeding state is a rare industrial scenario.
[0052] The proposed clustering algorithm divides the three-dimensional scores into high-score clusters and low-score clusters. High-score clusters are labeled with high variability (v=+1) (unstable state, prone to transformation into anomalies), while low-score clusters are labeled with low variability (v=-1) (stable state, clear normal / abnormal characteristics).
[0053] The variable labels include high-variability labels and low-variability labels. The clustering algorithms used include Gaussian Mixture Model (GMM) and K-means clustering. The GMM model is iterated 50 times, and the K-means clustering uses the elbow rule to determine the number of clusters to be 2.
[0054] In some embodiments of S300, based on the requirements of the precious metal feeding process, the established visual recognition algorithm is used to classify the feeding status into normal feeding status (y=+1) and abnormal feeding status (y=-1). Abnormal feeding status is further subdivided into four categories: stolen materials, substituted materials, insufficient materials, and incorrect materials. Based on graph structure feature vectors, the feeding status is identified and determined according to the fusion features of personnel actions, material status, and equipment parameters. The core of the judgment is to match and verify the three types of integrated features with the process requirements one by one. If all of them match, the feeding is normal. If one or more types of features conflict with the process requirements and form a chain of abnormal evidence that corroborates each other, the feeding is judged to be an abnormal state of the corresponding type.
[0055] The verification priority and correlation of the three types of integrated features are as follows: personnel action features are the behavioral basis, material status features are the core basis, and equipment parameter features are numerical evidence. All three are indispensable to avoid misjudgment based on a single feature.
[0056] The feeding status labels include: normal feeding status (y=+1) and abnormal feeding status (y=-1). The visual recognition algorithms include: YOLOv5 object detection model and MediaPipe key point extraction model.
[0057] In some embodiments of S400, the graph structure feature vector of the fused spatiotemporal graph is combined with the corresponding feeding state label y and variability label v to obtain a dual-label dataset specifically for theft prevention of precious metal feeding.
[0058] During cloud-based model training, a dual-label dataset can be used as the training input set for input training.
[0059] In some embodiments of the S500, a dual-label dataset is input into the trained multi-task fusion model.
[0060] The multi-task fusion model outputs predicted values of the feeding status label y and the variability label v. Based on the predicted values, the corresponding extension domain is determined, and the corresponding disposal strategy is executed.
[0061] Among them, reference Figure 2 Based on the four-quadrant extension domain classification and combined with the security level requirements for precious metal feeding, the dual-label prediction values output by the model are divided into four extension domains: negative quantitative variable domain, negative qualitative variable domain, positive qualitative variable domain, and positive quantitative variable domain. Differentiated hierarchical anti-theft strategies are implemented for each extension domain.
[0062] In one embodiment, based on the predicted values of the feeding status label and the variability label, if the feeding status label is determined to be in a normal feeding state (y=+1) and the variability label is determined to be in a low variability state (v=-1), then the corresponding extension domain is determined to be a positive variable domain. The feeding status characteristics are normal feeding, stable state, and no abnormal trend.
[0063] In one embodiment, based on the predicted values of the feeding status label and the variability label, if the feeding status label is determined to be in a normal feeding state (y=+1) and the variability label is determined to be in a high variability state (v=+1), then the corresponding extension domain is determined to be a positive qualitative change domain. The feeding status characteristics are normal feeding, unstable state, and prone to transformation into an abnormal state.
[0064] In one embodiment, based on the predicted values of the feeding status label and the variability label, if the feeding status label is determined to be an abnormal feeding state (y=-1) and the variability label is determined to be a low variability label (v=-1), then the corresponding extension domain is determined to be a negative variable domain. The feeding status characteristics are abnormal feeding, stable state, and clear characteristics (such as clear theft / replacement).
[0065] In one embodiment, based on the predicted values of the feeding status label and the variability label, if the feeding status label is determined to be an abnormal feeding state (y=-1) and the variability label is determined to be a high variability label (v=+1), then the corresponding extension domain is determined to be a negative prime variable domain. The feeding status characteristics are abnormal feeding, unstable state, and suspected abnormality undetermined.
[0066] By using the corresponding extended domain, the corresponding hierarchical anti-theft strategy can be determined.
[0067] After implementing the anti-theft strategy, visual data of material feeding is continuously collected for 5 seconds to analyze the changing trend of the material feeding status: if the trend changes to normal, the multi-task fusion model is fine-tuned within the variable security threshold; if the trend changes to abnormal, the sample is marked as a high-risk sample and added to the dual-label dataset; if the status does not change, the original parameters are retained for continuous monitoring. Record staff intervention operations (such as false alarm cancellation, missed alarm supplementation, manual alarm, process update), associate operation details with corresponding material feeding visual data as samples to be corrected, and correct the dual-label dataset according to preset rules (such as correcting v to -1 for false alarm positive quality change samples). The corrected high-confidence samples are added to the dual-label dataset. The multi-task fusion model is retrained monthly, and the parameters of the old model are overwritten with the new dataset. The model is then redeployed to the edge to continuously improve its generalization. At the same time, the judgment criteria of the graded anti-theft strategy are optimized according to the process update, and the new material feeding process requirements are adapted to achieve iterative optimization of the multi-task fusion model.
[0068] In some embodiments of this invention, in step S100, the process of obtaining the graph structure feature vector specifically includes the following steps: S110 uses the established target detection model to detect the feeder, materials and equipment in the feeding area as detection targets, and outputs a data set of candidate areas for each detection target.
[0069] S120, based on the candidate regions of each detection target, uses the established extraction model and dataset to identify the key points of the core actions of the feeding behavior, outputs the visual recognition data of each key point, and extracts the visual features of the material to obtain visual material data.
[0070] S130: Acquire the equipment's working data, align the visual recognition data and visual material data with the working data according to the timestamp, and form the feeding visual data by aligning the aligned visual recognition data, visual material data and working data.
[0071] S140, perform data cleaning on the visual data of material feeding, and construct a material feeding coordinate system with the center of the material feeding port in the material feeding area as the absolute reference origin; wherein, the visual data of material feeding includes visual recognition data, visual material data and working data.
[0072] S150 normalizes and converts the coordinates of the visual recognition data into relative coordinates in the feeding coordinate system, normalizes the visual material data and working data numerically, and eliminates interference factors.
[0073] S160 standardizes all cleaned and normalized data to obtain structured feature vectors.
[0074] S170: Extract the structured feature vectors of the set number of consecutive time series to form the time series segment of feeding behavior. Take the feeder, material and equipment as spatiotemporal graph nodes. Each of the three spatiotemporal graph nodes has corresponding node features. S180, based on time-series segments, establishes the physiological connection relationship of key points of the feeder within the same frame according to the node characteristics of the spatiotemporal graph nodes of the feeder, and establishes the spatial association relationship between the feeder, material and equipment according to the node characteristics of the three spatiotemporal graph nodes to form a spatial edge; S190, based on temporal segments, obtains temporal edges within adjacent frames based on the motion trajectories of the same node features. Based on spatiotemporal graph nodes, temporal edges, and spatial edges, a fused spatiotemporal graph is constructed. Through the fused spatiotemporal graph, graph structure feature vectors are output.
[0075] In some embodiments of S110, based on the acquisition sensors installed in the feeding area, data is collected through the acquisition sensors, and the established target detection model is used to detect three core detection targets in the feeding area: the feeder, precious metal materials, and feeding equipment, and output a set of candidate areas for each target.
[0076] The target detection model is the YOLOv5 target detection model, but other detection models are not limited in this application. The feeding equipment includes: electronic scale, feeding port and material box.
[0077] Specifically, there are three sensors. The first sensor is located in the material handling area, covering the material bin and the starting section from the material handling area to the feeding area, ensuring that the action of leaving the material bin can be captured. The second sensor is located in the feeding area, covering the feeding port and the area within 1m in front of the feeding area, ensuring that the action of the hand approaching the feeding port can be captured. The third sensor has a 1-2m overlap in the field of view of the middle path from the material handling area to the feeding area, to prevent the hand from completely leaving the camera's field of view during movement.
[0078] A number of positioning markers (such as black and white blocks) are placed on the ground between the material bin and the feeding port. By identifying the positioning markers, it is possible to confirm whether the person is moving or has reached the feeding area, supplementing the key point detection of the camera and avoiding process loss due to obstruction during movement.
[0079] In some embodiments of S120, based on the candidate region, using the established extraction model, 21 key points of the hands, 12 key points of the upper limbs and torso of the person feeding the material are cropped and extracted, the normalized coordinates and visibility information of each key point are output, and the confidence of the visibility information is extracted to form visual recognition data.
[0080] Simultaneously, visual features of precious metal materials are extracted to obtain visual material data.
[0081] The visual material data includes: material type, material quantity, and location characteristics. Visual material data may also include other data, which are not limited in this application. The extraction model can be a MediaPipe pose or MediaPipehand keypoint extraction model.
[0082] In some embodiments of S130, the working data of the feeding device is acquired, and the visual recognition data, visual material data and working data are aligned with the timestamp and stored in the edge database to form feeding visual data. The acquisition frequency is set to 200ms / time to balance real-time performance and computational load.
[0083] The operating data includes the weight of the electronic scale and the status of the feeding port. Other data may also be included, which are not limited in this application.
[0084] Specifically, the steps for determining that the feeding behavior is correct are as follows: During the material handling stage, the material handling area is detected using the established target detection model. Material handling is considered complete only if all three of the following conditions are met simultaneously: 1) The established extraction model confirms that the key points of both wrists have entered the calibrated area of the material bin, with a confidence level > 0.6; 2) The established target detection model detects the target precious metal in the area surrounding the hands, with a confidence level > 0.7; 3) The exclusion condition is that the established extraction model detects the current material as not being the target precious metal, indicating that the wrong material was handled. If the above conditions are eliminated, the process proceeds to the next step; otherwise, the process is prohibited from proceeding to the next step.
[0085] During the movement phase, to verify whether the material is in the hands of the feeder, the coordinates of the key points of the hand in the set extraction model are used to dynamically select a "50px area around the hand" as the target detection area. The set target detection model detects this area every frame. If the target precious metal is not detected for several consecutive frames, a "material loss" warning is triggered and the process is paused. The judgment of a brief occlusion is as follows: based on the material position and hand trajectory of the previous frame, the "material in hand" state is temporarily stored, and the on-site occlusion warning light is lit to remind the user. The maximum waiting time is 3 seconds. After the material is visible again, the verification is repeated.
[0086] During the feeding stage, the established target detection model and extraction model are used for detection. Feeding is considered complete only if all four of the following conditions are met simultaneously: The extraction model confirms that the fingertip key point has entered the feed inlet calibration area with a confidence level > 0.5; the target detection model confirms that the target precious metal is detected in the feed inlet area with a confidence level > 0.6; and the target detection model confirms that the IOU between the target precious metal detection frame and the feed inlet detection frame is > 0.3. This confirms that the material has entered the feed inlet, not just approached it. The condition that only a hand is detected in the feed inlet area without detecting the precious metal, indicating empty feeding, is excluded, triggering an early warning.
[0087] After all three stages are completed, the current feeding behavior is considered to be correct. The collected and processed visual recognition data, visual material data and working data are aligned with the timestamp and stored in the edge database to form feeding visual data.
[0088] In some embodiments of S140, outliers are removed using the 3σ principle and missing values are filled using linear interpolation to achieve data cleaning.
[0089] Outliers can be the instantaneous number of jumps in the electronic scale and / or key point detection errors. Missing values can be due to the loss of key points caused by temporary occlusion or the loss of parameters caused by temporary sensor disconnection. This application does not impose any restrictions on the types of the above values.
[0090] A three-dimensional feeding coordinate system is constructed with the center of the feeding port as the absolute reference origin.
[0091] In some embodiments of S150, based on visual recognition data, the normalized coordinates of the 2D screen of each core key point of the feeder are converted into relative coordinates under the feeding coordinate system to achieve spatial normalization.
[0092] Specifically, the conversion formula is: x_rel=x_screen-x0, y_rel=y_screen-y0, z_rel=distance sensor acquisition value-z0, where x0, y0, and z0 are the coordinate values of the center of the feeding port in the feeding coordinate system.
[0093] Based on visual material data and work data, the material type, quantity, location characteristics, electronic scale weight, and feeding port status are numerically normalized. The min-max normalization formula is used to convert them into normalized values in the range of [0,1], eliminating spatial interference factors such as camera installation location, feeding personnel height, and material specifications, and ensuring the consistency and comparability of features.
[0094] Specifically, the normalization formula is: x_norm=(x-x_min) / (x_max-x_min).
[0095] In some embodiments of S160, all cleaned and normalized feature data are uniformly standardized to form structured feature vectors.
[0096] In some embodiments of S170, a set number of structured feature vectors are extracted according to the time series to form a time sequence segment of feeding behavior. The feeder, material and equipment are used as nodes in the spatiotemporal graph. Each node has corresponding node features, and the node features carry normalized coordinates and visibility information.
[0097] Specifically, the node features of the spatiotemporal graph nodes of the feeder are as follows: 12 core key points of the upper limbs and torso defined in S120 are reused, and 21 key points of the hands defined in S120 are reused. The node features include: the relative coordinates in the feeding coordinate system in S150 and the visibility information and / or confidence of each key point in S120.
[0098] The node features of the spatiotemporal graph nodes of the materials are: the visual material data of material type, material quantity and location features extracted from S120 are reused. The node features include: the [0,1] interval values in S150 after numerical normalization, that is, the visual material data after numerical normalization.
[0099] The node characteristics of the spatiotemporal diagram nodes of the equipment are: reuse the working data of electronic scale weight and feeding port switch status collected in S130. The node characteristics include: the numerical values in S160 after numerical normalization and standardization, that is, the working data after numerical normalization and standardization, to eliminate the equipment range difference.
[0100] In some embodiments of S180, the edge definition of the fused spatiotemporal graph is based on the spatial associations of S110 to S130 and the spatial references of S140 to S160.
[0101] Based on temporal segments, node features within the same frame are extracted. For the spatiotemporal graph nodes of the feeder, physiological connections (shoulder-elbow-wrist) are established for the feeder's key points. For the three spatiotemporal graph nodes of the feeder, material, and equipment, spatial relationships between the feeder, material, and equipment are established. Spatial edges are obtained through physiological and spatial connections.
[0102] Specifically, based on the spatial relationship between the feeder, materials, and equipment collected from S110 to S130 (such as whether the hand is in contact with the material or whether the material is directly above the feed port), and with the spatial distance calculated using the three-dimensional coordinate system with the feed port center of S140 as the origin, the spatial correlation characteristics under different camera perspectives and different feeder heights are comparable.
[0103] In some embodiments of S190, based on time segments, the time edge is the motion trajectory of the same node in adjacent frames.
[0104] The motion trajectory can be the movement trajectory of the hand from the electronic scale to the feeding port and / or the change trajectory of the material weight.
[0105] Specifically, based on the timestamp alignment of S130, the same node in adjacent frames constructs motion trajectories according to the timestamp order, and the changes in trajectory coordinates are relative coordinates after spatial normalization of S150, eliminating spatial interference of absolute coordinates.
[0106] The final output of the fused spatiotemporal graph is a graph structure feature vector with node features and edge weights. Its dimension and numerical range are determined by the structured feature vector, ensuring that it can be directly input into subsequent steps for dual-labeling without additional format conversion.
[0107] In some embodiments of this invention, in step S200, the process of obtaining the three-dimensional score includes: S210: Based on the graph structure feature vector, the score of posterior uncertainty is calculated using a Gaussian mixture model and a Bayesian posterior probability model.
[0108] S220: Based on the graph structure eigenvectors, calculate the absolute difference of the logarithmic conditional density of the two Gaussian mixture models, and calculate the boundary proximity score based on the absolute difference.
[0109] S230, based on the graph structure feature vector, calculate the sparsity score using the total joint probability density formula.
[0110] In some embodiments of S210, the variable tag v differs from traditional technical solutions that only determine normal or abnormal conditions. The essence of variable tag determination is to quantify the variability of the material feeding state through algorithms, predict whether the state will transform into an abnormal state, and provide a core basis for tiered anti-theft strategies. This is achieved through graph structure feature vectors.
[0111] Addressing the pain point that existing precious metal feeding anti-theft technologies can only passively detect anomalies and cannot predict potential risks in advance, this paper uses variable tags to determine the stability of the feeding status: for stable normal states (low variable tags), routine monitoring and handling are performed; for abnormal states (high variable tags), direct handling is performed; for unstable states (high variable, such as normal feeding but deviation from the standard, suspected anomalies that have not yet been determined), early warning and intervention are provided, realizing an upgrade from "passive alarm" to "proactive prediction".
[0112] Based on graph structure feature vectors, the graph structure feature vectors contain three types of normalized / standardized core features: Personnel action features: the three-dimensional relative coordinates of key points of the feeder's upper limbs, torso, and hands, as well as visual recognition data of detection confidence and / or visibility information. Material state features: visual material data of the precious metal material's type, quantity, and location characteristics. Equipment parameter features: operational data of the electronic scale's weight and the feed port's open / closed status.
[0113] The label variability algorithm is used to calculate the three-dimensional scores of posterior uncertainty, boundary proximity and sparsity for each sample, which characterize the degree of variability of the state from different perspectives. The higher the score, the stronger the variability.
[0114] The posterior uncertainty U characterizes the fuzziness of the model's determination of the feeding state.
[0115] Bayesian posterior probability:
[0116] Where: P(x|y=1): the class conditional density of each graph structure feature vector sample calculated using the Gaussian mixture model, P(y=1) is the prior probability, P(x)=P(x|y=1)P(y=1)+P(x|y=0)P(y=0), the closer the posterior probability is to 0.5, the more uncertain the model is.
[0117] Score for posterior uncertainty:
[0118] When P(y=1|x)=0.5, then U=1 (least uncertain); when P(y=1|x)=0 or 1, then U=0 (most certain). The posterior uncertainty U characterizes the ambiguity of the model's determination of the feeding state. x can be a sample, i.e., the graph structure feature vector of the current feeding behavior.
[0119] In some embodiments of S220, the boundary proximity B represents the distance between the feeding state characteristics and the normal-abnormal determination boundary. The higher the value, the more likely the state is in the boundary area and it is easy to transform into another state.
[0120] Calculate the log-class conditional density of two Gaussian mixture models (GMMs):
[0121]
[0122] Calculate the absolute difference:
[0123] Boundary proximity score:
[0124] When ΔlogP approaches 0, the boundary proximity B approaches 1; when ΔlogP is at its maximum, the boundary proximity B approaches 0.
[0125] In some embodiments of S230, sparsity S represents the frequency of occurrence of the feeding state in actual industrial scenarios. The higher the value, the rarer the scenario, the less regular the state characteristics, and the poor the stability.
[0126] Using the total joint probability density
[0127] Take the logarithm:
[0128] The lower the logP(x) value, the more abnormal or sparse the sample is. Sparsity score: Sort logP(x) to convert it into a sparsity ranking:
[0129] N is the total number of samples, and rank() is the ascending ranking (rank=1 has the lowest density).
[0130] S tends towards 1: extremely sparse, highly variable; S tends towards 0: dense region, low variability.
[0131] In some embodiments of this invention, in S200, the process of obtaining the variable tag includes: S240 uses the designed clustering algorithm to classify the fitted 3D scores into high-score clusters and low-score clusters.
[0132] S250 calculates the average weighted composite score of high-scoring and low-scoring clusters for each feeding behavior based on the three-dimensional scores.
[0133] S260, based on the set score threshold and the average weighted comprehensive score, divides the labels into low-variability labels and high-variability labels.
[0134] In this embodiment, the fitted 3D scores are classified using the designed clustering algorithm into high-score clusters and low-score clusters. The average weighted composite score of each sample from the two clusters is compared. All samples in the cluster with the lower average weighted composite score are labeled with low variability (v=-1), and all samples in the cluster with the higher average weighted composite score are labeled with high variability (v=+1). The dual-label dataset is thus constructed. In one embodiment, a score threshold can be set to differentiate between high and low scores.
[0135] Equal weighted comprehensive score
[0136] Average weighted composite score =
[0137] In some embodiments, samples are labeled based on clustering results, with unique and quantifiable rules. n is the total number of samples; i is the current sample and represents the current feeding action.
[0138] High variability label (v=+1): The sample is clustered into a high-score cluster, which indicates that the feeding state is unstable and there is a high risk of transformation into anomalies; Low variability label (v=-1): The samples are clustered into low-score clusters, which clearly represent stable feeding status, normal / abnormal characteristics, and no trend changes.
[0139] The variable labels include high-variability labels and low-variability labels. The clustering algorithms used include Gaussian Mixture Model (GMM) and K-means clustering. The GMM model is iterated 50 times, and the K-means clustering uses the elbow rule to determine the number of clusters to be 2.
[0140] In some embodiments of this invention, in step S300, the process of obtaining the feeding status tag includes: S310 uses the established extraction model to determine whether the feeder is feeding materials normally based on the characteristics of personnel actions and the established reliability threshold.
[0141] S320 determines whether the input material matches the approved data of the work order and has the same weight, based on the material state characteristics, equipment parameter characteristics, and the data verified by the work order.
[0142] S330, when it is determined that the feeder is a normal feeder, the input material matches the data verified by the work order and the weight is consistent, the feeding behavior corresponding to the graph structure feature vector is marked as normal feeding status.
[0143] In this embodiment, based on the requirements of the precious metal feeding process, the feeding status is divided into normal feeding status (y=+1) and abnormal feeding status (y=-1) using the established visual recognition algorithm. The abnormal feeding status is further subdivided into four categories: stealing materials, substituting materials, feeding insufficient materials, and feeding the wrong materials. Based on graph structure feature vectors, the feeding status is identified and determined according to the fusion features of personnel actions, material status, and equipment parameters.
[0144] The core of the judgment is to match and verify the three types of integrated features with the process requirements one by one. If all of them match, the feeding is normal. If one or more types of features conflict with the process requirements and form a chain of abnormal evidence that corroborates each other, the feeding is judged to be an abnormal state of the corresponding type.
[0145] The verification priority and correlation of the three types of integrated features are as follows: personnel action features are the behavioral basis, material status features are the core basis, and equipment parameter features are numerical evidence. All three are indispensable to avoid misjudgment based on a single feature.
[0146] Specifically, the confidence level setting for each target identification should be based on the actual production data, such as the pixel count of the acquisition sensor and ambient lighting, and the confidence threshold should be set according to the test results.
[0147] Operational guidelines require that the entire feeding process must follow the standard procedure of "picking up (material bin) - moving (designated path) - feeding (feeding port)". Hands / materials must always be within the monitoring field of vision. Non-standard actions such as hands deviating from the designated path or materials leaving the hands and not entering the feeding port are prohibited.
[0148] In other words, the MediaPipe extraction model and YOLOv5 detection model are used to detect the entire feeding process of personnel action features to ensure that the entire process is within the monitoring field of vision. The confidence level obtained from the personnel action features is compared with the set confidence threshold. If the set confidence threshold is met, the feeding is considered to be normal; otherwise, it is considered to be abnormal feeding.
[0149] Material matching requirements: The type and quantity of precious metal materials fed must be completely consistent with the production work order. It is forbidden to feed the wrong precious metal materials or replace them with non-target precious metal materials. In other words, the material type and quantity in the material status characteristics are matched with the data approved in the work order. If they match perfectly, the feeder can feed the material normally; otherwise, it is considered abnormal feeding.
[0150] Compliance requirements for quantity values: The weight of materials fed must accurately match the value approved in the production work order (the error must be within the allowable range of the process). It is forbidden to feed too little or too much material. All materials must be fed into the feeding port. It is forbidden to intercept or carry away materials during the process. In other words, the weight of the electronic scale in the equipment parameter characteristics is matched with the data verified by the work order. If they match perfectly, the feeder can feed materials normally; otherwise, it is considered abnormal feeding.
[0151] The determination of all material feeding status is based on whether it meets the above process requirements. If it does, it is marked as a normal feeding status. The execution of process requirements is verified by integrating features, thereby defining normal / abnormal and the type of abnormality.
[0152] Combining quantity compliance requirements and action standard requirements, the following abnormal evidence chain is used to determine the characteristics of the integrated human-material-equipment system. Based on the premise of completing legitimate material collection, all characteristics must be aligned with timestamps to form a continuous behavioral trajectory for verification: Personnel movements: MediaPipe extracted the model and detected that the key points of both wrists entered the material box calibration area, and the upper limb movements were in accordance with normal material handling specifications; Material status: The YOLOV5 detection model detected the target precious metal A around the hand, and the quantity and / or volume matched the work order verification value; Equipment parameters: The electronic scale detected a decrease in the weight of precious metal A in the bin, which was consistent with the work order's approved value, and there were no abnormal fluctuations.
[0153] Core: The abnormal fusion feature chain during the movement / feeding stage, which is the key basis for determining material theft.
[0154] In another embodiment, the process for determining the normal feeding status includes: When all three types of fused features—personnel actions, material status, and equipment parameters—match the process requirements, and there are no abnormal features throughout the process, forming a continuous and compliant behavior trajectory, it is determined to be a normal material feeding state. The core fused feature conditions are: Based on the human action features in the graph structure feature vector, it is determined that the entire process follows the specified path, the key point trajectory is standardized, and there are no anomalies such as occlusion, deviation, or non-standard convergence / extension. The confidence of key points at each stage meets the set process confidence threshold. Based on the material state characteristics in the graph structure feature vector, the type and quantity of materials picked up and fed are consistent with the work order, and the materials are kept around the hand / feeding port area throughout the process, with no abnormalities such as detachment, replacement, or interception. Based on the equipment parameter features in the graph structure feature vector, the decrease value of the material bin and the increase value of the feeding port are consistent with the work order verification value, the error is within the allowable range of the process, the sensor and electronic scale have no abnormal jumps, and the material type matches the result of the equipment identification module.
[0155] The feeding status labels include: normal feeding status (y=+1) and abnormal feeding status (y=-1). The visual recognition algorithms include: YOLOv5 object detection model and MediaPipe key point extraction model.
[0156] In some embodiments of this invention, in step S300, the process for determining the abnormal feeding status of the replacement material includes: S340, based on the characteristics of human movements, uses the established target detection model to detect whether the hand area simultaneously contains target precious metal materials and non-target precious metal materials; S341, Based on the material state characteristics, using the established target detection model, detect whether the feeding port area contains only non-target precious metal materials and no target precious metal materials. S342, based on the characteristics of equipment parameters and material status, check whether the weight and type of material at the feeding port are consistent with the data verified in the work order; S343, when there are target precious metal materials and non-target precious metal materials in the hand area, and only non-target precious metal materials and no target precious metal materials in the feeding port area, and the weight of the feeding port is the same but the characteristics of the material types do not match, then the abnormal feeding state is considered to be a replacement material.
[0157] In this embodiment, the process for determining the abnormal feeding status of the replacement material includes: Based on the human action features in the graph structure feature vector, the YOLOv5 detection model was used in the material handling stage to detect that both target precious metal A and non-target precious metal B were present around the hand. Based on the material state features in the graph structure feature vector, the YOLOV5 detection model was used in the feeding stage to detect that only non-target precious metal B was detected in the feeding port area, and no target precious metal A was detected. Based on the equipment parameter features and material state features in the graph structure feature vector, the electronic scale detected that the weight at the feeding port was consistent with the work order, but the material type features conflicted with the results of the equipment material identification module.
[0158] If the above core integration characteristic conditions are met, then the abnormal feeding state is considered to be a replacement material.
[0159] In some embodiments of this invention, in step S300, the process for determining the abnormal feeding state of insufficient material includes: S350, based on the characteristics of personnel movements, uses the established target detection model to detect whether the trajectory of the feeder's movements conforms to the established feeding specifications.
[0160] S351, based on the material state characteristics, use the established target detection model to detect whether the material quantity is consistent with the data verified in the work order.
[0161] S352, based on the equipment parameter characteristics, the weight at the feeding port is checked to ensure it matches the data verified in the work order.
[0162] S353, when it is determined that the action trajectory of the feeder conforms to the set feeding specifications, and the quantity of material and the weight of the feeding port do not match the data verified by the set work order, the abnormal feeding status is considered to be insufficient material feeding.
[0163] In this embodiment, the process for determining the abnormal feeding status of insufficient material includes: Based on the human action features in the graph structure feature vector, the actions of the feeder conform to the material picking-moving-feeding specification throughout the entire process, with no abnormalities such as deviation from the path or occlusion. Based on the material state features in the graph structure feature vector, the material quantity matches the work order during the material picking stage, and all materials are fed into the feeding port during the feeding stage. However, the YOLOV5 detection model detected that the material quantity picked was lower than the data set for the work order. Based on the equipment parameter features in the graph structure feature vector, the electronic scale detected that the decrease in the material bin and the increase in the material inlet were both lower than the data approved by the work order, and the error exceeded the allowable range of the process, and there was no approved record of insufficient material input.
[0164] If the above core integration characteristic conditions are met, the abnormal feeding state is considered to be insufficient material feeding.
[0165] In some embodiments of this invention, in step S300, the process for determining the abnormal feeding state further includes: In one embodiment, the process for determining the abnormal feeding status of incorrectly fed materials includes: Based on the human action features in the graph structure feature vector, it is determined that the human action is standardized throughout the entire process, without any anomalies such as obstruction, deviation from the path, or material entrainment. Based on the material state features in the graph structure feature vector, the YOLOv5 detection model was used to detect that the material being picked up and fed was non-target precious metal C, and there were no other materials. Based on the equipment parameter features and material status features in the graph structure feature vector, the electronic scale detected that the feeding weight was consistent with the work order, the material type feature was consistent with the result of the equipment material identification module, and there was no work order record for the non-target precious metal C in the production system, thus ruling out the possibility of asynchronous process adjustments.
[0166] If the above core integration characteristic conditions are met, the abnormal feeding status is considered to be incorrect material feeding.
[0167] In one embodiment, examples of abnormal material feeding conditions involving theft include: For example, if materials are smuggled during the movement phase (e.g., materials are hidden in the palm / pocket), the abnormal state of material feeding is detected by MediaPipe based on the personnel action features in the graph structure feature vector. The model detects that the trajectory of the hand key points deviates from the designated path from the material picking area to the feeding area, and there is an offset action towards the body pocket / clothing or outside the feeding area, and the fingertip key points close (palm folds); the hand is covered by the body for a period of time exceeding the set short-term covering range and / or time, and when the hand key points reappear after the covering, the position has deviated from the original movement path.
[0168] Based on the material state features in the graph structure feature vector, the YOLOV5 detection module did not detect the target precious metal material in a 50px area around the dynamically selected hand for 3 consecutive frames or more; and no target precious metal material was detected in the feeding area, excluding process-permissible situations such as material falling or equipment obstruction.
[0169] Based on the device parameter features in the graph structure feature vector, if the electronic scale does not detect an increase in material weight at the feeding port and the level sensor does not send a material feeding signal, it can be considered that the material was stolen.
[0170] Reference Figure 2 In some embodiments of this invention, in S500, the process of partitioning the extensible domain includes the following steps: S501 uses a dual-label dataset in the cloud to train the multi-task fusion model until the multi-task fusion model converges.
[0171] S510, based on the predicted value of the feeding status label, determine that the feeding status label is in normal feeding status, and based on the predicted value of the variability label, determine that the variability label is in low variability status. S520, based on the normal feeding status and low variability label, determines the corresponding extension domain as the positive variable domain from the four-quadrant extension domain, so as to determine the hierarchical anti-theft strategy set for the positive variable domain.
[0172] In some embodiments of S501, the structure of the multi-task fusion model is as follows: using an improved ST-GCN as a shared feature extraction layer, alternating stacking of spatial graph convolution and temporal convolution is performed on the fused spatiotemporal graph to extract global spatiotemporal fusion features of feeding behavior; after the ST-GCN output layer, two parallel prediction heads are set to share global spatiotemporal fusion features.
[0173] The parallel prediction heads are the feeding state prediction head and the variability prediction head.
[0174] Feeding status prediction head: Uses a Soft Max classifier to output the probability distribution of normal feeding status / abnormal feeding status, corresponding to the feeding status label y; Variability prediction head: Uses the Sigmoid function to output the probability of high / low variability, corresponding to the variability label v.
[0175] The weighted joint loss function is used, and the formula is: L=αLST GCN+βLBCEy+γLBCEv, where LST GCN is the spatiotemporal feature loss of ST-GCN, LBCEy and LBCEv are the binary cross-entropy losses of the feeding state label y and the variability label v, respectively, with α=β=γ=1, balancing the training priority of the two tasks.
[0176] The multi-task fusion model is trained in the cloud using a dual-label dataset until the model converges; the trained lightweight model is then deployed to an industrial edge computing box to achieve real-time inference, with inference time controlled within 100ms / frame.
[0177] In some embodiments of S510 and S520, the multi-task fusion model outputs the predicted values of the feeding status label y and the variability label v, and executes the corresponding hierarchical anti-theft strategy based on the extension domain corresponding to the predicted value.
[0178] Status label: Determines whether the current feeding behavior is a normal feeding status or an abnormal feeding status; it is a result-based determination. Variableness label: Determines whether the normal or abnormal feeding state is stable or whether there will be a trend change; this is a trend determination.
[0179] The predicted values of the dual tags are combined according to the pairwise pairing logic to form a four-quadrant extension domain, which provides a unique basis for subsequent hierarchical anti-theft strategies.
[0180] Based on a four-quadrant extension domain classification and considering the security level requirements for precious metal feeding, the model's output dual-label predicted values are divided into four extension domains: negative quantitative variable domain, negative qualitative variable domain, positive qualitative variable domain, and positive quantitative variable domain. Differentiated, tiered anti-theft strategies are implemented for each extension domain. Simultaneously, the model's automatic differentiation mechanism calculates the dominant variable factors (such as personnel hand movements, material weight, and / or electronic scale readings) to achieve precise intervention.
[0181] In one embodiment, based on the predicted values of the feeding status label and the variability label, if the feeding status label is determined to be in a normal feeding state (y=+1) and the variability label is determined to be in a low-variability state (v=-1), then the corresponding extension domain is determined to be a positive variable domain. The feeding status characteristics are normal feeding, stable state, and no abnormal trend. The corresponding hierarchical anti-theft strategy is: maintain monitoring: relax the detection threshold, reduce computational overhead, and retain normal feeding data. The strategy for dealing with dominant factors is: no intervention required, continuous data collection.
[0182] In one embodiment, based on the predicted values of the feeding status tag and the variability tag, if the feeding status tag is determined to be in a normal feeding state (y=+1) and the variability tag is determined to be in a highly variable state (v=+1), then the corresponding extension domain is determined to be a positive qualitative change domain. The feeding status characteristic is normal feeding, but the state is unstable and prone to transformation into anomalies. The corresponding hierarchical anti-theft strategy is: early warning and enhanced monitoring: local indicator lights flash, detection frequency is increased to 100ms / time, access to the feeding area is locked, and changes in the dominant factors are tracked. The strategy for dealing with the dominant factors is: focus on monitoring the dominant factors, and upgrade the strategy if the transformation into anomalies continues.
[0183] In one embodiment, based on the predicted values of the feeding status tag and the variability tag, if the feeding status tag is determined to be an abnormal feeding state (y=-1) and the variability tag is determined to be a low-variability tag (v=-1), then the corresponding extension domain is determined to be a negative variable domain. The feeding status characteristic is abnormal feeding, stable state, and a clearly defined abnormal feeding state. The corresponding hierarchical anti-theft strategy is: immediate alarm and forced handling: remotely push to the security terminal, lock the feeding port and / or hopper, and capture abnormal video to preserve evidence. The dominant factor response strategy is: freeze the relevant parameters of the dominant factor, preserve evidence, and coordinate with security personnel for on-site intervention.
[0184] In one embodiment, based on the predicted values of the feeding status tag and the variability tag, if the feeding status tag is determined to be an abnormal feeding state (y=-1) and the variability tag is determined to be a high variability tag (v=+1), then the corresponding extension domain is determined to be a negative prime variable domain. The feeding status characteristics are abnormal feeding, unstable state, and suspected abnormality undetermined. The corresponding hierarchical anti-theft strategy is: trend tracking and minor intervention: local silent alarm security terminal prompts, on-site voice reminders for standardized feeding, suspension of the feeding process, and continuous monitoring of dominant factors. The dominant factor response strategy is: guiding personnel to correct the related behaviors of the dominant factor; if the trend does not reverse within 10 seconds, it is escalated to mandatory handling.
[0185] In some embodiments of this invention, in step S500, the process of determining the tiered anti-theft strategy includes the following steps: S530 determines the corresponding hierarchical anti-theft strategy through the corresponding extended domain.
[0186] S540 executes the corresponding set hierarchical anti-theft strategy, obtains the material feeding visual data within the set time period, analyzes the changing trend based on the material feeding visual data, adjusts the dual-label dataset based on the changing trend, updates the multi-task fusion model, and optimizes the set hierarchical anti-theft strategy.
[0187] In some embodiments of S530, a differentiated hierarchical anti-theft strategy is implemented for each extensible domain.
[0188] The extension domain includes: negative quantitative domain, negative qualitative domain, positive qualitative domain, and positive quantitative domain.
[0189] In one embodiment, when the extension domain is a positive variable domain, the feeding status is characterized by normal feeding, a stable state, and no abnormal trends. The corresponding hierarchical anti-theft strategy is: maintain monitoring: relax the detection threshold, reduce computational overhead, and retain normal feeding data. The strategy for dealing with dominant factors is: no intervention required, continuous data collection.
[0190] In one embodiment, when the extensible domain is a positive qualitative change domain, the feeding state is characterized as normal feeding, but the state is unstable and prone to transformation into anomalies. The corresponding hierarchical anti-theft strategy is: early warning and enhanced monitoring: local indicator lights flash, detection frequency is increased to 100ms / time, access to the feeding area is locked, and changes in the dominant factors are tracked. The strategy for dealing with the dominant factors is: focus on monitoring the dominant factors, and upgrade the strategy if the transformation into anomalies continues.
[0191] In one embodiment, when the extensible domain is a negative variable domain, the feeding status is characterized by abnormal feeding, stable status, and clearly defined abnormal feeding state. The corresponding hierarchical anti-theft strategy is: immediate alarm and forced handling: remotely push to the security terminal, lock the feeding port and / or hopper, and capture abnormal video to preserve evidence. The dominant factor response strategy is: freeze the relevant parameters of the dominant factor, preserve evidence, and coordinate with security personnel for on-site intervention.
[0192] In one embodiment, when the extensible domain is a negative qualitative change domain, the feeding status is characterized by abnormal feeding, unstable state, and suspected abnormality that is not yet defined. The corresponding tiered anti-theft strategy is: trend tracking and minor intervention: local silent alarm security terminal prompts, on-site voice reminders for standardized feeding, suspension of the feeding process, and continuous monitoring of the dominant factors. The strategy for dealing with the dominant factors is: guiding personnel to correct the behaviors related to the dominant factors; if the trend does not reverse within 10 seconds, it is escalated to mandatory action.
[0193] In some embodiments of S540, the corresponding hierarchical anti-theft strategy is determined by the corresponding extended domain.
[0194] After implementing the anti-theft strategy, visual data of material feeding is continuously collected for 5 seconds to analyze the changing trend of the material feeding status: if the trend changes to normal, the multi-task fusion model is fine-tuned within the variable security threshold; if the trend changes to abnormal, the sample is marked as a high-risk sample and added to the dual-label dataset; if the status does not change, the original parameters are retained for continuous monitoring. Record staff intervention operations (such as false alarm cancellation, missed alarm supplementation, manual alarm, process update), associate operation details with corresponding material feeding visual data as samples to be corrected, and correct the dual-label dataset according to preset rules (such as correcting v to -1 for false alarm positive quality change samples). The corrected high-confidence samples are added to the dual-label dataset. The multi-task fusion model is retrained monthly, and the parameters of the old model are overwritten with the new dataset. The model is then redeployed to the edge to continuously improve its generalization. At the same time, the judgment criteria of the graded anti-theft strategy are optimized according to the process update, and the new material feeding process requirements are adapted to achieve iterative optimization of the multi-task fusion model.
[0195] Through S530 and S540, based on dual-label extended-domain four-quadrant classification, differentiated hierarchical anti-theft strategies are implemented. Simultaneously, the system calculates the dominant variable factors for precise intervention, linking audible and visual alarms, security terminals, and material feeding equipment to complete hierarchical handling. A dual closed-loop system of automatic algorithm feedback and manual operation feedback is achieved, correcting dual labels, supplementing with high-confidence samples, and periodically iterating and optimizing the model. The model can adapt to updates in material feeding processes, new abnormal material feeding methods, and adjustments to camera / sensor installation positions without requiring the re-collection of large amounts of data for model training, significantly reducing industrial deployment and maintenance costs.
[0196] Please see Figure 3 This application also provides a precious metal feeding anti-theft system based on a dual-tag set, which can realize the above-mentioned precious metal feeding anti-theft method based on a dual-tag set. The device includes: The vision processing module is used to acquire visual data of feeding behavior, construct a fused spatiotemporal graph based on the feeding visual data, and output graph structure feature vector through the fused spatiotemporal graph. The variability determination module is used to calculate the three-dimensional scores of posterior uncertainty, boundary proximity and sparsity for each feeding behavior based on the graph structure feature vector, cluster the three-dimensional scores and assign labels to high-score clusters and low-score clusters to obtain variability labels. The feeding status determination module is used to identify abnormal feeding status and normal feeding status based on the graph structure feature vector, and assign labels to abnormal feeding status and normal feeding status to obtain feeding status labels. The dual-label set construction module is used to combine graph structure feature vectors, corresponding feeding status labels, and variability labels to obtain a dual-label dataset. The extension domain decision module is used to input the dual-label dataset into the trained multi-task fusion model, output the predicted values of the feeding status label and the variability label, determine the corresponding extension domain based on the predicted values, and determine the corresponding hierarchical anti-theft strategy based on the corresponding extension domain.
[0197] It is understood that the content of the above method embodiments is applicable to this system embodiment. The specific functions implemented in this system embodiment are the same as those in the above method embodiments, and the beneficial effects achieved are also the same as those achieved in the above method embodiments.
[0198] This application also provides an electronic device, which includes a memory and a processor. The memory stores a computer program, and the processor executes the computer program to implement the aforementioned anti-theft method for precious metal feeding based on a dual-tag set. This electronic device can be any smart terminal, including tablet computers, in-vehicle computers, etc.
[0199] It is understood that the content of the above method embodiments is applicable to this device embodiment. The specific functions implemented by this device embodiment are the same as those of the above method embodiments, and the beneficial effects achieved are also the same as those achieved by the above method embodiments.
[0200] Please see Figure 4 , Figure 4 The hardware structure of an electronic device according to another embodiment is illustrated. The electronic device includes: The processor 901 can be implemented using a general-purpose CPU (Central Processing Unit), microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits, and is used to execute relevant programs to implement the technical solutions provided in the embodiments of this application. The memory 902 can be implemented as a read-only memory (ROM), static storage device, dynamic storage device, or random access memory (RAM). The memory 902 can store the operating system and other application programs. When the technical solutions provided in the embodiments of this specification are implemented through software or firmware, the relevant program code is stored in the memory 902 and is called and executed by the processor 901 to implement the precious metal feeding anti-theft method based on dual-tag sets according to the embodiments of this application. The input / output interface 903 is used to implement information input and output; The communication interface 904 is used to enable communication and interaction between this device and other devices. Communication can be achieved through wired means (such as USB, Ethernet cable, etc.) or wireless means (such as mobile network, WIFI, Bluetooth, etc.). Bus 905 transmits information between various components of the device (e.g., processor 901, memory 902, input / output interface 903, and communication interface 904); The processor 901, memory 902, input / output interface 903, and communication interface 904 are connected to each other within the device via bus 905.
[0201] This application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method for preventing theft of precious metals based on a dual-tag set.
[0202] It is understood that the content of the above method embodiments is applicable to this storage medium embodiment. The specific functions implemented in this storage medium embodiment are the same as those in the above method embodiments, and the beneficial effects achieved are also the same as those achieved in the above method embodiments.
[0203] Memory, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs and non-transitory computer-executable programs. Furthermore, memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory may optionally include memory remotely located relative to the processor, and these remote memories can be connected to the processor via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
[0204] The embodiments described in this application are for the purpose of more clearly illustrating the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided by the embodiments of this application. As those skilled in the art will know, with the evolution of technology and the emergence of new application scenarios, the technical solutions provided by the embodiments of this application are also applicable to similar technical problems.
[0205] Those skilled in the art will understand that the technical solutions shown in the figures do not constitute a limitation on the embodiments of this application, and may include more or fewer steps than shown, or combine certain steps, or different steps.
[0206] Those skilled in the art will understand that all or some of the steps in the methods disclosed above, as well as the functional modules / units in the systems and devices, can be implemented as software, firmware, hardware, or suitable combinations thereof.
[0207] Furthermore, the terms “comprising” and “having”, and any variations thereof, are intended to cover non-exclusive inclusion, such that a process, method, system, product, or apparatus that includes a series of steps or units is not necessarily limited to those steps or units that are explicitly listed, but may include other steps or units that are not explicitly listed or that are inherent to such process, method, product, or apparatus.
[0208] It should be understood that in this application, "at least one (item)" means one or more, and "more than" means two or more. "And / or" is used to describe the relationship between related objects, indicating that three relationships can exist. For example, "A and / or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one (item) of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one (item) of a, b, or c can represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.
[0209] The preferred embodiments of the present application have been described above with reference to the accompanying drawings, but this does not limit the scope of the claims of the present application. Any modifications, equivalent substitutions, and improvements made by those skilled in the art without departing from the scope and substance of the embodiments of the present application shall be within the scope of the claims of the present application.
Claims
1. A method for preventing theft of precious metals based on a dual-tag set, characterized in that, The method includes: Obtain visual data of feeding behavior, construct a fused spatiotemporal graph based on the feeding visual data, and output a graph structure feature vector through the fused spatiotemporal graph; Based on the graph structure feature vector, calculate the three-dimensional scores of posterior uncertainty, boundary proximity, and sparsity for each feeding behavior, cluster the three-dimensional scores, and assign labels to high-score clusters and low-score clusters to obtain variability labels; Based on the graph structure feature vector, abnormal feeding status and normal feeding status are identified, and labels are assigned to the abnormal feeding status and the normal feeding status to obtain feeding status labels; The graph structure feature vector, the corresponding feeding state label, and the variability label are combined to obtain a dual-label dataset; The dual-label dataset is input into the trained multi-task fusion model, which outputs the predicted values of the feeding status label and the variability label. Based on the predicted values, the corresponding extension domain is determined, and based on the corresponding extension domain, the corresponding hierarchical anti-theft strategy is determined.
2. The method according to claim 1, characterized in that, The calculation of the three-dimensional scores for posterior uncertainty, boundary proximity, and sparsity for each feeding action includes: Based on the graph structure feature vector, the score of the posterior uncertainty is calculated using a Gaussian mixture model and a Bayesian posterior probability model. Based on the graph structure feature vector, calculate the absolute difference of the log-class conditional density of the two Gaussian mixture models, and calculate the boundary proximity score based on the absolute difference. The sparsity score is calculated based on the graph structure feature vector using the total joint probability density formula.
3. The method according to claim 1, characterized in that, The process of clustering the three-dimensional scores and assigning labels to high-score and low-score clusters to obtain variable labels includes: The fitted 3D scores are classified into high-score clusters and low-score clusters using the established clustering algorithm. Based on the three-dimensional scores, calculate the average weighted composite score of the high-score cluster and the low-score cluster for each feeding behavior; Based on the set score threshold and the average weighted comprehensive score, low-variability labels and high-variability labels are divided. The variable label includes the low variable label and the high variable label.
4. The method according to claim 1, characterized in that, The graph structure feature vector includes personnel action features, material status features, and equipment parameter features. The labels are assigned to the abnormal feeding status and the normal feeding status to obtain the feeding status labels, including: Using the established extraction model, based on the personnel action characteristics and the set confidence threshold, it is determined whether the feeder is feeding materials normally; Based on the material state characteristics, the equipment parameter characteristics, and the work order verification data, determine whether the input material matches the work order verification data and has the same weight. When it is determined that the feeder is feeding normally, and the material fed matches the data approved in the work order and has the same weight, the feeding behavior corresponding to the graph structure feature vector is marked as the normal feeding state.
5. The method according to claim 4, characterized in that, The abnormal feeding status includes material replacement. The label is assigned values for the abnormal feeding status and the normal feeding status to obtain the feeding status label, including: Based on the characteristics of the personnel's movements, the established target detection model is used to detect whether the hand area simultaneously contains target precious metal materials and non-target precious metal materials. Based on the material state characteristics, the target detection model is used to detect whether the feeding port area contains only the non-target precious metal material and not the target precious metal material. Based on the equipment parameter characteristics and the material state characteristics, check whether the weight and material type characteristics at the feeding port are consistent with the data approved in the work order. When the hand area contains both target precious metal material and non-target precious metal material, and the feeding port area contains only the non-target precious metal material and not the target precious metal material, and the weights of the feeding ports are consistent but the characteristics of the material types do not match, then the abnormal feeding state is considered to be a replacement material.
6. The method according to claim 4, characterized in that, The abnormal feeding status includes insufficient material feeding. The label is assigned values for the abnormal feeding status and the normal feeding status to obtain the feeding status label, including: Based on the aforementioned personnel movement characteristics, the established target detection model is used to detect whether the movement trajectory of the feeder conforms to the established feeding specifications. Based on the material state characteristics, the set target detection model is used to detect whether the material quantity is consistent with the set work order verification data. Based on the equipment parameter characteristics, the weight at the feeding port is found to be consistent with the data verified in the work order. When it is determined that the action trajectory of the feeder conforms to the set feeding specifications, and the quantity of the material and the weight of the feeding port do not match the data verified by the set work order, the abnormal feeding status is considered to be insufficient material feeding.
7. The method according to claim 1, characterized in that, The step of determining the corresponding extension domain based on the predicted value includes: Based on the predicted value of the feeding status label, the feeding status label is determined to be in a normal feeding state, and based on the predicted value of the variability label, the variability label is determined to be a low variability label. Based on the normal feeding state and the low variability label, the corresponding extension domain is determined to be a positive variable domain from the four quadrant extension domain, so as to determine the hierarchical anti-theft strategy set for the positive variable domain. The four-quadrant extension domain includes the negative quantitative domain, the negative qualitative domain, the positive qualitative domain, and the positive quantitative domain.
8. A precious metal feeding anti-theft system based on a dual-tag set, characterized in that, The system includes: The vision processing module is used to acquire visual data of feeding behavior, construct a fused spatiotemporal graph based on the feeding visual data, and output a graph structure feature vector through the fused spatiotemporal graph. The variability determination module is used to calculate the three-dimensional scores of posterior uncertainty, boundary proximity and sparsity of each feeding behavior based on the graph structure feature vector, cluster the three-dimensional scores and assign labels to high-score clusters and low-score clusters to obtain variability labels. The feeding status determination module is used to identify abnormal feeding status and normal feeding status based on the graph structure feature vector, and assign labels to the abnormal feeding status and the normal feeding status to obtain feeding status labels. A dual-label set construction module is used to combine the graph structure feature vector, the corresponding feeding state label, and the variability label to obtain a dual-label dataset. The extension domain decision module is used to input the dual-label dataset into the trained multi-task fusion model, output the predicted values of the feeding status label and the variability label, determine the corresponding extension domain based on the predicted values, and determine the corresponding hierarchical anti-theft strategy based on the corresponding extension domain.
9. An electronic device, characterized in that, The electronic device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to implement the method according to any one of claims 1 to 7.
10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it implements the method of any one of claims 1 to 7.