Biological detection methods, devices, detection terminals and storage media
By combining multimodal data processing of infrared thermal images and environmental images, the problem of inaccurate outdoor biometric identification under low power consumption is solved, and higher accuracy in biometric category identification is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- DONGGUAN ZKTECO ELECTRONICS TECH
- Filing Date
- 2025-06-10
- Publication Date
- 2026-06-30
AI Technical Summary
In outdoor biological detection under low power consumption requirements, due to the low sampling frequency of sensors and the fast movement speed of animals, existing technologies are unable to accurately identify animals using time-series models. The collected data consists of only one or two frames of valid images, resulting in inaccurate biometric identification.
Multimodal data processing is performed by combining infrared thermal images and environmental images. The candidate categories and location ranges of target organisms are determined by infrared thermal images, and features are extracted and classified by combining biological visual images. Multimodal verification is used to improve recognition accuracy.
By fusing multimodal data, the accuracy of identifying target biological categories was improved, and the biological detection effect was enhanced in low-power scenarios.
Smart Images

Figure CN120599709B_ABST
Abstract
Description
Technical Field
[0001] This application relates to, but is not limited to, the field of information processing technology, and in particular to a biological category detection method, device, detection terminal, and storage medium. Background Technology
[0002] Currently, outdoor biological detection solutions suffer from limitations due to low power consumption requirements, resulting in low sensor sampling frequencies to reduce power consumption. Furthermore, the high speed of animal movement makes accurate identification via time-series models impossible, with only one or two frames of valid image data collected. Therefore, how to accurately identify outdoor organisms by combining multimodal data under these low-power requirements and scenarios has become a problem to be solved. Summary of the Invention
[0003] In view of this, embodiments of this application provide at least one biological category detection method, apparatus, detection terminal, and storage medium.
[0004] The technical solution of this application embodiment is implemented as follows:
[0005] In a first aspect, embodiments of this application provide a biological category detection method applied to a detection terminal. The method includes: determining a candidate category of a target organism in the environment based on an acquired infrared thermal image, and determining a first location range and a biological thermal distribution image of the target organism in the infrared thermal image; determining a biological visual image corresponding to the target organism in an acquired environmental image based on the first location range; the environmental image and the infrared thermal image carrying feature information of the target organism at the same time; determining a predicted category and a predicted thermal distribution image of the target organism based on the candidate category and the biological visual image; verifying the predicted thermal distribution image based on the biological thermal distribution image, and outputting the predicted category as the target category if the verification is successful.
[0006] Secondly, embodiments of this application provide a biological category detection device applied to the detection terminal. The device includes: a first determining module, configured to determine a candidate category of a target organism in the environment based on a collected infrared thermal image, and determine a first location range and a biological thermal distribution image of the target organism in the infrared thermal image; a second determining module, configured to determine a biological visual image corresponding to the target organism in the collected environmental image based on the first location range; the environmental image and the infrared thermal image carry feature information of the target organism at the same time; a third determining module, configured to determine a predicted category and a predicted thermal distribution image of the target organism based on the candidate category and the biological visual image; and a verification module, configured to verify the predicted thermal distribution image based on the biological thermal distribution image, and output the predicted category as the target category if the verification is successful. Thirdly, embodiments of this application provide a detection terminal, including a memory and a processor. The memory stores a computer program that can run on the processor, and the processor executes the program to implement some or all of the steps in the above method.
[0007] Thirdly, embodiments of this application provide a detection terminal, which includes a processor, a memory, a first acquisition unit, and a second acquisition unit. The memory stores a computer program that can run on the processor. The processor executes some or all of the steps in the above-described method when executing the computer program. The first acquisition unit is used to acquire infrared thermal images, and the second acquisition unit is used to acquire environmental images.
[0008] Fourthly, embodiments of this application provide a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements some or all of the steps in the above-described method.
[0009] Fifthly, embodiments of this application provide a computer program product, including a computer program or instructions, which, when executed by a processor, implement some or all of the steps in the above-described method.
[0010] Technical Effects: Based on the acquired infrared thermal images, the candidate categories of target organisms in the environment can be determined, as well as the first location range of the target category in the infrared thermal images and the biological thermal distribution image. This allows for preliminary identification of target organisms in the environment based on infrared thermal images. The biological visual image corresponding to the target organism is determined in the acquired environmental image based on the first location range. The environmental image and the infrared thermal image carry the characteristic information of the target organism at the same time. Based on the candidate category and the biological visual image, the predicted category and predicted thermal distribution image of the target organism are determined. This allows for the combined prediction of the target organism's category using both infrared thermal images and environmental images, improving the accuracy of detecting target organism categories in the environment compared to existing technologies that only use a single modality for detection. The predicted thermal distribution image is verified based on the biological thermal distribution image, and if the verification is successful, the predicted category is output as the target category. This verification of multimodal prediction results further improves the accuracy of the detected target organism category.
[0011] It should be understood that the above general description and the following detailed description are merely exemplary and explanatory, and are not intended to limit the technical solutions of this application. Attached Figure Description
[0012] The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with this application and, together with the specification, serve to explain the technical solutions of this application.
[0013] Figure 1 A schematic diagram illustrating the implementation process of a biological category detection method provided in this application embodiment;
[0014] Figure 2 A schematic diagram illustrating the implementation process of a biological category detection method provided in this application embodiment;
[0015] Figure 3 A schematic diagram illustrating the implementation process of a biological category detection method provided in this application embodiment;
[0016] Figure 4 A schematic diagram illustrating the implementation process of a biological category detection method provided in this application embodiment;
[0017] Figure 5 A schematic diagram illustrating the implementation process of a biological category detection method provided in this application embodiment;
[0018] Figure 6 A schematic diagram illustrating the implementation process of a biological category detection method provided in this application embodiment;
[0019] Figure 7This application provides an implementation framework diagram for biological category detection.
[0020] Figure 8 This is a schematic diagram of the composition structure of a biological category detection device provided in an embodiment of this application;
[0021] Figure 9 This is a schematic diagram of the hardware entity of a detection terminal provided in an embodiment of this application. Detailed Implementation
[0022] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application are further described in detail below with reference to the accompanying drawings and embodiments. The described embodiments should not be regarded as limitations on this application. All other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0023] In the following description, references to "some embodiments" are made, which describe a subset of all possible embodiments. However, it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict. The terms "first / second / third" are used merely to distinguish similar organisms and do not represent a specific ordering for organisms. It is understood that "first / second / third" may be interchanged in a specific order or sequence where permitted, so that the embodiments of this application described herein can be implemented in an order other than that illustrated or described herein.
[0024] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains. The terminology used herein is for descriptive purposes only and is not intended to be limiting of this application.
[0025] Currently, in outdoor biological detection solutions, due to the requirement for low power consumption, the sensor sampling frequency is low, such as 5 images per second or even lower, in order to reduce power consumption. In addition, animals move quickly, making it impossible to accurately identify them using time-series models. The collected data consists of only one or two frames of valid images. Therefore, under the requirements of low power consumption and the above scenarios, how to combine multimodal data to accurately identify outdoor organisms has become a problem to be solved.
[0026] This application provides a biological category detection method, which can be executed by the processor of a detection terminal. The detection terminal can be a device with data processing capabilities, such as a server, laptop, tablet, desktop computer, smart TV, set-top box, or mobile device.
[0027] Figure 1This is a schematic diagram illustrating the implementation flow of a biological category detection method provided in an embodiment of this application. This method can be executed by the processor of a detection terminal. Figure 1 As shown, the method includes the following steps S101 to S104, combining... Figure 1 The steps are explained below.
[0028] Step S101: Based on the acquired infrared thermal image, determine the candidate category of the target organism in the environment, and determine the first location range and biological thermal distribution image of the target organism in the infrared thermal image.
[0029] In some embodiments, the detection terminal is provided with at least one acquisition unit, which may include an infrared thermal image acquisition unit (e.g., an infrared thermal imager), an infrared light image acquisition unit (e.g., a near-infrared camera), or a visible light image acquisition unit (e.g., a camera, a visible light camera).
[0030] In some embodiments, an infrared thermal image acquisition unit disposed on the detection terminal acquires infrared thermal images of the environment surrounding the detection terminal. The infrared thermal image is a visualized image generated based on the infrared light emitted by biological radiation acquired by the infrared thermal image acquisition unit, wherein the infrared thermal image can reflect the temperature range distribution of various organisms in the surrounding environment.
[0031] In some embodiments, after acquiring an infrared thermal image of the environment surrounding the detection terminal through the infrared thermal image acquisition unit, the infrared thermal image is preprocessed, including performing non-uniformity correction on the infrared thermal image to eliminate fixed pattern noise of the infrared thermal image acquisition unit; and performing temperature mapping on the infrared thermal image to convert the temperature signal represented by the infrared thermal image into an actual temperature value.
[0032] In some embodiments, the target organism may be an animal, a plant, a biomimetic animal, a biomimetic plant, etc.
[0033] In some embodiments, the target organism may include at least one organism, and the candidate category may include the category of at least one organism.
[0034] In some embodiments, candidate categories are matched from a biological database based on the temperature range distribution in a preprocessed infrared thermal image; wherein the biological database includes temperature ranges for multiple categories of organisms.
[0035] In some embodiments, feature extraction can be performed on the preprocessed infrared thermal image to obtain the thermodynamic and morphological features of the target organism. The thermodynamic features characterize the temperature range distribution of the target organism, and the morphological features characterize the contour features of the target organism (e.g., number of limbs, aspect ratio, etc.). The temperature range of each part of the target organism is determined based on the contour features and temperature range distribution of the target organism, and the candidate categories of the target organism are matched from the biological database based on the temperature range of each part of the target organism.
[0036] In some embodiments, at least one candidate category of an organism is matched from a biological database based on the temperature range distribution of the target organism and the ambient temperature. It is understood that the candidate categories may include a first candidate category and a second candidate category; the first candidate category of the target organism is determined based on the difference between the average temperature of the target organism and the ambient temperature, and the second candidate category of the target organism is matched from the biological database based on the temperature range distribution of the target organism.
[0037] If the difference is greater than or equal to the preset temperature, the first candidate category for the target organism is a homeothermic animal; if the difference is less than the preset temperature, the first candidate category for the target organism is a poikilothermic animal.
[0038] For example, if the average temperature of the target organism is 35°C and the ambient temperature is 25°C, the difference is 18°C. The preset temperature is 10°C, thus determining the first candidate category of the target organism as a homeothermic animal. The morphological characteristics of the target organism include four limbs and a tail. The temperature of the limbs and tail is lower than that of the trunk. Therefore, the second candidate category of the target organism is matched from the biological database, including dogs, cats, foxes, etc.
[0039] In some embodiments, if the target organism includes at least one organism, then the first location range includes the location range corresponding to each organism, and the organism thermal distribution image includes the thermal distribution image of each organism.
[0040] In some embodiments, the coordinates of the upper left and lower right corners of the target organism are obtained from an infrared thermal image based on its outline. A rectangular bounding box is then generated based on these coordinates and used as the location range of the target organism. This method of identifying the location range of the target organism using a rectangular bounding box, compared to using a mask image in related technologies, requires less computational power and improves the generation rate in low-power outdoor scenarios.
[0041] In some embodiments, based on the temperature range distribution of an infrared thermal image, the temperature range distribution of a target organism in a first location range is determined, and at least two temperature regions are generated based on the temperature range distribution of the target organism. Each temperature region corresponds to a different temperature range, and each temperature region is marked with a different marking state to obtain a thermal distribution image corresponding to the target organism.
[0042] For example, at least two temperature zones include a low temperature zone, a medium temperature zone, and a high temperature zone, with the low temperature zone marked in blue, the medium temperature zone marked in green, and the high temperature zone marked in red.
[0043] Step S102: Determine the biological visual image corresponding to the target organism in the acquired environmental image based on the first location range.
[0044] The environmental image and the infrared thermal image carry the characteristic information of the target organism at the same time.
[0045] In some embodiments, the environmental image may include visible light image and infrared light image, which are acquired based on at least one acquisition unit disposed on the detection terminal.
[0046] The environmental images and infrared thermal images are acquired simultaneously by multiple acquisition units. That is to say, the environmental images and infrared thermal images are acquired by multiple acquisition units at the same time and location of the environment surrounding the retrieval unit; the coordinate system transformation relationship between the multiple acquisition units is pre-calibrated.
[0047] The first biological visual image may include both visible light visual images and infrared light visual images of the target organism.
[0048] In some embodiments, based on the pre-calibrated transformation relationship between the coordinate systems of each acquisition unit, the same area as the first position range of the thermal imaging image is located in the environmental image, and the environmental image is cropped to obtain a biological visual image of the target organism in the environmental image corresponding to the first position range in the thermal imaging image.
[0049] Specifically, based on the coordinate transformation relationship between the acquisition unit corresponding to the infrared thermal image and the acquisition unit corresponding to the infrared light image, the same area as the first position range is located from the infrared light image, and the infrared light image is cropped to obtain the infrared light visual image of the target organism.
[0050] Specifically, based on the coordinate transformation relationship between the acquisition unit corresponding to the infrared thermal image and the acquisition unit corresponding to the visible light image, the same area as the first position range is located from the visible light image, and the visible light image is cropped to obtain the visible light visual image of the target organism.
[0051] Step S103: Based on the candidate categories and the biological visual image, determine the predicted category and predicted heat distribution image of the target organism.
[0052] In some embodiments, features are first extracted from the biological visual image to obtain the texture features, contour features, and thermal distribution features of the target organism. Based on the correspondence between the texture features, contour features, and thermal distribution features and the biological category, the target organism is selected from the candidate categories to obtain the predicted category.
[0053] In some embodiments, the standard temperature range distribution of organisms corresponding to the predicted category is obtained from a biological database based on the predicted category; the standard temperature range distribution characterizes the temperature distribution of each part of the organism in the predicted category; the contours of each part of the target organism in the biological visual image are obtained, and the temperature ranges corresponding to each part of the target organism in the biological visual image are matched based on the standard temperature range distribution; a temperature value is assigned to each pixel in each part contour based on each pixel in the biological visual image, thereby obtaining a predicted thermal distribution image of the target organism; the category of the target organism in the environmental organisms is determined based on the predicted thermal distribution image.
[0054] Specifically, based on the temperature range of each part in the standard temperature range distribution, the average temperature value of each part is determined, and each pixel in the contour of each part of the biological visual image is filled with the corresponding average temperature value, thereby obtaining the predicted thermal distribution image of the target organism.
[0055] For example, the predicted category includes dogs, and the standard temperature range distribution of dogs includes the temperature range of the limbs, the temperature range of the head (which can be further subdivided into the ears, face, etc.), the temperature range of the trunk (which can be further subdivided into the back, chest, etc.), and the temperature range of the tail, and determines the average temperature value of each part; in the biological visual image, the average temperature value corresponding to each part is matched, and the average temperature value is used to assign a temperature value to the pixel of each part, thereby obtaining the predicted thermal distribution image of the dog.
[0056] Step S104: Validate the predicted heat distribution image based on the biological heat distribution image, and output the predicted category as the target category if the validation is successful.
[0057] In some embodiments, feature regions are extracted from the biological heat distribution image and the predicted heat distribution image to obtain feature vectors for the biological heat distribution image and the predicted heat distribution image, respectively. The feature vectors of the biological heat distribution image and the predicted heat distribution image are then compared to obtain their similarity. The predicted heat distribution image is then verified based on the similarity between the two.
[0058] Specifically, if the similarity between the feature vector of the biological thermal distribution image and the feature vector of the predicted thermal distribution image is greater than or equal to a preset similarity, the predicted distribution image is considered to have passed verification, and the predicted category is output as the target category.
[0059] Specifically, if the similarity between the feature vector of the biological thermal distribution image and the feature vector of the predicted thermal distribution image is less than the preset similarity, the verification of the predicted distribution image fails, the predicted category is inaccurate, and no predicted category is output.
[0060] In some embodiments, if the predicted thermal distribution image passes verification, the infrared thermal image, environmental image, biological visual image, and predicted thermal distribution image are packaged and sent to the cloud.
[0061] The detection terminal is also equipped with a voice interaction unit. Users can control the voice interaction unit of the detection terminal in the cloud to send voice information (such as warning information) to target organisms in the surrounding environment, and to obtain the sounds emitted by target organisms through the voice interaction unit of the detection terminal.
[0062] In this embodiment, based on the acquired infrared thermal image, the candidate category of a target organism in the environment can be determined, as well as the first location range of the target category in the infrared thermal image and the biological thermal distribution image. This allows for preliminary identification of the target organism in the environment based on the infrared thermal image. The biological visual image corresponding to the target organism is determined in the acquired environmental image according to the first location range. The environmental image and the infrared thermal image carry the characteristic information of the target organism at the same time. Based on the candidate category and the biological visual image, the predicted category and predicted thermal distribution image of the target organism are determined. This allows for the combined prediction of the target organism's category using both the infrared thermal image and the environmental image, improving the accuracy of detecting the target organism's category in the environment compared to existing schemes that only use a single modality for detection. The predicted thermal distribution image is verified based on the biological thermal distribution image, and if the verification is successful, the predicted category is output as the target category. This verification of the multimodal prediction results further improves the accuracy of the detected target organism's category.
[0063] Figure 2 This is a schematic diagram illustrating the implementation flow of a biological category detection method provided in an embodiment of this application. This method can be executed by the processor of a detection terminal. Based on Figure 1 , Figure 1 In step S103, steps S201 to S203 can be updated to combine... Figure 2 The steps shown are explained.
[0064] Step S201: Extract features from the biological visual image to obtain the biological visual feature map corresponding to the biological visual image.
[0065] In some embodiments, the biological visual image can be feature extracted based on image processing algorithms or deep learning algorithms to obtain the biological visual feature map.
[0066] In some embodiments, the biological visual image includes at least one of the following: a visible light image and an infrared light image, wherein the visible light image is used to extract texture features, color features, shape features, etc. of the target organism, and the infrared light image is used to extract temperature distribution, hot spot location, thermal boundary, etc. of the target organism.
[0067] For example, for a visible light image, a color feature extraction algorithm is used to statistically analyze the distribution frequency of RGB three-channel pixel values in the visible light image to obtain color features; a texture feature extraction algorithm is used to statistically analyze the gray-level joint distribution of pixel pairs in the visible light image under specific directions (e.g., 0°, 45°, 90°, 135°) and distances to generate texture features such as contrast, energy, entropy, and correlation; and the Canny edge detection algorithm is used to extract contour features and shape features in the visible light image.
[0068] For example, for infrared images, the temperature of the target organism in the infrared image is extracted based on a thermal statistical algorithm, and the average temperature, temperature variance, and temperature extreme values are calculated to obtain temperature features. The Otsu thresholding method segmentation algorithm is used to segment the high-temperature region in the infrared image to obtain the location and area of hot spots. The edges of temperature abrupt changes (such as the thermal boundary between the animal and the environment) are extracted based on the Sobel operator or the Laplacian operator to obtain thermal boundary features.
[0069] In some embodiments, where the biological visual image includes a visible light image and an infrared light image, the features extracted from the visible light image and the features extracted from the infrared light image are fused to obtain a biological visual feature map.
[0070] Step S202: Input the candidate category and the biological visual feature map into the classification network to obtain the predicted category of the target organism.
[0071] In some embodiments, the classification network can be a trained neural network model.
[0072] The trained neural network model can be a trained dual-stream convolutional neural network (Dual-Stream CNN, DS-CNN) or a trained Transformer neural network model.
[0073] In some embodiments, if the biological visual feature map includes a visual feature map corresponding to a visible light image and a visual feature map corresponding to an infrared light image, and the trained neural network model is a dual-stream convolutional neural network (Dual-Stream CNN, DS-CNN), then its structure may include an input branch, a fusion layer, and a classifier, wherein the input branch includes a visible light branch, an infrared light branch, and a candidate category branch.
[0074] In some embodiments, the biological visual feature map and candidate categories are input into a trained first convolutional neural network to determine the predicted category from the candidate categories. This includes: preprocessing the biological visual feature map and candidate categories before inputting them into the trained first convolutional neural network, including converting the candidate categories into an encoding format or vector format and standardizing the biological visual feature map. Then, the encoded vectors of the candidate categories and the standardized biological visual feature map are input into the trained first convolutional neural network to obtain the predicted category. Specifically, the visual feature map corresponding to the visible light image is processed by the visible light branch to extract texture features, color features, and shape features; the visual feature map corresponding to the infrared light image is processed by the infrared light branch to extract temperature distribution, hot spot location, and thermal boundary; the encoded vector of the candidate category is processed by the candidate category branch into a vector with the same dimension as the biological visual feature map; then, the above features are concatenated along the dimension by a fusion layer, and the concatenated features are fused; the fused features are input into a trained classifier; the probability distribution of each candidate category is obtained, and the candidate category with the highest probability is determined as the predicted category.
[0075] The visible light branch is equipped with a pre-trained feature extraction model (e.g., a 50-layer residual neural network, such as ResNet50, or an efficient neural network, such as EfficientNet) to output visible light feature vectors. The infrared light branch is also equipped with a pre-trained feature extraction model to output infrared light feature vectors. The candidate category branch is used to convert the encoding vectors of candidate categories into vectors of the same dimension as the visible light feature vectors and the infrared light feature vectors.
[0076] The fusion layer contains a pre-trained fusion model that concatenates and fuses the encoding vectors of the candidate categories, visible light feature vectors, and infrared light feature vectors to obtain a fused vector.
[0077] The classifier contains a pre-trained classification model used to obtain the predicted probability of each candidate category based on the fusion vector.
[0078] In some embodiments, the training process in the first convolutional neural network includes: acquiring a first training set, including multiple visible light visual feature images and infrared light visual feature images corresponding to each biological category among multiple biological categories, wherein the visible light visual feature images and infrared light visual feature images have been standardized, and the visible light visual feature images and infrared light visual feature images carry the labeled categories of biological categories; further including a candidate category set, the candidate category set including the encoding vector of each candidate category; inputting the multiple visible light visual feature images, the multiple infrared light visual feature images, and the encoding vectors of the candidate categories into the first convolutional neural network to be trained to obtain the predicted category; adjusting the model parameters of the first convolutional neural network to be trained based on the loss value between the predicted category and the labeled category until the convergence condition is met; and outputting the trained first convolutional neural network.
[0079] In some embodiments, the biological visual feature map and the candidate category are input into a trained second convolutional neural network to predict the category of the target organism based on the trained second convolutional neural network, thereby obtaining the predicted category; this includes: preprocessing the biological visual feature map and the candidate category to obtain a standardized biological visual feature map and semantic feature vectors of the candidate category; and inputting the standardized biological visual feature map and semantic feature vectors of the candidate category into the trained second convolutional neural network to obtain the predicted category for the target organism.
[0080] In this process, the visual feature map corresponding to the visible light image is processed by the visible light branch to extract texture features, color features, and shape features; the visual feature map corresponding to the infrared light image is processed by the infrared light branch to extract temperature distribution, hot spot location, and thermal boundary; the semantic feature vector of the candidate category is processed by the candidate category branch into a vector with the same dimension as the biological visual feature map; then it is fused by the fusion layer to obtain the fused feature vector; the classifier outputs the probability of all biological categories based on the fused feature vector, and determines the biological category with the highest probability among all biological categories as the predicted category.
[0081] In some embodiments, the training process of the second convolutional neural network includes: acquiring a second training set, including visual feature maps corresponding to visible light images and infrared light images of organisms corresponding to multiple biological categories; the visual feature maps corresponding to visible light images and infrared light images have been standardized; the visual feature maps corresponding to visible light images and infrared light images respectively carry corresponding labeled categories; the second training set also includes semantic descriptions of candidate categories; the second training set also includes semantic descriptions of all preset number of biological categories; inputting the visual feature maps corresponding to visible light images, the visual feature maps corresponding to infrared light images, the semantic descriptions of candidate categories, and the semantic descriptions of preset number of biological categories into the second convolutional neural network to be trained to obtain predicted categories; adjusting the model parameters of the second convolutional neural network to be trained based on the loss value between the predicted categories and the labeled categories until the convergence condition is met, and outputting the trained second convolutional neural network.
[0082] Step S203: Input the standard thermal distribution information corresponding to the candidate category and the biological visual feature map into the thermal distribution prediction network to determine the predicted thermal distribution image of the target organism.
[0083] In some embodiments, the standard thermal distribution information may be a standard thermal distribution feature map of the organism corresponding to the candidate category, or it may be a textual description of each part of the organism corresponding to the candidate category; it characterizes the temperature distribution of each part of the organism corresponding to the candidate category. For example, if the organism corresponding to the candidate category is a dog, the temperature distribution of the dog's limbs is 34.0-36.0℃, the temperature distribution of the trunk is 38.0-39.0℃, and the temperature distribution of the tail is 30.0-34.0℃.
[0084] In some embodiments, the heat distribution prediction network can be a trained second neural network model, such as a Transformer Neural Network model or a Convolutional Neural Network (CNN).
[0085] In some embodiments, before inputting the standard thermal distribution information and the biological visual feature map into the thermal distribution prediction network, it is necessary to preprocess the standard thermal distribution information and the biological visual feature map, including extracting the biological visual feature vector of the biological visual feature map and the standard features of the standard thermal distribution information; inputting the biological visual features and the standard features into the trained second neural network model to fuse the biological visual features and the standard features to obtain fused features, and generating a predicted thermal distribution image of the target organism based on the fused features.
[0086] In some embodiments, the biological visual feature map includes a visual feature map corresponding to a visible light image and a visual feature map corresponding to an infrared light image. The standard thermal distribution information includes a standard thermal distribution map of the organism corresponding to the candidate category. Feature extraction is performed on the visual feature map corresponding to the visible light image and the visual feature map corresponding to the infrared light image to obtain the visible light feature vector and the infrared light feature vector of the target organism. The visible light feature vector includes contour feature vectors and texture feature vectors of each part, and the infrared light feature vector includes temperature distribution vector, hot spot position vector, and thermal boundary feature vector. Feature extraction is performed on the standard thermal distribution map to obtain the standard temperature distribution features of each part of the organism corresponding to the candidate category. The above features are then standardized, and a learning feature vector (e.g., a mapping matrix between part and temperature) is generated based on the standard temperature distribution features of each part. The visible light feature vector, the infrared light feature vector, and the learning feature vector are fused to obtain a fused feature vector. A predicted thermal distribution image is generated based on the fused feature vector.
[0087] The process involves fusing visible light feature vectors and infrared light feature vectors to learn feature vectors, and generating a predicted heat distribution image based on the fused feature vector. This includes: first, performing feature stitching to obtain a stitched feature vector; then, calculating the weight of each feature; and finally, fusing the features based on the weights of each feature to obtain a fused feature. Finally, based on a deconvolutional network and bilinear interpolation, the image size is progressively upsampled to the input biological visual features to generate the predicted heat distribution image.
[0088] In some embodiments, the training process of the second neural network model includes: acquiring a third dataset, including visual feature maps corresponding to visible light images and infrared light images of multiple biological categories, as well as standard thermal distribution information and labeled thermal distribution images of the target category; wherein, the visual feature map corresponding to the visible light image and the visual feature map corresponding to the infrared light image of each biological category carries the labeled category of the corresponding biological category; the standard thermal distribution information carries the temperature range distribution of each part of the corresponding biological category; inputting the visual feature maps corresponding to the visible light images and the visual feature maps corresponding to the infrared light images of multiple biological categories, as well as the standard thermal distribution information of the target category, into the second neural network model to be trained to obtain a predicted thermal distribution image; adjusting the model parameters of the second neural network model to be trained based on the loss value between the predicted thermal distribution image and the standard thermal distribution image until the convergence condition is met, and then outputting the trained second neural network model.
[0089] In this embodiment, by extracting features from a biological visual image, a biological visual feature map corresponding to the biological visual image can be obtained. Based on the candidate category and the biological visual feature map, a classification network is used to predict the category of the target organism, resulting in a predicted category. Based on the standard heat distribution information of the predicted category and the biological visual feature map, a heat distribution prediction network is used to obtain a predicted heat distribution image of the target organism. Thus, by multimodal fusion of the biological visual feature map of the target organism and the heat distribution information corresponding to the predicted category, a predicted heat distribution image of the target organism can be obtained for category detection, thereby improving the accuracy of detection.
[0090] Figure 3 This is a schematic diagram illustrating the implementation flow of a biological category detection method provided in an embodiment of this application. This method can be executed by the processor of a detection terminal. Based on Figure 2 , Figure 2 Step S203 can be updated to steps S301 and S302, combining Figure 3 The steps shown are explained.
[0091] Step S301: Based on the biological visual feature map, determine the segmentation result of the target organism.
[0092] The segmentation result includes the segmented region of the target organism and the sub-regions of each body part of the target organism within the segmented region.
[0093] In some embodiments, the biological visual feature map includes a visual feature map corresponding to a visible light image and a visual feature map corresponding to an infrared light image. Based on the fused feature map of the visual feature map corresponding to the visible light image and the visual feature map corresponding to the infrared light image, segmentation is performed by an image segmentation model, including segmenting the fused feature map based on an overall mask to obtain a segmented region of the target organism; and segmenting the segmented region of the target organism based on a part mask to obtain sub-regions of each body part of the target organism in the segmented region.
[0094] For example, the overall segmentation mask is 1, and the target organism part in the fused feature map is represented by the overall segmentation mask 1 to obtain the segmented region of the target organism; the background part is represented by mask 0 to obtain the segmented region of the background part in the fused feature map; for each part in the segmented region of the target organism, it is represented based on the segmentation mask corresponding to each body part in the target organism; for example, the head region is represented by mask 2, the torso by mask 3, and the limbs by mask 4, so as to obtain the sub-regions of each body part in the segmented region.
[0095] Step S302: Based on the sub-regions of the segmented region of each body part of the target organism and the standard thermal distribution information, fit the predicted thermal distribution image of the target organism.
[0096] In some embodiments, based on the segmented sub-regions of each body part in the target organism (such as the head, torso, and limbs) as spatial constraints, and combined with the standard thermal distribution information of the corresponding parts (such as the mean temperature and range), a predicted thermal distribution image is generated through a regression model.
[0097] In some embodiments, the masks of each body part and standard thermal distribution information are input into a regression model to extract the spatial feature vectors (shape and position) of the masks of each body part, and to encode the standard thermal distribution information to obtain learnable feature vectors (matrix of mapping between body part and temperature); the learnable feature vectors and spatial feature vectors are concatenated and then decoded to obtain a predicted thermal distribution image.
[0098] In this embodiment, the biological visual feature map of the target organism is segmented to obtain the segmented region of the target organism and the sub-regions of each body part of the target organism within the segmented region. Based on the sub-regions of the segmented region and the standard thermal distribution information, a predicted thermal distribution image of the target organism is fitted. Thus, compared to a whole-body fitting approach, fitting based on the parts of the target organism improves the accuracy of subsequent target organism category detection.
[0099] Figure 4 This is a schematic diagram illustrating the implementation flow of a biological category detection method provided in an embodiment of this application. This method can be executed by the processor of a detection terminal. Based on Figure 2 The environmental image includes an infrared image and a visible light image; the biological visual feature image includes a first biological image corresponding to the infrared image and a second biological image corresponding to the visible light image; Figure 2 Step S201 can be updated to steps S401 to S404, combining Figure 4 The steps shown are explained.
[0100] Step S401: Perform size transformation on the first biological image and the second biological image to obtain a first biological image and a second biological image with uniform size.
[0101] In some embodiments, size transformation refers to using methods such as image scaling, cropping, and interpolation to ensure that the first biological image and the second biological image maintain the same spatial resolution. For example, both can be adjusted to the same size pixel matrix.
[0102] In some embodiments, it is necessary to determine the target size of the first biological image and the second biological image, wherein the target size may be the size of the first biological image, the size of the second biological image, or other preset size.
[0103] In some embodiments, a scaling direction and scaling size for the first biological image are determined based on the size of the first biological image and the target size. Specifically, if the size of the first biological image is larger than the target size, the scaling direction is to shrink; if the size of the first biological image is smaller than the target size, the scaling direction is to enlarge.
[0104] In some embodiments, the scaling direction and scaling size for the second biological image are determined based on the size of the second biological image and the target size. Specifically, if the size of the second biological image is larger than the target size, the scaling direction is to reduce; if the size of the second biological image is smaller than the target size, the scaling direction is to enlarge.
[0105] Step S402: Extract features from the first biological image based on the first feature extraction network to obtain a first visual feature map.
[0106] In some embodiments, the first feature extraction network can be a trained neural network, such as a Customized Convolutional Neural Network (Customized CNN) or Mobile Neural Network Version 3 (MobileNetV3).
[0107] In some embodiments, the first biological image is preprocessed and then input into a trained neural network to obtain the temperature distribution features, hot spot location features, thermal boundary features, etc. of the target biological image, as well as the feature map corresponding to each feature. The feature maps corresponding to each feature are all in the same dimension. After multiple feature maps are stitched together along the same dimension, the stitched feature maps are weighted and fused to obtain the first visual feature map.
[0108] Step S403: Extract features from the second biological image based on the second feature extraction network to obtain a second visual feature map.
[0109] In some embodiments, the second feature extraction network can be a trained neural network, such as a convolutional neural network (CNN), a 50-layer residual neural network (ResNet50), etc.
[0110] In some embodiments, the second biological image is preprocessed and then input into a trained neural network to obtain the texture features, edge features, shape features, color features, etc. of the target biological image, as well as the feature map corresponding to each feature. The feature maps corresponding to each feature are all in the same dimension. After multiple feature maps are stitched together along the same dimension, the stitched feature maps are weighted and fused to obtain the second visual feature map.
[0111] Step S404: Fuse the first visual feature map and the second visual feature map to obtain the biological visual feature map corresponding to the biological visual image.
[0112] In some embodiments, weights are generated for the first visual feature map and the second visual feature map, and the feature maps are weighted and summed based on the weights of each feature map to obtain a fused biological visual feature map. The weights of the first visual feature map and the second visual feature map can be pre-defined or generated based on a neural network model.
[0113] In some embodiments, the first visual feature map and the second visual feature map can be stitched together according to dimensions to obtain a biological visual feature map.
[0114] In this embodiment, the efficiency of backflow feature fusion is improved by unifying the sizes of the first biological image and the second biological image. A first visual feature map is obtained by extracting features from the first biological image, and a second visual feature map is obtained by extracting features from the second biological image. Thus, by fusing the first and second visual feature maps, a biological visual feature map corresponding to the biological visual image can be obtained. This improves fusion efficiency and the accuracy of the generated biological visual feature map.
[0115] Figure 5 This is a schematic diagram illustrating the implementation flow of a biological category detection method provided in an embodiment of this application. This method can be executed by the processor of a detection terminal. Based on Figure 1 , Figure 1 Step S104 can be updated to steps S501 to S504, combining Figure 5 The steps shown are explained.
[0116] Step S501: Based on the sub-regions of the segmented region for each body part of the target organism, the thermal distribution image of the organism and the predicted thermal distribution image are segmented respectively.
[0117] In some embodiments, region segmentation refers to dividing the entire thermal distribution image into multiple local regions, each corresponding to a specific body part, such as the head, torso, or limbs. By segmenting the thermal distribution image according to body parts, the difference between the actual thermal distribution and the predicted thermal distribution can be compared more accurately.
[0118] In some embodiments, based on the above, each body part has a segmentation mask for its corresponding sub-region. The segmented sub-region mask (such as head mask, torso mask, and limb mask) is multiplied element-wise with the biological thermal distribution image to extract the biological thermal distribution region of each part. The segmented sub-region mask (such as head mask, torso mask, and limb mask) is multiplied element-wise with the predicted thermal distribution image to extract the predicted biological thermal distribution region of each part.
[0119] Step S502: For each body part, determine the first similarity between the body part in the first sub-distribution image of the biological thermal distribution image and the body part in the second sub-distribution image of the predicted thermal distribution image.
[0120] In some embodiments, a first Euclidean distance to the biological heat distribution region corresponding to the target body part and a second Euclidean distance to the predicted heat distribution region corresponding to the target body part are calculated, and a first similarity is determined based on the first Euclidean distance and the second Euclidean distance.
[0121] For example, the first Euclidean distance of the biological heat distribution region corresponding to the head is calculated, the second Euclidean distance of the predicted heat distribution region corresponding to the head is calculated, and the first Euclidean distance and the second Euclidean distance are compared to obtain the first similarity.
[0122] Step S503: Based on the first similarity of each body part, generate the target similarity between the biological heat distribution image and the predicted heat distribution image.
[0123] In some embodiments, the target similarity can be an overall similarity index derived by weighted averaging or other aggregation methods of the first similarities of all body parts. For example, the similarity of each part can be combined using methods such as average, weighted average, or maximum / minimum values to reflect the consistency between the entire predicted heat distribution image and the actual heat distribution image. The weights can be set according to the importance of different body parts; for example, the head and torso may have higher weights.
[0124] In some embodiments, a first hash value of the whole biological heat distribution image and a second hash value of the whole predicted heat distribution image are calculated and compared to obtain a second similarity; a target similarity is obtained by weighted summation based on the first similarity and the second similarity.
[0125] Step S504: Determine the verification result based on the target similarity and the similarity threshold.
[0126] In some embodiments, the target similarity is compared with a similarity threshold. If the target similarity is greater than or equal to the similarity threshold, the verification is successful and the predicted category is output as the target category. If the target similarity is less than the similarity threshold, the verification is unsuccessful, that is, the predicted category is inaccurate.
[0127] In this embodiment, by segmenting the heat distribution image according to body parts and calculating the similarity of each part, then synthesizing the overall similarity and verifying it, the accuracy and robustness of the prediction model can be effectively improved. This allows for precise evaluation of the matching degree between the predicted heat distribution image and the actual heat distribution image, thereby optimizing the model training effect and ultimately improving the accuracy of biometric identification.
[0128] Figure 6 This is a schematic diagram illustrating the implementation flow of a biological category detection method provided in an embodiment of this application. This method can be executed by the processor of a detection terminal based on... Figure 1 , Figure 1 Step S102 can be updated to steps S601 and S602, combining... Figure 6 The steps shown are explained.
[0129] Step S601: Based on a pre-calibrated conversion relationship, convert the first position range of the target organism in the infrared thermal image into the second position range of the target organism in the environmental image.
[0130] In some embodiments, the pre-calibrated transformation relationship refers to the pixel mapping relationship between the infrared thermal image and the visible light ambient image established through system calibration. This relationship is typically obtained by aligning and scaling a calibration board or a target object of known size under different imaging modes. For example, during the installation phase, the device can establish a spatial coordinate transformation model by simultaneously capturing infrared and texture images of the same scene and using algorithms to identify the positions of identical feature points in both.
[0131] In some embodiments, the first location range may be the outline region of the target organism. The region is segmented to obtain a two-dimensional mask of the region. The coordinates of the upper left corner and the lower right corner of the outline region are obtained to generate a bounding box. The coordinates of all pixels are extracted from the mask or the bounding box as the first location range to be transformed. Based on a pre-calibrated transformation relationship, the coordinates of the first location range are transformed to the coordinates of the second location range in the environmental image.
[0132] Step S602: Use the second location range to crop the environmental image to obtain the biological visual image; the proportion of the target organism in the biological visual image is the same as the proportion of the target organism in the biological visual image.
[0133] In some embodiments, cropping using the second location range means cropping a sub-image region containing the target organism from the environmental image based on the mapped coordinate range described above.
[0134] In some embodiments, based on the rectangular bounding box corresponding to the second location range, all pixel regions corresponding to the rectangular bounding box are cropped from the environmental image to obtain the biological visual image. During the cropping process, no scaling operation is performed to maintain the size ratio of the target in the original image.
[0135] In this embodiment, through a pre-calibrated conversion relationship, the target location information in the infrared thermal image can be accurately mapped to the visible light image, thereby improving the accuracy of image cropping. By using a second location range to crop the environmental image while maintaining the consistent proportion of the target organism, the acquired biological visual image can be ensured to have high fidelity, thereby improving the accuracy of animal identification and behavior analysis.
[0136] In some embodiments, step S101 above includes the following implementation process if the verification fails:
[0137] If the verification fails, the candidate category, the biological visual image, and the biological thermal distribution image are uploaded to the cloud; the cloud is used to generate a fused feature map based on the biological texture image and the biological thermal distribution image; and to generate the target category of the target object based on the fused feature map and the candidate category.
[0138] In some embodiments, when the local device's recognition model cannot accurately determine the category of the target object, the system uploads the relevant data to the cloud. Compared to the detection terminal, the cloud has abundant computing resources, can use larger-scale neural network models, and possesses stronger generalization capabilities. The cloud is used to receive and process data from the terminal device, including biological visual images (such as visible light images and infrared images), biological thermal distribution images (such as infrared thermograms), and candidate category information.
[0139] In some embodiments, a fused feature map refers to an image representation with higher discriminative power generated by extracting multi-dimensional features from images of different modalities (such as visible light images and infrared light images). The cloud-based system employs feature-level fusion to extract animal appearance features from biological texture images and body temperature distribution features from biological thermal distribution images. These two sets of features are then integrated to form a unified feature vector for subsequent classification.
[0140] In some embodiments, the target category is the species or class to which the target object is ultimately identified. In cases where local identification fails, the cloud platform regenerates the target category by performing more precise matching and comparison of the uploaded image and features, ensuring the reliability of the identification results. For example, if a fox is mistakenly identified as a wolf locally, the cloud platform can use a more advanced deep learning model to verify and correct the misclassification.
[0141] In this embodiment, a cloud processing mechanism is introduced when verification fails, utilizing the powerful computing capabilities of the cloud to perform deep fusion and re-identification of image data. This compensates for the limitations of local recognition models, thereby improving the overall accuracy and stability of recognition and enabling it to adapt to more complex application scenarios.
[0142] The following describes an exemplary application of a biological category detection method provided in this application in a real-world scenario.
[0143] In fields such as wildlife research, ecological conservation, and animal husbandry, monitoring and collecting information on animal behavior, species, and numbers is of great significance. Existing animal monitoring equipment has many limitations, such as a single power supply method, making it difficult to meet the needs of long-term outdoor use; inconvenient data transmission, making it impossible to transmit collected data to designated locations in real time; and poor image acquisition quality and low animal identification accuracy at night or in low-light environments. Therefore, there is a need for a remote, all-weather, outdoor intelligent animal identification and data collection device that can overcome these problems.
[0144] The data acquisition device includes a power supply and charging module with a power switch for powering on and off, effectively saving battery consumption. It features a Type-C charging dock with a reversible interface for convenient use. Supporting USB 3.1, it can transmit 4K video at higher speeds and supports high-current charging (3A and 5A) as well as reverse charging. The device converts solar energy into electrical energy via a battery and solar panels, storing the DC power in the battery for long-term power support, ensuring stable outdoor operation.
[0145] The data acquisition device includes a light sensor and a night-time operation module, employing a photoresistor. This photoresistor is highly sensitive to light, exhibiting high resistance in the absence of light and rapidly decreasing resistance under strong light. When there is no light or the light level is low, and no light reaches the photoresistor, the light board circuit controls the infrared lights to activate. Based on the mapping relationship between different light intensities and infrared light intensities, the infrared lights are activated at varying power levels. Simultaneously, the camera activates night mode to ensure clear image acquisition even at night or in low-light environments.
[0146] The acquisition device includes an image acquisition and processing module. The camera's photosensitive components and control components acquire and digitize images, performing animal AI feature extraction, recognition, classification, and comparison on the acquired images. A thermal imaging component uses an infrared detector to receive infrared radiation energy from people or animals in front of the camera, obtaining infrared thermal images, and similarly performing animal AI feature extraction, recognition, classification, and comparison. Both modules exchange data with Wi-Fi SD cards and IoT SIM cards via the Serial Peripheral Interface (SPI) protocol. The system acquires texture images of the environment based on the camera and infrared thermal images of the environment based on the thermal imaging component; the infrared thermal images and texture images are fused to obtain the target image. A trained image recognition model is used to identify the target image to determine the biological category in the environment; this trained image recognition model is based on images of different biological categories and the body temperature data of each animal.
[0147] The data acquisition device includes: a data transmission module and an IoT SIM card. The IoT SIM card transmits data and recognition results collected by the camera and thermal imaging to the cloud or a designated data center in real time, providing stable and efficient network access services for the devices and enabling data exchange and information sharing between devices. The Wi-Fi SD card transmits data and recognition results collected by the camera to the wireless network terminal in real time, significantly improving the data sharing capabilities and convenience of the devices.
[0148] The data acquisition device includes a sound acquisition and interaction module, a microphone for capturing sound, and a speaker for transmitting sound. The two work together to combine sound with real-time images from the camera, enriching the monitoring information.
[0149] The data acquisition device includes a triggering and alarm module, which determines the alarm strategy based on the currently identified biological category (e.g., different alarm strategies correspond to different protection levels for animals); the alarm strategies include audible and visual alarms, reporting to the cloud, or terminal devices equipped by staff. The PIR sensor detects infrared radiation emitted by people or animals near the camera and converts it into a trigger signal that can be used to activate and wake up the system when it is operating at low power. The alarm device promptly sounds an alarm siren when a security event is detected, alerting or deterring destructive behavior by people or animals.
[0150] The data acquisition device includes a status indicator and a mounting module. Indicator lights reflect the device's operating status. The device offers multiple mounting options, allowing it to be tied to a tree with straps or fixed to a wall with screws, facilitating installation and use in various environments.
[0151] Through the coordinated operation of the above modules, this device realizes the function of remote all-weather outdoor animal identification and data collection, solving the problems of insufficient power supply, poor nighttime identification effect, and inconvenient data transmission in the existing technology, and has good practicality and scalability.
[0152] In addition, the device can be expanded with additional functions according to actual application needs, such as adding a GPS positioning module, multilingual voice prompts, and a remote control module, to enhance its ability to adapt to complex environments.
[0153] In summary, the remote all-weather outdoor intelligent animal identification and data collection device provided by this invention not only enables efficient animal identification and data collection, but also features a stable power supply system, reliable nighttime operation capability, powerful data transmission function, and flexible installation method, demonstrating significant technological advancements and broad application prospects.
[0154] Figure 7 This document provides a framework diagram for biological category detection, where 701 is an infrared image, 702 is a visible light image, and 703 represents candidate categories generated based on the infrared thermal image. The infrared image 701 is processed using an infrared feature extraction network to obtain an infrared feature map. Similarly, the visible light image 702 is processed using a visible light feature extraction network to obtain a visible light feature map. The infrared and visible light feature maps are then fused to obtain a fused feature map. This fused feature map and candidate category 703 are input into a trained classification network to obtain a predicted category. The predicted category and the fused feature map are then input into a temperature distribution network to obtain a temperature distribution map. Finally, the temperature distribution map and candidate category 703 are input into a verification module to verify the predicted category. If the verification passes, the predicted category is output as the target category.
[0155] Based on the foregoing embodiments, this application provides a biological category detection device, which includes various units and modules included in each unit, and can be implemented by a processor in a detection terminal; of course, it can also be implemented by specific logic circuits; in the implementation process, the processor can be a central processing unit (CPU), a microprocessor unit (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA), etc.
[0156] Figure 8 This is a schematic diagram of the composition structure of a biological category detection device provided in an embodiment of this application, as shown below. Figure 8As shown, the biological category detection device 800 includes: a first determining module 801, a second determining module 802, a third determining module 803, and a verification module 804, wherein: the first determining module 801 is used to determine the candidate category of a target organism in the environment based on the acquired infrared thermal image, and to determine the first location range and biological thermal distribution image of the target organism in the infrared thermal image; the second determining module 802 is used to determine the biological visual image corresponding to the target organism in the acquired environmental image according to the first location range; the environmental image and the infrared thermal image carry the feature information of the target organism at the same time; the third determining module 803 is used to determine the predicted category and predicted thermal distribution image of the target organism based on the candidate category and the biological visual image; the verification module 804 is used to verify the predicted thermal distribution image based on the biological thermal distribution image, and output the predicted category as the target category if the verification is successful.
[0157] In some embodiments, the third determining module 803 is further configured to extract features from the biological visual image to obtain a biological visual feature map corresponding to the biological visual image; input the candidate category and the biological visual feature map into a classification network to obtain the predicted category of the target organism; input the standard heat distribution information corresponding to the predicted category and the biological visual feature map into a heat distribution prediction network to determine the predicted heat distribution image of the target organism.
[0158] In some embodiments, the third determining module 803 is further configured to determine the segmentation result of the target organism based on the biological visual feature map; the segmentation result includes the segmentation region of the target organism and the sub-regions of each body part of the target organism in the segmentation region; and to fit a predicted thermal distribution image of the target organism based on the sub-regions of each body part of the target organism in the segmentation region and the standard thermal distribution information.
[0159] In some embodiments, the environmental image includes an infrared light image and a visible light image; the biological visual feature image includes a first biological image corresponding to the infrared light image and a second biological image corresponding to the visible light image; the third determining module 803 is further configured to perform size transformation on the first biological image and the second biological image to obtain a first biological image and a second biological image with unified size; perform feature extraction on the first biological image based on a first feature extraction network to obtain a first visual feature map; perform feature extraction on the second biological image based on a second feature extraction network to obtain a second visual feature map; and fuse the first visual feature map and the second visual feature map to obtain a biological visual feature map corresponding to the biological visual image.
[0160] In some embodiments, the verification module 804 is further configured to segment the biological thermal distribution image and the predicted thermal distribution image based on sub-regions of the segmented region for each body part of the target organism; for each body part, determine a first similarity between the body part in a first sub-distribution image of the biological thermal distribution image and the body part in a second sub-distribution image of the predicted thermal distribution image; generate a target similarity between the biological thermal distribution image and the predicted thermal distribution image based on the first similarity of each body part; and determine a verification result based on the target similarity and a similarity threshold.
[0161] In some embodiments, the first determining module 801 is further configured to convert the first position range of the target organism in the infrared thermal image into a second position range of the target organism in the environmental image based on a pre-calibrated conversion relationship; to crop the environmental image using the second position range to obtain the biological visual image; and to make the proportion of the target organism in the biological visual image the same as the proportion of the target organism in the biological visual image.
[0162] In some embodiments, the verification module 804 is further configured to upload the candidate category, the biological visual image, and the biological thermal distribution image to the cloud if the verification fails; the cloud is configured to generate a fused feature map based on the biological texture image and the biological thermal distribution image; and generate a target category for the target object based on the fused feature map and the candidate category.
[0163] The descriptions of the apparatus embodiments above are similar to those of the method embodiments above, and have similar beneficial effects. In some embodiments, the functions or modules included in the apparatus provided in this application can be used to perform the methods described in the method embodiments above. For technical details not disclosed in the apparatus embodiments of this application, please refer to the descriptions of the method embodiments of this application for understanding.
[0164] It should be noted that, in the embodiments of this application, if the above-described methods are implemented as software functional modules and sold or used as independent products, they can also be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the embodiments of this application, or the parts that contribute to related technologies, can be embodied in the form of a software product. This software product is stored in a storage medium and includes several instructions to cause a detection terminal (which may be a personal computer, server, or network device, etc.) to execute all or part of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), magnetic disks, or optical disks. Thus, the embodiments of this application are not limited to any specific hardware, software, or firmware, or any combination of hardware, software, and firmware.
[0165] This application provides a detection terminal, comprising a processor, a memory, a first acquisition unit, and a second acquisition unit. The memory stores a computer program that can run on the processor. When the processor executes the computer program, it implements some or all of the steps in the above method. The first acquisition unit is used to acquire infrared thermal images, and the second acquisition unit is used to acquire environmental images.
[0166] This application provides a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements some or all of the steps in the above-described method. The computer-readable storage medium can be transient or non-transient.
[0167] This application provides a computer program including computer-readable code. When the computer-readable code is run in a detection terminal, the processor in the detection terminal executes some or all of the steps in the above method.
[0168] This application provides a computer program product, which includes a non-transitory computer-readable storage medium storing a computer program. When the computer program is read and executed by a computer, it implements some or all of the steps in the above-described method. This computer program product can be implemented specifically through hardware, software, or a combination thereof. In some embodiments, the computer program product is specifically embodied as a computer storage medium; in other embodiments, the computer program product is specifically embodied as a software product, such as a software development kit (SDK), etc.
[0169] It should be noted that the descriptions of the various embodiments above tend to emphasize the differences between them, while their similarities or commonalities can be referred to interchangeably. The descriptions of the above embodiments of the device, storage medium, computer program, and computer program product are similar to the descriptions of the above method embodiments and have similar beneficial effects. For technical details not disclosed in the embodiments of the device, storage medium, computer program, and computer program product of this application, please refer to the descriptions of the method embodiments of this application for understanding.
[0170] Figure 9 This is a schematic diagram of the hardware entity of a detection terminal provided in an embodiment of this application, such as... Figure 9 As shown, the hardware entity of the detection terminal 900 includes: a processor 901, a memory 902, a first acquisition unit 903, and a second acquisition unit 904. The memory 902 stores a computer program that can run on the processor 901. When the processor 901 executes the program, it implements the steps in the method of any of the above embodiments.
[0171] The memory 902 stores computer programs that can run on the processor. The memory 902 is configured to store instructions and applications that can be executed by the processor 901. It can also cache data to be processed or already processed by the various modules in the processor 901 and the detection terminal 900 (e.g., image data, audio data, voice communication data and video communication data). It can be implemented by flash memory or random access memory (RAM).
[0172] The processor 901 executes the steps of any of the above methods when executing the program. The processor 901 typically controls the overall operation of the detection terminal 900.
[0173] The first acquisition unit 903 is used to acquire infrared thermal images.
[0174] The second acquisition unit 904 is used to acquire environmental images.
[0175] This application provides a computer storage medium that stores one or more programs, which can be executed by one or more processors to implement the steps of the methods described in any of the above embodiments.
[0176] It should be noted that the descriptions of the storage medium and device embodiments above are similar to the descriptions of the method embodiments above, and have similar beneficial effects. For technical details not disclosed in the storage medium and device embodiments of this application, please refer to the descriptions of the method embodiments of this application for understanding.
[0177] The aforementioned processor can be at least one of the following: Application Specific Integrated Circuit (ASIC), Digital Signal Processor (DSP), Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field Programmable Gate Array (FPGA), Central Processing Unit (CPU), Controller, Microcontroller, and Microprocessor. It is understood that other electronic devices can also implement the functions of the aforementioned processor, and this application does not specifically limit the specific implementation.
[0178] The aforementioned computer storage media / memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), magnetic random access memory (FRAM), flash memory, magnetic surface memory, optical disc, or compact disc read-only memory (CD-ROM), etc.; or it can be various terminals that include one or any combination of the above-mentioned memories, such as mobile phones, computers, tablet devices, personal digital assistants, etc.
[0179] It should be understood that the phrase "one embodiment" or "an embodiment" throughout the specification means that a specific feature, structure, or characteristic related to the embodiment is included in at least one embodiment of this application. Therefore, "in one embodiment" or "in an embodiment" appearing throughout the specification does not necessarily refer to the same embodiment. Furthermore, these specific features, structures, or characteristics can be combined in any suitable manner in one or more embodiments. It should be understood that in the various embodiments of this application, the sequence numbers of the above steps / processes do not imply a sequential order of execution; the execution order of each step / process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application. The sequence numbers of the above embodiments of this application are merely descriptive and do not represent the superiority or inferiority of the embodiments.
[0180] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element.
[0181] In the several embodiments provided in this application, it should be understood that the disclosed devices and methods can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of units is only a logical functional division, and in actual implementation, there may be other division methods, such as: multiple units or components can be combined, or integrated into another system, or some features can be ignored or not executed. In addition, the coupling, direct coupling, or communication connection between the various components shown or discussed can be through some interfaces, and the indirect coupling or communication connection between devices or units can be electrical, mechanical, or other forms.
[0182] The units described above as separate components may or may not be physically separate. The components shown as units may or may not be physical units. They may be located in one place or distributed across multiple network units. Some or all of the units may be selected to achieve the purpose of this embodiment according to actual needs.
[0183] Furthermore, in the various embodiments of this application, all functional units can be integrated into one processing unit, or each unit can be a separate unit, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or in a combination of hardware and software functional units. Those skilled in the art will understand that all or part of the steps of the above method embodiments can be implemented by hardware related to program instructions. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it performs the steps of the above method embodiments. The aforementioned storage medium includes various media capable of storing program code, such as mobile storage devices, read-only memory (ROM), magnetic disks, or optical disks.
[0184] Alternatively, if the integrated units described above are implemented as software functional modules and sold or used as independent products, they can also be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence or the part that contributes to related technologies, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a detection terminal (which may be a personal computer, server, or network device, etc.) to execute all or part of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as mobile storage devices, ROM, magnetic disks, or optical disks.
[0185] The above description is merely an embodiment of this application, but the scope of protection of this application is not limited thereto. Any changes or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application.
Claims
1. A method for detecting biological categories, characterized in that, Applied to a detection terminal, the method includes: Based on the acquired infrared thermal images, the candidate categories of target organisms in the environment are determined, and the first location range and biological thermal distribution image of the target organisms in the infrared thermal images are determined. Based on the first location range, the biological visual image corresponding to the target organism is determined in the acquired environmental image; the environmental image and the infrared thermal image carry the characteristic information of the target organism at the same time. Based on the candidate categories and the biological visual images, the predicted category and predicted heat distribution image of the target organism are determined; The predicted heat distribution image is verified based on the biological heat distribution image, and if the verification is successful, the predicted category is output as the target category.
2. The method according to claim 1, characterized in that, The step of determining the predicted category and predicted heat distribution image of the target organism based on the candidate categories and the biological visual image includes: Feature extraction is performed on the biological visual image to obtain the biological visual feature map corresponding to the biological visual image; The candidate categories and the biological visual feature maps are input into a classification network to obtain the predicted category of the target organism; The standard thermal distribution information corresponding to the predicted category and the biological visual feature map are input into the thermal distribution prediction network to determine the predicted thermal distribution image of the target organism.
3. The method according to claim 2, characterized in that, The step of inputting the standard heat distribution information corresponding to the predicted category and the biological visual feature map into the heat distribution prediction network to determine the predicted heat distribution image of the target organism includes: Based on the biological visual feature map, the segmentation result of the target organism is determined; the segmentation result includes the segmented region of the target organism and the sub-regions of each body part of the target organism in the segmented region; Based on the sub-regions of the segmented region for each body part of the target organism, and the standard thermal distribution information, a predicted thermal distribution image of the target organism is fitted.
4. The method according to claim 2, characterized in that, The environmental image includes an infrared image and a visible light image; the biological visual feature image includes a first biological image corresponding to the infrared image and a second biological image corresponding to the visible light image; the step of extracting features from the biological visual image to obtain a biological visual feature map corresponding to the biological visual image includes: The first biological image and the second biological image are resized to obtain a first biological image and a second biological image with uniform size. Based on the first feature extraction network, feature extraction is performed on the first biological image to obtain a first visual feature map; The second biological image is used to extract features based on the second feature extraction network to obtain a second visual feature map; The first visual feature map and the second visual feature map are fused to obtain the biological visual feature map corresponding to the biological visual image.
5. The method according to claim 3, characterized in that, The verification of the predicted heat distribution image based on the biological heat distribution image includes: Based on the sub-regions of the segmented region for each body part of the target organism, the thermal distribution image of the organism and the predicted thermal distribution image are segmented respectively; For each of the body parts, a first similarity is determined between the body part in a first sub-distribution image of the biological thermal distribution image and the body part in a second sub-distribution image of the predicted thermal distribution image; Based on the first similarity of each of the body parts, a target similarity is generated between the biological heat distribution image and the predicted heat distribution image; The verification result is determined based on the target similarity and the similarity threshold.
6. The method according to any one of claims 1 to 5, characterized in that, The step of determining the biological visual image corresponding to the target organism in the acquired environmental image based on the first location range includes: Based on a pre-defined conversion relationship, the first location range of the target organism in the infrared thermal image is converted into the second location range of the target organism in the environmental image; The environmental image is cropped using the second location range to obtain the biological visual image; the proportion of the target organism in the biological visual image is the same as the proportion of the target organism in the biological visual image.
7. The method according to any one of claims 1 to 5, characterized in that, The method further includes: If the verification fails, the candidate category, the biological visual image, and the biological thermal distribution image are uploaded to the cloud; the cloud is used to generate a fused feature map based on the biological texture image and the biological thermal distribution image; and based on the fused feature map and the candidate category, a target category of the target object is generated.
8. A biological category detection device, characterized in that, The device, applied to a detection terminal, includes: The first determining module is used to determine the candidate category of the target organism in the environment based on the acquired infrared thermal image, and to determine the first location range and biological thermal distribution image of the target organism in the infrared thermal image. The second determining module is used to determine the biological visual image corresponding to the target organism in the acquired environmental image based on the first location range; the environmental image and the infrared thermal image carry the feature information of the target organism at the same time. The third determining module is used to determine the predicted category and predicted heat distribution image of the target organism based on the candidate category and the biological visual image; The verification module is used to verify the predicted heat distribution image based on the biological heat distribution image, and output the predicted category as the target category if the verification is successful.
9. A detection terminal, characterized in that, The detection terminal includes a processor, a memory, a first acquisition unit, and a second acquisition unit, wherein... The memory stores computer programs that can run on a processor; When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 7; The first acquisition unit is used to acquire infrared thermal images; The second acquisition unit is used to acquire environmental images.
10. A computer-readable storage medium, characterized in that, It stores a computer program that, when executed by a processor, implements the steps of the method according to any one of claims 1 to 7.