A biological species identification method, device, equipment and storage medium
By combining an AR camera with a region and species recognition model and utilizing the positional relationship between the center point of the object frame and the anchor point, the efficiency and accuracy issues of biological species recognition in augmented reality environments are solved, thus improving the user experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HANGZHOU RUISHENG SOFTWARE CO LTD
- Filing Date
- 2022-08-15
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies struggle to efficiently and comprehensively identify biological objects in multiple frames of images in augmented reality environments, and the user experience is also inadequate.
The AR camera captures multiple frames of images, uses a region recognition model to determine the object bounding box, and combines the positional relationship between the center point of the object bounding box and the anchor point to determine the stability and consistency of the biological object. The species recognition model is then used to confirm the target species, and the results are fed back through the AR camera's visualization interface.
It achieves efficient and accurate identification of biological objects in multi-frame images in augmented reality environments, improving user experience and ensuring the efficiency and accuracy of identification.
Smart Images

Figure CN115346142B_ABST
Abstract
Description
Technical Field
[0001] One or more embodiments of the present invention relate to the field of image recognition technology, and more particularly to a method, apparatus, device and storage medium for identifying biological species. Background Technology
[0002] Augmented Reality (AR) technology, which is based on Simultaneous Localization and Mapping (SLAM) technology, can merge virtual information with the real world. Commonly, users can observe virtual images superimposed on the real environment through AR cameras.
[0003] Thanks to the development of related technologies, AR technology can now be combined with image recognition, natural language processing and other technologies in some applications to provide a richer user experience. Summary of the Invention
[0004] In view of the above, one or more embodiments of the present invention provide a method, apparatus, device and storage medium for identifying biological species.
[0005] To achieve the above objectives, one or more embodiments of the present invention provide the following technical solutions:
[0006] According to a first aspect of one or more embodiments of the present invention, a method for identifying biological species is provided, the method comprising:
[0007] Use the AR camera to capture multiple frames of images to be recognized in the current recognition scene;
[0008] For each frame of the image to be identified, based on a trained region recognition model, several object boxes contained in the image to be identified are determined; wherein each object box contains a biological object for species identification.
[0009] For each object frame, based on the positional relationship between the center point of the object frame and each current anchor point, it is determined whether the biological object contained in the object frame is the same as the biological object corresponding to each anchor point, and whether the biological object contained in the object frame is in a stable recognition state; wherein, the biological object corresponding to each anchor point is the biological object contained in the object frame where the anchor point is located;
[0010] If the number of times a biological object corresponding to any anchor point appears in other object frames in a stable recognition state exceeds a certain threshold, the target species to which the biological object belongs is determined based on the trained species recognition model, and the target species is responded to to the user through the visualization interface of the AR camera.
[0011] In one alternative implementation, determining whether the biological objects contained in the object frame are the same as the biological objects corresponding to each anchor point, and whether the biological objects contained in the object frame are in a stable identification state, includes:
[0012] Determine whether the distance between the center point of the object frame and each anchor point is less than a first distance threshold;
[0013] If the distance between the center point of the object frame and any anchor point is less than the first distance threshold, it is determined that the biological object contained in the object frame is the same as the biological object corresponding to the anchor point, and it is determined whether the distance between the center point of the object frame and the anchor point is less than the second distance threshold.
[0014] If the distance between the center point of the object frame and the anchor point is less than the second distance threshold, the biological object contained in the object frame is determined to be in a stable recognition state.
[0015] In an alternative implementation, if the distance between the center point of the object frame and any anchor point is less than the first distance threshold, the method further includes:
[0016] Determine the image similarity between the biological objects contained in the object frame and the biological objects corresponding to the anchor points;
[0017] If the image similarity meets the similarity requirement, it is determined that the biological object contained in the object frame is the same as the biological object corresponding to the anchor point.
[0018] In one alternative implementation, the method further includes:
[0019] If the distance between the center point of the object frame and all anchor points exceeds the first distance threshold, it is determined that the biological object contained in the object frame is different from the biological objects corresponding to all anchor points, and a new anchor point is added based on the center point of the object frame.
[0020] In one alternative implementation, the method further includes:
[0021] When determining several object boxes contained in an image to be identified based on a region recognition model, the biological object contained in the object box corresponding to the maximum confidence score is determined as the main identification object based on the confidence score output by the region recognition model for each object box, and the biological objects contained in the other object boxes besides the object box corresponding to the maximum confidence score are determined as background identification objects.
[0022] First, determine whether the main identification object is the same as the biological object corresponding to each anchor point, and whether the main identification object is in a stable identification state.
[0023] In one alternative implementation, the method further includes:
[0024] If the number of times the current main identification object appears in other object frames in a stable identification state exceeds the quantity threshold, the target species to which the current main identification object belongs is determined, and the target species to which the current main identification object belongs is responded to the user in an animated manner through the AR camera's visualization interface;
[0025] If the number of times any background identification object appears in other object frames in a stable identification state exceeds the number threshold, determine whether the number of times the current main identification object appears in other object frames in a stable identification state exceeds the number threshold, and whether the target species to which the current main identification object belongs has been identified.
[0026] If the number of times the current primary identification object appears in other object frames in a stable identification state does not exceed the quantity threshold, or if the target species to which the current primary identification object belongs has been identified, the background identification object is determined as the new primary identification object, the target species to which the new primary identification object belongs is determined, and the target species to which the new primary identification object belongs is responded to to the user in an animated manner through the AR camera's visualization interface.
[0027] In one alternative implementation, the method further includes:
[0028] In response to the user's selection action, the AR camera's visual interface displays the target species to which all selected biological objects in the current recognition scene belong.
[0029] According to a second aspect of one or more embodiments of the present invention, a biological species identification device is provided, the device comprising an image acquisition unit, an object frame determination unit, a state determination unit, and a species response unit; wherein:
[0030] The image acquisition unit is used to call the AR camera to acquire multiple frames of images to be recognized in the current recognition scene;
[0031] The object box determination unit is used to determine, for each frame of the image to be identified, a number of object boxes contained in the image to be identified based on a trained region recognition model; wherein each object box contains a biological object to be identified.
[0032] The state determination unit is used to determine, for each object frame, whether the biological object contained in the object frame is the same as the biological object corresponding to each anchor point, and whether the biological object contained in the object frame is in a stable recognition state, based on the positional relationship between the center point of the object frame and each current anchor point; wherein, the biological object corresponding to each anchor point is the biological object contained in the object frame where the anchor point is located.
[0033] The species response unit is used to determine the target species to which the biological object belongs based on a trained species recognition model when the number of times the biological object corresponding to any anchor point appears in other object frames in a stable recognition state exceeds a certain threshold, and to respond to the user with the target species through the visualization interface of the AR camera.
[0034] According to a third aspect of one or more embodiments of the present invention, an electronic device is provided, comprising:
[0035] The processor, and memory for storing processor-executable instructions;
[0036] The processor implements the steps in the method described in the first aspect by running the executable instructions.
[0037] According to a fourth aspect of one or more embodiments of the present invention, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the steps of the method described in the first aspect above.
[0038] As can be seen from the above description, after the present invention calls the AR camera to collect multiple frames of images in the current recognition scene, it first uses the model to determine the object box containing the biological object in each frame image. Then, based on the positional relationship between the center point of the object box and the known anchor points, it determines whether the biological object contained in the object box is a biological object that has appeared in the previously collected images, and whether this biological object is in a stable recognition state. Then, for the same biological object that appears multiple times in a stable recognition state, the model is used to identify the species to which it belongs and to provide feedback to the user.
[0039] This solution combines AR technology with image recognition technology in biological species identification scenarios. By determining the relationship between biological objects and their identification status through the positional relationship between the center point and anchor point of the object frame, it ensures that multiple biological objects contained in the multi-frame images captured by the AR camera can be efficiently and without omission in a clear and stable state. This enriches the user experience while taking into account the efficiency and accuracy required for the identification scenario. Attached Figure Description
[0040] Figure 1A flowchart of a method for identifying biological species provided as an exemplary embodiment.
[0041] Figure 2 A flowchart illustrating a method for determining the relationships between biological objects and the state of biological objects, as shown in an exemplary embodiment.
[0042] Figure 3 A flowchart illustrating a method for determining relationships between biological objects and the state of biological objects, as shown in another exemplary embodiment.
[0043] Figure 4 This is an exemplary embodiment illustrating how an AR camera responds to a user about the target species to which a biological object belongs.
[0044] Figure 5 A flowchart of a method for identifying biological species provided for another exemplary embodiment.
[0045] Figure 6 This is a schematic diagram of the structure of an electronic device containing a biological species identification device, provided as an exemplary embodiment.
[0046] Figure 7 A block diagram of a biological species identification device provided for an exemplary embodiment. Detailed Implementation
[0047] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with one or more embodiments of the present invention. Rather, they are merely examples of apparatuses and methods consistent with some aspects of one or more embodiments of the present invention as detailed in the appended claims.
[0048] It should be noted that the steps of the corresponding methods in other embodiments are not necessarily performed in the order shown and described in this invention. In some other embodiments, the methods may include more or fewer steps than those described in this invention. Furthermore, a single step described in this invention may be broken down into multiple steps in other embodiments; and multiple steps described in this invention may be combined into a single step in other embodiments.
[0049] Augmented Reality (AR) technology is a technology that integrates virtual information with the real world. It typically relies on Simultaneous Localization and Mapping (SLAM) technology and computer vision. By comparing the differences between multiple frames captured by cameras on smartphones and other electronic devices during movement, it constructs a 3D model of the real world and then overlays specific images, text, and other virtual information onto the constructed environment. For example, in some game scenarios, users can use an AR camera to place preset virtual characters or animations at designated locations in the real world constructed by the camera.
[0050] Thanks to the rapid development of related technologies, AR technology can now be combined with other technologies such as image recognition and natural language processing in some applications to provide more diverse functional services and enhance the user experience.
[0051] In view of this, the present invention proposes a method for identifying biological species. This method combines AR technology and image recognition technology in the context of biological species identification, which can balance the user experience with the efficiency and accuracy required for the identification scenario.
[0052] The biological species identification method can be applied to various electronic devices such as smartphones, tablets, personal computers, or wearable smart devices through multiple methods, including apps, mini-programs, and web pages. Specifically, when the identification method is run in the form of an app, mini-program, or web page, the electronic device executing the identification method can be the smartphone, tablet, personal computer, or wearable smart device, or it can be a server that interacts with the smartphone, tablet, personal computer, or wearable smart device. This invention does not impose specific limitations on this.
[0053] Please refer to Figure 1 , Figure 1 The diagram shown is a flowchart of a biological species identification method provided by an exemplary embodiment of the present invention.
[0054] The method for identifying biological species may include the following specific steps:
[0055] Step 102: Use the AR camera to capture multiple frames of images to be recognized in the current recognition scene.
[0056] In this embodiment, when a user launches a species identification APP, mini-program, or webpage on an electronic device, the AR camera can first capture multiple frames of images to be identified in the current identification scene, and then construct a three-dimensional model of the real world based on the multiple frames of images to be identified, and identify the species of biological objects appearing in the current identification scene.
[0057] The specific process by which the AR camera constructs a 3D representation of the real world based on multiple frames of images to be recognized can be found in relevant technologies and will not be elaborated here. In one possible scenario, the AR camera is configured as a pre-packaged component on the electronic device, such as ARCore or ARKit. When executing the biological species recognition method, the AR camera can be invoked and related data processed by interacting with the pre-packaged component API.
[0058] The number of images to be identified captured by the AR camera in the current recognition scene is not limited. Specifically, the number can be determined by the number and spatial distribution of the biological objects to be identified in the current recognition scene.
[0059] Step 104: For each frame of the image to be identified, based on the trained region recognition model, determine a number of object boxes contained in the image to be identified; wherein each object box contains a biological object for species identification.
[0060] In this embodiment, for each frame of the image to be identified captured by the AR camera in the current recognition scene, the frame of the image to be identified can be used as input to the trained region recognition model, and based on the output of the region recognition model, a number of object boxes contained in the frame of the image to be identified are determined; wherein, each object box contains a biological object to be identified, and the specific number of object boxes contained in the image to be identified is determined by the number of biological objects contained in the frame of the image to be identified.
[0061] There are no specific restrictions on the algorithm used to implement the region recognition model. For example, biological objects contained in the sample images can be labeled in advance as object boxes, and then the original region recognition model can be trained through supervised learning using the labeled sample image set.
[0062] It should be noted that the process of the AR camera acquiring multiple frames of images described above can be performed separately at several different locations. Determining the object bounding box in each frame does not need to wait until all images have been acquired; the two are not contradictory. For example, assuming there is a plant at point P1 and point P2 in the real world, the AR camera can continuously acquire multiple frames of images as the user moves from point P1 to point P2. The steps of determining the object bounding box in the previous frame do not affect the AR camera's acquisition of the subsequent frame.
[0063] Step 106: For each object frame, based on the positional relationship between the center point of the object frame and each current anchor point, determine whether the biological object contained in the object frame is the same as the biological object corresponding to each anchor point, and whether the biological object contained in the object frame is in a stable recognition state; wherein, the biological object corresponding to each anchor point is the biological object contained in the object frame where the anchor point is located.
[0064] In this embodiment, after determining several object boxes contained in a frame of an image to be identified based on a region recognition model, the positional relationship between the center point of the object box and each known anchor point can be determined for each object box. Then, based on the positional relationship between the center point of the object box and each anchor point, it can be determined whether the biological object contained in the object box is the biological object corresponding to each anchor point, and whether the biological object contained in the object box is in a stable recognition state.
[0065] First, let's explain the concept of anchor points. As mentioned earlier, AR technology can be used to construct 3D models on electronic devices to recreate the real world. An anchor point is a fixed point in the constructed real world, typically used to place virtual avatars. Specifically, when a user moves the electronic device, the anchor point may move to different positions in the 2D view of the camera. However, if these different positions in the camera view are mapped to the 3D space of the real world, they all point to the same point, the anchor point. Correspondingly, the virtual character or animation placed at the anchor point can change its display as the user and their electronic device move closer or further away.
[0066] In this embodiment, the position of the anchor point can be determined based on the position of the center point of each object frame contained in the image to be identified. The determination process may involve the calibration and conversion of the world coordinate system, camera coordinate system, and pixel coordinate system. For details, please refer to the relevant technology, which will not be elaborated here.
[0067] It is understandable that, due to the spatial mapping relationship between anchor points and the center point of the object frame, there is also a corresponding relationship between anchor points and the creature objects contained within the object frame. The creature object corresponding to any anchor point is the creature object contained within the object frame containing that anchor point. In other words, the creature object corresponding to any anchor point is the creature object contained within the object frame used to determine the anchor point's position. For example, if the coordinates of the center point C1 of object frame K1 in the constructed real world are set as the coordinates of anchor point A1, then the creature object corresponding to anchor point A1 is the creature object contained within object frame K1.
[0068] When processing a particular frame of the image to be recognized, the anchor points used are determined based on the center points of the object boxes contained in the previously processed frames of the image to be recognized. For example, assuming the region recognition model determines that the first frame of the image to be recognized captured by the AR camera contains object boxes K1 and K2, the coordinates of the center point C1 of object box K1 in the constructed real world can be set as the coordinates of anchor point A1, and the coordinates of the center point C2 of object box K2 can be set as the coordinates of anchor point A2. Then, when processing the second frame of the image to be recognized, the positional relationship between the center points of each object box contained in the second frame of the image to be recognized and anchor points A1 and A2 is determined respectively. In addition, new anchor points can be added based on the center points of each object box contained in the second frame of the image to be recognized. Then, when processing the third frame of the image to be recognized, the positional relationship between the center points of each object box contained in the third frame of the image to be recognized and anchor points A1, A2, and the newly added anchor points can be determined respectively. The processing of subsequent frames of the image to be recognized is similar and will not be elaborated further.
[0069] In order to ensure that each biological object contained in each frame of the image to be identified can be identified efficiently and without omission in a stable and clear state, in step 106, for each object frame in each frame of the image to be identified, the positional relationship between the center point of the object frame and each known anchor point will be used to determine whether the biological object contained in the object frame is the biological object corresponding to each anchor point, that is, whether it is a biological object that has appeared in the processed image to be identified, and whether the biological object contained in the object frame is in a stable identification state.
[0070] Please refer to Figure 2 , Figure 2 The diagram shown is a flowchart illustrating a method for determining the relationships between biological objects and the state of biological objects in an exemplary embodiment.
[0071] In one alternative implementation, step 106, determining whether the biological objects contained in the object frame are the same as the biological objects corresponding to each anchor point, and whether the biological objects contained in the object frame are in a stable identification state, may include the following specific steps:
[0072] Step 1062: Determine whether the distance between the center point of the object frame and each anchor point is less than a first distance threshold.
[0073] Step 1064: If the distance between the center point of the object frame and any anchor point is less than the first distance threshold, determine that the biological object contained in the object frame is the same as the biological object corresponding to the anchor point, and determine whether the distance between the center point of the object frame and the anchor point is less than the second distance threshold.
[0074] Step 1066: If the distance between the center point of the object frame and the anchor point is less than the second distance threshold, it is determined that the biological object contained in the object frame is in a stable recognition state.
[0075]
[0076] Table 1
[0077] As shown in Table 1, in the above implementation, the relationship between biological objects and the state of biological objects are determined based on the distance between the center point of the object frame and each anchor point. The first distance threshold is used to determine whether the biological object contained in the object frame is a known biological object corresponding to any anchor point at the current time, and the second distance threshold is used to determine whether the biological object contained in the object frame is in a stable recognition state.
[0078] Based on the previous example, assuming the first distance threshold is 0.4m and the second distance threshold is 0.2m, after determining anchor point A1 from the center point C1 of object frame K1 and anchor point A2 from the center point C2 of object frame K2, for object frame K3 in the second frame image to be identified, if the distance between the center point C3 of object frame K3 and anchor point A1 in the real world is 0.17m, which is less than both the first and second distance thresholds, then it can be determined that the biological object contained in object frame K3 is the same as the biological object corresponding to anchor point A1. In other words, the biological object contained in object frame K3 is the same as the biological object contained in object frame K1, and this biological object is in a stable recognition state. However, if the distance between the center point C3 of object frame K3 and anchor point A2 in the real world is 1.72m, which is greater than the first distance threshold, then it can be determined that the biological object contained in object frame K3 is not the biological object corresponding to anchor point A2. In other words, the biological object contained in object frame K3 is not the biological object contained in object frame K2.
[0079] In another alternative implementation, the relationship between the organisms contained in the object frame and the organisms corresponding to each anchor point can be further determined by combining image similarity. Specifically, in step 1064, if the distance between the center point of the object frame and any anchor point is less than the first distance threshold, it may further include:
[0080] Determine the image similarity between the biological object contained in the object frame and the biological object corresponding to the anchor point; if the image similarity meets the similarity requirement, determine that the biological object contained in the object frame is the same as the biological object corresponding to the anchor point.
[0081] Based on the previous example, assuming that the similarity requirement is that the image similarity exceeds the similarity threshold, and the preset similarity threshold is 85%, after determining that the distance between the center point C3 of the object frame K3 and the anchor point A1 in the real world is 0.17m, which is less than the first distance threshold of 0.4m, it can be further determined that the image similarity between the biological object contained in the object frame K3 and the biological object corresponding to the anchor point A1 is 92%, which exceeds the similarity threshold of 85%. It is determined that the biological object contained in the object frame K3 is the same as the biological object corresponding to the anchor point A1. Then, step 1066 is executed to determine whether the biological object is in a stable recognition state.
[0082] The image similarity between the biological object contained in the object frame and the biological object corresponding to the anchor point can be determined using a trained similarity determination model. For example, the images of the biological objects contained in the object frame and the biological object corresponding to the anchor point are input into the trained similarity determination model as input parameters, and then the similarity between the two is determined based on the output of the similarity determination model.
[0083] There are no specific restrictions on the algorithm used to implement the similarity determination model. For example, the similarity between any two sample images in the sample image set can be marked, and then the original similarity determination model can be trained in a supervised manner using the marked sample image set.
[0084] Furthermore, since the AR camera may capture images to be identified during movement, the biological objects contained in each frame of the images to be identified may not be consistent. In a post-processed frame of the images to be identified, there may be biological objects that were not included in the previously processed frames.
[0085] In response to this situation, in order to ensure that each biological object contained in each frame of the image to be identified can be identified efficiently and without omission in a stable and clear state, it may be necessary to add new anchor points.
[0086] Please refer to Figure 3 , Figure 3 The following is a flowchart illustrating a method for determining the relationships between biological objects and the state of biological objects, as shown in another exemplary embodiment.
[0087] In an alternative implementation, step 106 may also include:
[0088] Step 1068: If the distance between the center point of the object frame and all anchor points exceeds the first distance threshold, determine that the biological objects contained in the object frame are different from the biological objects corresponding to all anchor points, and add a new anchor point based on the center point of the object frame.
[0089] Based on the previous example, for the object frame K4 contained in the second frame of the image to be identified, if the distance between the center point C4 of the object frame K4 and the anchor point A1 in the real world is 2.71m, which is greater than the first distance threshold of 0.4m, and at the same time, the distance between the center point C4 of the object frame K4 and the anchor point A2 in the real world is 2.01m, which is also greater than the first distance threshold of 0.4m, then it can be determined that the biological object contained in the object frame K4 is different from the biological objects corresponding to the anchor points A1 and A2. The coordinates of the center point C4 of the object frame K4 in the constructed real world are set as the coordinates of the anchor point A3. In the processing of the third frame of the image to be identified, the anchor points A1, A2 and the newly added anchor point A3 are used.
[0090] Step 108: If the number of times the biological object corresponding to any anchor point appears in other object frames in a stable recognition state exceeds the quantity threshold, the target species to which the biological object belongs is determined based on the trained species recognition model, and the target species is responded to to the user through the visualization interface of the AR camera.
[0091] In this embodiment, for each biological object corresponding to each known anchor point, it can be determined whether the number of times the biological object appears in a stable recognition state exceeds a preset number threshold. If so, the image of the biological object, such as the object box where the biological object is located, is used as an input parameter to the trained species recognition model. Then, based on the output result of the species recognition model, the target species to which the biological object belongs is determined.
[0092] Specifically, based on the previous example, assuming the preset quantity threshold is 2, and the distance between the object box K5 and the anchor point A1 in the real world is 0.11m, which is less than the first distance threshold of 0.4m and less than the second distance threshold of 0.2m, it can be determined that the biological object contained in the object box K5 is the same as the biological object corresponding to the anchor point A1 and is in a stable recognition state. In the second frame of the image to be recognized, which was processed earlier, the biological object contained in the object box K3 is also the same as the biological object corresponding to the anchor point A1 and is in a stable recognition state. Therefore, the number of times the biological object corresponding to the anchor point A1 appears in a stable recognition state is greater than or equal to the quantity threshold of 2, and the target species to which the biological object corresponding to the anchor point A1 belongs can be determined based on the trained species recognition model.
[0093] There are no specific restrictions on the algorithm used to implement the species identification model. For example, the target species to which the biological objects contained in the sample images belong can be labeled, and then the original species identification model can be trained in a supervised manner using the labeled sample image set.
[0094] After identifying the target species of a biological object, the AR camera's visualization interface can provide feedback to the user using text, images, animations, or other means. For example, based on AR technology, the identified target species can be overlaid near the biological object in the real world constructed by the AR camera using text, images, or animations. The information about the target species can include not only the species name but also related information such as species characteristics.
[0095] Please refer to Figure 4 , Figure 4 The diagram shown is an exemplary embodiment illustrating how an AR camera responds to a user with the target species to which a biological object belongs.
[0096] like Figure 4 As shown, the current recognition scene contains two plants to be identified: plant PL1 at point P1 and plant PL2 at point P2. Plant PL1 belongs to the genus Crassula and family Crassulaceae. It is poisonous, has no weeds, and does not require watering. Plant PL2 belongs to the genus Sansevieria and family Liliaceae. It is non-toxic, has no weeds, and requires watering. The above information can be displayed in the AR camera interface in the form of cards or small windows.
[0097] Since AR cameras can capture images of the scene to be identified while the user is moving, a single user action can identify the species of multiple biological objects. To improve the user experience, in one alternative implementation, the biological species identification method further includes:
[0098] In response to the user's selection action, the AR camera's visual interface displays the target species to which all selected biological objects in the current recognition scene belong.
[0099] Specifically, users can perform a check-in operation on each biological object appearing in the current recognition scene, selecting some or all of the biological objects for recognition and feedback. After the identification of the species to which the biological object belongs is completed, the electronic device can respond to the user's check-in operation and provide feedback to the user on the target species to which one or more biological objects belong through the AR camera's visualization interface in the form of text, images, animations, etc.
[0100] As can be seen from the above description, after the present invention calls the AR camera to collect multiple frames of images in the current recognition scene, it first uses the model to determine the object box containing the biological object in each frame image. Then, based on the positional relationship between the center point of the object box and the known anchor points, it determines whether the biological object contained in the object box is a biological object that has appeared in the previously collected images, and whether this biological object is in a stable recognition state. Then, for the same biological object that appears multiple times in a stable recognition state, the model is used to identify the species to which it belongs and to provide feedback to the user.
[0101] This solution combines AR technology with image recognition technology in biological species identification scenarios. By determining the relationship between biological objects and their identification status through the positional relationship between the center point and anchor point of the object frame, it ensures that multiple biological objects contained in the multi-frame images captured by the AR camera can be efficiently and without omission in a clear and stable state. This enriches the user experience while taking into account the efficiency and accuracy required for the identification scenario.
[0102] To further improve the execution efficiency of the above technical solutions, based on the above embodiments, it is possible to further distinguish between the main identification object and the background identification object for each biological object, thereby making the identification of the species to which each biological object belongs more orderly and efficient.
[0103] Please refer to Figure 5 , Figure 5 The diagram shown is a flowchart of a method for identifying biological species provided by another exemplary embodiment of the present invention.
[0104] The method for identifying biological species may include the following specific steps:
[0105] Step 202: Call the AR camera to collect multiple frames of images to be recognized in the current recognition scene.
[0106] Step 204: For each frame of the image to be identified, based on the trained region recognition model, determine several object boxes contained in the image to be identified, and based on the confidence score output by the region recognition model for each object box, determine the biological object contained in the object box corresponding to the maximum confidence score as the main identification object, and determine the biological objects contained in the other object boxes besides the object box corresponding to the maximum confidence score as background identification objects; wherein, each object box contains one biological object to be identified as a species.
[0107] Step 206: Based on the positional relationship between the center point of the object frame and each current anchor point, first determine whether the main identification object is the same as the biological object corresponding to each anchor point, and whether the main identification object is in a stable identification state. Then determine whether the background identification object is the same as the biological object corresponding to each anchor point, and whether the background identification object is in a stable identification state.
[0108] Step 208: If the number of times the current main identification object appears in other object frames in a stable identification state exceeds the quantity threshold, the target species to which the current main identification object belongs is determined based on the trained species identification model, and the target species to which the current main identification object belongs is responded to the user in an animated manner through the visualization interface of the AR camera.
[0109] Step 210: If the number of times any background identification object appears in other object frames in a stable identification state exceeds the quantity threshold, determine whether the number of times the current main identification object appears in other object frames in a stable identification state exceeds the quantity threshold, and whether the target species to which the current main identification object belongs has been identified.
[0110] Step 212: If the number of times the current main identification object appears in other object frames in a stable identification state does not exceed the quantity threshold, or if the target species to which the current main identification object belongs has been identified, the background identification object is determined as the new main identification object. Based on the trained species identification model, the target species to which the new main identification object belongs is determined. The target species to which the new main identification object belongs is then responded to the user in an animated manner through the AR camera's visualization interface.
[0111] In this embodiment, when the region recognition model determines the number of object boxes included in a frame of the image to be recognized, it outputs the confidence level of each object box containing a biological object. Based on the confidence level of each object box, the biological object contained in the object box with the highest confidence level can be identified as the main recognition object, while the biological objects contained in other object boxes can be identified as background recognition objects.
[0112] When determining the relationship between biological objects and the recognition status of biological objects based on the positional relationship between the center point and the anchor point of the object frame, the main recognition object is processed first, and the background recognition objects are processed after the main recognition object is processed.
[0113] If the number of times the current main identification object appears in other object frames in a stable identification state exceeds the number threshold, the target species to which the current main identification object belongs can be determined using the trained species identification model. Then, after determining the target species, the AR camera's visualization interface responds to the user in an animated manner.
[0114] If any background identification object appears in other object frames more times in a stable identification state than the number threshold, and the current main identification object does not appear in other object frames more times in a stable identification state, or if the target species to which the current main identification object belongs has been identified, the background identification object can be determined as the new main identification object. Then, the trained species identification model is used to determine the target species to which the new main identification object belongs. After determining the target species, the AR camera's visualization interface responds to the user with animation.
[0115] Based on the previous example, assuming that the third frame of the image to be identified contains object boxes K5 and K6, where the confidence level of object box K5 output by the region recognition model is higher than that of object box K6, it can be determined that the biological object contained in object box K5 is the main recognition object, and the biological object contained in object box K6 is the background recognition object.
[0116] First, based on the distance between the center point C5 of object frame K5 and anchor points A1, A2, and A3, it is determined that the main recognition object contained in object frame K5 is the same as the biological object corresponding to anchor point A1 and is in a stable recognition state. Then, based on the distance between the center point C6 of object frame K6 and anchor points A1, A2, and A3, it is determined that the background recognition object contained in object frame K6 is the same as the biological object corresponding to anchor point A2 and is in a stable recognition state.
[0117] If the number of times the main identification object appears in a stable identification state is greater than or equal to the quantity threshold, the target species to which the main identification object in the object frame K5 belongs is determined based on the species identification model, and the target species is responded to by animation near the main identification object in the real world constructed by the AR camera; then, if the number of times the background identification object in the object frame K6 appears in a stable identification state is less than the quantity threshold, species identification and user feedback are not performed.
[0118] Suppose that the fourth frame of the image to be identified contains object boxes K7 and K8. The confidence level of object box K7 output by the region recognition model is higher than that of object box K8. It can be determined that the biological object contained in object box K7 is the main recognition object, and the biological object contained in object box K8 is the background recognition object.
[0119] First, based on the distance between the center point C7 of object frame K7 and anchor points A1, A2, and A3, it is determined that the main recognition object contained in object frame K7 is the same as the biological object corresponding to anchor point A1 and is in a stable recognition state. Then, based on the distance between the center point C8 of object frame K8 and anchor points A1, A2, and A3, it is determined that the background recognition object contained in object frame K8 is the same as the biological object corresponding to anchor point A2 and is in a stable recognition state.
[0120] If the number of times the background identification object contained in the object frame K8 appears in a stable identification state is greater than or equal to a preset quantity threshold, and the target species to which the main identification object contained in the object frame K7 belongs has been identified, the background identification object contained in the object frame K8 is determined as the new main identification object. Based on the species identification model, the target species to which the biological object contained in the object frame K8 belongs is determined, and the target species is responded to by animation near the new main identification object in the real world constructed by the AR camera.
[0121] The processing of subsequent frames of the image to be identified follows the same principle and will not be described in detail here.
[0122] It is understandable that the specific implementation methods of other contents in steps 202 to 210 can be found in the contents recorded in steps 102 to 108 above, and will not be repeated here.
[0123] AR recognition can also be used to collect information about the true size of organisms, which can be used to assist species classification engines in identification, similarity judgment, and as a basis for displaying information on the results page.
[0124] Please refer to Figure 6 , Figure 6 The diagram illustrates the structure of an electronic device for identifying biological species, as provided in an exemplary embodiment of the present invention. At the hardware level, the electronic device includes a processor 602, an internal bus 604, a network interface 606, a memory 608, and a non-volatile memory 610, and may also include other hardware required for various operations. One or more embodiments of the present invention can be implemented in software, for example, the processor 602 can read the corresponding computer program from the non-volatile memory 610 into the memory 608 and then run it. Of course, besides software implementation, one or more embodiments of the present invention do not exclude other implementation methods, such as logic devices or a combination of hardware and software, etc. That is to say, the execution entity of the following processing flow is not limited to individual logic units, but can also be hardware or logic devices.
[0125] Please refer to Figure 7 , Figure 7 The image shows an exemplary embodiment of a biological species identification device provided by the present invention. This biological species identification and recommendation device can be applied to, for example... Figure 6 The electronic device shown implements the technical solution of the present invention. The biological species identification device includes an image acquisition unit 710, an object frame determination unit 720, a state determination unit 730, and a species response unit 740; wherein:
[0126] The image acquisition unit 710 is used to call the AR camera to acquire multiple frames of images to be recognized in the current recognition scene;
[0127] The object box determination unit 720 is used to determine, for each frame of the image to be identified, a plurality of object boxes contained in the image to be identified based on a trained region recognition model; wherein each object box contains a biological object to be identified.
[0128] The state determination unit 730 is used to determine, for each object frame, whether the biological object contained in the object frame is the same as the biological object corresponding to each anchor point, and whether the biological object contained in the object frame is in a stable recognition state, based on the positional relationship between the center point of the object frame and each current anchor point; wherein, the biological object corresponding to each anchor point is the biological object contained in the object frame where the anchor point is located.
[0129] The species response unit 740 is used to determine the target species to which the biological object belongs based on a trained species recognition model when the number of times the biological object corresponding to any anchor point appears in other object frames in a stable recognition state exceeds a certain threshold, and to respond to the user with the target species through the visualization interface of the AR camera.
[0130] Optionally, the state determination unit 730, when determining whether the biological object contained in the object frame is the same as the biological object corresponding to each anchor point, and whether the biological object contained in the object frame is in a stable recognition state, is specifically used for:
[0131] Determine whether the distance between the center point of the object frame and each anchor point is less than a first distance threshold;
[0132] If the distance between the center point of the object frame and any anchor point is less than the first distance threshold, it is determined that the biological object contained in the object frame is the same as the biological object corresponding to the anchor point, and it is determined whether the distance between the center point of the object frame and the anchor point is less than the second distance threshold.
[0133] If the distance between the center point of the object frame and the anchor point is less than the second distance threshold, the biological object contained in the object frame is determined to be in a stable recognition state.
[0134] Optionally, the state determination unit 730, when the distance between the center point of the object frame and any anchor point is less than the first distance threshold, is further configured to:
[0135] Determine the image similarity between the biological objects contained in the object frame and the biological objects corresponding to the anchor points;
[0136] If the image similarity meets the similarity requirement, it is determined that the biological object contained in the object frame is the same as the biological object corresponding to the anchor point.
[0137] Alternatively, the state determination unit 730 is further configured to:
[0138] If the distance between the center point of the object frame and all anchor points exceeds the first distance threshold, it is determined that the biological object contained in the object frame is different from the biological objects corresponding to all anchor points, and a new anchor point is added based on the center point of the object frame.
[0139] Optionally, when determining a number of object boxes contained in an image to be identified based on a region recognition model, the object box determining unit 720 is further configured to: determine the biological object contained in the object box corresponding to the maximum confidence score as the main identification object based on the confidence score output by the region recognition model for each object box, and determine the biological objects contained in other object boxes besides the object box corresponding to the maximum confidence score as background identification objects.
[0140] The state determination unit 730 is further configured to: preferentially determine whether the main identification object is the same as the biological object corresponding to each anchor point, and whether the main identification object is in a stable identification state.
[0141] Alternatively, the species response unit 740 is further configured to:
[0142] If the number of times the current main identification object appears in other object frames in a stable identification state exceeds the quantity threshold, the target species to which the current main identification object belongs is determined, and the target species to which the current main identification object belongs is responded to the user in an animated manner through the AR camera's visualization interface;
[0143] If the number of times any background identification object appears in other object frames in a stable identification state exceeds the number threshold, determine whether the number of times the current main identification object appears in other object frames in a stable identification state exceeds the number threshold, and whether the target species to which the current main identification object belongs has been identified.
[0144] If the number of times the current primary identification object appears in other object frames in a stable identification state does not exceed the quantity threshold, or if the target species to which the current primary identification object belongs has been identified, the background identification object is determined as the new primary identification object, the target species to which the new primary identification object belongs is determined, and the target species to which the new primary identification object belongs is responded to to the user in an animated manner through the AR camera's visualization interface.
[0145] Alternatively, the species response unit 740 is further configured to:
[0146] In response to the user's selection action, the AR camera's visual interface displays the target species to which all selected biological objects in the current recognition scene belong.
[0147] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer, which can take the form of a personal computer, laptop computer, cellular phone, camera phone, smartphone, personal digital assistant, media player, navigation device, email sending and receiving device, game console, tablet computer, wearable device, or any combination of these devices.
[0148] In a typical configuration, a computer includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.
[0149] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
[0150] Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, disk storage, quantum memory, graphene-based storage media or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.
[0151] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0152] The foregoing has described specific embodiments of the invention. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps described in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired results. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
[0153] The terminology used in one or more embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to limit the invention. The singular forms “a,” “the,” and “the” used in one or more embodiments of the invention and in the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used herein refers to and includes any or all possible combinations of one or more associated listed items.
[0154] It should be understood that although the terms first, second, third, etc., may be used to describe various information in one or more embodiments of the present invention, such information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, first information may also be referred to as second information without departing from the scope of one or more embodiments of the present invention, and similarly, second information may also be referred to as first information. Depending on the context, the word "if" as used herein may be interpreted as "when," "when," or "in response to a determination."
[0155] The above description is merely a preferred embodiment of one or more embodiments of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of one or more embodiments of the present invention should be included within the protection scope of one or more embodiments of the present invention.
Claims
1. A method of identifying a biological species, characterized by, The method includes: Use the AR camera to capture multiple frames of images to be recognized in the current recognition scene; For each frame of the image to be identified, based on a trained region recognition model, several object boxes contained in the image to be identified are determined; wherein each object box contains a biological object for species identification. Each anchor point is determined, wherein the position of each anchor point is predetermined based on the position of the center point of the object frame containing the biological object in the historical frame image; For each object frame, based on the geometric distance between the center point of the object frame and each of the current anchor points, it is determined whether the biological object contained in the object frame is the same object as the biological object corresponding to any anchor point, and whether the biological object is in a stable recognition state; wherein, if the geometric distance between the center point of the object frame and an anchor point is less than a first distance threshold, it is determined to be the same object; if the geometric distance is further less than a second distance threshold, it is determined that the biological object is in a stable recognition state, where the second distance threshold is less than the first distance threshold; If the number of times a biological object corresponding to any anchor point appears in other object frames in a stable recognition state exceeds a certain threshold, the image of the object frame containing the biological object corresponding to that anchor point is input into a trained species recognition model to determine the target species to which the biological object belongs, and the target species is responded to to the user through the visualization interface of the AR camera.
2. The method of claim 1, wherein, For each object frame, based on the geometric distance between the center point of the object frame and each of the current anchor points, it is determined whether the biological object contained in the object frame is the same object as the biological object corresponding to any anchor point, and whether the biological object is in a stable recognition state, including: Calculate the geometric distance between the center point of the object frame and each anchor point; The geometric distance is compared with a preset first distance threshold and a second distance threshold, wherein the first distance threshold is greater than the second distance threshold; If the geometric distance between the center point of the object frame and any anchor point is less than the first distance threshold, it is determined that the biological object contained in the object frame and the biological object corresponding to the anchor point are the same object, and it is further determined whether the geometric distance is less than the second distance threshold. If the geometric distance is less than the second distance threshold, the biological object contained in the object frame is determined to be in a stable recognition state.
3. The method according to claim 2, characterized in that, If the distance between the center point of the object frame and any anchor point is less than the first distance threshold, the method further includes: Determine the image similarity between the biological objects contained in the object frame and the biological objects corresponding to the anchor points; If the image similarity meets the similarity requirement, it is determined that the biological object contained in the object frame is the same as the biological object corresponding to the anchor point.
4. The method of claim 3, wherein, The method further includes: If the distance between the center point of the object frame and all anchor points exceeds the first distance threshold, it is determined that the biological object contained in the object frame is different from the biological object corresponding to all anchor points, and a new anchor point is added based on the center point of the object frame. In this case, if the distance between the center point of the object frame and all anchor points exceeds the first distance threshold, the image similarity between the biological object contained in the object frame and the biological object corresponding to the anchor point is different.
5. The method of claim 1, wherein, The method further includes: When determining several object boxes contained in an image to be identified based on a region recognition model, the biological object contained in the object box corresponding to the maximum confidence score is determined as the main identification object based on the confidence score output by the region recognition model for each object box, and the biological objects contained in the other object boxes besides the object box corresponding to the maximum confidence score are determined as background identification objects. First, determine whether the main identification object is the same as the biological object corresponding to each anchor point, and whether the main identification object is in a stable identification state.
6. The method of claim 5, wherein, The method further includes: If the number of times the current main identification object appears in other object frames in a stable identification state exceeds the number threshold, the main identification object is input into the trained species identification model to determine the target species to which the current main identification object belongs, and the target species to which the current main identification object belongs is responded to the user in an animated manner through the AR camera's visualization interface. If the number of times any background identification object appears in other object frames in a stable identification state exceeds the number threshold, determine whether the number of times the current main identification object appears in other object frames in a stable identification state exceeds the number threshold, and whether the target species to which the current main identification object belongs has been identified. If the number of times the current primary identification object appears in other object frames in a stable identification state does not exceed the quantity threshold, or if the target species to which the current primary identification object belongs has been identified, the background identification object is determined as the new primary identification object. The new primary identification object is input into the trained species identification model to determine the target species to which the new primary identification object belongs. The target species to which the new primary identification object belongs is then responded to the user through the AR camera's visualization interface in an animated manner.
7. The method according to claim 1, characterized in that, The method further includes: In response to the user's selection action, the AR camera's visual interface displays the target species to which all selected biological objects in the current recognition scene belong.
8. A biological species identification device, characterized in that, The device includes an image acquisition unit, an object frame determination unit, a state determination unit, and a species response unit; wherein: The image acquisition unit is used to call the AR camera to acquire multiple frames of images to be recognized in the current recognition scene; The object box determination unit is used to determine, for each frame of the image to be identified, a number of object boxes contained in the image to be identified based on a trained region recognition model; wherein each object box contains a biological object to be identified. The state determination unit is used to determine each anchor point, wherein the position of each anchor point is predetermined based on the position of the center point of the object frame containing the biological object in the historical frame image; and to determine, for each object frame, whether the biological object contained in the object frame is the same as the biological object corresponding to any anchor point, and whether the biological object is in a stable recognition state, based on the geometric distance between the center point of the object frame and each of the current anchor points; wherein, if the geometric distance between the center point of the object frame and an anchor point is less than a first distance threshold, it is determined to be the same object; if the geometric distance is further less than a second distance threshold, it is determined that the biological object is in a stable recognition state, wherein the second distance threshold is less than the first distance threshold; The species response unit is used to input the image of the object frame containing the biological object corresponding to any anchor point when the number of times the biological object appears in other object frames in a stable recognition state exceeds a certain threshold. Based on the trained species recognition model, the unit determines the target species to which the biological object belongs and responds to the user with the target species through the visualization interface of the AR camera.
9. An electronic device, comprising: processor; Memory used to store processor-executable instructions; The processor implements the steps of the method according to any one of claims 1-7 by running the executable instructions.
10. A computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the method according to any one of claims 1-7.