Image recognition method and apparatus, and electronic device and storage medium
By acquiring user-end images and generating knowledge data based on topic classification trees, this approach solves the problem that existing image recognition technologies cannot provide in-depth knowledge, thus achieving the effect of providing children with a systematic learning experience and enhanced interactivity.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SHENZHEN DR LOOKAI TECHNOLOGY CO LTD
- Filing Date
- 2025-12-03
- Publication Date
- 2026-06-18
AI Technical Summary
Existing image recognition technology cannot provide children with in-depth and systematic knowledge, and traditional education methods lack interactivity and fun, making it difficult to stimulate children's learning interest and desire to explore.
By acquiring the image to be recognized input from the user, the main category name of the target object is determined, and relevant knowledge data is generated based on the topic classification tree. The knowledge data is generated by using the target topic classification tree, which includes the sub-category items of the target object, and then returned to the user.
To provide children with a systematic and rich knowledge base, improve their learning experience, and enhance their interest and interactivity in learning.
Smart Images

Figure CN2025139620_18062026_PF_FP_ABST
Abstract
Description
Image recognition methods, devices, electronic equipment and storage media Technical Field
[0001] This invention relates to the field of artificial intelligence, and in particular to an image recognition method, apparatus, electronic device, and storage medium.
[0002] This application claims priority to Chinese Patent Application No. 202411818100.5, filed on December 10, 2024, entitled "Image Recognition Method, Apparatus, Electronic Device and Storage Medium", the entire contents of which are incorporated herein by reference. Background Technology
[0003] With the rapid development of technology, artificial intelligence (AI) has been widely applied in various fields, especially in education, where it plays a significant role in improving learning efficiency and increasing the enjoyment of learning. Currently, children's ability to recognize and understand their surroundings is particularly important during the learning process, especially in the early stages of cognitive development. However, traditional teaching methods often rely on printed books or oral instruction from teachers. These methods are not only slow in updating information but also lack sufficient interactivity and engagement, making it difficult to stimulate children's learning interest and desire for exploration. Furthermore, while existing image recognition technologies can identify objects in images, they often remain at the recognition level, failing to provide children with deeper and more systematic knowledge. Therefore, how to combine AI technology to provide children with a learning platform that is both fun and efficient has become an urgent problem to be solved. Technical issues
[0004] This invention provides an image recognition method aimed at offering an image recognition solution for children's learning, providing them with more in-depth and systematic knowledge. By acquiring the image to be recognized input from the user, and based on the image, determining the target subject category name corresponding to the target object, and based on the target subject category name, determining the target topic classification tree corresponding to the target object. The target topic classification tree includes target sub-categories of the target object. Using the target topic classification tree, knowledge data for different sub-categories can be generated, providing users with systematic knowledge related to the target object, helping users build a more complete and systematic knowledge system, and improving the user experience.
[0005] In a first aspect, embodiments of the present invention provide an image recognition method, the method comprising the following steps:
[0006] Obtain the image to be recognized input by the user;
[0007] Based on the image to be identified, the target subject category name corresponding to the target object is determined;
[0008] Based on the target subject category name, a target topic classification tree corresponding to the target object is determined, and the target topic classification tree includes the target sub-category items of the target object;
[0009] Based on the target topic classification tree, generate knowledge data for the target object;
[0010] The knowledge data is then returned to the user terminal.
[0011] Optionally, generating knowledge data for the target object based on the target topic classification tree includes:
[0012] Based on the target topic classification tree, the target sub-classification items of the target object are determined;
[0013] Generate target prompt words corresponding to the target sub-category item, and generate knowledge data of the target object based on the target prompt words.
[0014] Optionally, generating target prompt words corresponding to the target sub-category item, and generating knowledge data of the target object based on the target prompt words, includes:
[0015] Obtain the prompt word template corresponding to the target sub-category item, with one prompt word template corresponding to each target sub-category item;
[0016] Based on the prompt word template, generate the target prompt words corresponding to the target sub-category item;
[0017] The target prompt words are input into the large model to generate knowledge data corresponding to the target sub-category item;
[0018] The knowledge data corresponding to each target sub-category item in the target topic classification tree are integrated to obtain the knowledge data of the target object.
[0019] Optionally, determining the target subject category name corresponding to the target object based on the image to be identified includes:
[0020] The image to be identified is subjected to target recognition by an image recognition engine to obtain at least one object to be identified and the subject category name corresponding to the object to be identified.
[0021] Among at least one of the identified objects, a target object is determined;
[0022] Based on the subject category name of the identified object, the target subject category name corresponding to the target object is determined.
[0023] Optionally, determining the target topic classification tree corresponding to the target object based on the target subject category name includes:
[0024] Based on the target subject category name, the target standard name of the target object is determined;
[0025] Based on the correspondence between standard names and topic classification trees, a target topic classification tree corresponding to the target object is determined, with each target standard name corresponding to one topic classification tree.
[0026] Optionally, before determining the target topic classification tree corresponding to the target object based on the target subject category name, the method further includes:
[0027] Collect name information for different objects;
[0028] The name information is standardized to obtain the standard name of the object;
[0029] In addition, the alias of the object is determined from the name information;
[0030] Based on the standard name and the alias, a name database is constructed. The name database is used to find the standard name of the object according to the main category name of the object.
[0031] Optionally, before determining the target topic classification tree corresponding to the target object based on the target subject category name, the method further includes:
[0032] Based on the Nice Classification or biological taxonomy, the objects are classified to obtain a subject classification tree;
[0033] Establish a mapping relationship between standard names and the topic classification trees so that the corresponding topic classification tree can be found by using the standard name.
[0034] Secondly, embodiments of the present invention also provide an image recognition device, the image recognition device comprising:
[0035] The acquisition module is used to acquire the image to be recognized input by the user.
[0036] The recognition module is used to determine the target subject category name corresponding to the target object based on the image to be recognized;
[0037] The first processing module is used to determine a target topic classification tree corresponding to the target object based on the target subject category name, wherein the target topic classification tree includes the target sub-category items of the target object;
[0038] The generation module is used to generate knowledge data of the target object based on the target topic classification tree;
[0039] The return module is used to return the knowledge data to the user terminal.
[0040] Thirdly, embodiments of the present invention provide an electronic device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps in the image recognition method provided in embodiments of the present invention.
[0041] Fourthly, embodiments of the present invention provide a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of the image recognition method provided in the embodiments of the invention.
[0042] In this embodiment of the invention, an image to be recognized input from a user terminal is acquired; based on the image to be recognized, the target subject category name corresponding to the target object is determined; based on the target subject category name, a target topic classification tree corresponding to the target object is determined, the target topic classification tree including target sub-category items of the target object; based on the target topic classification tree, knowledge data of the target object is generated; and the knowledge data is returned to the user terminal. By acquiring an image to be recognized input from a user terminal, and based on the image to be recognized, determining the target subject category name corresponding to the target object, and based on the target subject category name, determining the target topic classification tree corresponding to the target object, the target topic classification tree including target sub-category items of the target object can generate knowledge data of different sub-category items, providing users with system knowledge related to the target object, helping users establish a more complete and systematic knowledge system, and improving user experience. Attached Figure Description
[0043] Figure 1 is a flowchart of an image recognition method provided in an embodiment of the present invention;
[0044] Figure 2 is a schematic diagram of the structure of an image recognition device provided in an embodiment of the present invention;
[0045] Figure 3 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present invention. Embodiments of the present invention
[0046] As shown in Figure 1, Figure 1 is a flowchart of an image recognition method provided by an embodiment of the present invention. The image recognition method includes the following steps:
[0047] 101. Obtain the image to be recognized input by the user.
[0048] In this embodiment of the invention, the image recognition method described above can be applied to an image recognition platform. This platform can be built on a server or a distributed server and includes an image interface (for user uploads), a knowledge database, and an image recognition program. The image interface can be used to acquire image data, which can be a single image, consecutive frame images, or video stream data. The image recognition program can be used to implement each step of the image recognition method. The knowledge database is specifically used to provide additional related information for the identified image subject, improving the image recognition system's understanding of the content.
[0049] The aforementioned image to be identified includes multiple identifiable entity objects. The image to be identified can be image data uploaded by the user, or image data downloaded after the user provides a download address. The number of images to be identified can be one or more; in this embodiment of the invention, only one image to be identified is processed at a time.
[0050] The aforementioned user terminal can be a device used to capture images, such as a smartphone, tablet, computer, learning camera, VR device, etc. Users take photos of objects of interest through the user terminal and upload the captured images to the image recognition platform as the images to be recognized.
[0051] The aforementioned objects can be animals, plants, commodities, items, landscapes, scenes, text, etc.
[0052] 102. Based on the image to be identified, determine the target subject category name corresponding to the target object.
[0053] In this embodiment of the invention, after obtaining the image to be recognized, the target object category name corresponding to the target object can be determined by an intelligent object recognition engine. The aforementioned intelligent recognition engine can be based on a trained target recognition model or a large model. Specifically, the target recognition model can be used to perform target recognition on the image to be recognized to obtain the target object category name corresponding to the target object; alternatively, the image to be recognized can be input into a large model, which will then recognize the target object in the image and generate the target object category name corresponding to the target object for output.
[0054] The aforementioned target recognition model can be built based on deep learning algorithms, such as R-CNN, Faster R-CNN, YoLo, and SSD. By training the target recognition model on a dataset, a trained target recognition model can be obtained. The dataset includes sample images and label data. The sample images contain entity objects, and the label data represents the category names corresponding to the entity objects. The target recognition model outputs the category names. The sample images are input into the target recognition model, which performs target recognition processing to obtain the output category names corresponding to the entity objects in the sample images. The error loss between the output category names and the label data is calculated using a loss function. Minimizing the error loss is the optimization objective. The backpropagation algorithm is used to adjust the model parameters of the target recognition model. This adjustment process is iterated a certain number of times, or until the error loss converges to a predetermined value, completing the training and obtaining the trained target recognition model.
[0055] The aforementioned large models can be CV (computer vision) large models or multimodal large models. CV (computer vision) large models or multimodal large models can be pre-trained large models, such as the GPT-4, GLM-4V, LLaVa, CLIP and other series of large models, or they can be interfaces for commercial large models, such as the interfaces for Tongyi Qianwen and Wenxin Yiyan.
[0056] For the pre-trained large model, it can be adjusted using a pre-prepared dataset to better suit the scenario of outputting entity object names in this embodiment. The dataset includes sample images and label data. The sample images contain entity objects, and the label data represents the names of the entity objects. The pre-trained large model is set to output the category names of the entity objects. The sample images are input into the pre-trained large model, which generates the output category names corresponding to the entity objects in the sample images. The error loss between the output category names and the label data is calculated using a loss function. Minimizing the error loss is the optimization objective. The backpropagation algorithm is used to fine-tune the model parameters of the pre-trained large model. This fine-tuning process is iterated a certain number of times, or until the error loss converges to a predetermined value, completing the adjustment of the pre-trained large model.
[0057] For the interface of the commercial large model, preset prompt words can be used as input to enable the commercial large model to output names of entity objects that conform to this embodiment. For example, the prompt words can be designed as follows:
[0058] "You are a classification expert who can accurately classify entities in images and know the category names of entities under different classification systems."
[0059] Please provide an image: XXX.
[0060] Please identify the entities in the image and output the category names of the entities in the image.
[0061] By using targeted prompts, the commercial large-scale model can be guided to output the name of the entity object that conforms to this embodiment, based on the interface of the commercial large-scale model, as the target subject category name corresponding to the target object.
[0062] The intelligent object recognition engine processes the image to be recognized, obtaining the target object's corresponding category name. Employing advanced image recognition technology, the engine accurately identifies subjects in children's photos and matches them with categories in its database. The engine can use a multimodal large-scale model as its recognition model to identify objects in the image.
[0063] In one possible embodiment, if the intelligent object recognition engine identifies multiple objects, it can mark these objects in the image to be recognized, return them to the user for display, and prompt the user to select an object as the target object. After the user selects an object, the user sends the selection result to the image recognition platform. The image recognition platform determines the object selected by the user as the target object and obtains the target subject category name of the target object.
[0064] In one possible embodiment, if the intelligent object recognition engine identifies multiple objects, it can mark these multiple objects in the image to be recognized, return them to the user for display, and prompt the user to select multiple objects as target objects. After the user selects multiple objects, the user sends the selection result to the image recognition platform. The image recognition platform determines the objects selected by the user as target objects and obtains the target subject category name of the target objects.
[0065] In one possible embodiment, if the intelligent object recognition engine identifies multiple objects, it can select the most important object among the multiple objects as the target object. For example, it can select the object with the highest confidence level as the target object, or the object closest to the center of the image as the target object, or the object with the largest area as the target object.
[0066] It should be noted that the main category name is the category name output by the intelligent object recognition engine. This category name is related to the training data. If the training data uses standard names, the category name output by the intelligent object recognition engine will be the standard name; if the training data uses non-standard names, the category name output by the intelligent object recognition engine will be the non-standard name. For a given object, both the standard name and the non-standard name refer to that object.
[0067] 103. Based on the target subject category name, determine the target topic classification tree corresponding to the target object.
[0068] In this embodiment of the invention, each subject category name corresponds to a topic classification tree, and a topic classification tree can correspond to multiple subject category names. After identifying the target subject category name corresponding to the target object, the corresponding topic classification tree can be found based on the target subject category as the target topic classification tree corresponding to the target object.
[0069] The target topic classification tree includes target subcategories of the target object. In the target topic classification tree to which the target object belongs, one or more target subcategories of the target object can be found.
[0070] It should be noted that the above-mentioned thematic classification tree employs, but is not limited to, the Nice Classification based on commodities and biological taxonomy, to ensure the scientific rigor and systematic nature of the classification. Binding main categories to the thematic classification tree ensures that each main category can find its corresponding position in the tree, belonging to a specific subcategory, thus providing children with a clear learning path.
[0071] For example, in product classification, there might be a broad category called "electronic products," which can be further divided into subcategories such as "mobile phones," "televisions," and "computers." By querying a database or using other algorithms, the main category name is mapped to specific nodes on the subject classification tree, thereby determining the precise classification of the object.
[0072] 104. Generate knowledge data for the target object based on the target topic classification tree.
[0073] In this embodiment of the invention, after obtaining the target topic classification tree, knowledge data of the corresponding target sub-category can be queried in the knowledge database based on the target topic classification tree, thereby obtaining the knowledge data of the target object. Alternatively, prompt words can be constructed for the target sub-category, and the prompt words for the target sub-category can be input into the large model to obtain the knowledge data of the corresponding target sub-category output by the large model.
[0074] If a target object corresponds to only one target subcategory, then the knowledge data of that target subcategory can be used as the knowledge data of the target object. If a target object corresponds to multiple target subcategory items, then the knowledge data of the multiple target subcategory items can be integrated to obtain the knowledge data of the target object.
[0075] The aforementioned knowledge data can include descriptions, uses, characteristics, and history of objects. This knowledge data can exist in various forms such as text, images, and videos, aiming to provide users with rich and comprehensive knowledge content.
[0076] 105. Return the knowledge data to the user.
[0077] In this embodiment of the invention, after obtaining the knowledge data of the target object, the knowledge data of the target object is returned to the user terminal, and the target object and the corresponding knowledge data are displayed on the user terminal, so that the user can systematically understand the target object through the knowledge data displayed on the user terminal.
[0078] Taking a learning-by-photographing device as an example for children, suppose a child uses the device to take a picture of a dog and uploads it. The image recognition platform first identifies the subject in the photo as a dog and assigns it the category name "dog". Then, by querying a subject classification tree, the system determines that the dog belongs to the subfamily "Canidae" under the broad category "Mammals". Next, it retrieves relevant knowledge data about "Canidae" from the database, such as dog breeds, habits, and care methods, and integrates this knowledge data into an easy-to-understand format. Finally, the final knowledge data is returned to the child's device, allowing the child to view or listen to this information, thereby enhancing their understanding of dogs.
[0079] In this embodiment of the invention, an image to be recognized input from a user terminal is acquired; based on the image to be recognized, the target subject category name corresponding to the target object is determined; based on the target subject category name, a target topic classification tree corresponding to the target object is determined, the target topic classification tree including target sub-category items of the target object; based on the target topic classification tree, knowledge data of the target object is generated; and the knowledge data is returned to the user terminal. By acquiring an image to be recognized input from a user terminal, and based on the image to be recognized, determining the target subject category name corresponding to the target object, and based on the target subject category name, determining the target topic classification tree corresponding to the target object, the target topic classification tree including target sub-category items of the target object can generate knowledge data of different sub-category items, providing users with system knowledge related to the target object, helping users establish a more complete and systematic knowledge system, and improving user experience.
[0080] It is understood that in the specific implementation of this application, data such as image data, knowledge data, and user data are involved. When the embodiments in this application are applied to specific products or technologies, user permission or consent is required. Furthermore, the collection, use, and processing of related data, as well as the training, deployment, and invocation of algorithm models, must comply with the relevant laws, regulations, and standards of the relevant countries and regions.
[0081] Optionally, in the step of generating knowledge data of the target object based on the target topic classification tree, the target sub-category of the target object can be determined based on the target topic classification tree; target prompt words corresponding to the target sub-category can be generated; and knowledge data of the target object can be generated based on the target prompt words.
[0082] In this embodiment of the invention, the target sub-classification item is determined by relying on a pre-constructed subject classification tree. The subject classification tree can comprehensively consider multiple classification systems, such as the Nice Classification of commodities and biological taxonomy, to ensure the scientific and systematic nature of the classification.
[0083] After identifying the target object in the image, the corresponding category name can be searched in a name database using the object's main category name. This category name is the object's standard name. The standard name can be defined by experts or by definitions found in textbooks. After obtaining the category name corresponding to the main category name, a subject tree can be searched, with each category name corresponding to a subject classification tree. The subcategory to which the object belongs is then found within the subject classification tree. This ensures that all identified target objects are accurately classified into their respective subcategories. A target object can correspond to at least one subcategory.
[0084] Once the target subcategories are determined, a large language model can be used to generate prompt words for each subcategory, resulting in target prompt words for each target subcategory. Each target subcategory can correspond to a target prompt word. These templates are customized by education experts based on the characteristics of the subcategories or automatically generated by the large model.
[0085] Once the target subcategories are determined, target prompts can be generated based on the corresponding prompt templates. These prompts can guide the multimodal learning model to generate relevant content, such as content that conforms to the characteristics of the subcategories and meets children's learning needs, to serve as knowledge data. The prompt templates mentioned above can be generated based on large language models or developed by educational experts.
[0086] For example, in the pet subcategory, prompts might include "appearance characteristics" and "care instructions," guiding the multimodal large model to generate knowledge data about pet appearance descriptions and care methods. In the appliance subcategory, prompts could focus on "related functions" and "safety precautions" to ensure the generated content helps children understand how to use appliances and safety regulations.
[0087] When generating knowledge data, target prompts can be input into a multimodal large-scale model. After training and generalization, the multimodal large-scale model can generate corresponding multimodal knowledge data, including text, images, and audio, based on the input prompts. This multimodal knowledge data is not only rich and diverse but can also be personalized according to children's learning needs and interests. When a target object corresponds to multiple subcategories, the knowledge data corresponding to each subcategory in the target topic classification tree can be integrated to form complete knowledge data for the target object.
[0088] Optionally, in the steps of generating target prompt words corresponding to target subcategories and generating knowledge data of the target object based on the target prompt words, the following steps can be taken: obtaining prompt word templates corresponding to target subcategories, with one prompt word template for each target subcategorie; generating target prompt words corresponding to target subcategories based on the prompt word templates; inputting the target prompt words into the large model to generate knowledge data corresponding to the target subcategories; and integrating the knowledge data corresponding to each target subcategorie in the target topic classification tree to obtain the knowledge data of the target object.
[0089] In this embodiment of the invention, the aforementioned prompt word templates are pre-defined, with each target subcategory having a corresponding template. The prompt word template corresponding to the target subcategory can be obtained based on the correspondence between the subcategory and the prompt word template. The prompt word templates can be generated based on a large language model or formulated by educational experts. The design of the prompt word templates can take into account the characteristics of the subcategories and educational objectives, aiming to guide the generation of subsequent knowledge data.
[0090] After obtaining the prompt word template, target prompt words corresponding to the target subcategory items can be generated based on the prompt word template. Information about the target subcategory items can be filled into the prompt word template to obtain the target prompt words. These target prompt words are used to guide the direction and focus of knowledge data generation. For example, in the pet subcategory, target prompt words might include "pet appearance" and "feeding methods," which will guide the system to generate knowledge data about pet appearance descriptions and feeding techniques.
[0091] The target prompts are input into a multimodal large-scale model to generate multimodal knowledge data corresponding to the target subcategories. This multimodal large-scale model can be understood as a deep learning model trained on a large amount of multimodal data, possessing powerful generative capabilities. It can generate related multimodal knowledge data, such as text, images, and videos, based on the input prompts. This multimodal knowledge data is not only rich in content but also diverse in format, further meeting the diverse learning needs of children and improving the learning experience for child users.
[0092] When a target object corresponds to multiple subcategories, the knowledge data corresponding to each subcategory in the target topic classification tree can be integrated to form complete knowledge data for the target object. This ensures that the generated knowledge data is both comprehensive and coherent, forming a complete knowledge system. This, in turn, provides children with a more systematic and comprehensive learning experience.
[0093] Optionally, in the step of determining the target subject category name corresponding to the target object based on the image to be identified, the image to be identified can be identified by an image recognition engine to obtain at least one identified object and the subject category name corresponding to the identified object; a target object is determined based on the at least one identified object; and the target subject category name corresponding to the target object is determined based on the subject category name of the identified object.
[0094] In this embodiment of the invention, the image recognition engine described above can also be called a smart object recognition engine. The smart object recognition engine can be built based on deep learning algorithms, capable of identifying objects present in an image and classifying them into their main category names. The smart object recognition engine can also be a multimodal large model or a visual large model. By training the large model to output the main category names, it can accurately identify the main object's corresponding main category name in a child's photograph.
[0095] After recognition, at least one identified object and its corresponding subject category name can be obtained, with each identified object corresponding to one subject category name. These identified objects may be the main object in the image, or they may be background or secondary objects. To ensure the accuracy of subsequent processing, if multiple identified objects are obtained, one can be selected as the target object; if only one identified object is obtained, that object is directly selected as the target object. When multiple identified objects are obtained, this can be achieved by analyzing factors such as the confidence level, size, and position of the identified objects. For example, the object with the highest confidence level can be selected as the target object, as can the object closest to the image center, or as the object with the largest area proportion, and the target subject category name of the target object can be obtained.
[0096] Of course, if multiple objects are obtained from image recognition, these objects can be marked in the image to be recognized and returned to the user for display. The user is prompted to select multiple objects as target objects. After the user selects multiple objects, the user sends the selection result to the image recognition platform. The image recognition platform determines the objects selected by the user as target objects and obtains the target subject category name of the target objects.
[0097] For example, suppose a child user takes a photo of various fruits using a learning camera, with the apple occupying the center of the image and having the largest area. The intelligent object recognition engine might provide multiple subject category names such as "apple," "fruit," or "red object" during the recognition process. It can analyze and compare at least one of the following: the confidence level of each object, its distance from the image center, and its area proportion in the image. If the apple has the highest confidence level, it can be identified as the target object, and the subject category name "apple" can be designated as the target subject category name; alternatively, if the apple is the object closest to the image center, it can be identified as the target object, and the subject category name "apple" can be designated as the target subject category name; alternatively, if the apple has the largest area proportion, it can be identified as the target object, and the subject category name "apple" can be designated as the target subject category name; or, based on the confidence level, distance from the image center, and area proportion in the image, a weighted integral calculation can be performed to obtain the weighted integral sum for each object, and the object with the highest weighted integral sum can be designated as the target object.
[0098] Optionally, in the step of determining the target topic classification tree corresponding to the target object based on the target subject category name, the target standard name of the target object can be determined based on the target subject category name; and the target topic classification tree corresponding to the target object can be determined based on the correspondence between the standard name and the topic classification tree, with each target standard name corresponding to one topic classification tree.
[0099] In this embodiment of the invention, it should be noted that the target subject category name can be either a standard name or an alias, depending on the dataset used to train the intelligent object recognition engine. If the dataset uses a standard name as the output label, the intelligent object recognition engine will output the standard name as the subject category name. If the dataset uses an alias as the output label, the intelligent object recognition engine will output the alias as the subject category name. If the dataset uses a combination of standard name and alias as the output label, the intelligent object recognition engine may output either the standard name or the alias.
[0100] After obtaining the target subject category name, it can be processed into a target standard name using a large language model, or it can be matched with the target standard name corresponding to the target subject category name by performing name matching in a name database.
[0101] The name database stores various objects, each with a standard name and at least one alias. During name matching, the target subject category name is first compared with all standard names. If a standard name matching the target subject category name is found, that name is designated as the target standard name. If no matching standard name is found, the target subject category name is matched with all aliases. If an alias matching the target subject category name is found, the corresponding standard name is designated as the target standard name. If no matching alias is found, the similarity between the target subject category name and each alias is calculated. The standard name corresponding to the alias with the highest similarity is designated as the target standard name. In one possible embodiment, if no alias with the same name as the target subject category name is matched, the first similarity between the target subject category name and each alias is calculated, and the second similarity between the target subject category name and each standard name is calculated. For a standard name, the weighted average similarity between the first similarity and the second similarity is calculated to obtain the weighted average similarity between the target subject category name and the standard name. The standard name with the largest weighted average similarity is determined as the target standard name.
[0102] The aforementioned topic classification trees are pre-constructed, with each standard name corresponding to one topic classification tree. After obtaining the target standard name, the target topic classification tree corresponding to the target object can be determined based on the correspondence between the standard name and the topic classification tree.
[0103] Optionally, before determining the target subject classification tree corresponding to the target object based on the target subject category name, name information of different objects can be collected; the name information can be standardized to obtain the standard name of the object; and the alias of the object can be determined from the name information; a name database can be constructed based on the standard name and the alias, and the name database can be used to find the standard name of the object according to the subject category name of the object.
[0104] In this embodiment of the invention, the name database not only stores the standard names of various objects, but also includes their aliases, which can improve the breadth and accuracy of the identification process.
[0105] Specifically, name information of different objects can be collected through public information, professional literature, online resources, etc. The aforementioned name information can be various titles of the objects.
[0106] After collecting name information for different objects, the information can be standardized. The purpose of standardization is to eliminate ambiguity, inconsistencies, and duplications in names, ensuring that each object has a unique and accurate standard name. For example, for animals, "dog" might be commonly referred to as "dog"; in the standardization process, "dog" can be chosen as its standard name, while "dog" can be recorded as an alias. The standardization process can be based on naming rules and expert knowledge to ensure the scientific validity and universality of the standard names.
[0107] In addition to determining the standard name, aliases for objects can be filtered from the collected name information. An alias is an alternative name used to describe the same object, besides the standard name. The existence of aliases increases the flexibility and inclusiveness of the name database, improving accuracy in the identification process. For example, the object "computer" might have aliases such as "computer" or "laptop."
[0108] Once the standard name and alias are determined, a name database can be built based on this information. The name database establishes a mapping relationship from the standard name of an object to its alias, so that in the subsequent identification process, regardless of whether the main category name is the standard name or an alias, the system can quickly and accurately find the standard name corresponding to the main category name, and then determine the corresponding topic classification tree.
[0109] It should be noted that the construction and maintenance of the name database is an ongoing process. The name database can be continuously updated to maintain its timeliness and accuracy as new objects appear and the names of old objects change.
[0110] For example, suppose we want to build an object recognition system for children, which includes a large number of objects such as animals, plants, and everyday items. When collecting name information, we might find that the animal "cat" has many alternative names in different regions and cultures, such as "kitty" or "cat person." Through standardization, we determine "cat" as its standard name and record the other names as alternative names. When children use a learning device to photograph and identify a cat, regardless of whether they input "cat" or its alternative name, the system can accurately identify the target object in the name database.
[0111] Optionally, before determining the target subject classification tree corresponding to the target object based on the target subject category name, the object can be classified based on Nice classification or biological taxonomy to obtain a subject classification tree; a mapping relationship between standard names and subject classification trees can be established so that the corresponding subject classification tree can be found through the standard name.
[0112] In this embodiment of the invention, after collecting information on different objects, these objects are classified according to the object information. This classification can be based on the Nice Classification or biological taxonomy, or other classification systems.
[0113] The Nice Classification, or the classification system established by the Nice Agreement Concerning the International Classification of Goods and Services for the Purposes of the Registration of Marks, is mainly used for classifying goods and services to facilitate trademark registration and management. Biological taxonomy, on the other hand, is the science of biology that studies the classification, identification, and naming of organisms. It divides organisms into different levels, such as species, genus, family, order, class, and phylum, based on their similarities and kinship.
[0114] When constructing a subject classification tree, an appropriate classification method can be selected based on the different characteristics of the object being identified. If the object is a non-biological entity such as a commodity, the Nice Classification system can be used to classify the commodity according to its properties, functions, uses, and other attributes. For example, appliances, furniture, and toys will be classified into different Nice Classification categories, ensuring the scientific and systematic nature of commodity classification. If the object is a biological entity, such as an animal or plant, classification is based on biological taxonomy. Biological taxonomy can consider the external morphological characteristics of organisms, and can also comprehensively consider their physiological structure, genetic information, and other factors. For example, felines can be further subdivided into genera such as *Felice* and *Panthera*, with each genus containing multiple species.
[0115] Once the classification method is determined, the topic classification tree can be constructed. A topic classification tree can be understood as a hierarchical structure, where each node represents a classification level, gradually refining from the root node until the desired classification precision is achieved. Each leaf node represents a specific object or its subclass. By traversing this topic classification tree, the position of any object within the classification system can be found.
[0116] After obtaining the topic classification tree, a mapping relationship can be established between standard names and the topic classification tree. A standard name is a standardized name for an identified object, uniquely identifying it. A standard name has already been assigned to each identified object when constructing the name database. Furthermore, these standard names can be associated with corresponding nodes in the topic classification tree.
[0117] Specifically, a database table can be created containing fields such as standard name, classification level, and node ID. Whenever a new object is added, its standard name and corresponding classification information need to be added to this database table.
[0118] In practical applications, when an image to be recognized is received from the user's device, the intelligent object recognition engine can obtain the main category name of the object to be recognized. This main category name can be converted into a standard name using a name database, and then the corresponding classification information can be found using a mapping relationship. In this way, corresponding knowledge data can be generated based on the classification information and returned to the user's device.
[0119] As shown in Figure 2, an embodiment of the present invention provides an image recognition device, which includes:
[0120] The acquisition module 201 is used to acquire the image to be recognized input by the user terminal;
[0121] The recognition module 202 is used to determine the target subject category name corresponding to the target object based on the image to be recognized;
[0122] The first processing module 203 is used to determine a target topic classification tree corresponding to the target object based on the target subject category name, wherein the target topic classification tree includes the target sub-category items of the target object;
[0123] The generation module 204 is used to generate knowledge data of the target object based on the target topic classification tree;
[0124] The return module 205 is used to return the knowledge data to the user terminal.
[0125] Optionally, the generation module 204 is further configured to determine the target sub-category of the target object based on the target topic classification tree; generate target prompt words corresponding to the target sub-category; and generate knowledge data of the target object based on the target prompt words.
[0126] Optionally, the generation module 204 is further configured to obtain the prompt word template corresponding to the target sub-category item, with each target sub-category item corresponding to one prompt word template; generate target prompt words corresponding to the target sub-category item based on the prompt word template; input the target prompt words into the large model to generate knowledge data corresponding to the target sub-category item; and integrate the knowledge data corresponding to each target sub-category item in the target topic classification tree to obtain the knowledge data of the target object.
[0127] Optionally, the recognition module 202 is further configured to perform target recognition on the image to be recognized through an image recognition engine to obtain at least one recognition object and the subject category name corresponding to the recognition object; determine a target object based on at least one recognition object; and determine the target subject category name corresponding to the target object based on the subject category name of the recognition object.
[0128] Optionally, the first processing module 203 is further configured to determine the target standard name of the target object based on the target subject category name; and to determine the target topic classification tree corresponding to the target object based on the correspondence between the standard name and the topic classification tree, wherein each target standard name corresponds to one topic classification tree.
[0129] Optionally, the device further includes:
[0130] The collection module is used to collect name information of different objects;
[0131] The second processing module is used to standardize the name information to obtain the standard name of the object;
[0132] In addition, the alias of the object is determined from the name information;
[0133] A construction module is used to construct a name database based on the standard name and the alias. The name database is used to find the standard name of the object according to the main category name of the object.
[0134] Optionally, before determining the target topic classification tree corresponding to the target object based on the target subject category name, the method further includes:
[0135] The classification module is used to classify the objects based on the Nice Classification or biological taxonomy to obtain a subject classification tree;
[0136] The mapping module is used to establish a mapping relationship between standard names and the topic classification trees, so as to find the corresponding topic classification tree by the standard name.
[0137] As shown in Figure 3, this embodiment of the invention also provides an electronic device, including a processor, which can execute any of the above-described image recognition methods.
[0138] Specifically, it includes a processor 301 and a memory 302, as well as a computer program stored in the memory 302 and capable of running on the processor 301 to perform an image recognition method, wherein:
[0139] The processor 301 executes the calculator program for the image recognition method stored in the memory 302, performing the following steps:
[0140] Obtain the image to be recognized input by the user;
[0141] Based on the image to be identified, the target subject category name corresponding to the target object is determined;
[0142] Based on the target subject category name, a target topic classification tree corresponding to the target object is determined, and the target topic classification tree includes the target sub-category items of the target object;
[0143] Based on the target topic classification tree, generate knowledge data for the target object;
[0144] The knowledge data is then returned to the user terminal.
[0145] Optionally, the process of generating knowledge data of the target object based on the target topic classification tree, executed by processor 301, includes:
[0146] Based on the target topic classification tree, the target sub-classification items of the target object are determined;
[0147] Generate target prompt words corresponding to the target sub-category item, and generate knowledge data of the target object based on the target prompt words.
[0148] Optionally, the processor 301 executes the step of generating target prompt words corresponding to the target sub-category item, and generates knowledge data of the target object based on the target prompt words, including:
[0149] Obtain the prompt word template corresponding to the target sub-category item, with one prompt word template corresponding to each target sub-category item;
[0150] Based on the prompt word template, generate the target prompt words corresponding to the target sub-category item;
[0151] The target prompt words are input into the large model to generate knowledge data corresponding to the target sub-category item;
[0152] The knowledge data corresponding to each target sub-category item in the target topic classification tree are integrated to obtain the knowledge data of the target object.
[0153] Optionally, the step of determining the target subject category name corresponding to the target object based on the image to be identified, executed by processor 301, includes:
[0154] The image to be identified is subjected to target recognition by an image recognition engine to obtain at least one object to be identified and the subject category name corresponding to the object to be identified.
[0155] A target object is determined based on at least one of the identified objects;
[0156] Based on the subject category name of the identified object, the target subject category name corresponding to the target object is determined.
[0157] Optionally, the step of processor 301 determining the target topic classification tree corresponding to the target object based on the target subject category name includes:
[0158] Based on the target subject category name, the target standard name of the target object is determined;
[0159] Based on the correspondence between standard names and topic classification trees, a target topic classification tree corresponding to the target object is determined, with each target standard name corresponding to one topic classification tree.
[0160] Optionally, before determining the target topic classification tree corresponding to the target object based on the target subject category name, the method executed by the processor 301 further includes:
[0161] Collect name information for different objects;
[0162] The name information is standardized to obtain the standard name of the object;
[0163] In addition, the alias of the object is determined from the name information;
[0164] Based on the standard name and the alias, a name database is constructed. The name database is used to find the standard name of the object according to the main category name of the object.
[0165] Optionally, before determining the target topic classification tree corresponding to the target object based on the target subject category name, the method executed by the processor 301 further includes:
[0166] Based on the Nice Classification or biological taxonomy, the objects are classified to obtain a subject classification tree;
[0167] If the pair is an organism, then the classification information is classified based on biological taxonomy to obtain a topic classification tree;
[0168] Establish a mapping relationship between standard names and the topic classification trees so that the corresponding topic classification tree can be found by using the standard name.
[0169] This invention also provides a computer-readable storage medium storing a computer program. When the computer program is executed by a processor, it implements the various processes of the image recognition method provided in this invention and achieves the same technical effect. To avoid repetition, it will not be described again here.
Claims
1. An image recognition method, characterized in that, The method includes the following steps: Obtain the image to be recognized input by the user; Based on the image to be identified, the target subject category name corresponding to the target object is determined; Based on the target subject category name, a target topic classification tree corresponding to the target object is determined, and the target topic classification tree includes the target sub-category items of the target object; Based on the target topic classification tree, generate knowledge data for the target object; The knowledge data is then returned to the user terminal.
2. The image recognition method as described in claim 1, characterized in that, The step of generating knowledge data for the target object based on the target topic classification tree includes: Based on the target topic classification tree, the target sub-classification items of the target object are determined; Generate target prompt words corresponding to the target sub-category item, and generate knowledge data of the target object based on the target prompt words.
3. The image recognition method as described in claim 2, characterized in that, The process of generating target prompt words corresponding to the target sub-category item and generating knowledge data of the target object based on the target prompt words includes: Obtain the prompt word template corresponding to the target sub-category item, with one prompt word template corresponding to each target sub-category item; Based on the prompt word template, generate the target prompt words corresponding to the target sub-category item; The target prompt words are input into the large model to generate knowledge data corresponding to the target sub-category item; The knowledge data corresponding to each target sub-category item in the target topic classification tree are integrated to obtain the knowledge data of the target object.
4. The image recognition method as described in claim 1, characterized in that, The step of determining the target subject category name corresponding to the target object based on the image to be identified includes: The image to be identified is subjected to target recognition by an image recognition engine to obtain at least one object to be identified and the subject category name corresponding to the object to be identified. Among at least one of the identified objects, a target object is determined; Based on the subject category name of the identified object, the target subject category name corresponding to the target object is determined.
5. The image recognition method as described in claim 4, characterized in that, The step of determining the target topic classification tree corresponding to the target object based on the target subject category name includes: Based on the target subject category name, the target standard name of the target object is determined; Based on the correspondence between standard names and topic classification trees, a target topic classification tree corresponding to the target object is determined, with each target standard name corresponding to one topic classification tree.
6. The image recognition method as described in claim 5, characterized in that, Before determining the target topic classification tree corresponding to the target object based on the target subject category name, the method further includes: Collect name information for different objects; The name information is standardized to obtain the standard name of the object; In addition, the alias of the object is determined from the name information; Based on the standard name and the alias, a name database is constructed. The name database is used to find the standard name of the object according to the main category name of the object.
7. The image recognition method as described in claim 6, characterized in that, Before determining the target topic classification tree corresponding to the target object based on the target subject category name, the method further includes: Based on the Nice Classification or biological taxonomy, the objects are classified to obtain a subject classification tree; Establish a mapping relationship between standard names and the topic classification trees so that the corresponding topic classification tree can be found by using the standard name.
8. An image recognition device, characterized in that, The image recognition device includes: The acquisition module is used to acquire the image to be recognized input by the user. The first processing module is used to determine the target subject category name corresponding to the target object based on the image to be identified; The second processing module is used to determine a target topic classification tree corresponding to the target object based on the target subject category name, wherein the target topic classification tree includes the target sub-category items of the target object; The generation module is used to generate knowledge data of the target object based on the target topic classification tree; The return module is used to return the knowledge data to the user terminal.
9. An electronic device, characterized in that, include: A memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, implements the steps of the image recognition method as described in any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps of the image recognition method as described in any one of claims 1 to 7.