Image processing methods, apparatus, electronic devices and storage media
By segmenting and matching the images of the objects to be clustered, the problem of inaccurate image grouping and clustering in existing technologies is solved, achieving efficient image clustering and improved location service accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TENCENT TECHNOLOGY (SHENZHEN) CO LTD
- Filing Date
- 2021-12-10
- Publication Date
- 2026-06-30
AI Technical Summary
Existing POI grouping algorithms suffer from inaccurate image grouping and clustering due to the inability to guarantee the image scale of each point of interest's identification information, which increases the workload of geographic location labeling.
By determining the first object image to be segmented and the second object image to be unsegmented from the object images to be clustered, and performing segmentation processing on the first object image to obtain at least two sub-images, the matching process is performed using an image matching model to obtain accurate matching results.
This improves the accuracy of image clustering, reduces the burden of subsequent image annotation of geographic locations, and enhances the accuracy and efficiency of location services.
Smart Images

Figure CN114332834B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer application technology, and in particular to an image processing method, apparatus, electronic device and storage medium. Background Technology
[0002] Location services are widely used, for example in map products, where they typically mark points of interest (POIs) for various geographic locations within a region, such as shops, bars, or gas stations. This requires the location service platform to group the identification information (e.g., shop signs) of each POI within that region, i.e., grouping based on POIs. This allows the identification information of the same POI to be grouped together, enabling precise display to the user.
[0003] Existing POI grouping algorithms largely rely on traditional feature extraction algorithms. However, due to the inability to guarantee the image scale of the identification information for each point of interest, small images often have no or very few feature points, and the threshold for the logarithm of successfully matched features is also difficult to set. This results in inaccurate image grouping and clustering based on points of interest, and also leads to a significant workload for subsequent geolocation annotation. Summary of the Invention
[0004] In view of the aforementioned technical problems, this application proposes an image processing method, apparatus, electronic device, and storage medium.
[0005] According to one aspect of this application, an image processing method is provided, the method comprising:
[0006] Obtain the images of the objects to be clustered and the target objects;
[0007] A first object image to be segmented and a second object image to be unsegmented are determined from the object image to be clustered and the target object image; the size of the first object image is larger than the size of the second object image.
[0008] The first object image is segmented to obtain at least two sub-images;
[0009] The second object image and each sub-image are input into the image matching model for image matching processing to obtain the first matching result.
[0010] Based on the first matching result, the target image set of the images of the objects to be clustered is determined.
[0011] According to another aspect of this application, an image processing apparatus is provided, comprising:
[0012] The acquisition module is used to acquire images of the objects to be clustered and the target objects.
[0013] The segmentation image determination module is used to determine a first object image to be segmented and a second object image not to be segmented from the object image to be clustered and the target object image; the size of the first object image is larger than the size of the second object image.
[0014] The segmentation module is used to segment the first object image to obtain at least two sub-images;
[0015] The first matching module is used to input the second object image and each sub-image into the image matching model, perform image matching processing, and obtain the first matching result;
[0016] The first clustering module is used to determine the target image set of the object images to be clustered based on the first matching result.
[0017] According to another aspect of this application, an electronic device is provided, comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the above-described method.
[0018] According to another aspect of this application, a non-volatile computer-readable storage medium is provided, on which computer program instructions are stored, wherein the computer program instructions, when executed by a processor, implement the above-described method.
[0019] By determining the first object image to be segmented and the second object image to be unsegmented from the object image to be clustered and the target object image, and then segmenting the first object image to obtain at least two sub-images, the second object image and each sub-image are input into an image matching model for image matching processing to obtain the first matching result. This achieves matching of the second object image with each sub-image, making the first matching result more accurate. Furthermore, since the image matching model is used to perform the image matching processing, the image matching process can be made more efficient. Further determining the target image set based on the accurate first matching result can improve the image clustering accuracy; this high-precision image clustering can reduce the pressure of subsequent image annotation of geographic points, further improving the accuracy and efficiency of location services.
[0020] Other features and aspects of this application will become clear from the following detailed description of exemplary embodiments with reference to the accompanying drawings. Attached Figure Description
[0021] The accompanying drawings, which are included in and form part of this specification, illustrate exemplary embodiments, features, and aspects of this application together with the specification and serve to explain the principles of this application.
[0022] Figure 1 This diagram illustrates an application system provided according to an embodiment of the present application.
[0023] Figure 2 A flowchart is shown for an image processing method provided according to an embodiment of this application.
[0024] Figure 3 This diagram illustrates an image matching model provided according to an embodiment of the present application.
[0025] Figure 4 A flowchart is shown for an image processing method provided according to an embodiment of this application.
[0026] Figure 5 The diagram illustrates a method for segmenting a first object image to obtain at least two sub-images according to an embodiment of this application.
[0027] Figure 6 This diagram illustrates an image segmentation method according to an embodiment of the present application.
[0028] Figure 7 A flowchart illustrating another image processing method provided according to an embodiment of this application is shown.
[0029] Figure 8 The diagram shows a flowchart of a text-based clustering method according to an embodiment of this application.
[0030] Figure 9 This diagram illustrates a block diagram of an image processing apparatus according to an embodiment of the present application.
[0031] Figure 10 This diagram illustrates an electronic device for image processing according to an embodiment of the present application. Detailed Implementation
[0032] Various exemplary embodiments, features, and aspects of this application will now be described in detail with reference to the accompanying drawings. The same reference numerals in the drawings denote elements that have the same or similar functions. Although various aspects of the embodiments are shown in the drawings, they are not necessarily drawn to scale unless specifically indicated otherwise.
[0033] The term “exemplary” as used herein means “serving as an example, embodiment, or illustration.” Any embodiment illustrated herein as “exemplary” is not necessarily to be construed as superior to or better than other embodiments.
[0034] Furthermore, to better illustrate this application, numerous specific details are provided in the following detailed embodiments. Those skilled in the art should understand that this application can be implemented without certain specific details. In some instances, methods, means, components, and circuits well-known to those skilled in the art have not been described in detail in order to highlight the main points of this application.
[0035] Please see Figure 1 , Figure 1 This diagram illustrates an application system according to an embodiment of this application. The application system can be used in the image processing method of this application. Figure 1 As shown, the application system may include at least server 01 and terminal 02.
[0036] In this embodiment of the application, the server 01 can be used for image processing. The server 01 may include an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms.
[0037] In this embodiment, the terminal 02 can be used to provide images to be processed, such as continuously captured images of shop signs in an area. The terminal 02 may include physical devices such as smartphones, desktop computers, tablets, laptops, smart speakers, digital assistants, augmented reality (AR) / virtual reality (VR) devices, and smart wearable devices. The physical device may also include software running on it, such as applications. In this embodiment, the operating system running on the terminal 02 may include, but is not limited to, Android, iOS, Linux, and Windows.
[0038] In the embodiments described in this specification, the terminal 02 and the server 01 can be directly or indirectly connected through wired or wireless communication, and this application does not limit this connection.
[0039] In a specific embodiment, when server 02 is a distributed system, this distributed system can be a blockchain system. When the distributed system is a blockchain system, it can be formed by multiple nodes (any form of computing device connected to the network, such as servers or user terminals). These nodes form a peer-to-peer (P2P) network. The P2P protocol is an application layer protocol running on top of the Transmission Control Protocol (TCP). In a distributed system, any machine, such as a server or terminal, can join and become a node. A node includes a hardware layer, a middleware layer, an operating system layer, and an application layer. Specifically, the functions of each node in the blockchain system may include:
[0040] 1) Routing: A basic function of nodes used to support communication between nodes.
[0041] In addition to routing capabilities, nodes can also have the following functions:
[0042] 2) Applications are deployed in the blockchain to implement specific business needs. They record data related to the implementation of functions to form record data, carry digital signatures in the record data to indicate the source of the task data, and send the record data to other nodes in the blockchain system. When other nodes successfully verify the source and integrity of the record data, they add the record data to a temporary block.
[0043] It should be noted that in the specific embodiments of this application, data related to user information (including but not limited to user device information, user personal information, etc.) are involved. When the following embodiments of this application are applied to specific products or technologies, user permission or consent is required, and the collection, use and processing of related data must comply with the relevant laws, regulations and standards of the relevant countries and regions.
[0044] Figure 2 A flowchart illustrating an image processing method according to an embodiment of this application is shown. Figure 2 As shown, the method may include:
[0045] S201, Obtain the image of the object to be clustered and the image of the target object.
[0046] In the embodiments of this specification, the image of the object to be clustered can be an image including a preset object. In one example, the preset object can refer to identification information used to distinguish different points of interest within a geographical area. Points of interest can include shops, attractions, and buildings, and the identification information can be information used to identify points of interest, such as shop signs, attraction signs, and building signs. Taking shop signs as an example, in practical applications, to accurately display each shop sign to the user in a map application, images can be taken in advance in various regions. For example, multiple images can be taken continuously along a street. The map application platform (location service platform) can extract object images including shop signs from the multiple captured images, and each object image can include one shop sign. Furthermore, these object images can be processed according to the embodiments of this specification to achieve clustering of the object images, that is, to divide each object image into different image sets, so that shop signs in the same image set are the same, and shop signs in different image sets are different.
[0047] In one possible implementation, the process of obtaining the image of the object to be clustered described above may include:
[0048] Obtain the image to be processed, which is an image including at least one preset object;
[0049] Extract the image of the object to be clustered from the image to be processed, wherein the image of the object to be clustered includes a preset object.
[0050] In practical applications, each shop has a signboard, and the image to be processed may include the signboard of one shop or multiple shops. After acquiring the image to be processed, at least one object image can be extracted from it. For example, a preset object can be identified in the image to be processed, thereby extracting the region image containing the preset object as at least one object image. The at least one object image can then be traversed, and the traversed object images can be used as object images to be clustered. The image processing method of the embodiments of this specification can be executed for each object image to achieve the purpose of clustering each object image. After matching at least one object image corresponding to one image to be processed, the extraction processing of the next image to be processed can proceed, thereby enabling clustering processing of the object images in the next image to be processed.
[0051] To acquire the images to be processed, they can be obtained from the images corresponding to the target geographic region to be labeled. In one example, the image acquisition terminal can acquire images of a preset object in the target geographic region (e.g., a street) and upload them in real time. Correspondingly, the platform can receive the images uploaded by the image acquisition terminal in real time, and thus use the currently received images as the images to be processed. Alternatively, in another example, all images of the target geographic region uploaded by the image acquisition terminal can be acquired as the images corresponding to the target geographic region. Here, the images corresponding to the target geographic region can be images arranged in chronological order, so these images can be traversed in chronological order, and the traversed images can be used as the images to be processed.
[0052] In the embodiments of this specification, the target object image can refer to an image used for matching with the object image to be clustered. Here, the image to be matched with the object image to be clustered can be at least one image to be matched, and the acquisition of the at least one image to be matched and the target object image can be achieved in the following ways.
[0053] In one possible implementation, the target geographic region corresponding to the image of the object to be clustered can be obtained, thereby obtaining images of the target geographic region arranged in chronological order. Further, to ensure matching accuracy and improve matching efficiency, at least one image whose image acquisition time is within a time threshold of the target time can be obtained, thereby extracting images of the region where the preset object is located from these at least one image as at least one image to be matched. The image of the object to be clustered can be matched with each image to be matched, thereby traversing these at least one image to be matched, and taking one of the traversed images as the target object image. Here, the target time can be the image acquisition time of the image to be processed corresponding to the image of the object to be clustered. The image to be processed corresponding to the image of the object to be clustered can refer to the image to be clustered extracted from the image to be processed. In one example, the images arranged in chronological order can be image 1, ..., image N-1, image N; image N is the image to be processed, and the image to be clustered extracted from image N is: image H of the object to be clustered. With a duration threshold of one frame, images whose duration is within the threshold and are close to the image H of the object to be clustered can be obtained as image N-1. Therefore, images of the region containing the preset object can be extracted from image N-1 as at least one image to be matched: image D and image F. Further, images D and F can be traversed. When image D is encountered, it can be used as the target object image.
[0054] In another possible implementation, the target geographic region corresponding to the image of the object to be clustered can be obtained, thereby obtaining the image set associated with the target geographic region. This image set associated with the target geographic region can refer to the image set to which the clustered object images within the target geographic region belong. Further, the object images in the image set associated with the target geographic region can be used as at least one image to be matched; thus, the at least one image to be matched can be traversed, and the traversed image to be matched can be used as the target object image. As an example, image sets associated with the target geographic region can be obtained, such as image set A and image set B. Therefore, the object images in image set A and image set B can be directly used as at least one image to be matched. Further, the at least one image to be matched can be traversed, and the traversed image to be matched can be used as the target object image.
[0055] It should be noted that if no associated image set exists for the target geographic region, a new image set can be directly constructed as the target image set. The images of the objects to be clustered are placed in this newly created target image set; that is, this newly created target image set can be considered as the first image set associated with the target geographic region. Here, the absence of an associated image set for the target geographic region indicates that the images of the objects to be clustered are the first images in the corresponding images of the target geographic region to be clustered. Based on this, a new image set can be directly created as the target image set for the images of the objects to be clustered.
[0056] S203, determine the first object image to be segmented and the second object image to be unsegmented from the object image to be clustered and the target object image; the size of the first object image is larger than the size of the second object image.
[0057] In the embodiments of this specification, a first object image to be segmented and a second object image not to be segmented can be determined from the object image to be clustered and the target object image. For example, a first size of the object image to be clustered and a second size of the target object image can be determined; when the first size is larger than the second size, the object image to be clustered can be determined as the first object image to be segmented (needs segmentation), and the target object image can be determined as the second object image not to be segmented (does not need segmentation). That is, the object image to be segmented and the object image not to be segmented are determined from two object images (the object image to be clustered and the target object image). The first size of the object image to be clustered can refer to the length of its longer side, and the second size of the target object image can refer to the length of its longer side.
[0058] S205, perform segmentation processing on the first object image to obtain at least two sub-images.
[0059] In the embodiments of this specification, the border of the second object image can be slid across the first object image to capture at least two sub-images. The border of the second object image can refer to the border formed by the edges of the second object image.
[0060] S207, Input the second object image and each sub-image into the image matching model, perform image matching processing, and obtain the first matching result.
[0061] In the embodiments of this specification, each sub-image can be paired with a second object image to form an image pair. These image pairs can then be input into an image matching model for image matching processing, yielding a first matching sub-result for each image pair. This first matching sub-result can be either a match or a non-match. There can be at least two first matching sub-results (corresponding to at least two sub-images), allowing the determination of a first matching result based on these at least two first matching sub-results. For example, if a match exists among the at least two first matching sub-results, the first matching result is determined to be a match; if neither of the at least two first matching sub-results is a match, the first matching result is determined to be a non-match.
[0062] The image matching model can be a deep learning network. In one example, the image matching model could be as follows: Figure 3 As shown, the image matching model can include a first convolutional network, a second convolutional network, a feature fusion and convolutional module, and a fully connected layer. The first and second convolutional networks can share weight parameters. Taking a second object image H and a sub-image K as an example, H can be input into the first convolutional network, and K into the second convolutional network for downsampling, outputting the corresponding feature image. The first and second convolutional networks can be residual networks, such as the ResNet101 network. This residual network can consist of an identity block and a convolutional block. The convolutional block consists of multiple convolutional layers, normalization layers, and activation layers to extract features. The identity mapping and shortcut are added to the identity block, which does not generate additional parameters or increase computational complexity, ensuring effective gradient backpropagation and preventing gradient vanishing during deep network training. Lower-level convolutional layers are responsible for extracting basic features such as image edges and textures, while higher-level convolutional layers are responsible for combining and abstracting the lower-level texture features. The normalization layer normalizes the features to a normal distribution. Activation layers can perform non-linear mapping on extracted features, thereby enhancing the model's generalization ability.
[0063] Furthermore, the feature images output by the first and second convolutional networks can be input into the feature fusion and convolution module for feature fusion and extraction. This involves fusing the weighted dual-path features and further extracting them using a series of convolutional operations. Finally, a fully connected operation transforms the features into a one-dimensional vector, yielding the first matching result, such as a match or no match. This image matching model integrates feature extraction and feature similarity discrimination, improving the accuracy and efficiency of deep feature matching.
[0064] The training of the aforementioned image matching model can involve acquiring multiple pairs of sample images and their label information (e.g., matching or non-matching); each sample image can be an image containing a predefined object. The sample image pairs can then be input into a predefined deep learning model for matching processing to obtain matching prediction results. Furthermore, based on the matching prediction results and label information, loss information can be determined, and this loss information can be used to train the predefined deep learning model to obtain the image matching model. The determination of this loss information can be based on the cross-entropy loss function for binary classification, and this disclosure does not limit it in this way.
[0065] S209, Based on the first matching result, determine the target image set of the images of the objects to be clustered.
[0066] In the embodiments of this specification, if the first matching result is a match, and the target object image belongs to an image set, the image set to which the target object image belongs can be used as the target image set; if the target object image does not belong to an image set, a new image set can be constructed as the target image set. Further, the object images to be clustered can be assigned to the newly created target image set; alternatively, the target object images can also be assigned to the newly created target image set.
[0067] Optionally, if the first matching result is a mismatch, a new image set can be constructed as the target image set. When there are multiple images to be matched, if the first matching result is a mismatch, it can be determined whether the traversal of the other multiple images to be matched has ended. That is, if the traversal reaches the last image to be matched, it means that at least one image to be matched does not match the image to be clustered, and a new image set can be constructed as the target image set. If the traversal has not ended, the traversal can proceed to the next image to be matched, and the target image can be updated using the next image to be matched, and the matching of the next image to be matched with the image to be clustered can be performed until the traversal ends.
[0068] In practical applications, the above processing can be applied to all images of objects to be clustered corresponding to a target geographic region, thereby achieving clustering of the identification information corresponding to points of interest within a target geographic region. Here, the images of objects to be clustered corresponding to the target geographic region can refer to the images of objects to be clustered extracted from the images to be processed corresponding to the target geographic region.
[0069] By determining the first object image to be segmented and the second object image to be unsegmented from the object image to be clustered and the target object image, and then segmenting the first object image to obtain at least two sub-images, the second object image and each sub-image are input into an image matching model for image matching processing to obtain the first matching result. This achieves matching of the second object image with each sub-image, making the first matching result more accurate. Furthermore, since the image matching model is used to perform the image matching processing, the image matching process can be made more efficient. Further determining the target image set based on the accurate first matching result can improve the image clustering accuracy; this high-precision image clustering can reduce the pressure of subsequent image annotation of geographic points, further improving the accuracy and efficiency of location services.
[0070] Figure 4 A flowchart illustrating an image processing method according to an embodiment of this application is shown. In one possible implementation, the target object image can be any one of at least one image to be matched. Figure 4 As shown, after step S201 above, the following may also be included:
[0071] S401, based on the Scale Invariant Feature Transform (SIFT) algorithm, the image of the object to be clustered is matched with at least one image to be matched to obtain a second matching result;
[0072] Accordingly, the above S203 may include:
[0073] S403, if the second matching result is a mismatch, determine the first object image to be segmented and the second object image to be unsegmented from the object image to be clustered and the target object image.
[0074] In the embodiments of this specification, considering that the segmentation processing of the first object image will affect the clustering efficiency, the SIFT (Scale Invariant Feature Transform) algorithm is used for matching before using the image matching model. When the object image to be clustered does not match with at least one image to be matched, the image matching model is considered to be used, thereby realizing the decentralization of image processing and ensuring both the accuracy and efficiency of image clustering.
[0075] Specifically, based on the SIFT algorithm, feature points and corresponding descriptors of the images to be clustered, as well as feature points and corresponding descriptors of each image to be matched, can be determined. The K-nearest neighbor algorithm can then be used to calculate the number of successful matches between feature points. If the number of successful matches is higher than the feature logarithm threshold, the images to be clustered and matched are considered a match; if the number of successful matches is not higher than the feature logarithm threshold, they are considered a mismatch. Based on this, if all second matching sub-results of the images to be clustered and matched are mismatched, the second matching result is determined to be a mismatch; if at least one second matching sub-result shows a match, the second matching result is determined to be a match. Optionally, the matched feature points can be verified, for example, using an algorithm such as RANSAC to filter out incorrectly matched feature points and improve matching accuracy. The feature logarithm threshold can be set higher than the existing feature logarithm threshold. This improves the accuracy of the second matching result and allows for the decentralization of the image matching model, enabling image matching models to handle more difficult-to-match objects, thus achieving a balance between matching accuracy and efficiency.
[0076] Optionally, if the second matching result is a match, the target matching image corresponding to the match can be determined. For example, the image to be matched whose second matching sub-result is a match can be determined as the target matching image; the target matching image is one of at least one image to be matched. Further, the image set to which the target matching image belongs can be used as the target image set for the images to be clustered. For example, the image to be clustered can be matched with each image to be matched. If the second matching sub-result when matching image A is a match, the image set to which image A belongs, such as image set n, can be used as the target image set for the images to be clustered. That is, the image to be clustered can be added to image set n to complete the clustering process of the images to be clustered.
[0077] Figure 5 This document illustrates a flowchart of a method for segmenting a first object image to obtain at least two sub-images according to an embodiment of this application. In one possible implementation, the image format of the first object image and the image format of the second object image can be the same. For example, from at least one image to be matched, an image with the same image format as the object image to be clustered can be selected as the target object image, thereby achieving the same image format for the first object image and the second object image. This image format indicates whether the image is in landscape or portrait format. Accordingly, as... Figure 5 As shown, step S205 above may include:
[0078] S501, obtain the target size information corresponding to the image format.
[0079] In the embodiments of this specification, the image format may include landscape and portrait formats. As an example, the size information for the landscape format may be 336*112, and the size information for the portrait format may be 112*336. This disclosure does not limit this. Figure 6 Taking the landscape format in image 'a' as an example, both the first and second object images are in landscape format, so the target size information can be obtained as the landscape format size: 336*112. Among these, the image in landscape format... Figure 6 The left image in image 'a' has a shorter horizontal side than the right image, which means the left image is the second object image and the right image is the first object image.
[0080] S503, based on the target size information, perform size transformation processing on the second object image to obtain a reference image.
[0081] In the embodiments of this specification, the size transformation process can be a magnification process. For example, the second object image is magnified proportionally until the short side of the second object image reaches 112 or the long side of the second object image reaches 336, then the magnification process stops (i.e., the size transformation process ends). The second object image corresponding to the end of the size transformation process can then be used as a reference image, such as... Figure 6 As shown in b. The initial size of the left image is 50*20. Each side of the left image is proportionally enlarged, with the shorter side reaching 112 first, resulting in a reference image of 280*112, as shown. Figure 6 As shown in b. The unit of measurement for size can be pixels.
[0082] S505, the first object image is segmented based on the reference image to obtain at least two sub-images.
[0083] In the embodiments of this specification, a border formed by the edge of a reference image (such as...) can be used. Figure 6 As shown in c), slide across the first object image to segment the first object image, obtaining at least two sub-images, such as... Figure 6 As shown in c and 6d. In the embodiments described in this specification, the step size of the sliding motion is not limited.
[0084] By setting the image matching model to support both horizontal and vertical formats, and since shop signs are generally horizontal or vertical, it can adapt to various sizes of shop sign images, ensuring that the image size ratio is not distorted, and can retain complete original information, making the first matching result more accurate.
[0085] Figure 7A flowchart illustrating another image processing method according to an embodiment of this application is shown. Figure 7 As shown, in one possible implementation, after step S505, the following may also be included:
[0086] S701, pixel filling processing is performed on at least two sub-images and a reference image respectively to obtain at least two first target images and a second target image; wherein the size information of each first target image and the size information of each second target image are target size information.
[0087] In the embodiments of this specification, it is considered that the two images input to the image matching model have the same image format and corresponding size information. When at least two sub-images and the reference image do not match the size information, pixel filling processing is performed on at least two sub-images and the reference image respectively to obtain at least two first target images and a second target image. Taking the second object image as an example, the pixel-filled second target image can be as follows: Figure 6 As shown in e, the size of the second target image can be the target size information, for example, 336*112. Figure 6 Each sub-image in d can also undergo the same pixel filling to obtain the corresponding first target image. Black pixels can be used for this pixel filling process, but this disclosure does not limit this approach.
[0088] Accordingly, the above S207 may include:
[0089] S703, input the second target image and each first target image into the image matching model, perform image matching processing, and obtain the first matching result.
[0090] In this embodiment of the specification, at least two first target images can be traversed. When a matching object image is encountered among the at least two first target images, the second target image and the matching object image are input into the image matching model for image matching processing to obtain a first matching result. The specific implementation of this step S703 can be found in S207 above, and will not be repeated here.
[0091] Figure 8 This diagram illustrates a text-based clustering method according to an embodiment of this application. Figure 8 As shown, one possible implementation may include:
[0092] S801, take the image that meets the preset condition in each image set to be merged as the target image corresponding to each image set to be merged.
[0093] The image set to be merged can be obtained based on the target image set and the image sets to which at least one image to be matched belongs. In other words, an existing image set can be used as the image set to be merged, and this existing image set is associated with the target geographic region. The preset condition can refer to the text clarity being higher than the text clarity of the images in the image set to be merged excluding the target image, i.e., text clarity is ranked first. This disclosure does not limit this, as long as it is beneficial to subsequent text recognition.
[0094] In the embodiments described in this specification, one image can be selected from each set of images to be merged as the target image corresponding to each set of images to be merged.
[0095] S803, performs text recognition processing on the target image to obtain the text information of the target image;
[0096] S805, perform matching processing on two text messages to obtain text matching information;
[0097] S807, merge two image sets to be merged if the text matching information is greater than or equal to the matching threshold, and obtain the merged image set.
[0098] In the embodiments of this specification, text recognition processing can be performed on each target image. For example, text recognition processing can be performed on each target image using OCR (Optical Character Recognition) technology to obtain the text information of each target image. Then, matching processing can be performed on every two pieces of text information to obtain text matching information. Furthermore, two sets of images to be merged can be merged if the text matching information is greater than a matching threshold, resulting in a merged image set. This achieves the merging of existing image sets, i.e., secondary clustering. Through secondary clustering of text, the accuracy of image clustering can be further improved.
[0099] Figure 9 This diagram illustrates a block diagram of an image processing apparatus according to an embodiment of this application. Figure 9 As shown, the device may include:
[0100] The acquisition module 901 is used to acquire images of the objects to be clustered and the target object images;
[0101] The segmentation image determination module 903 is used to determine a first object image to be segmented and a second object image not to be segmented from the object image to be clustered and the target object image; the size of the first object image is larger than the size of the second object image.
[0102] The segmentation module 905 is used to segment the first object image to obtain at least two sub-images;
[0103] The first matching module 907 is used to input the second object image and each sub-image into the image matching model, perform image matching processing, and obtain the first matching result;
[0104] The first clustering module 909 is used to determine the target image set of the object images to be clustered based on the first matching result.
[0105] In one possible implementation, the target object image is any one of at least one images to be matched, and the apparatus may further include:
[0106] The second matching module is used to match the image of the object to be clustered with the at least one image to be matched based on the scale-invariant feature transform algorithm to obtain a second matching result.
[0107] Accordingly, the above-mentioned segmentation image determination module can also be used to determine the first object image to be segmented and the second object image not to be segmented from the object image to be clustered and the target object image when the second matching result is a mismatch.
[0108] In one possible implementation, the image format of the first object image and the image format of the second object image are the same, wherein the image format indicates whether the image is in landscape or portrait format; the above segmentation module may include:
[0109] A target size information acquisition unit is used to acquire target size information corresponding to the image format;
[0110] The reference image acquisition unit is used to perform size transformation processing on the second object image based on the target size information to obtain a reference image;
[0111] The sub-image acquisition unit is used to segment the first object image based on the reference image to obtain the at least two sub-images.
[0112] In one possible implementation, the above-mentioned apparatus may further include:
[0113] A pixel filling module is used to perform pixel filling processing on the at least two sub-images and the reference image respectively to obtain at least two first target images and a second target image; wherein the size information of each first target image and the size information of each second target image are the target size information;
[0114] Accordingly, the first matching module mentioned above may include:
[0115] The matching unit is used to traverse the at least two first target images, input the second object image and each first target image into the image matching model, perform image matching processing, and obtain the first matching result.
[0116] In one possible implementation, the above-mentioned acquisition module may include:
[0117] An image acquisition unit is used to acquire an image to be processed, wherein the image to be processed is an image including at least one preset object;
[0118] The image acquisition unit for objects to be clustered is used to extract images of objects to be clustered from the image to be processed, wherein the images of objects to be clustered include one of the at least one preset objects.
[0119] In one possible implementation, the first clustering module described above may include:
[0120] The first clustering unit is used to, if the first matching result is a match, take the image set to which the target object image belongs as the target image set if the target object image has an image set to which it belongs.
[0121] In one possible implementation, the first clustering module described above may further include:
[0122] The second clustering unit is used to construct a new image set as the target image set if the first matching result is a mismatch.
[0123] In one possible implementation, the device may further include:
[0124] The target image determination module is used to identify images in each image set to be merged that meet preset conditions as the target images for each image set to be merged; the image set to be merged is obtained based on the target image set and the image set to which each of the at least one image to be matched belongs.
[0125] A text recognition module is used to recognize and process the text in the target image to obtain the text information of the target image;
[0126] The text matching module is used to match two pieces of text information to obtain text matching information;
[0127] The second clustering module is used to merge two image sets to be merged whose text matching information is greater than or equal to the matching threshold, so as to obtain a merged image set.
[0128] In one possible implementation, the device may further include:
[0129] The target image set determination module is used to determine the target matching image corresponding to the match when the second matching result is a match; the target matching image is one of the at least one images to be matched.
[0130] The image set to which the target matching image belongs is taken as the target image set of the object to be clustered.
[0131] Regarding the apparatus in the above embodiments, the specific manner in which each module and unit performs its operations has been described in detail in the embodiments related to the method, and will not be elaborated upon here.
[0132] On the other hand, this application provides a computer program product or computer program that includes computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the data recommendation method provided in the various optional implementations described above.
[0133] Figure 10 This diagram illustrates a block diagram of an electronic device for image processing according to an embodiment of this application. The electronic device may be a server, and its internal structure diagram may be as follows: Figure 10 As shown, the electronic device includes a processor, memory, and a network interface connected via a system bus. The processor provides computing and control capabilities. The memory includes a non-volatile storage medium and internal memory. The non-volatile storage medium stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The network interface is used to communicate with external terminals via a network connection. When the computer program is executed by the processor, it implements an image processing method.
[0134] Those skilled in the art will understand that Figure 10 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the electronic device to which the present application is applied. The specific electronic device may include more or fewer components than shown in the figure, or combine certain components, or have different component arrangements.
[0135] In an exemplary embodiment, an electronic device is also provided, including: a processor; and a memory for storing processor-executable instructions; wherein the processor is configured to execute the instructions to implement the image processing method as described in the embodiments of this application.
[0136] In an exemplary embodiment, a storage medium is also provided, which, when the instructions in the storage medium are executed by the processor of an electronic device, enables the electronic device to perform the image processing method of the present application embodiments.
[0137] In an exemplary embodiment, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to perform the image processing method of the embodiments of this application.
[0138] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. This computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments of the above methods. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include non-volatile and / or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), RAMbus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and RAMbus dynamic RAM (RDRAM), etc.
[0139] Other embodiments of this application will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of this application that follow the general principles of this application and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this application are indicated by the following claims.
[0140] It should be understood that this application is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this application is limited only by the appended claims.
Claims
1. An image processing method, characterized in that, The method includes: Obtain images of objects to be clustered and images of target objects; the image of the object to be clustered is an image that includes a preset object; the preset object refers to the identification information used to distinguish different points of interest within a geographical region; From the object image to be clustered and the target object image, a first object image to be segmented and a second object image not to be segmented are determined; the size of the first object image is larger than the size of the second object image; "to be segmented" means that segmentation is required, and "not to be segmented" means that segmentation is not required. The first object image is segmented to obtain at least two sub-images; The second object image and each sub-image are input into the image matching model for image matching processing to obtain the first matching result. Based on the first matching result, the target image set of the images of the objects to be clustered is determined; The step of determining the first object image to be segmented and the second object image to be unsegmented from the object image to be clustered and the target object image includes: determining a first size of the object image to be clustered and a second size of the target object image; when the first size is greater than the second size, determining the object image to be clustered as the first object image to be segmented and the target object image as the second object image to be unsegmented; The step of segmenting the first object image to obtain at least two sub-images includes: sliding the border of the second object image within the first object image to capture the at least two sub-images, wherein the border of the second object image refers to the border formed by the edges of the second object image.
2. The method according to claim 1, characterized in that, The target object image is any one of at least one images to be matched, and the method further includes: Based on the scale-invariant feature transform algorithm, the image of the object to be clustered is matched with the at least one image to be matched to obtain a second matching result; The step of determining the first object image to be segmented and the second object image to be unsegmented from the object image to be clustered and the target object image includes: If the second matching result is a mismatch, a first object image to be segmented and a second object image not to be segmented are determined from the object image to be clustered and the target object image.
3. The method according to claim 1 or 2, characterized in that, The image format of the first object image and the image format of the second object image are the same, wherein the image format indicates whether the image is in landscape or portrait format; the segmentation process of the first object image to obtain at least two sub-images includes: Obtain the target size information corresponding to the image format; Based on the target size information, the second object image is subjected to size transformation processing to obtain a reference image; The first object image is segmented based on the reference image to obtain at least two sub-images.
4. The method according to claim 3, characterized in that, The method further includes: Pixel filling processing is performed on the at least two sub-images and the reference image respectively to obtain at least two first target images and a second target image; wherein the size information of each first target image and the size information of each second target image are the target size information; The step of inputting the second object image and each sub-image into the image matching model for image matching processing to obtain the first matching result includes: The second target image and each of the first target images are input into the image matching model for image matching processing to obtain the first matching result.
5. The method according to claim 1 or 2, characterized in that, The step of obtaining the image of the object to be clustered includes: Acquire an image to be processed, wherein the image to be processed includes an image of at least one preset object; Extract the clustering object image from the image to be processed, wherein the clustering object image includes a preset object.
6. The method according to claim 1, characterized in that, The step of determining the target image set of the images to be clustered based on the first matching result includes: If the first matching result is a match, and the target object image has a corresponding image set, then the image set to which the target object image belongs is taken as the target image set.
7. The method according to claim 1, characterized in that, The step of determining the target image set of the images to be clustered based on the first matching result includes: If the first matching result is a mismatch, a new image set is constructed as the target image set.
8. The method according to claim 2, characterized in that, Also includes: Each image in the image set to be merged that meets the preset conditions is used as the target image for each image set to be merged. The set of images to be merged is obtained based on the target set and the image sets to which each of the at least one image to be matched belongs; The text in the target image is recognized to obtain the text information of the target image; The two text messages are matched to obtain text matching information; The two image sets to be merged are merged by merging the text matching information which is greater than or equal to the matching threshold to obtain the merged image set.
9. The method according to claim 2, characterized in that, Also includes: If the second matching result is a match, the target matching image corresponding to the match is determined; The target matching image is one of the at least one images to be matched; The image set to which the target matching image belongs is taken as the target image set of the object to be clustered.
10. An image processing apparatus, characterized in that, include: The acquisition module is used to acquire images of objects to be clustered and images of target objects; the image of the object to be clustered is an image that includes a preset object; The preset object refers to the identification information used to distinguish different points of interest within a geographical area; The segmentation image determination module is used to determine a first object image to be segmented and a second object image not to be segmented from the object image to be clustered and the target object image; the size of the first object image is larger than the size of the second object image; "to be segmented" means that segmentation is required, and "not to be segmented" means that segmentation is not required. The segmentation module is used to segment the first object image to obtain at least two sub-images; The first matching module is used to input the second object image and each sub-image into the image matching model, perform image matching processing, and obtain the first matching result; The first clustering module is used to determine the target image set of the object images to be clustered based on the first matching result; The segmentation image determination module is further configured to determine a first size of the object image to be clustered and a second size of the target object image; when the first size is greater than the second size, the object image to be clustered is determined as the first object image to be segmented, and the target object image is determined as the unsegmented second object image; The segmentation module is further configured to slide the border of the second object image in the first object image to crop the at least two sub-images, wherein the border of the second object image refers to the border formed by the edges of the second object image.
11. The apparatus according to claim 10, characterized in that, The target object image is any one of at least one images to be matched, and the device further includes: The second matching module is used to match the image of the object to be clustered with the at least one image to be matched based on the scale-invariant feature transform algorithm to obtain a second matching result. The segmentation image determination module is further configured to determine, from the object image to be segmented and the target object image, a first object image that is not segmented, if the second matching result is a mismatch.
12. The apparatus according to claim 10 or 11, characterized in that, The image format of the first object image and the image format of the second object image are the same, and the image format indicates whether the image is in landscape or portrait format; The segmentation module includes: A target size information acquisition unit is used to acquire target size information corresponding to the image format; The reference image acquisition unit is used to perform size transformation processing on the second object image based on the target size information to obtain a reference image; The sub-image acquisition unit is used to segment the first object image based on the reference image to obtain the at least two sub-images.
13. The apparatus according to claim 12, characterized in that, The device further includes: A pixel filling module is used to perform pixel filling processing on the at least two sub-images and the reference image respectively to obtain at least two first target images and a second target image; wherein the size information of each first target image and the size information of each second target image are the target size information; The first matching module includes: The matching unit is used to input the second target image and each first target image into the image matching model to perform image matching processing and obtain the first matching result.
14. The apparatus according to claim 10 or 11, characterized in that, The acquisition module includes: An image acquisition unit is used to acquire an image to be processed, wherein the image to be processed is an image including at least one preset object; The image acquisition unit for objects to be clustered is used to extract the image of objects to be clustered from the image to be processed, wherein the image of objects to be clustered includes a preset object.
15. The apparatus according to claim 10, characterized in that, The first clustering module includes: The first clustering unit is configured to, if the first matching result is a match, take the image set to which the target object image belongs as the target image set if the target object image has an image set to which it belongs.
16. The apparatus according to claim 10, characterized in that, The first clustering module also includes: The second clustering unit is used to construct a new image set as the target image set if the first matching result is a mismatch.
17. The apparatus according to claim 11, characterized in that, The device further includes: The target image determination module is used to identify images in each image set to be merged that meet preset conditions as target images for each image set to be merged; the image set to be merged is obtained based on the target image set and the image set to which each of the at least one image to be matched belongs. A text recognition module is used to recognize and process the text in the target image to obtain the text information of the target image; The text matching module is used to match two pieces of text information to obtain text matching information; The second clustering module is used to merge two image sets to be merged whose text matching information is greater than or equal to the matching threshold, so as to obtain a merged image set.
18. The apparatus according to claim 11, characterized in that, The device further includes: The target image set determination module is used to determine the target matching image corresponding to the match when the second matching result is a match; the target matching image is one of the at least one images to be matched. The first clustering module is also used to take the image set to which the target matching image belongs as the target image set of the object to be clustered.
19. An electronic device, characterized in that, include: processor; Memory used to store processor-executable instructions; The processor is configured to execute the executable instructions to implement the method according to any one of claims 1 to 9.
20. A non-volatile computer-readable storage medium storing computer program instructions thereon, characterized in that, When the computer program instructions are executed by the processor, they implement the method described in any one of claims 1 to 9.
21. A computer program product, characterized in that, Includes computer instructions, which, when executed by a processor, cause the computer to perform the method as described in any one of claims 1 to 9.