Classification method, transport method, device and electronic equipment

By collecting and recognizing images of express parcels, and combining preliminary and masked objects to determine the final category, the problem of inaccurate parcel classification has been solved, achieving accurate classification and efficient transfer.

CN116563834BActive Publication Date: 2026-06-16MECH MIND ROBOTICS TECH LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
MECH MIND ROBOTICS TECH LTD
Filing Date
2023-05-11
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

In existing technologies, the similarity of the outer packaging of express parcels leads to inaccurate classification and makes it impossible to effectively distinguish different types of express parcels.

Method used

By acquiring target images, the preliminary category of the object to be classified and the mask object are identified. If the preliminary category does not belong to the preset category, the final category is determined based on the mask object, and accurate classification is performed by combining shape, color and texture features.

🎯Benefits of technology

It improves the accuracy of parcel classification, ensures the detection rate of irregularly shaped parcels, and avoids the transshipment of irregularly shaped parcels during transit, thereby improving transit efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116563834B_ABST
    Figure CN116563834B_ABST
Patent Text Reader

Abstract

The disclosure provides a classification method, a transfer method, a device and an electronic equipment, the classification method comprising: collecting a target image, the target image comprising a to-be-classified object; identifying the target image to obtain a preliminary category and a mask object of the to-be-classified object; and if the preliminary category does not belong to a preset category, determining a final category of the to-be-classified object according to the mask object, so that accurate classification of the to-be-classified object can be realized.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of computer technology, and more particularly to a classification method, a transfer method, an apparatus, and an electronic device. Background Technology

[0002] In many scenarios, it is necessary to classify objects so that they can be processed according to their categories. For example, in logistics scenarios, express parcels need to be classified and then transferred to achieve parcel sorting.

[0003] In related technologies, due to the similarity of the outer packaging of different express parcels, it is impossible to accurately classify express parcels. Summary of the Invention

[0004] This disclosure provides a classification method, a transfer method, an apparatus, and an electronic device to improve the accuracy of object classification. A first aspect of this disclosure provides a classification method comprising: acquiring a target image, the target image including an object to be classified; identifying the target image to obtain a preliminary category and a mask object for the object to be classified; and if the preliminary category does not belong to a preset category, determining the final category of the object to be classified based on the mask object.

[0005] A second aspect of this disclosure provides a transfer method applied to the aforementioned objects to be classified. The transfer method includes:

[0006] Objects belonging to the preset categories will not be transferred.

[0007] A third aspect of this disclosure provides a sorting apparatus, comprising:

[0008] The acquisition module is used to acquire target images, which include objects to be classified.

[0009] The recognition module is used to recognize the target image and obtain the preliminary category and mask object of the object to be classified.

[0010] The determination module is used to determine the final category of the object to be classified based on the mask object if the initial category does not belong to the preset category.

[0011] A fourth aspect of this disclosure provides an electronic device, including: a processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the classification method of the first aspect and / or the transfer method of the second aspect.

[0012] The fifth aspect of this disclosure provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, are used to implement the classification method of the first aspect / or the transfer method of the second aspect.

[0013] A sixth aspect of this disclosure provides a computer program product comprising: a computer program stored in a readable storage medium, wherein at least one processor of an electronic device can read the computer program from the readable storage medium, and the at least one processor executes the computer program to cause the electronic device to perform the classification method of the first aspect / or the transfer method of the second aspect.

[0014] This embodiment of the disclosure is applied to the classification of express parcels. By acquiring a target image, which includes the object to be classified, the target image is identified to obtain a preliminary category and a mask object for the object to be classified. If the preliminary category does not belong to a preset category, the final category of the object to be classified is determined based on the mask object, thereby achieving accurate classification of the object to be classified. Attached Figure Description

[0015] The accompanying drawings, which are included to provide a further understanding of this disclosure and form part of this disclosure, illustrate exemplary embodiments of the present disclosure and are used to explain the disclosure, but do not constitute an undue limitation of the disclosure. In the drawings:

[0016] Figure 1 An application scenario diagram of a classification method provided for an exemplary embodiment of this disclosure;

[0017] Figure 2 A flowchart illustrating the steps of a classification method provided in an exemplary embodiment of this disclosure;

[0018] Figure 3 A schematic diagram of a target image provided for an exemplary embodiment of this disclosure;

[0019] Figure 4 A flowchart illustrating the steps of another classification method provided as an exemplary embodiment of this disclosure;

[0020] Figure 5 A structural block diagram of an identification model provided for an exemplary embodiment of this disclosure;

[0021] Figure 6 A structural block diagram of Focus provided for an exemplary embodiment of this disclosure;

[0022] Figure 7 A structural block diagram of an SPP provided for an exemplary embodiment of this disclosure;

[0023] Figure 8 A structural block diagram of a CSP1 provided for an exemplary embodiment of this disclosure;

[0024] Figure 9 A structural block diagram of a CSP2 provided for an exemplary embodiment of this disclosure;

[0025] Figure 10 A schematic diagram of a mask image provided for an exemplary embodiment of this disclosure;

[0026] Figure 11 A structural block diagram of a classification device provided for an exemplary embodiment of this disclosure;

[0027] Figure 12 A schematic diagram of the structure of an electronic device provided for an exemplary embodiment of this disclosure. Detailed Implementation

[0028] To make the objectives, technical solutions, and advantages of this disclosure clearer, the technical solutions of this disclosure will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this disclosure, and not all of them. All other embodiments obtained by those skilled in the art based on the embodiments of this disclosure without creative effort are within the scope of protection of this disclosure.

[0029] When there are multiple objects to be classified on a platform, these objects need to be categorized to ensure accurate processing later. Common techniques for classification use the shape of the objects, but this method can be inaccurate. For example, bag-shaped packages and box-shaped packages are very similar in shape, and shape alone cannot distinguish them.

[0030] Based on the above problems, the embodiments of this disclosure acquire a target image, which includes an object to be classified; identify the target image to obtain a preliminary category and a mask object for the object to be classified; if the preliminary category does not belong to a preset category, the final category of the object to be classified is determined according to the mask object, thereby accurately classifying the object to be classified.

[0031] Furthermore, one application scenario of this disclosure embodiment is as follows: Figure 1 ,exist Figure 1 The system includes: a carrying platform P1, a carrying platform P2, and an object D to be classified. The classification device can transfer the object D from carrying platform P1 to carrying platform P2. The transfer device can transfer the object D according to transfer route a (including segments X, Y, and Z) or transfer route b. When transferring the object D, the transfer device needs to select transfer parameters based on the category of the object, such as whether it can be transferred, transfer speed, and transfer sequence.

[0032] in, Figure 1 This is merely one exemplary application scenario, and the embodiments disclosed herein can be applied to any object classification scenario. The embodiments disclosed herein do not limit the specific application scenario.

[0033] Figure 2 A flowchart illustrating the steps of a classification method provided for an exemplary embodiment of this disclosure. The classification method is applied to a classification apparatus and specifically includes the following steps:

[0034] S201, Acquire target image.

[0035] The target image includes objects to be classified. In this disclosure, if the target image may include one or more objects to be classified, this disclosure classifies each object.

[0036] In this disclosure, the classification device carries a camera, which can acquire target images of the objects to be classified and obtain target images.

[0037] For example, refer to Figure 3 The target image Pa is obtained by acquiring images of the objects to be classified placed on the carrier platform. The target image Pa contains multiple objects to be classified (h, z, q, b, d1, d2, d3, d4, y, u and d5).

[0038] S202, Identify the target image to obtain the preliminary category and mask object of the object to be classified.

[0039] Furthermore, the multiple objects to be classified included in the target image can be of the same or different categories. The objects to be classified include: express parcels; the categories of the objects to be classified include: box-shaped parcels, cylindrical parcels, spherical parcels, bag-shaped parcels, sheet-shaped parcels, or irregularly shaped parcels.

[0040] In this disclosure, the initial category can be determined by combining multiple factors such as the shape, color features, and texture features of the object to be classified. The mask object can represent the shape of the object to be classified.

[0041] For example, if the object to be classified is a box-shaped package, then the box-shaped package is roughly rectangular in shape, brown in color, and has a paper or foam texture. A bag-shaped package has an irregular shape, and its color characteristics can be gray, black, etc. Its texture characteristic is plastic.

[0042] In this disclosure, multiple categories can be preset, and then the category to be classified can be determined by recognizing the target image. This category is the preliminary category.

[0043] In this disclosure, reference is made to Figure 3 Box-shaped packages, such as express delivery packages packed in cardboard boxes. Figure 3 The initial category for objects h and u to be classified is box-shaped parcel. Cylindrical parcels refer to express parcels that are cylindrical or near-cylindrical in shape, such as... Figure 3The initial category for object b to be classified is columnar package. Spherical package refers to express delivery packages that are spherical or near-spherical in shape, such as... Figure 3 The initial category of the object q to be classified is spherical package. Bag-shaped package refers to express delivery packages loaded in plastic bags, such as... Figure 3 The initial category for objects d1, d2, d3, d4, and d5 to be classified is bag-shaped parcel. Sheet-shaped parcels, such as envelopes, are also included. Figure 3 The initial category for the object z to be classified is sheet-shaped parcel. Irregularly shaped parcels refer to other types of express parcels besides box-shaped parcels, cylindrical parcels, spherical parcels, bag-shaped parcels, and sheet-shaped parcels, such as... Figure 3 The initial category of the object y to be classified is irregular-shaped package. Furthermore, users can define irregular-shaped packages according to their actual needs; for example, users can classify cylindrical packages and spherical packages as irregular-shaped packages.

[0044] S203, if the preliminary category does not belong to the preset category, then determine the final category of the object to be classified based on the mask object.

[0045] In this disclosure, the preset category can be set by the user in advance. For example, when transferring express parcels, the user can set in advance not to transfer express parcels belonging to a preset category, such as irregularly shaped parcels.

[0046] In one optional embodiment, if the preset category is irregularly shaped package, the final category can be either the preliminary category or the preset category. For example, if the preliminary category is box-shaped package, the final category is either box-shaped package or irregularly shaped package. If the preliminary category is bag-shaped package, the final category is either bag-shaped package or irregularly shaped package. If the preliminary category is sheet-shaped package, the final category is either sheet-shaped package or irregularly shaped package.

[0047] For example, refer to Figure 3 If the initial category of target object h is a box-shaped package, and this initial category (box-shaped package) is not a preset category (irregularly shaped package), and the three-dimensional structure of target object h is determined to be not a preset three-dimensional structure corresponding to the preset category based on the mask object, then the final category of the target object can be determined to be a box-shaped package. Similarly, if the initial category of target object u is a box-shaped package, and this initial category (box-shaped package) is not a preset category (irregularly shaped package), and the three-dimensional structure of target object u is determined to be a preset three-dimensional structure corresponding to the preset category based on the mask object, then the final category of the target object can be determined to be an irregularly shaped package. In summary, this disclosure classifies all objects to be classified that do not belong to the standard box-shaped, bag-shaped, or sheet-shaped categories as irregularly shaped packages, which can improve the detection rate of irregularly shaped packages.

[0048] In another alternative embodiment, the final category is either a standard preliminary category or a non-standard preliminary category. For example, if the preliminary category is a boxed package, the final category is either a standard boxed package or a non-standard boxed package. If the preliminary category is a bagged package, the final category is either a standard bagged package or a non-standard bagged package. If the preliminary category is a sheet-like package, the final category is either a standard sheet-like package or a non-standard sheet-like package.

[0049] In this embodiment, the final category of the object to be classified can be one of the following: standard box-shaped package, non-standard box-shaped package, standard cylindrical package, non-standard cylindrical package, standard spherical package, non-standard spherical package, standard bag-shaped package, non-standard bag-shaped package, standard sheet-shaped package, and non-standard sheet-shaped package. Different processing can be performed on objects with different final categories, thereby improving the classification accuracy of the objects.

[0050] For example, for boxed express packages, if the length, width, and height ratio of the boxed package is within a corresponding preset range, it is considered a standard boxed package. Figure 3 The object h to be classified in the table is a standard box-shaped package. If the length, width, and height ratio of the boxed package is outside the corresponding preset range, it is considered a non-standard box-shaped package, such as... Figure 3 If the object to be classified, u, initially appears as a box-shaped package, but its length is significantly greater than its width and height, it can be determined to be a non-standard box-shaped package. For example, Figure 3 The initial category of the object d3 to be classified is bag-shaped package, and the final category is standard bag-shaped package. Figure 3 The initial category of the bag to be classified object d5 is bag-shaped parcel, and the final category is non-standard bag-shaped parcel.

[0051] This embodiment of the disclosure is applied to the classification of express parcels. By acquiring a target image, which includes the object to be classified, the target image is identified to obtain a preliminary category and a mask object for the object to be classified. If the preliminary category does not belong to a preset category, the final category of the object to be classified is determined based on the mask object, thereby achieving accurate classification of the object to be classified.

[0052] Figure 4 A flowchart illustrating another classification method provided as an exemplary embodiment of this disclosure. Specifically, it includes the following steps:

[0053] S401, acquire target image.

[0054] The specific implementation process of this step is described in S201 and will not be repeated here.

[0055] S402, the recognition model identifies the object to be classified, and obtains the probability value of the object to be classified in different categories and the mask object of the object to be classified.

[0056] The recognition model includes a backbone feature extraction module, an enhanced feature extraction module, and an attribute determination module. It identifies the object to be classified and obtains the probability values ​​of the object in different categories. This includes: performing feature extraction processing on the target image through the backbone feature extraction module to obtain a first feature image; performing feature enhancement processing on the first feature image through the enhanced feature extraction module to obtain a second feature image; and performing attribute analysis processing on the second feature image through the attribute determination module to obtain the probability values ​​of the object in different categories.

[0057] In this disclosure, reference is made to Figure 5 The identification of the target image includes identifying the target image through an identification model 50; the identification model 50 includes: a backbone feature extraction module 51, an enhanced feature extraction module 52, and an attribute determination module 53. The identification model can be a YOLOX (a neural network) model.

[0058] Specifically, the backbone feature extraction module 51 includes Focus (concentration layer), (CBA / DWConv)1 (first depthwise separable convolutional layer), CSP11 (first residual network layer 1), (CBA / DWConv)2 (second depthwise separable convolutional layer), CSP12 (second residual network layer 1), (CBA / DWConv)3 (third depthwise separable convolutional layer), CSP13 (third residual network layer 1), SPP (pooling layer) and CSP14 (fourth residual network layer 1).

[0059] Among them, reference Figure 6 Focus comprises four Slices (procedural slices), Concat (stitching units), and CBA (attention mechanism feedback layer). Each Slice extracts a value from every pixel in the target image (3×W×H, representing 3 channels, length W, width H), resulting in four independent feature images. Concat and CBA then stack these four independent feature images, expanding the target image from 3 channels fourfold. Focus processes and extracts features from the target image, resulting in a feature image of {12×(W / 2)×(H / 2)}.

[0060] In addition, refer to Figure 6 CBA includes Conv (convolutional layer), BN (Batch Normalization), and Act (activation layer). (See reference...) Figure 7 SPP includes three Maxpool layers, Concat, and CBA. (See reference...) Figure 8CSP1 includes: a CBA and n Bottleneck layers in one branch, and a CBA in another branch. The outputs of the two branches are concatenated and then processed by a CBA. The Bottleneck consists of a CBA and a (CBA / DWConv) layer. The input features of the Bottleneck are passed through the CBA and a (CBA / DWConv) layer and then superimposed on the input features to produce the output.

[0061] Furthermore, the target image can be processed by the backbone feature extraction module 51 to obtain one first feature image or multiple first feature images. For example... Figure 5 The backbone feature extraction module 51 can output first feature image a1, first feature image a2, and first feature image a3, or it can output only one of these three first feature images, for example, only outputting first feature image a3. Specifically, first feature image a1 is a feature image with size {192×(W / 8)×(H / 8)} output by CSP12, first feature image a2 is a feature image with size {768×(W / 16)×(H / 16)} output by CSP13, and first feature image a3 is a feature image with size {3072×(W / 32)×(H / 32)} output by CSP14.

[0062] In this disclosure, reference is made to Figure 5 The enhanced feature extraction module 52 includes CBA 1 (first attention mechanism feedback layer), FPN (Feature Pyramid Network) and PAN (Perceptual adversarial network).

[0063] Furthermore, the FPN includes: Upsample 1 (first upsampling layer), Concat 1 (first stitching unit), CSP21 (first residual network layer 2), CBA 2 (second attention mechanism feedback layer), Upsample 2 (second upsampling layer), and Concat 2 (second stitching unit). The PAN includes: CSP2 2 (second residual network layer 2), (CBA / DWConv)4 (fourth depthwise separable convolutional layer), Concat 3 (third stitching unit), CSP2 3 (third residual network layer 2), (CBA / DWConv)5 (fifth depthwise separable convolutional layer), Concat 4 (fourth stitching unit), and CSP2 4 (fourth residual network layer 2).

[0064] Among them, reference Figure 9CSP2 includes: a CBA and n CBAs in one branch, and a CBA in another branch. The outputs of the two branches are connected by Concat and then processed by a CBA.

[0065] In this disclosure, the second feature image can be one or more, as shown in the reference. Figure 5 It can include three second feature images: second feature image b1, second feature image b2, and second feature image b3. Alternatively, it can include only one second feature image, for example, the enhanced feature extraction module 52 only outputs second feature image b3. Further, second feature image b1 is output by CSP22, and its size is {192×(W / 8)×(H / 8)}. Second feature image b2 is output by CSP23, and its size is {768×(W / 16)×(H / 16)}. Second feature image b3 is output by CSP24, and its size is {3072×(W / 32)×(H / 32)}.

[0066] In this disclosure, reference is made to Figure 5 The attribute determination module 53 includes: CBA 3 (third attention mechanism feedback layer), (CBA / DWConv) 6 (sixth depthwise separable convolutional layer), and Conv 1 (first convolutional layer).

[0067] In this disclosure, if there are multiple second feature images, the multiple second feature images can be input into the attribute determination module for processing. The attribute determination module can output the probability of belonging to different integrity levels, different pose states, different categories, different color features, and different texture features for each second feature image. Then, the multiple second feature images are averaged to obtain the final integrity level, pose state, category, color feature, and texture feature.

[0068] For example, referring to Table 1, if there is only one second feature image b3, the probability value of the object to be classified in different categories can be determined according to the probability corresponding to the second feature image b3. For example, referring to Table 1, the probability values ​​of the object to be classified in different categories are as follows: box-shaped package: 60%; columnar package: 5%; spherical package: 1%; bag-shaped package: 14%; sheet-shaped package: 15%; irregularly shaped package: 5%.

[0069] Furthermore, if there are multiple second feature images, such as second feature image b1, second feature image b2, and second feature image b3, then after processing each second feature image separately, the attribute determination module calculates the average probability value of the object to be classified under different categories (namely: box-shaped package: 60.67%; columnar package: 2.67%; spherical package: 1.33%; bag-shaped package: 18%; sheet-shaped package: 13.67%; irregularly shaped package: 3.66%).

[0070] Table 1

[0071]

[0072] Furthermore, the recognition model also includes a mask determination module, wherein the mask object is determined by performing mask analysis processing on the second feature image through the mask determination module to obtain a mask image; and the mask object of the object to be classified is determined based on the mask image.

[0073] In one optional embodiment, a second feature image is obtained during the feature enhancement processing of the first feature image by the enhanced feature extraction module; the mask determination module includes a first convolution unit and a first decoder, and performs mask analysis processing on the second feature image through the mask determination module to obtain a mask image, including: performing convolution processing on the second feature image through the first convolution unit to obtain a third feature image; and performing decoding processing on the third feature image through the first decoder to obtain the mask image.

[0074] In another optional embodiment, during the feature enhancement processing of the first feature image by the enhanced feature extraction module, multiple second feature images of different sizes are obtained; the mask determination module further includes: multiple second convolutional units, a stitching unit, and a second decoder. The convolutional kernels of the multiple second convolutional units are of different sizes. The mask determination module performs mask analysis processing on the second feature images to obtain a mask image, including: performing convolution processing on the second feature images one-to-one by the second convolutional units to obtain multiple fourth feature images of the same size; stitching the multiple fourth feature images of the same size by the stitching unit to obtain a fifth feature image; and decoding the fifth feature image by the second decoder to obtain the mask image.

[0075] The recognition model also outputs the mask object of the object to be classified in the mask image corresponding to the target image. The recognition model also includes a mask determination module 54.

[0076] Specifically, if the enhanced feature extraction module outputs only one second feature image, then the mask determination module includes only one first convolutional unit (Conv) and a first decoder. The size of the second feature image corresponds to the kernel size of the first convolutional unit. For example, if the size of the second feature image is {192×(W / 8)×(H / 8)}, then the kernel size of the first convolutional unit is 1×1. If the size of the second feature image is {768×(W / 16)×(H / 16)}, then the kernel size of the first convolutional unit is 2×2. If the size of the second feature image is {3072×(W / 32)×(H / 32)}, then the kernel size of the first convolutional unit is 4×4. Then, after decoding by the first decoder, a mask image of size 3×W×H is obtained.

[0077] In another optional embodiment, during the feature enhancement processing of the first feature image by the enhanced feature extraction module, multiple second feature images of different sizes are obtained; the mask determination module further includes: multiple second convolutional units, a stitching unit, and a second decoder. The convolutional kernels of the multiple second convolutional units are of different sizes. The mask determination module performs mask analysis processing on the second feature images to obtain a mask image, including: performing convolution processing on the second feature images one-to-one by the second convolutional units to obtain multiple fourth feature images of the same size; stitching the multiple fourth feature images of the same size by the stitching unit to obtain a fifth feature image; and decoding the fifth feature image by the second decoder to obtain the mask image.

[0078] For example, refer to Figure 5 Multiple second feature images of different sizes are designated as second feature image b1{192×(W / 8)×(H / 8)}, second feature image b2{768×(W / 16)×(H / 16)}, and second feature image b3{3072×(W / 32)×(H / 32)}. This corresponds to three second convolutional units. Figure 5 The second convolutional unit (Conv 8), the second convolutional unit (Conv 7), and the second convolutional unit (Conv 6) have the following kernel sizes: Conv 8 has a kernel size of 1×1; Conv 7 has a kernel size of 2×2; and Conv 6 has a kernel size of 4×4.

[0079] Furthermore, after the second feature image is convolved by the second convolution unit, it is stitched together using a concatenation unit (Concat5). The stitched features are then decoded by the second decoder to obtain a mask image, which has a size of 3×W×H.

[0080] Reference Figure 10 ,for Figure 3 The target image Pa corresponds to the mask image Y.

[0081] Reference Figure 3 and Figure 10 The mask objects corresponding to the objects to be classified (h, d4, z, q, d1, b, d2, y, d3, u and d5) are (y1, y2, y3, y4, y5, y6, y7, y8 and y9, y10 and y11).

[0082] S403, determine the category corresponding to the highest probability value among the probability values ​​of different categories as the preliminary category.

[0083] For example, referring to Table 1, if the category corresponding to the highest probability value is box-shaped package, then the preliminary category of the object to be classified is box-shaped.

[0084] S404, if the preliminary category does not belong to the preset category and the probability value under the preliminary category is less than the first threshold, determine the final category of the object to be classified based on the mask object.

[0085] In this embodiment, a first threshold can be preset, such as setting the first threshold to 80%. Referring to Table 1, the preliminary category of the object to be classified is box-shaped package, and the probability value of the object to be classified under the preliminary category (box-shaped package) is 60.67%, which is less than 80%. When these two conditions are met, the final category of the object to be classified is determined based on the mask object.

[0086] Further, determining the final category of the object to be classified based on the mask object includes: determining the three-dimensional structure of the object to be classified based on the mask object, wherein the mask object includes the depth information of the object to be classified; if the three-dimensional structure belongs to a preset three-dimensional structure of a preset category, then the final category of the object to be classified is determined to be the preset category; otherwise, the final category of the object to be classified is determined to be the preliminary category.

[0087] The mask object can include pixel position information and depth information. A three-dimensional structure of the object to be classified can be constructed based on this pixel position information and depth information. If the three-dimensional structure represents a preset three-dimensional structure of an irregularly shaped object, then the final category of the object to be classified is an irregularly shaped object. Otherwise, the final category is determined to be the preliminary category (i.e., a box-shaped object).

[0088] In this disclosure, the preset category can pre-set multiple corresponding preset three-dimensional structures, or pre-set the shape of the preset three-dimensional structure.

[0089] S405, if the preliminary category does not belong to the preset category, and the probability value under the preliminary category is greater than or equal to the first threshold, determine the final category of the object to be classified as the preliminary category.

[0090] If the probability value under the preliminary category is greater than or equal to the first threshold, it can be understood that the recognition model is relatively accurate in recognizing the object to be classified, and the final category of the object to be classified can be directly determined as the preliminary category.

[0091] For example, refer to Figure 3 The system initially identifies the object h to be classified as a box-shaped package. When the probability of object h being classified as a box-shaped package is greater than 80%, the final category of object h can also be determined to be a box-shaped package. Similarly, the system initially identifies the object u to be classified as a box-shaped package. When the probability of object u being classified as a box-shaped package is less than 80%, the final category of object u can be determined to be an irregularly shaped package using a three-dimensional structure.

[0092] This disclosure sets a condition to determine whether the probability value under the preliminary category is less than a first threshold. This eliminates the need to determine the three-dimensional structure of objects to be classified that are greater than the first threshold, thereby improving classification efficiency.

[0093] In this embodiment of the disclosure, a preliminary category and a mask object of the object to be classified can be obtained first by the recognition model. Then, based on the preliminary category and the probability value under the preliminary category, it is determined whether to use the mask object to determine the final category. In addition, by using the mask object to determine the final category, the classification accuracy can be improved.

[0094] In addition, this disclosure also provides a transfer method applied to the above-mentioned objects to be classified. The transfer method includes: not transferring objects to be classified that belong to a preset category.

[0095] In this disclosure, objects to be classified whose initial or final category is a preset category are not transferred.

[0096] During the transfer of objects to be classified, it can be determined whether to transfer them based on the category of the objects to be classified. For example, irregularly shaped packages will not be transferred, while other packages will be transferred.

[0097] The above-mentioned classification method can accurately identify irregularly shaped parts, so that irregularly shaped parts are not transferred during the transfer process, thereby improving the transfer efficiency.

[0098] In this embodiment of the disclosure, reference is made to Figure 11 In addition to providing a classification method, a classification device 110 is also provided for application of the above-mentioned classification method, including:

[0099] Acquisition module 111 is used to acquire target images, which include objects to be classified;

[0100] The recognition module 112 is used to recognize the target image and obtain the preliminary category and mask object of the object to be classified;

[0101] The determination module 113 is used to determine the final category of the object to be classified based on the mask object if the preliminary category does not belong to the preset category.

[0102] In one optional embodiment, the recognition module 112 is specifically used to: recognize the object to be classified through the recognition model, obtain the probability values ​​of the object to be classified in different categories and the mask object of the object to be classified; and determine the category corresponding to the highest probability value among the probability values ​​of different categories as the preliminary category.

[0103] In one optional embodiment, if the preliminary category does not belong to the preset category, the determining module 113 is specifically used to: if the preliminary category does not belong to the preset category and the probability value under the preliminary category is less than the first threshold, determine the final category of the object to be classified based on the mask object; if the preliminary category does not belong to the preset category and the probability value under the preliminary category is greater than or equal to the first threshold, determine the final category of the object to be classified as the preliminary category.

[0104] In one optional embodiment, the determining module 113 is specifically used to determine the three-dimensional structure of the object to be classified based on the mask object, the mask object including the depth information of the object to be classified; if the three-dimensional structure belongs to a preset three-dimensional structure of a preset category, then the final category of the object to be classified is determined to be the preset category; otherwise, the final category of the object to be classified is determined to be the preliminary category.

[0105] In one optional embodiment, the object to be classified is an express parcel; the category of the object to be classified includes one of the following: box-shaped parcel, columnar parcel, spherical parcel, bagged parcel, sheet-shaped parcel, or irregularly shaped parcel.

[0106] In one optional embodiment, the preset category is irregularly shaped package.

[0107] In one optional embodiment, the recognition model includes: a backbone feature extraction module, an enhanced feature extraction module, and an attribute determination module. When the recognition module 112 identifies the object to be classified through the recognition model and obtains the probability values ​​of the object to be classified in different categories, it specifically performs feature extraction processing on the target image through the backbone feature extraction module to obtain a first feature image; performs feature enhancement processing on the first feature image through the enhanced feature extraction module to obtain a second feature image; and performs attribute analysis processing on the second feature image through the attribute determination module to obtain the probability values ​​of the object to be classified in different categories.

[0108] In one optional embodiment, the recognition model further includes a mask determination module, and the recognition module 112 is specifically used to determine the mask object by: performing mask analysis processing on the second feature image through the mask determination module to obtain a mask image; and determining the mask object of the object to be classified based on the mask image.

[0109] In one optional embodiment, a second feature image is obtained during the feature enhancement processing of the first feature image by the enhanced feature extraction module; the mask determination module includes a first convolution unit and a first decoder. When the recognition module 112 performs mask analysis processing on the second feature image through the mask determination module to obtain a mask image, it is specifically used to: perform convolution processing on the second feature image through the first convolution unit to obtain a third feature image; and perform decoding processing on the third feature image through the first decoder to obtain a mask image.

[0110] In one optional embodiment, during the feature enhancement processing of the first feature image by the enhanced feature extraction module, multiple second feature images of different sizes are obtained; the mask determination module further includes: multiple second convolutional units, a stitching unit, and a second decoder. The convolutional kernels of the multiple second convolutional units are of different sizes. When the recognition module 112 performs mask analysis processing on the second feature image through the mask determination module to obtain a mask image, it specifically performs the following: convolution processing on the second feature image one-to-one by the second convolutional units to obtain multiple fourth feature images of the same size; stitching processing on the multiple fourth feature images of the same size by the stitching unit to obtain a fifth feature image; and decoding processing on the fifth feature image by the second decoder to obtain the mask image.

[0111] The classification apparatus provided in this disclosure can implement the above-described method embodiments, and will not be repeated here.

[0112] Furthermore, in some of the processes described in the above embodiments and accompanying drawings, multiple operations appear in a specific order. However, it should be clearly understood that these operations may not be executed in the order they appear herein, or may be executed in parallel. The sequence numbers are merely used to distinguish different operations, and the sequence numbers themselves do not represent any execution order. Additionally, these processes may include more or fewer operations, and these operations may be executed sequentially or in parallel. It should be noted that the descriptions such as "first," "second," etc., in this document are used to distinguish different messages, devices, modules, etc., and do not represent a sequential order, nor do they limit "first" and "second" to different types.

[0113] Furthermore, in some of the processes described in the above embodiments and accompanying drawings, multiple operations appear in a specific order. However, it should be clearly understood that these operations may not be executed in the order they appear herein, or may be executed in parallel. The sequence numbers are merely used to distinguish different operations, and the sequence numbers themselves do not represent any execution order. Additionally, these processes may include more or fewer operations, and these operations may be executed sequentially or in parallel. It should be noted that the descriptions such as "first," "second," etc., in this document are used to distinguish different messages, devices, modules, etc., and do not represent a sequential order, nor do they limit "first" and "second" to different types.

[0114] Figure 12 This is a schematic diagram of the structure of an electronic device provided in an example embodiment of this disclosure. For example... Figure 12 As shown, the electronic device 120 includes a processor 121 and a memory 122 communicatively connected to the processor 121, the memory 122 storing computer-executed instructions.

[0115] The processor executes computer execution instructions stored in the memory to implement the classification method and / or transfer method provided in any of the above method embodiments. The specific functions and technical effects to be achieved will not be elaborated here.

[0116] This disclosure also provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, are used to implement the classification method and / or transfer method provided in any of the above method embodiments.

[0117] This disclosure also provides a computer program product, comprising: a computer program stored in a readable storage medium, at least one processor of an electronic device being able to read the computer program from the readable storage medium, and at least one processor executing the computer program causing the electronic device to perform the classification method and / or transfer method provided in any of the above method embodiments.

[0118] In the embodiments provided in this disclosure, it should be understood that the disclosed systems and methods can be implemented in other ways. For example, the system embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between systems or units may be electrical, mechanical, or other forms.

[0119] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0120] Furthermore, the functional units in the various embodiments of this disclosure can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or in a combination of hardware and software functional units.

[0121] The integrated units implemented as software functional units described above can be stored in a computer-readable storage medium. These software functional units, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute some steps of the methods of the various embodiments of this disclosure. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0122] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional modules is merely an example. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the system can be divided into different functional modules to complete all or part of the functions described above. The specific working process of the system described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0123] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.

[0124] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.

Claims

1. A classification method, characterized in that, include: Acquire a target image, the target image including an object to be classified, the object to be classified being an express parcel; The target image is identified to obtain a preliminary category and a mask object for the object to be classified. The mask object includes the pixel position information and depth information of the object to be classified. The mask object is determined based on a mask image, which is obtained by performing mask analysis on a second feature image through a mask determination module in the recognition model; the second feature image is obtained by performing feature enhancement processing on a first feature image through an enhanced feature extraction module in the recognition model; the first feature image is obtained by performing feature extraction processing on the target image through a backbone feature extraction module in the recognition model. If the preliminary category does not belong to the preset category, then the three-dimensional structure of the object to be classified is constructed based on the pixel position information and depth information of the mask object; If the three-dimensional structure belongs to the preset three-dimensional structure of the preset category, then the final category of the object to be classified is determined to be the preset category; Otherwise, the final category of the object to be classified is determined to be the preliminary category, and the preset category is irregularly shaped package.

2. The classification method according to claim 1, characterized in that, The process of identifying the target image to obtain the preliminary category and mask object of the object to be classified includes: The object to be classified is identified by the recognition model, and the probability values ​​of the object to be classified in different categories and the mask object of the object to be classified are obtained. The category corresponding to the highest probability value among the different categories is determined as the preliminary category.

3. The classification method according to claim 2, characterized in that, If the preliminary category does not belong to the preset category, then determining the final category of the object to be classified based on the mask object includes: If the preliminary category does not belong to the preset category, and the probability value under the preliminary category is less than the first threshold, the final category of the object to be classified is determined according to whether the three-dimensional structure belongs to the preset three-dimensional structure of the preset category. If the preliminary category does not belong to the preset category, and the probability value under the preliminary category is greater than or equal to the first threshold, the final category of the object to be classified is determined to be the preliminary category.

4. The classification method according to any one of claims 1 to 3, characterized in that, The object to be classified is a courier parcel; the category of the object to be classified includes: box-shaped parcel, columnar parcel, spherical parcel, bagged parcel, sheet-shaped parcel or irregularly shaped parcel.

5. The classification method according to claim 4, characterized in that, The preset category is irregularly shaped package.

6. The classification method according to any one of claims 2 to 3, characterized in that, The recognition model further includes: an attribute determination module, wherein the recognition model identifies the object to be classified and obtains the probability values ​​of the object to be classified in different categories, including: The attribute determination module performs attribute analysis on the second feature image to obtain the probability values ​​of the object to be classified in different categories.

7. The classification method according to claim 1, characterized in that, In the feature enhancement processing of the first feature image by the enhanced feature extraction module, a second feature image is obtained; The mask determination module includes a first convolutional unit and a first decoder. The process of performing mask analysis on the second feature image through the mask determination module to obtain the mask image includes: The third feature image is obtained by convolving the second feature image using the first convolution unit. The mask image is obtained by decoding the third feature image using the first decoder.

8. The classification method according to claim 1, characterized in that, In the feature enhancement processing of the first feature image by the enhanced feature extraction module, multiple second feature images of different sizes are obtained; The mask determination module further includes: multiple second convolutional units, a stitching unit, and a second decoder. The convolutional kernels of the multiple second convolutional units have different sizes. The mask determination module performs mask analysis processing on the second feature image to obtain the mask image, including: By performing convolution processing on the second feature image one by one using the second convolution unit, multiple fourth feature images of the same size are obtained. The stitching unit stitches together the plurality of fourth feature images of the same size to obtain a fifth feature image. The mask image is obtained by decoding the fifth feature image using the second decoder.

9. A transshipment method, characterized in that, The classification method according to any one of claims 1 to 8, wherein the transfer method comprises: Objects belonging to the preset categories will not be transferred.

10. A sorting device, characterized in that, include: The acquisition module is used to acquire target images, the target images including objects to be classified, the objects to be classified being express parcels; The recognition module is used to recognize the target image and obtain the preliminary category and mask object of the object to be classified. The mask object includes the pixel position information and depth information of the object to be classified. The mask object is determined based on a mask image, which is obtained by performing mask analysis on a second feature image through a mask determination module in the recognition model; the second feature image is obtained by performing feature enhancement processing on a first feature image through an enhanced feature extraction module in the recognition model; the first feature image is obtained by performing feature extraction processing on the target image through a backbone feature extraction module in the recognition model. The determination module is configured to, if the preliminary category does not belong to a preset category, construct a three-dimensional structure of the object to be classified based on the pixel position information and depth information of the mask object, and determine the final category of the object to be classified based on whether the three-dimensional structure belongs to a preset three-dimensional structure of the preset category, wherein the preset category is irregular part wrapping; wherein, if the three-dimensional structure belongs to the preset three-dimensional structure of the preset category, the final category of the object to be classified is determined to be the preset category; otherwise, the final category of the object to be classified is determined to be the preliminary category.

11. An electronic device, characterized in that, include: A processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the classification method as described in any one of claims 1 to 8, or the transfer method as described in claim 9.

12. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when executed by a processor, are used to implement the classification method according to any one of claims 1 to 8, or the transfer method according to claim 9.