Methods and systems for categorising an article

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
The image processing system addresses the challenge of inconsistent manual item categorization by using machine learning to automate the process, enhancing accuracy and reducing errors in baggage and cargo handling.

WO2026129032A1PCT designated stage Publication Date: 2026-06-25SITA INFORMATION NETWORKING COMPUTING CANADA

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: SITA INFORMATION NETWORKING COMPUTING CANADA
Filing Date: 2025-12-15
Publication Date: 2026-06-25

Application Information

Patent Timeline

15 Dec 2025

Application

25 Jun 2026

Publication

WO2026129032A1

IPC: G06V10/764; G06V10/25; G06V10/56; G06V10/82; G06V10/762; G06V10/774

AI Tagging

Application Domain

Character and pattern recognition

Technology Topics

Pattern recognition Image extraction

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CA2025051687_25062026_PF_FP_ABST

Patent Text Reader

Abstract

According to the present disclosure there is provided systems and methods for categorising an item. The system comprises processing means configured to: obtain a first image comprising an item; extract a portion of the first image, the portion comprising the item; determine, using the extracted portion of the first image, one or more characteristic features of the item using a first machine learning model; determine a colour value associated with the extracted portion of the first image; and categorise, using a second machine learning model, the item according to one of a plurality of predetermined item categorisations based on at least the determined one or more characteristic features and the colour value, wherein the second machine learning model is different to the first machine learning model.

Need to check novelty before this filing date? Find Prior Art

Description

[0001] METHODS AND SYSTEMS FOR CATEGORISING AN ARTICLE

[0002] FIELD OF THE INVENTION

[0003] This invention relates to systems and methods for categorising an item. Further, this invention relates to image processing and leveraging machine learning methods to categorise an item according to a predetermined item categorisation. It is particularly, but not exclusively, concerned with categorising items of baggage or cargo, for example operating at airports, seaports, train stations, and other transportation hubs or travel terminals. The techniques herein may also be applied to a factory environment where an item needs to be categorised rapidly.

[0004] BACKGROUND OF THE INVENTION

[0005] At present, around 2.25 billion bags are transported by the Air Transport Industry (ATI) annually. However, approximately 45,000,000 bags (2%) are mishandled. This problem is only expected to get worse, as it is predicted that 8.2 billion air passenger journeys are expected to be made in 2037. At this rate of growth, current airport processes will not be able to handle the demand and airport infrastructure and systems must be strategically planned for a sustainable future.

[0006] It is therefore very important for airlines and airports to improve baggage handling performance using scalable processes to ensure that the ATI is able to continue to provide high quality services and reduce passenger delays into the future.

[0007] Current methods for categorising items, such as baggage, rely on manual input from passengers or operatives to describe the bags and select an appropriate item categorisation, such as an IATA™ bag code. This relies on subjective descriptions of items which can vary in accuracy and detail, leading to subjective interpretation of bag characteristics. Human error in selection of an item classification can result in misclassification and mishandling of items of baggage or cargo.

[0008] Although there exists a standardized list of IATA™ bag categories, the current baggage mishandling process suffers from a number of particular problems:

[0009] It is a labour-intensive labelling process by examining each bag If a bag is not clearly one colour or another, the labelling may well not be consistent If a bag is not clearly one bag type or another (e.g., soft-shell or hard-shell), labelling may also not be consistent

[0010] Both human error and disagreement may impact how a bag is labelled and recorded Staff must be trained in understanding the baggage categories

[0011] The same is true for sea and air cargo which can be categorised according to standardised lists. It is therefore desirable to overcome or ameliorate the above problems and limitations of existing systems and processes.

[0012] SUMMARY OF THE INVENTION

[0013] The invention is defined by the independent claims, to which reference should now be made. Preferred features are laid out in the dependent claims.

[0014] In a first aspect of the invention, there is provided an image processing system for categorising an item, the system comprising: processing means configured to: obtain a first image comprising an item; extract a portion of the first image, the portion comprising the item; determine, using the extracted portion of the first image, one or more characteristic features of the item using a first machine learning model; determine a colour value associated with the extracted portion of the first image; categorise, using a second machine learning model, the item according to one of a plurality of predetermined item categorisations based on at least the determined one or more characteristic features and the colour value, wherein the second machine learning model is different to the first machine learning model.

[0015] In an embodiment of the invention, the processing means is further configured to: obtain one or more second images, each of the one or more second images comprising the item; and extract the portion of the first image based on a difference in a position of the item between the first image and the one or more second images.

[0016] In an embodiment of the invention, the processing means is further configured to obtain the one or more second images within a threshold period of time of obtaining the first image.

[0017] In an embodiment of the invention, the processing means is configured to determine the colour value using a third machine learning model, the third machine learning model being different to the first and second machine learning models. In an embodiment of the invention, determining a colour value associated with the extracted portion of the first image comprises determining a first average colour value of a plurality of colour values associated with the extracted portion of the first image.

[0018] In an embodiment of the invention, the processing means is further configured to: map the first average colour value to one of a plurality of predetermined colour definitions based on a plurality of colour ranges associated with each colour definition; and categorise the item based on at least the determined one or more characteristic features and the mapping.

[0019] In an embodiment of the invention, the first average colour value is according to a first colour space and wherein the processing means is further configured to determine a second average colour value according to a second colour space, the first colour space being different from the second colour space.

[0020] In an embodiment of the invention, the processing means is further configured to: generate training data using the first image, the training data comprising the mapping and the second average colour; and train the second machine learning model using the training data.

[0021] In an embodiment of the invention, the first machine learning model is an unsupervised machine learning model.

[0022] In an embodiment of the invention, the unsupervised machine learning model is an unsupervised clustering algorithm.

[0023] In an embodiment of the invention, the first machine learning model is a semi-supervised machine learning model.

[0024] In an embodiment of the invention, the second machine learning model is a supervised machine learning model.

[0025] In an embodiment of the invention, the supervised machine learning model is a neural network. In an embodiment of the invention, the processing means is further configured to: obtain depth information associated with the item; and extract the portion of the first image based on the depth information.

[0026] In an embodiment of the invention, the processing means is further configured to: estimate, using the first machine learning model, a dimension of the item based on the depth information; and determine whether the estimated dimension is less than a threshold value.

[0027] In an embodiment of the invention, the processing means is further configured to output the item categorisation.

[0028] In an embodiment of the invention, the processing means is further configured to: generate, based on the categorisation, a unique identifier associated with the item; and output the unique identifier.

[0029] In an embodiment of the invention, the unique identifier is a barcode.

[0030] In an embodiment of the invention, determining one or more characteristic features includes outputting to the second machine learning model a confidence score associated with at least one of the one or more determined characteristic features.

[0031] In an embodiment of the invention, the item is an item of baggage or an item of cargo.

[0032] In an embodiment of the invention, extracting the portion of the first image comprises: determining a bounding box enclosing the item within the first image; and cropping the first image to extract the portion of the first image.

[0033] In an embodiment of the invention, the plurality of predetermined item categorisations correspond to International Air Transport Association baggage identification codes.

[0034] In an embodiment of the invention, the plurality of predetermined item categorisations comprises one or more priority item categorisations.

[0035] In an embodiment of the invention, the processing means is further configured to, in response to categorising the item as one of the one or more priority item categorisations, output an alert. In an embodiment of the invention, the processing means is further configured to, in response to categorising the item as one of the one or more priority item categorisations, output the first image.

[0036] In an embodiment of the invention, the image processing system is part of an airport baggage check-in system.

[0037] In an embodiment of the invention, the processing means is further configured to obtain the first image comprising the item in response to a passenger or agent placing the item on a bag drop belt.

[0038] In an embodiment of the invention, the processing means is further configured to output a confidence score associated with the item categorisation.

[0039] In an embodiment of the invention, the processing means is further configured to output an alert if the confidence score is below a threshold value.

[0040] In an embodiment of the invention, the processing means is further configured to: obtain a second image comprising the item, said second image having been captured from a different physical location to the first image; and extract the portion of the first image based on a comparison of the first image and the second image.

[0041] In another aspect of the invention there is provided a method for categorising an item, the method comprising: obtaining a first image comprising an item; extracting a portion of the first image, the portion comprising the item; determining, using the extracted portion of the first image, one or more characteristic features of the item using a first machine learning model; determining a colour value associated with the extracted portion of the first image; categorising, using a second machine learning model, the item according to one of a plurality of predetermined item categorisations based on at least the determined one or more characteristic features and the colour value, wherein the second machine learning model is different to the first machine learning model.

[0042] BRIEF DESCRIPTION OF THE DRAWINGS Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which:

[0043] Figure 1 is a schematic diagram showing an example process flow of an embodiment of the invention;

[0044] Figure 2 is an exemplary image of a passenger’s bag captured by a camera;

[0045] Figure 3 is a schematic diagram showing an example process flow of an embodiment of the invention;

[0046] Figure 4 is an exemplary colour tree diagram according to an embodiment of the invention;

[0047] Figure 5 is an exemplary user interface showing an image of a passenger bag together with determined item categorisations and characteristics;

[0048] Figure 6 is a schematic diagram showing an example process of an aspect of the invention;

[0049] Figure 7 is a schematic flow diagram showing method steps of an embodiment of the invention;

[0050] Figure 8 is a schematic diagram showing an example image processing system of an embodiment of the invention.

[0051] DETAILED DESCRIPTION

[0052] The following exemplary description is based on a system, apparatus, and method for use in the aviation industry. However, it will be appreciated that the invention may find application outside the aviation industry, particularly in other transportation industries such as sea transport, or delivery industries where items are transported between locations.

[0053] The following embodiments described may be implemented using a Python programming language using for example an OpenCV library. However, this is exemplary and other programming languages known to the skilled person may be used such as JAVA.

[0054] Embodiments of the invention solve the problems discussed above by providing a system that can complement or fully replace manual examination of an item or article in order to categorise the item according to one of a plurality of predetermined categorisations. Instead, object detection techniques are leveraged to identify and extract characteristic features of an item using a camera and machine learning models. This allows the system to automatically identify an item and its characterising features without human intervention, resulting in both a significant reduction in the operational cost of identifying and categorising items compared to existing manual categorisation methods.

[0055] Using object detection methods, an item can be identified in an image and a portion of the image extracted for processing to determine an item categorisation. The system can use machine learning models to identify one or more characteristic features of the item (e.g., dent, sticker, added marker, or an unusual shape) that can be used to appropriately categorise the item for subsequent reidentification. The item could be reidentified by subsequently by repeating feature detection and comparing the features with previously detected features to assess whether there is a match.

[0056] Accordingly, embodiments of the invention can reduce the risk of mishandling items of baggage or cargo by providing a universal objective item categorisation system. The item categorisation by the system is therefore repeatable and as such allows items to be easily reidentified at a later stage using similar object detection and feature extraction techniques. This also allows item categorisation and reidentification to be performed at a much larger scale to address growing baggage and cargo demand. For example, a passenger could capture one or more images of their item of baggage using a mobile app, eliminating the need for manual input and simplifying the baggage check-in process. This improves the overall user experience of baggage check-in and reduces the risk of errors. The software may also gather data on baggage characteristics and classification patterns, enabling operators such as airlines to gain insights into passenger preferences and optimise baggage handling processes.

[0057] Figure 1 shows a high level overview of a system 100 according to embodiments of the invention.

[0058] The system 100 comprises an image capturing device such as a camera 101 that is configured to capture a first image, said image comprising an item to be categorised. The camera 101 may be integrated into a baggage check-in apparatus, on or within a bag drop kiosk or desk, a self-service bag drop machine at an airport, a mobile device, or may be one of a plurality of cameras in a CCTV system.

[0059] It will be appreciated that camera 101 may be coupled to a central computer or server which receives the output images from the camera for performing the item categorisation methods described herein. In all cases, wired or wireless communications protocols may be used to exchange information between each of the functional components.

[0060] The first image captured by the camera 101 comprises at least part of an item to be categorised. It will be appreciated that the term “item” used herein is used broadly to describe any article to be categorised such as baggage, cargo, or goods. An image comprising an item refers to an image that contains some or all of an item from a given camera perspective, and may also include scenery surrounding the item that falls within the field of view of the camera 101.

[0061] In some embodiments, capturing the first image may be triggered by the item being placed in a specified location. For example, the first image may be captured in response to a passenger or agent placing the item on a bag drop belt at an airport.

[0062] Having obtained a first image with camera 101 , object detection techniques are performed by the image processing system to identify and extract a portion of the first image for processing, said portion comprising the item to be categorised. It will be appreciated that various object detection algorithms may be implemented to extract a portion of the first image comprising the item. However, it is particularly advantageous to perform motion detection 103 to identify and locate an item within an image when there are a diverse range of items being categorised with correspondingly diverse features.

[0063] To perform motion detection 103, in addition to an initial first image frame being captured and passed to the image processing system, one or more additional second image frames are also captured that comprise at least part of the same item captured in the first image. Camera 101 obtains the first and one or more second images from the same physical location such that the background of the image is static. A reference image where no items are present in the field of view of camera 101 may be obtained prior to performing item categorisation. The item to be categorised is passed through the field of view of camera 101 while the first and one or more second images are captured such that the item is in a different physical location at the point of capturing each image. Movement of the item may be provided by a belt such as baggage belt in an airport.

[0064] By comparing the first and one or more second images it is possible to identify the item in the first image (or any of the other images) by identifying areas of the images where motion occurs. This may be achieved using various comparison techniques such as frame differencing, optical flow analysis, or using pre-trained machine learning models as would be appreciated by the skilled person. By identifying the static and moving regions of the image, the background of one or more of the images may then be removed to leave the moving item, such as by using adaptive background subtraction or by converting the frame difference into a binary mask. In some embodiments there is provided a threshold period of time between capturing the first and one or more second images. Providing such an image capture time window means that motion detection can be performed at scale by ensuring that only a short window of image capture time is allocated to imaging a given item. This period may be on the order of milliseconds.

[0065] Object detection 104 may be performed by motion detection step 103, or as a separate additional step that follows motion detection 103. In the prior situation, motion detection 103 may sufficiently isolate an item in an image frame from the static background. Based on this isolation, it is possible to extract the portion of the image corresponding to the item without identifying what the item is. This may be achieved by cropping the image according to pixel locations that conform to the outline of the region of motion within the image, resulting in an irregularly shaped cropped image. Alternatively, a bounding box 105 may be defined that tightly encloses the region of motion. The bounding box 105 is calculated to enclose the extremities of the region.

[0066] However, it is advantageous to perform further object detection 104 using a different technique and models to improve the reliability and robustness of identifying the item for categorisation. Dedicated object detection may be performed, for instance, using a machine learning model (e.g., YOLO, R-CNN, SSD). These machine learning models will be previously trained using labelled and categorised items. Alternatively, bounding box regression or region-based object detection or may be used. After using motion detection 103, a region of motion can be identified in the first image (or any of the images of the item captured by camera 101) which is input to the object detection model. The model can then identify, classify, and label the object (e.g., as a bicycle, or a suitcase). This significantly improves upon simply using motion detection by removing noise and factoring in environmental changes (such as shadows) that could affect how the region of motion containing the item is identified.

[0067] Using the output object detection and item classification, a portion of the first image can be extracted, for example by cropping. In some embodiments, object detection step 104 may include determining a boundary of the item within the image that corresponds to the outline or perimeter of the item in the image. Based on the determined boundary, it is possible to extract the individual cropped image of the item in an image. Cropping of the image may correspond directly to the determined boundary, for instance the dimensions and location of a bounding box can be used exactly for cropping. The cropped image may be transmitted to a server to perform subsequent item categorisation processes.

[0068] In addition to camera 101 , a depth sensor 102 may be provided to provide supplementary depth information to the motion detection step 103. The depth sensor 102 can include a Time-of-Flight (ToF) sensor, LiDAR, a structured light sensor, an ultrasonic sensor, an infrared sensor, or a hybrid depth sensor. Said depth sensor 102 may be integrated into the same hardware as camera 101 , such as a check-in desk at an airport. The depth sensor 102 is physically positioned closely (i.e., within a threshold distance) of the camera 101 to ensure that the field of view of the camera 101 substantially overlaps with the field of view of the depth sensor 102 and the information is captured at substantially the same time as capturing the first and one or more second images.

[0069] In this way, the depth information from the depth sensor can be used in combination with images from camera 101 for motion detection 103 and object detection 104. The 3D depth information is indicative of the item (and one or more dimensions thereof) and can be utilised alongside motion detection to extract the portion of the first image in which an item is indicated in the depth information. For example, a 3D scene may be generated using the depth information and compared with the 2D scene (or first image) that is obtained for motion detection. The regions of motion in the 2D scene can then be compared with the 3D scene information to assess whether an item is present at a given location in the first image.

[0070] This may be achieved by performing data fusion in which the 3D depth information is projected into the 2D scene. The 3D depth data represents distances between the depths sensor and points within the scene. 3D depth data can be projected onto the 2D image scene using intrinsic parameters of the camera such as focal length, and the extrinsic parameters that describe the relative position and orientation of the depth sensor with respect to the camera that obtains the 2D image scene.

[0071] The depth information can also be used by the object detection model to identify and classify the item by also considering the estimated dimensions of the item indicated by the depth information. Using the image alone, the object detection model may struggle to differentiate between regions in the image that display similar visual characteristics. The estimated dimensions can assist the model to differentiate between these features in the image.

[0072] In alternative embodiments one or more second images may be obtained from a different physical location to the first image, either using different image capturing devices (e.g., a CCTV system) within a threshold distance of each other or by moving the same image capturing device between a first image capturing position to a second image capturing position. In this instance, the item is kept stationary and the multiple camera views are used to perform stereo vision object detection which will be known to the skilled person. By comparing the first image with the one or more second images, it is possible to match corresponding features between each image using an object detection algorithm 104. By matching corresponding features, it is possible to locate the item within the first image and subsequently extract the first image portion comprising the item as described herein.

[0073] Extracted portion 105 is then passed to a first machine learning model 106 for further processing. In some alternative embodiments, the first machine learning model 106 may receive the first image directly. Using the extracted portion, the first machine learning model can determine one or more characteristic features of the item such as colour, shape, and size.

[0074] In some embodiments, the first machine learning model 106 may be an unsupervised machine learning model. In further embodiments, the first machine learning model 106 may be an unsupervised clustering algorithm.

[0075] The unsupervised clustering algorithm receives the extracted portion 105 and groups the cropped image by comparing the extracted characteristic features to historical data derived from extracting characteristic features of images containing other items. The algorithm receives unlabelled data points corresponding to the extracted portion (e.g., pixel intensities) and determines a similarity measure that determines how close data points are. As would be appreciated by the skilled person, this can be achieved using distance-based metrics (such as Euclidean distance, Manhattan distance, cosine similarity), density-based measures, or graph-based measures.

[0076] The unsupervised clustering algorithm therefore clusters the data point corresponding to the item in feature space close to other similar items (e.g., red suitcases with other red suitcases). The data point corresponding to the item is therefore assigned a label according based on the historical data obtained by the algorithm. Exemplar clustering algorithms include k-means clustering, DBSCAN, hierarchical clustering, and Gaussian Mixture Models. After labelling the item, the label can be output for further processing.

[0077] In some embodiments the first machine learning model 106 outputs a confidence score associated with at least one of the one or more determined characteristic features, or the determined label. This is an indication of how certain the determination of the first machine learning model is and can be passed to second machine learning model 109 which assign a weight to the one or more determined characteristic features / labels output from the first machine learning model. This ensures that low-confidence determinations are correspondingly de-weighted for processing in the second machine learning model 109. It will be appreciated that the first machine learning model 106 and second machine learning model 109 may be two different models, or two sub-models / containers of the same machine learning model.

[0078] In some embodiments the first machine learning model 106 may also receive depth information from depth sensor 102. Using this as an additional input alongside the extracted portion 105, the first machine learning model 106 can determine one or more characteristic features of the item with greater accuracy.

[0079] Additionally, the first machine learning model may be used to estimate one or more dimensions of the item based on the depth information to determine whether each dimension estimate is less than a threshold value. Items such as passenger baggage can require compliance checks before being transported, including checking the dimensions of the item. By integrating this step into the first machine learning model it is possible to automate this compliance check. In some embodiments an alert may be output in response to the first machine learning model 106 determining that the item is not less than a threshold size.

[0080] Alternatively, in some embodiments the first machine learning model 106 is a semisupervised machine learning model trained on pre-labelled images comprising items. Using a semi-supervised machine learning model advantageously allows some prior label selection which can be used to label the data point corresponding to the item in the first image. This allows many input data points to the first machine learning model 106 to be labelled at scale based on some desired predetermined labels. This may be particularly useful when particular labels / features are of interest to an operator (e.g., colour, shape, texture). The image processing system processing means are also configured to determine a colour value 107 associated with the extracted portion of the image. This may be performed at the same device as the preceding object detection step 104, or at a separate server. Colour determination may be performed based on the first image or on the extracted portion 105. In some embodiments, the colour value may be determined by taking an average of a plurality of colour values associated with the extracted portion of the first image. For instance, the pixel colour values in the image portion may be averaged using an average colour calculator 107.

[0081] The average colour may be determined using a well-known function such as a median function, a mode function or a mean function. Embodiments of the invention preferably used a mean function to calculate the mean value of an array of elements. This may be calculated by summing the values of the array of elements and dividing the sum by the number of elements in the array. For example, an array of pixel values may be averaged.

[0082] Colour classification may be performed by a colour mapping process according to a plurality of different colour definitions. These may be classified according to the hue, saturation and value (H, S and V) definitions of a plurality of different colour categorisations in which the different colours are defined according to the values defined in the Table 1 . Alternatively or additionally, colour determination and classification may be according to red-green-blue (RGB) colour space.

[0083] Table 1 : The H, S, and V definitions of a number of predetermined different colour classifications.

[0084] In some embodiments the first average colour may be determined according to one colour space (e.g., RGB) and a second average colour is determined according to a second different colour space (e.g., HSV). It will be appreciated that other colour spaces may be used such as CMYK, LAB, YUV, XYZ et cetera. The first and second average colours may be determined by taking averages over the whole extracted portion 105 or sub-regions of the extracted portion. By taking averages according to two different colour spaces it is possible to more accurately determine the colour of an item, particularly if the item is multicoloured.

[0085] After determining the first average colour, the processing means is configured to map the first average colour value to one or a plurality of predetermined colour definitions based on a plurality of colour ranges associated with each colour definition (e.g., as in Table 1 or the rules-based approach in figure 4). By way of example, the average colour value may be compared to RGB values (e.g., using a Hex code) and mapped to the closest colour definition. Using such rule-based approach to define the target variable embodiments of the invention provide a systematic approach to defining colours.

[0086] Usually, the average colour is determined using an RGB colour space. Irrespective of the type of colour space used, a single average over a plurality of channels, such as three channels, of the colour space is determined. Alternatively, individual average (H), average (S), and average (V) may also be determined from the extracted portion 105 which are input into a rules-based algorithm approach set out in figure 4 and discussed below.

[0087] Second machine learning model 109 receives inputs from first machine learning model 106 and the colour value determination. To be specific, the second machine learning model can receive at least some of the first image, the extracted portion of the first image, and the output clusters, characteristic features, or labels from the first machine learning model. Using these inputs, the second machine learning model is configured to categorise the item according to one of a plurality of predetermined item categorisations based on at least the determined one or more characteristic features identified by the first machine learning model 106 and the determined colour value. The processing means may be further configured to output the item categorisation, for example to a server or user interface.

[0088] In some embodiments the second machine learning model 109 may output a confidence score associated with the item categorisation. This corresponds to the certainty with which the item has been categorised. This can allow the item categorisation to be weighted according to the confidence score for further processing or consideration by an operator. For instance, if the confidence score is very low the categorisation is very uncertain and may be erroneous. In some embodiments, if the confidence score is below a threshold value, the image processing system may output an alert. This can bring an operator’s attention to an item to be able to verify the item categorisation.

[0089] The plurality of predetermined item categorisations may advantageously be a list of International Air Transport Association (IATA) baggage identification codes. In this way, it is possible to categorise a passenger’s item of baggage according to the standardised IATA categorisation system. It will be appreciated that other predetermined item categorisations may be used for differing use cases, for instance sea cargo or factory production lines.

[0090] In some embodiments, the plurality of predetermined item categorisations may comprise one or more priority item categorisations that have special handling requirements. By way of example, within the plurality of predetermined item categorisations there may be a “perishable” item categorisation which corresponds to perishable items where item handling is subject to time constraints and / or delicate handling requirements. In this example, the “perishable” item categorisation may be defined a priority item categorisation. In response to categorising an item as one of the one or more priority item categorisations the image processing system may be configured to output an alert. This ensures that an item that requires special handling is flagged to an operator without requiring the operator to manually review all items that are being categorised.

[0091] Advantageously, the image processing system may output the first image comprising the item in response to categorising the item as one or more of the plurality of priority item categorisations. This allows an operator to receive an image of the item as a visual reference to be able to easily identify the item. For instance, the operator may receive the image and note that it contains a small blue suitcase. Using this image, the operator can then review a plurality of items that could contain many colours and types of baggage and visually identify the small blue suitcase by comparison with the image. This allows the operator to identify the item at scale among many other items without having to manually check each item while also receiving the benefits posed by autonomous item categorisation.

[0092] First machine learning model 106 and second machine learning model 109 are different machine learning models. In some embodiments, the second machine learning model 109 is a supervised machine learning model, allowing an operator to specify the labels (or predetermined item categorisations) desired.

[0093] In some embodiments, the second machine learning model 109 is a neural network. Such neural networks are well known to the skilled person and comprise a plurality of interconnected nodes. This may be provided a web-service cloud server. Usually, the nodes are arranged in a plurality of layers L1 , L2, ...LN which form a backbone neural network. The neural network usually has one or more nodes forming an input layer and one or more nodes forming an output layer. Accordingly, the model may be defined by the neural network architecture with parameters defined by the weights. The neural network may be a convolutional neural network (CNN). It will be appreciated that the second machine learning model 109 may employ a decision tree-like approach where a feature embedding vector of the bounding box derived by the object detection model is matched to a corresponding branch of historical data.

[0094] It will be appreciated that the second machine learning model 109 is usually trained. In some embodiments the image processing system may generate training data 108 for the second machine learning model 109 using the first image, the training data 108 comprising the colour mapping and the second average colour. The second machine learning model 109 may then be trained using said training data. In this example, the training data is generated using both the second average colour determined according to a second colour space (as described above) and the result of mapping the first average colour value to one of a plurality of predetermined colour definitions based on a plurality of colour ranges associated with each colour definition. Using at least both of these colour averages to form training data for the second machine learning model 109 allows the machine learning model to more accurately categorise the item than would be achieved by solely implementing a first average colour determination or a colour mapping. Combining these techniques provides an optimum item categorisation algorithm or in other words provides a good balance for good indicators of colour and regularisation.

[0095] In alternative embodiments, training data 108 may not be derived from the colour value determination and instead comprises a number of historical pre-labelled item images. For example, the machine learning model is trained on a diverse dataset of annotated images representing various types of baggage, sizes, colours, materials, and contents, etc.. Regardless, the second machine learning model 109 categorises the item based on at least the determined one or more characteristic features (determined by first machine learning model 106) and the colour mapping in step 107.

[0096] Training of neural networks is well known to the skilled person, and therefore will not be described in further detail.

[0097] Using the described inputs, second machine learning model 109 categorises the item according to one of the plurality of predetermined item categorisations based on at least the determined one or more characteristic features and the colour value.

[0098] Based on the item categorisation, the image processing system may be further configured to generate a unique identifier associated with the item and output the unique identifier. This may be any one of a baggage license plate number (LPN); a Bag Tag number; a barcode; or any other suitable means for uniquely identifying the item. Appropriate unique identifiers can be generated based on the item categorisation use case (e.g., sea cargo, air cargo, passenger baggage etc.).

[0099] Figure 2 is an exemplary image 200 obtained by an image capture device, such as camera 101. Image 200 may correspond to the first or one or more second images discussed herein, and comprises an item 201 to be categorised. In other examples, the image 200 may only partially include an item such that part of the item is not included in the image. In this example, item 201 is a suitcase with various characteristic features such as wheels, combination lock, and a zipper.

[0100] The image is taken by a camera 101 that is placed on or in a bag check-in desk which receives the suitcase on a belt and may be taken in response to the bag being placed on said belt by a passenger or agent. To detect this, the system may receive inputs from a weight sensor positioned below the belt which detects an increase in weight. Alternatively, the system may use object detection techniques to recognise when an object has been placed in the field of view of the camera. Other methods for detecting the item may be utilised such as laser sensors. The belt may be used to move item 201 to perform motion detection of the item as described herein.

[0101] Using the techniques described herein, it is possible to determine and extract a portion 202 of the first image, the portion 202 comprising the item 201. In this example, portion 202 is a bounding box which tightly encloses item 201 where each side of the bounding box is aligned with the extrema of the item. The extracted portion may therefore correspond exactly to the bounding box by cropping the first image 200 to match the dimensions and position of the bounding box.

[0102] Figure 3 shows a high level overview of a system 300 according to embodiments of the invention. Elements 301 , 302, 303, 304, 305, 306, 309, 310, and 311 of system 300 may perform the same functions as described in relation to elements 101 , 202, 103, 104, 105, 106, 109, 110, and 111 of system 100. In some embodiments, the processing means of the image processing system is configured to determine the colour value using a third machine learning model 307, the third machine learning model being different from the first machine learning model 306 (or 106) and the second machine learning model 309 (or 109).

[0103] The third machine learning model 307 may receive either the first image or the extracted portion 305 of the first image as an input and determine a colour value associated with the first image or extracted portion 305. The determined colour value may then be output to the second machine learning model 309.

[0104] Advantageously, the third machine learning model 307 may be an unsupervised algorithm, or a semi-supervised algorithm. In some embodiments, the third machine learning model 307 is an unsupervised clustering algorithm that is different from first machine learning model 306 which may also be an unsupervised clustering algorithm. As a consequence, model 307 may be configured with different tuning parameters to model 306 to obtain different results even if operating on the same input image or extracted portion 305. This arrangement allows the image processing system to uncover any hidden feature patterns and reduce bias risks that could be introduced by supervised models or the rules-based approach described herein to determine a colour value. If the third machine learning model 307 is a semi-supervised algorithm it is trained on historical images comprising items with pre-labelled features such as colour. In some embodiments, the second machine learning model 309 may assign weights to the inputs from the first and third machine learning models to improve the accuracy of the final item categorisation. The third machine learning model 307 may also output a confidence score indicative of the certainty with which a colour value has been determined. Model 309 may then appropriately increase or decrease the weight of the input from model 307 based on the confidence score, allowing model 309 to adapt to the reliability of the colour value determination.

[0105] Figure 4 shows an exemplary colour tree diagram for determining a colour value that may be performed by any systems described herein. The approach set out in figure 4 may be referred to as a “rules-based approach” or the like and is presented in the H, S, colour space, though it will be appreciated that other colour spaces may be used in the rules-based approach. However, it has been found that using H, S, values for this approach is more robust compared to known colour determination algorithms. This processing step maps the determined average colour value to one of a plurality of predetermined colour definitions based on a plurality of colour ranges associated with each colour definition. The item can then be categorised based at least on said mapping in addition to or instead of the determined colour value previously obtained.

[0106] After determining an average colour or set of average colours (e.g., average H, average S, average V) these averages are input into the rules-based approach in orderto assign a colour or classified the colour of the image according to a predetermined colour. The following code, as well as the colour tree diagram of figure 4 describe how the rule-based colour mapping is performed:

[0107] In this code:

[0108] • score, label, bounding_box = material_model.image_predict(image)

[0109] • grab_cut_image = grab_cut(image, bounding_box)

[0110] • mean_r, mean_g, mean_b = grab_cut_image.reshape(-1 , 3).mean(axis = 0)

[0111] • Hsv = Rgb_to_hsv(mean_r, mean_g, mean_b)

[0112] • color_clusters = kmeans(k = 5, image)[clusters]

[0113] • features = [color_clusters, [mean_r, mean_g, mean_b], rule_color] Type features:

[0114] • material_pred = material_retinanet.predict(img) • type_pred = type_retinanet.predict(img)

[0115] • external_pred = external_retinanet.predict(img)

[0116] • features = [material_pred[labels], type_pred[labels], external_pred[labels]] Brightness reduction: • h, s, v = rgb_to_hsv(r, g, b)

[0117] • v = v - 40

[0118] • r, g, b = hsv_to_rgb(h, s, v)

[0119] Table 2: Code defining the HSV to colour function. In this, the labels bk, gy, wt, bn, rd, yw, be, gn, bu, pu, rd are the colours black, grey, white, brown, red, yellow, beige, green, blue, purple and red respectively. As will be appreciated from the colour tree diagram of figure 4, the average colour is mapped to one of a predetermined colour categorisation defined by colour space values.

[0120] Referring to figure 4 of the drawings it will be appreciated that embodiments of the invention may first examine saturation, S, values, then examine S and values, and then examine H values if needed.

[0121] Examining the values according to this order is particularly beneficial compared to examining first hue sample values, then saturation sample values and then the sample values.

[0122] By way of explanation, the S, Saturation values first indicate how grey an image is or in other words the lack of colour. So this is first used to filter to colours; black, grey and white if under a certain value.

[0123] The ,V, Values then indicate brightness, say if black if under a certain value.

[0124] Finally, the H, Hue, values are then used to determine within which colour range or ranges a bag may be in and may again be checked with V if two colours are close in H values.

[0125] This is beneficial because if it is assumed that light sources have a colour temperature that is not always constant. This means that any black, grey or white a hue value which may not be accurate. Accordingly, hue values are not checked until the algorithm has determined with enough certainty that that the bag is not a black, grey or white bag.

[0126] In the specific example of the HSV colour mapping shown in figure 4, a determination is first made as to whether the average S value is less than a first threshold = 15. If the average S value is not less than 15, then the next step is to determine whether the average V value is greater than a second threshold = 25. If the average value is not greater than 25, then the bag is categorised as black. Otherwise if the average value is greater than 25, then a determination is made as to whether the average V value is greater than a third threshold = 75. If the average value is not greater than the third threshold, then the bag is categorised as grey, but if the average V value is greater than 75, then the bag is classified as white.

[0127] Using such rule-based approach to define the target variable embodiments of the invention provide a systematic approach to defining colours. This solves the problem of inconsistent labelling of bag colours. For example, one person may label a bag as purple while another may consider that the same bag is in fact red.

[0128] This procedure may correctly categorise a bag colour according to a predetermined colour categorisation.

[0129] Figure 5 shows a an exemplary user interface 500 showing an image of a passenger bag together with determined item categorisations and characteristics. This user interface may be provided, for example, as part of a baggage check-in desk or on a mobile device that receives item categorisation information from a server. As will be seen from figure 5, the user interface may output the first image comprising the item, or may alternatively output the extracted portion of the first image. This allows a user to be able to visually identify a bag and use the interface as a point of reference instead of using merely a verbal or written description of the item which may be unreliable or subjective.

[0130] Additionally the user interface may output one or more determined characteristic features of the item and / or the determined colour value of the item. In the example of figure 5, the colour of the item is output “BLUE” along with a categorisation term “BU”. Additional item characteristics are shown such as “STRAPS TO CLOSE”, “COMBINATION LOCK”, and “WHEELS” which are all features that can be used to identify and categorise an item. Categorisation terms “BU”, “T02”, “T22R” and so on as presented in figure 5 may be combined or concatenated to provide a comprehensive item categorisation.

[0131] In some embodiments, the user interface advantageously outputs the confidence score associated with each determined characteristic feature and colour value. For example, the colour “BLUE” is determined with a 99.8% confidence score and is therefore considered to be very accurate. Conversely, the feature “COMBINATION LOCK” is determined with a 33.9% confidence score and is therefore likely erroneous. Said confidence scores can be output by the first, second, and / or third machine learning models as described herein. By providing the confidence scores, it is possible to prioritise searching and identifying the item using the characteristics that have high confidence scores and avoid using low confidence scores to potentially identify items erroneously. In the absence of providing these confidence scores, it a user may look to search for an item using an erroneously identified characteristic, resulting in wasted time and potentially mishandling of items. Figure 6 shows an example process flow 600 for categorising an item and may be performed in accordance with systems 100 or 300 described herein. By way of example, process flow 600 may be used to categorise an item according to standardised IATA baggage identification codes. It will therefore be appreciated that the “image” 601 shown in flow 600 may be the first image or extracted portion of the first image (e.g., 105), and that “object detection models” 602 refer to processes 103, 104, 303, and / or 304. Similarly “vision models” 603 may refer to process 106, 107, 108, 109, 306, 307, and / or 309.

[0132] Using the techniques described, it is possible to categorise an item according to a plurality of predetermined item categorisations. In one processing path, an average colour value 604 is determined such as those listed in box 611 : WT: White; BK: Black; GY: Gray; BU: Blue; PU: Purple; RD: Red; YW: Yellow; BE: Beige; BN: Brown; GN: Green; MC: Mix colour; PR: Pink.

[0133] Additionally one or more characteristic features of the item are determined according to box 605 which separates bags that close with zippers “Yes” and “No”. Box 612 lists bags that do not close with zippers: 01 : Horizontal Design, Generally Hard Shell; 02: Upright Design, Generally Hard Shell; 03: Horizontal Design, Rigid Frame, Hard or Soft Side / May Have Reinforced Corners; 05: Horizontal Design, Rigid Frame, Soft Sided, Expandable; 06: Briefcase; 07: Flight / Pilots Bag / Documents Case; 08: Military Style Bag; 09: Plastic / Laundry Bag; 10: Cardboard / Wooden Box; 12: Storage Container (May Have Handle / Wheels). Box 613 lists bag categorisations that do close with zippers: 20: Hanging or Folding Garment Bag; 22: Upright Design, Soft Material Retractable Handle; 22D: upright design combined hard and soft material; 22R: upright design, durable, plastic, molded; 23: horizontal design suitcase that secures with zipper; 25:gym / sport type bag - all sizes, may have shoulder strap and retractable handle; 26: small overnight or laptop bag; 27: stands upright, may be expandable, often has wheels; 28: matted woven bag; 29: backpack / rucksackwith or without frame, may have retractable handle and wheels.

[0134] The item may also be categorised as a Miscellaneous Article - Special Container 606, for example as set out in box 614: 50: Hat Box; 51 : Courier Bag / Box / Package; 52: Trunk / Sample / Display Case (often custom made); 53: Art or Display Portfolio; 54: Tube without Sporting Equipment; 55: Duty Free Articles (Not Listed Elsewhere); 56: Cosmetic / Beauty Case, Any Style; 57: Kennel / Pet Containers. The item may also be categorised as a Sporting Good 607, for example as set out in box 615: 60. Fishing rods / poles / sticks 61 : firearms 62: golf bag and / or clubs - specify color and brand 63: bicycle and / or accessories 64: sleeping bag / bed roll / tent 65: surf equipment - e.g. wind, kite, boogie 66: skis and / or pole. Nordic walking poles 67: snow boards and other sledding devices 68: ski boots and / or boot bags 69: sporting equipment not shown elsewhere 75: wheeled sporting items - e.g. skateboards, scooters 89: camping, folding collapsible chair.

[0135] The item may also be categorised as Baby / Infant / Child Equipment 608, for example as set out in box 616: 71 : Child / lnfant Car Seat; 72: Child / lnfant Equipment Not Listed Elsewhere e.g., Infant Carrier, Playpen; 73: Full Size Pram / Baby Carriage / Jogger; 74: Umbrella Stroller.

[0136] The item may also be categorised as Photographic / Electronic / Musical / Communication Equipment 609, for example as set out in box 617: 81 : Audio / Visual / Photo Equipment e.g., Television, DVD, Video Players and Games, Radio, Tape / Cassette / CD Player; 82: Computer / Communication Equipment e.g., PC Computer, Monitor, Keyboard, Laptop, Printer, Computer Games, Telephone, Fax Machine; 83: Electrical Appliances; 85: All Musical Instruments.

[0137] The item may also be categorised as a Miscellaneous Article 610, for example as set out in box 618: 90: Baggage Trolley; 92: Security Removed Items; 93: Shopping Bag e.g., Straw, Plastic, Nylon; 94: Wheelchair - Powered or Manual and Accessories - e.g., Seat, Wheel, etc.; 95: Orthopaedic Devices other than a Wheelchair; 96: Bedding - May contain Pillows and Blankets; 97: Dive Bag / Equipment; 98: Beach, Patio, Golf, and Rain Umbrella; 99: Article other than Bag not appearing on this list - describe item.

[0138] By categorising an item according to one of this plurality of predetermined item categorisations, it is possible to identify the article easily and prevent item mishandling.

[0139] Figure 7 is a flowchart example operation 700 of categorising an item performed by an image processing system. At step 701 the system obtains a first image comprising an item. At step

[0140] 702 the system extracts a portion of the first image, the portion comprising the item. At step

[0141] 703 the system determines, using the extracted portion of the first image, one or more characteristic features of the item using a first machine learning model. At step 704 the system determines a colour value associated with the extracted portion of the first image. At step 705 the system categorises, using a second machine learning model, the item according to one of a plurality of predetermined item categorisations based on at least the determined one or more characteristic features and the colour value, wherein the second machine learning model is different to the first machine learning model. It will be appreciated that the order of at least some of these steps may be changed and are performed in accordance with the techniques described above.

[0142] Figure 8 shows a schematic image processing system 800 that may perform the functions described herein. The system may comprise a camera 801 as an image capturing means configured to capture the first image. Advantageously, the system 800 may also include one or more depth sensors 802 configured to obtain depth information of a scene that may comprise an item to be categorised. The system may also include one or more transceivers 803 that may be used to transmit any data obtained or generated by system 800, including the first image, the extracted portion of the first image, the determined one or more characteristic features of an item, the determined colour value associated with the extracted portion of the first image, and / or the categorisation of the item according to one of a plurality of predetermined item categorisations. Such transmissions may be sent to a server for further processing in accordance with the steps described herein. Data bus 804 may be included in system, 800 to transport data between the various functional components in the system (e.g., camera, depth sensor, transceiver). System 800 may also include memory 805 comprising instructions which, when executed by the system, cause the system to perform the techniques described herein. Some or all of the functionality in system 800 may be implemented by processor 806 by execution of code stored in the memory 805. System 807 may also include a user interface 807 configured to output information to a user such as alerts, and the first image as described above.

[0143] It will be noted that the various components in system 800 may be implemented in one or more circuits (e.g., one or more processors and / or one or more ASICs).

[0144] Embodiments of the invention may be advantageously used to locate missing or lost items.

[0145] This may be performed by searching a data base or storage means for an item having characteristics corresponding to the determined categorisation of the item. Location data and data defining a time when an item was categorised may also be stored in the database and associated with the item. Thus, a processor may be configured to search a database for items having associated location data and data defining a time when the item was detected which is associated with the determined item classification.

[0146] Thus, it will be appreciated that when a bag or item is missing or lost, the processor may advantageously search a database for matching bags with the same categorisation during a predetermined time period at predetermined location. This has the benefit that missing items may be more quickly located.

[0147] By way of example, a passenger may arrive at a busy airport that handles many bags. The passenger may arrive at a baggage check-in desk and look to check in their item of baggage. The passenger, or an airport agent, may take the item of baggage and place it on a belt. Attached to or near the desk a camera is directed towards the belt and obtains an image of the item. Either locally at the desk, or separately at a server, the techniques described herein are used to categorise the item of baggage. Said categorisation can then be output to the passenger, agent, both, or indeed any baggage handler involved. After item categorisation, it then possible enquire a database of categorised items and search according to the item categorisation to limit the results of the search. Doing so, the item can be easily located, for instance if it has been mishandled. By using the standardised item categorisation system, the search is rendered much more effective as the subjectivity of item categorisations by humans is completely removed.

[0148] Exemplary embodiments of the invention may be implemented as a circuit board which may include a CPU, a bus, RAM, flash memory, one or more ports for operation of connected I / O apparatus such as printers, display, keypads, sensors and cameras, ROM, and the like.

[0149] The wired or wireless communication networks described above may be public, private, wired or wireless network. The communications network may include one or more of a local area network (LAN), a wide area network (WAN), the Internet, a mobile telephony communication system, or a satellite communication system. The communications network may comprise any suitable infrastructure, including copper cables, optical cables or fibres, routers, firewalls, switches, gateway computers and edge servers.

[0150] The system described above may comprise a Graphical User Interface. Embodiments of the invention may include an on-screen graphical user interface. The user interface may be provided, for example, in the form of a widget embedded in a web site, as an application for a device, or on a dedicated landing web page. Computer readable program instructions for implementing the graphical user interface may be downloaded to the client device from a computer readable storage medium via a network, for example, the Internet, a local area network (LAN), a wide area network (WAN) and / or a wireless network. The instructions may be stored in a computer readable storage medium within the client device.

[0151] As will be appreciated by one of skill in the art, the invention described herein may be embodied in whole or in part as a method, a data processing system, or a computer program product including computer readable instructions. Accordingly, the invention may take the form of an entirely hardware embodiment or an embodiment combining software, hardware and any other suitable approach or apparatus.

[0152] The computer readable program instructions may be stored on a non-transitory, tangible computer readable medium. The computer readable storage medium may include one or more of an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk.

[0153] The above detailed description of embodiments of the invention is not intended to be exhaustive or to limit the invention to the precise form disclosed. For example, while processes or blocks are presented in a given order, alternative embodiments may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and / or modified. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed in parallel or may be performed at different times.

[0154] The teachings of the invention provided herein can be applied to other systems, not necessarily the system described above. The elements and acts of the various embodiments described above can be combined to provide further embodiments.

[0155] While some embodiments of the inventions have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the disclosure. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the disclosure.

Claims

CLAIMS1 . An image processing system for categorising an item, the system comprising: processing means configured to: obtain a first image comprising an item; extract a portion of the first image, the portion comprising the item; determine, using the extracted portion of the first image, one or more characteristic features of the item using a first machine learning model; determine a colour value associated with the extracted portion of the first image; categorise, using a second machine learning model, the item according to one of a plurality of predetermined item categorisations based on at least the determined one or more characteristic features and the colour value, wherein the second machine learning model is different to the first machine learning model.

2. The image processing system of claim 1 , wherein the processing means is further configured to: obtain one or more second images, each of the one or more second images comprising the item; and extract the portion of the first image based on a difference in a position of the item between the first image and the one or more second images.

3. The image processing system of claim 2, wherein the processing means is further configured to obtain the one or more second images within a threshold period of time of obtaining the first image.

4. The image processing system of any preceding claim wherein the processing means is configured to determine the colour value using a third machine learning model, the third machine learning model being different to the first and second machine learning models.

5. The image processing system of any of claims 1 to 3, wherein determining a colour value associated with the extracted portion of the first image comprises determining a first average colour value of a plurality of colour values associated with the extracted portion of the first image.

6. The image processing system of claim 5, wherein the processing means is further configured to: map the first average colour value to one of a plurality of predetermined colour definitions based on a plurality of colour ranges associated with each colour definition; and categorise the item based on at least the determined one or more characteristic features and the mapping.

7. The image processing system of claim 5 or 6, wherein the first average colour value is according to a first colour space and wherein the processing means is further configured to determine a second average colour value according to a second colour space, the first colour space being different from the second colour space.

8. The image processing system of claims 6 or 7, wherein the processing means is further configured to: generate training data using the first image, the training data comprising the mapping and the second average colour; and train the second machine learning model using the training data.

9. The image processing system of any preceding claim, wherein the first machine learning model is an unsupervised machine learning model.

10. The image processing system of claim 9, wherein the unsupervised machine learning model is an unsupervised clustering algorithm.11 . The image processing system of any of claims 1 to 8, wherein the first machine learning model is a semi-supervised machine learning model.

12. The image processing system of any preceding claim, wherein the second machine learning model is a supervised machine learning model.

13. The image processing system of claim 12, wherein the supervised machine learning model is a neural network.

14. The image processing system of any preceding claim, wherein the processing means is further configured to: obtain depth information associated with the item; andextract the portion of the first image based on the depth information.

15. The image processing system of claim 14, wherein the processing means is further configured to: estimate, using the first machine learning model, a dimension of the item based on the depth information; and determine whether the estimated dimension is less than a threshold value.

16. The image processing system of any preceding claim, wherein the processing means is further configured to output the item categorisation.

17. The image processing system of any preceding claim, wherein the processing means is further configured to: generate, based on the categorisation, a unique identifier associated with the item; output the unique identifier.

18. The image processing system of claim 17, wherein the unique identifier is a barcode.

19. The image processing system of any preceding claim, wherein determining one or more characteristic features includes outputting to the second machine learning model a confidence score associated with at least one of the one or more determined characteristic features.

20. The image processing system of any preceding claim, wherein the item is an item of baggage or an item of cargo.

21. The image processing system of any preceding claim, wherein extracting the portion of the first image comprises: determining a bounding box enclosing the item within the first image; and cropping the first image to extract the portion of the first image.

22. The image processing system of any preceding claim, wherein the plurality of predetermined item categorisations correspond to International Air Transport Association baggage identification codes.

23. The image processing system of any preceding claim, wherein the plurality of predetermined item categorisations comprises one or more priority item categorisations.

24. The image processing system of claim 23, wherein the processing means is further configured to, in response to categorising the item as one of the one or more priority item categorisations, output an alert.

25. The image processing system of claim 23 or 24, wherein the processing means is further configured to, in response to categorising the item as one of the one or more priority item categorisations, output the first image.

26. The image processing system of any preceding claim, wherein the image processing system is part of an airport baggage check-in system.

27. The image processing system of any preceding claim, wherein the processing means is further configured to obtain the first image comprising the item in response to a passenger or agent placing the item on a bag drop belt.

28. The image processing system of any preceding claim, wherein the processing means is further configured to output a confidence score associated with the item categorisation.

29. The image processing system of claim 28, wherein the processing means is further configured to output an alert if the confidence score is below a threshold value.

30. The image processing system of any preceding claim, wherein the processing means is further configured to: obtain a second image comprising the item, said second image having been captured from a different physical location to the first image; and extract the portion of the first image based on a comparison of the first image and the second image.31 . A method for categorising an item, the method comprising: obtaining a first image comprising an item; extracting a portion of the first image, the portion comprising the item; determining, using the extracted portion of the first image, one or more characteristic features of the item using a first machine learning model;determining a colour value associated with the extracted portion of the first image; categorising, using a second machine learning model, the item according to one of a plurality of predetermined item categorisations based on at least the determined one or more characteristic features and the colour value, wherein the second machine learning model is different to the first machine learning model.

32. A computer program comprising instructions which, when executed by a computer, cause the computer to perform the method of claim 31.