Method for detecting occlusion of a camera and for monitoring the functioning of a camera

By extracting features from camera image data and using a statistical model to detect occlusion, the problem of low efficiency in camera occlusion detection in existing technologies is solved, achieving fast and resource-saving occlusion detection and ensuring the normal functioning of the camera system.

CN122265136APending Publication Date: 2026-06-23VOLVO CAR CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
VOLVO CAR CORP
Filing Date
2025-12-18
Publication Date
2026-06-23

Smart Images

  • Figure CN122265136A_ABST
    Figure CN122265136A_ABST
Patent Text Reader

Abstract

The present disclosure relates to a method for detecting an occlusion of a camera and a method for monitoring a functionality of a camera. The method for detecting an occlusion of a camera comprises obtaining image data indicative of an image captured by the camera, extracting, from the image data, first information indicative of features in the image data, and evaluating, by a statistical model, whether at least a portion of the camera is occluded based on second information indicative of an amount of features in the first information. Furthermore, the present disclosure relates to a data processing device, a computer program and a computer readable storage medium, each comprising corresponding means for performing the steps of at least one of the methods described. Additionally, the present disclosure relates to a vehicle comprising a camera and a data processing apparatus.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to a method for detecting camera occlusion and a method for monitoring camera functionality. Furthermore, this disclosure relates to data processing apparatus, computer programs, and computer-readable storage media, each including corresponding components for performing steps of at least one of the methods. Additionally, this disclosure relates to a vehicle including a camera and a data processing device. Background Technology

[0002] Increasingly, cameras are being deployed inside or outside various objects. For example, objects may include vehicles. Regardless of whether the camera is located inside or outside an object (e.g., a vehicle), camera occlusion can occur for various reasons. For example, the camera may be blocked by other objects. Dust, snow, ice, condensation, moisture, fog, and / or mist on the camera or camera lens can also cause camera occlusion. For example, the camera may also be damaged or misaligned, and this will also result in camera occlusion. As cameras are increasingly used in different systems to autonomously perform various tasks (e.g., analysis or sensing tasks), there is a need for methods that can detect camera occlusion quickly, efficiently, and resource-efficiently. Summary of the Invention

[0003] This disclosure addresses the problem of improving camera occlusion detection, thereby also improving camera monitoring capabilities. Improving camera occlusion detection can include detecting camera occlusion in a fast, efficient, and / or resource-saving manner.

[0004] The problem is at least partially solved or mitigated by the subject matter of the independent claims of this disclosure, wherein further examples are incorporated in the dependent claims.

[0005] According to a first aspect, a method for detecting camera occlusion is provided. The method includes the steps of: obtaining image data indicating an image captured by the camera; extracting first information indicating features in the image data from the image data; and evaluating, using a statistical model, whether at least a portion of the camera is occluded based on second information indicating the amount of features in the first information.

[0006] Camera occlusion is a topic widely discussed by those skilled in the art. Generally, camera occlusion occurs when an object or thing partially or completely obstructs the camera's field of view; that is, when a part of an object, thing, or scene is not visible from the camera's viewpoint due to obstruction or occlusion by other objects or things. Dirt, dust, or debris on the camera lens can also be a cause of camera occlusion. Furthermore, and most importantly, for externally mounted cameras, snow or ice forming around the camera in winter conditions can cause camera occlusion. Similarly, condensation or moisture on the camera lens, fog or mist, physical damage to the camera, or misalignment can all cause camera occlusion. These occlusions or obstructions can significantly affect the performance of systems that rely on a properly functioning camera. This disclosure is applicable to the detection of any kind of camera occlusion.

[0007] This disclosure enables the detection of camera occlusion from the camera's field of view, as image data from the image captured by the camera is used for detection. Generally, those skilled in the art understand the meaning of the term "image data." The term "image data," referring to an image captured by a camera, means image data as a digital representation of the captured image in a structured format. It is well known that image data typically consists of pixel data. Additionally, image data may also include metadata. Pixel data / information can represent the color and intensity of light at each point in the image. Each pixel can be defined by multiple channels (e.g., RGB for color images) and can vary in depth (e.g., 8-bit, 16-bit), thus affecting the range of colors represented. It is well known that image data can be organized in different dimensions, such as 2D (standard images), 3D (volume data), or even higher dimensions including time and other variables. Metadata for image data is also well known and can provide context for pixel data, including details about the image type (e.g., TIFF, JPEG), size, acquisition settings (such as exposure time and the hardware used), and other relevant parameters that help in the correct interpretation of the image. Metadata can be used to determine how the image was captured and what conditions might have affected the quality and content of the captured image. Image data can be stored in various file formats known to those skilled in the art. Common formats may include, for example, TIFF, JPEG, and / or PNG, wherein any other suitable image file format may be used in accordance with this disclosure.

[0008] Detecting camera occlusion based on image data captured by the camera improves the quality of camera occlusion detection results, i.e., improves the reliability of camera occlusion detection, because the detection is performed from the camera's field of view.

[0009] Furthermore, due to the use of statistical models, camera occlusion detection can be performed quickly, efficiently, and / or resource-efficiently. Statistical models are fast, efficient, and resource-saving analytical tools. Fast, efficient, and / or resource-efficient camera occlusion detection is also achieved by performing quantitative and non-deep analysis (i.e., qualitative analysis) of the features of the image data. Focusing on the number of features increases the efficiency of camera occlusion detection and reduces the amount of resources (e.g., storage and / or computational resources) required to perform camera occlusion detection.

[0010] This method can be implemented at least partially by a computer, and can be implemented in software or hardware, or both. Furthermore, the method can be executed by computer program instructions running on a component providing data processing functionality. The data processing component can be any suitable computing component (such as an electronic control module), or it can be a distributed computer system. The data processing component or the computer can each include one or more of a processor, memory, data interface, etc.

[0011] As illustrated, cameras can be mounted on or inside a vehicle. Vehicles use cameras for various purposes. For example, cameras in and / or on a vehicle provide a range of essential functions to enhance safety, assist the driver, and / or improve the overall driving experience. For instance, cameras are used in Advanced Driver Assistance Systems (ADAS) for monitoring and recording, parking assistance, improving visibility in low-light conditions, adjusting vehicle lighting based on environmental conditions (e.g., during nighttime driving or inclement weather conditions), security monitoring, and / or fleet management. Cameras in or on a vehicle support many additional functions, and cameras have become an integral part of modern vehicles, making significant contributions to safety, convenience, and efficiency on the road. Their ability to function effectively under various conditions is also crucial for the advancement of autonomous driving technologies. Correspondingly, detecting camera occlusion of cameras mounted on and / or inside a vehicle is essential for the correct execution of the various and critical functions supported by the cameras within the vehicle.

[0012] According to the example, the features in the image data can be image features; and / or the features in the image data can be features of different types of image features; and / or each quantity in the feature quantities in the first information can indicate the number of features belonging to at least one particular type of image feature. Therefore, camera occlusion detection can be performed at different levels of granularity and with high variability of the features under consideration. Furthermore, the camera occlusion detection described herein is generally applicable to any camera, whether it is installed indoors or outdoors, and regardless of the purpose of its use.

[0013] According to the example, different types of image features may include at least one of the following types: edges, corners, textures, shapes, colors, transformation-based features, local feature descriptors, points of interest, and / or objects.

[0014] As illustrated in the example, the statistical model assesses whether at least a portion of the camera is occluded by comparing quantities in the set of features with corresponding thresholds for those quantities. Therefore, simply comparing features to their corresponding thresholds requires far less time and computational resources to determine whether a camera is occluded compared to methods that rely on more complex analysis of the features themselves and / or on artificial intelligence (AI) or machine learning (ML) models. Methods that rely on more complex analysis of the features themselves and / or on artificial intelligence (AI) or machine learning (ML) models require significantly more storage and computational resources, as well as more computation, to satisfy the appropriate decision regarding the presence or absence of camera occlusion—that is, to satisfy the appropriate decision regarding whether the camera is occluded.

[0015] According to the examples, the statistical model is a Bayesian statistical model, a random forest model, or a decision tree model. Any other suitable statistical model may also be used according to this disclosure. The use of statistical models has efficiency advantages because they require fewer storage and processing resources than methods that rely on more complex analyses of the features themselves and / or are based on artificial intelligence (AI) or machine learning (ML) models, which require, for example, more storage and computational resources and more computation to satisfy appropriate decisions about the presence or absence of camera occlusion.

[0016] According to the examples, feature extraction can be performed using at least one of a set of feature extraction processes. For example, this set of feature extraction processes may include at least one of the following: MinEigen, BRISK, FAST, Harris, LBP, GLCM, HOG, Gabor, SIFT, SURF, BRIEF, ORB, HSV, YCbCr, LAB, Canny, Sobel, Laplacian, Shi-Tomasi, FFT, DCT, DWT, RANSAC, KLT, PCA, t-SNE, UMAP, CNN, ResNet, VGG, YOLO, RCNN, UNet, OF, STIP, FD, ZFD, HOG, FAST, BRIEF, ORB, HSV, and / or MSER. The exemplary feature extraction processes listed above are well known to those skilled in the art. However, this disclosure is not limited to the exemplary feature extraction processes / tools listed above, and any or a combination of other suitable feature extraction processes or tools may be utilized. Therefore, this disclosure implements a modular implementation in which different(one or more) additional feature extraction processes or tools(s) can be added to perform feature extraction and / or any of the already used feature extraction processes or tools(s) can be removed and thus not used to perform feature extraction. For example, if a more efficient process or tool exists, it can be easily integrated. If a process or tool is outdated and / or inefficient, it can be easily removed. Similarly, the addition or removal of tools(s) or processes(s) can be performed based on the required feature extraction. Thus, a specific tool(s) or process(s) can be selected based on the features to be extracted. Some of the exemplary feature extraction processes / tools listed above are explained in more detail below.

[0017] The MinEigen process is configured to extract, for example, corner features from 2-D grayscale or binary images using a minimum eigenvalue algorithm developed by Shi and Tomasi. The algorithm identifies points in the image data where the minimum eigenvalue of the gradient covariance matrix is ​​above a certain threshold, indicating potential corner points. The extracted features can be returned as corner objects containing information about these detected corner features in the image data.

[0018] BRISK (Binary Robust Invariant Scalable Keypoint) processing is used to extract, for example, several different image features that are crucial for tasks such as image matching and object recognition. Features that BRISK can extract may include at least one of the following: sampling pattern, keypoints, descriptor composition, rotation invariance, intensity of sampling points, etc.

[0019] FAST (Fast-Segment Test) processing is used, for example, for corner detection. Therefore, features extracted by the FAST algorithm can include, for example, points of interest or corners in an image, which can be used to track and map objects in various computer vision tasks.

[0020] Harris processing is used, for example, to extract angular features from images. For instance, Harris processing can identify angles as keypoints in an image. An angle can be defined as a point where there are significant changes in intensity gradients in multiple directions, making them distinct and stable features. Furthermore, Harris processing can detect gradient changes; that is, it can detect regions in an image where there are large changes in intensity in all possible directions, indicating angles.

[0021] Local Binary Pattern (LBP) processing is employed, for example, to extract texture features from an image. LBP encodes local texture information in an image by comparing the intensity of a central pixel with the intensities of its surrounding neighboring pixels. Features extracted by LBP can include at least one of the following: edge patterns, corner patterns, flat regions (featureless areas), uniform patterns (representing continuous regions), and / or non-uniform patterns (representing more complex textures), etc.

[0022] GLCM (Gray Co-occurrence Matrix) processing is used, for example, to extract texture features from images. GLCM can analyze the spatial relationships between pixel pairs in an image, and therefore can be used for texture analysis. Features extracted from a GLCM may include at least one of the following, for example: contrast features (e.g., by measuring local variations in the gray-level co-occurrence matrix, contrast features can reflect the intensity contrast between a pixel and its neighboring pixels across the entire image), correlation features (e.g., by assessing the correlation between a pixel and its one or more neighboring pixels across the entire image, correlation features measure the joint probability of occurrence of a given pair of pixels), energy (e.g., angular second moment) features (energy represents, for example, the sum of squared elements in the GLCM, also known as uniformity, and can indicate the texture uniformity of the image), homogeneity (e.g., inverse difference moment) features (e.g., by assessing the proximity of the element distribution in the GLCM to its diagonal, where homogeneity features can reflect the uniformity of the image), entropy features (quantifying, for example, randomness or disorder within the image, where higher entropy values ​​indicate greater complexity and less predictability of the image texture), and / or dissimilarity features (which may be similar to contrast features and can measure / indicate the degree of dissimilarity of each element with each other in the GLCM), etc.

[0023] HOG (Histogram of Oriented Gradients) processing is employed, for example, to extract features of the shape and / or structure of objects focused on in an image. Features extracted by HOG may include at least one of the following: gradient orientation (appearance), local shape information, histogram of gradient directions, block normalization, spatial and orientation binning, etc.

[0024] Gabor processing is employed, for example, to extract features related to texture and / or orientation in an image.

[0025] SIFT (Scale Invariant Feature Transform) processing is used, for example, to detect key points in an image.

[0026] SURF (Accelerated Robust Features) processing is arranged, for example, to detect blobular structures in an image, which may include identifying regions in the image where there are significant changes in intensity, which may indicate corners and blobs in the image.

[0027] BRIEF (Binary Robust Independent Basic Feature) processing is arranged, for example, for extracting binary descriptors from image patches.

[0028] ORB (Oriented FAST and Rotated BRIEF) processing is used, for example, to extract key points and binary descriptors from images.

[0029] HSV (Hue, Saturation, Value) processing is arranged, for example, to extract color-based features from an image, wherein the color-based features may include at least one of the following: hue, saturation, and / or value (also known as brightness).

[0030] The YCbCr processing is arranged, for example, to extract at least one of the following features from an image: luminance (Y), blue color difference (Cb), and red color difference (Cr).

[0031] LAB processing is arranged, for example, to extract at least one of the following features from an image: color information, where L represents brightness, A represents the green-to-red dimension, B represents the blue-to-yellow dimension, and / or perceived uniformity. LAB processing can enhance edge detection and / or image fragment detection in an image.

[0032] Canny processing is employed, for example, to extract edge features from images. Canny can identify and / or highlight the boundaries or edges of objects within an image by detecting regions of rapid intensity changes. Features extracted by Canny may include at least one of the following: edge boundaries, gradient information, and / or multi-scale edges, etc.

[0033] Sobel processing is a widely used technique in image processing and is used, for example, for edge detection in images.

[0034] Laplacian processing is a widely used technique in image processing and is employed, for example, to detect at least one of the following features: edges, blurriness, and / or corners.

[0035] Shi-Tomasi processing is employed, for example, to extract angular features from an image.

[0036] FFT (Fast Fourier Transform) processing is arranged, for example, to extract at least one of the following features: frequency components, amplitude and phase information, low and high frequencies, and / or power spectrum.

[0037] DCT (Discrete Cosine Transform) processing is used, for example, to extract spectral information from an image, converting 2D spatial information into frequency domain features. Features extracted by DCT processing may include at least one of the following: texture patterns, edges, low-frequency components, image energy, and / or spatial frequencies.

[0038] DWT (Discrete Wavelet Transform) processing is arranged to extract at least one of the following features: color features (e.g., color distribution, color space, and / or dominant color pattern), texture features, and / or shape features.

[0039] The RANSAC (Random Sample Consensus) process utilizes different known feature extraction techniques to extract image features and is also arranged to identify objects in the image as complex image features.

[0040] KLT (Kanade-Lucas-Tomasi) processing is arranged to extract features from an image, including at least one of the following, such as: corners, feature points, eigenvalues ​​of the gradient matrix, spatial intensity information, etc.

[0041] PCA (Principal Component Analysis) processing is arranged, for example, to identify the direction that maximizes the variance of the data in the image orientation (principal components), to determine the irrelevant principal components in the image, and / or to create "eigenfaces" which are principal components derived from a facial image dataset and represent the most important features in facial recognition (e.g., eyes, nose, mouth), and can be used to identify individuals based on their facial characteristics.

[0042] t-SNE (t-distributed random neighborhood embedding) processing and / or UMAP (uniform manifold approximation and projection) processing can be used to process high-dimensional image data by reducing high-dimensional image data to a low-dimensional space (typically two or three dimensions), which can improve the efficiency of further image processing for extracting features from image data.

[0043] ResNet (Residual Network) processing is employed, for example, for image classification and feature extraction. For instance, ResNet processing can extract features such as low-level, mid-level, and / or high-level features, where low-level features may include, for example, edges, textures, and / or colors; mid-level features may include, for example, the shape and / or portions of an object; and high-level features may include, for example, the entire object or scene.

[0044] Using CNNs (Convolutional Neural Networks) for image processing and / or feature extraction is generally well-known. For example, CNNs can also extract features such as low-level, mid-level, and / or high-level features.

[0045] VGG (Visual Geometric Groups) processing or models are deep convolutional neural networks used for image feature extraction. For example, VGG can be used to extract objects and / or image fragments.

[0046] For example, YOLO (You Only Look Once) processes features using deep convolutional neural networks to extract features at multiple scales, and can extract features such as low-level, mid-level and / or high-level features.

[0047] RCNN (Region-based Convolutional Neural Network) processing or tools are deployed, for example, to extract objects from images.

[0048] For example, UNet processing or tools are deep learning architectures and can be used for image segmentation. U-Net can extract, for example, hierarchical features representing different levels of abstraction, from edges and textures to more complex patterns. For instance, U-Net can be used to capture the context of an image.

[0049] Optical flow (OF) processing is arranged, for example, to extract motion-related features. OF can extract at least one of the following: motion vectors, pixel-level displacements, temporal gradients, and / or spatial gradients.

[0050] STIP (Spatiotemporal Points of Interest) processing is employed, for example, to extract specific features from images. STIP can identify, for example, actions (such as specific movements or activities), objects, and / or anomalies in an image.

[0051] FD (Fractal Dimension) processing is employed, for example, to extract texture-related features.

[0052] ZFD (Zettabyte File System) processing is arranged, for example, to extract keypoints and / or local feature descriptors around each keypoint.

[0053] HOG (Histogram of Oriented Gradients) processing is employed, for example, to extract local shape information from regions within an image. Specifically, HOG can capture the distribution of gradient orientations in localized portions of an image. Features extracted by HOG can be at least one of the following, for example: edge orientation and magnitude, local intensity gradient, object contour and shape.

[0054] FAST (Features from Accelerated Section Test) processing is arranged, for example, to extract angular features from an image.

[0055] BRIEF (Binary Robust Independent Basic Feature) processing is employed, for example, to describe image features. BRIEF is referred to as a feature descriptor and can be used to encode local image patches into compact binary strings.

[0056] ORB (Oriented Fast and Rotated BRIEF) processing is a feature detection and descriptor algorithm that is deployed to extract multiple image features. ORB can be used for, for example, keypoint detection, orientation assignment to each keypoint, and / or descriptor extraction.

[0057] The HSV (Hue, Saturation, Value) color space can be used to extract certain image features due to its ability to separate color information (hue) from intensity (value). HSV can extract at least one of the following features: hue, saturation, and / or value (also known as brightness).

[0058] MSER (Maximum Stable Extremum Region) processing is used, for example, to extract blobular features from an image.

[0059] As stated above, the processes / tools listed above are well known and enable those skilled in the art to understand how to use them for feature extraction. Additionally, this disclosure is not limited to the processes / tools listed above, and any other suitable further processing may also be used for feature extraction according to this disclosure.

[0060] According to a second aspect, a method is provided for monitoring the function of a camera by performing the steps of the method of the first aspect, wherein the method further includes: if an assessment indicates that at least a portion of the camera is occluded, causing an output of an indication of the presence of the occluded camera.

[0061] According to a third aspect, a data processing apparatus is provided, comprising components for performing steps of the method of the first aspect and / or the method of the second aspect.

[0062] According to the fourth aspect, a computer program including instructions is provided, which, when executed by a computer, cause the computer to perform the steps of the method of the first aspect and / or the method of the second aspect.

[0063] According to a fifth aspect, a computer-readable storage medium including instructions is provided, which, when executed by a computer, cause the computer to perform the steps of the method of the first aspect and / or the method of the second aspect.

[0064] According to the fifth aspect, a vehicle is provided, the vehicle including a camera and a processing device according to the third aspect.

[0065] It should be noted that the above examples can be combined with each other, regardless of the aspects involved. Therefore, the method can be combined with structural features, and similarly, the apparatus and system can be combined with the features described above regarding the method.

[0066] These and other aspects of this disclosure will become apparent from the examples described below and will be illustrated with reference to the examples described below. Attached Figure Description

[0067] Examples of this disclosure will now be described with reference to the following figures.

[0068] Figure 1 Exemplary components according to examples of this disclosure are shown, and this disclosure can be implemented by using these exemplary components;

[0069] Figure 2 The steps of a method for detecting camera occlusion according to an example of this disclosure are shown;

[0070] Figure 3 The steps of a method for a function of a surveillance camera, according to an example of this disclosure, are shown;

[0071] Figure 4 An arrangement of a data processing apparatus according to an example of this disclosure is shown;

[0072] Figure 5 The arrangement of a vehicle according to an example of this disclosure is shown;

[0073] Figure 6A An exemplary image is shown, according to an example of this disclosure, for determining whether a camera capturing the image is occluded;

[0074] Figure 6B Exemplary images are shown, according to examples of this disclosure, for determining whether a camera capturing the image is occluded; and

[0075] Figure 6C An exemplary image is shown, according to an example of this disclosure, for determining whether the camera capturing the image is obstructed. Detailed Implementation

[0076] The accompanying drawings are merely illustrative and are intended to illustrate examples of this disclosure only. Identical or equivalent elements are generally referred to by the same reference numerals.

[0077] Figure 1 Exemplary components according to this disclosure are shown, and this disclosure can be implemented by using these exemplary components. Figure 1 In this context, the monitoring function of camera 10 is executed. Specifically, it can be based on... Figure 1 The example detects occlusion of camera 10.

[0078] Camera 10 can be positioned inside or outside an object. For example, the object could be a car. Camera 10 is configured to capture image 12, which can be used for further analysis purposes. In a vehicle, for example, one or more cameras 10 and the images 12 captured by one or more cameras 10 can be used for a number of basic purposes, such as enhancing safety, driver assistance, improving the driving experience, and / or providing valuable data.

[0079] exist Figure 1 In the example, image data 12, indicating the image captured by camera 10, is obtained, as indicated by arrow 11. As mentioned above, image data 12, indicating the image captured by camera 10, is a digital representation of the image captured by the camera. Therefore, the terms image data and image are used interchangeably herein, since image data 12 is an image in digital form.

[0080] When the image or image data 12 is acquired, at least a portion of the image data 12 is used to extract first information 16 indicating features in the image data 12. Reference numeral 16 is used hereinafter to indicate both the first information and the features (the first information indicates the features).

[0081] exist Figure 1 In the image data 12, different features are schematically indicated as different geometric shapes within a rectangle representing the first information 16. Feature 16 in the image data 12 is an image feature. Furthermore, feature 16 in the image data 12 can be a feature of a different type of image feature. For example, different types of image features include at least one of the following types: edges, corners, textures, shapes, colors, transformation-based features, local feature descriptors, points of interest, and / or objects. This disclosure is not limited to these image feature types. Features of any other suitable image feature types not explicitly listed above can be extracted according to this disclosure and are present in the first information 16 indicating features in the image data 12. Extracting or obtaining the first information 16 indicating features in the image data 12 from or obtaining the first information 16 is typically done by… Figure 1 The dotted line 17 indicates this.

[0082] At least one of a set of feature extraction processes / tools 14 can be used to perform the extraction of first information 16 indicative of features in image data 12 from image data 12. Feature extraction from image data 12 is generally well-known. Therefore, this disclosure can utilize any one (well-known) of the feature extraction processes or tools 14 arranged for image feature extraction, or combinations thereof. Combining more than one feature extraction process or tool can, for example, be used to improve accuracy, confidence, or redundancy.

[0083] As described above, the feature extraction process or tool 14 may include at least one of the following: MinEigen, BRISK, FAST, Harris, LBP, GLCM, HOG, Gabor, SIFT, SURF, BRIEF, ORB, HSV, YCbCr, LAB, Canny, Sobel, Laplacian, Shi-Tomasi, FFT, DCT, DWT, RANSAC, KLT, PCA, t-SNE, UMAP, CNN, ResNet, VGG, YOLO, RCNN, UNet, OF, STIP, FD, ZFD, HOG, FAST, BRIEF, ORB, HSV, and / or MSER. As explained above, any other suitable feature extraction process or tool 14 may also be utilized according to this disclosure.

[0084] according to Figure 1 For example, at least a portion of image data 12 can be transferred 13 to at least one corresponding feature extraction process or tool 14. Then, at least one feature extraction process or tool 14 extracts first information 16 indicating features in the image data 12 from the image data 12. The extraction of features 16 from the image data 12 by at least one feature extraction process or tool 14, and thus the provision of the first information 16 by at least one feature extraction process or tool 14, is achieved by… Figure 1 Arrow 15 indicates this. The first information 16 can be understood as a set of features of image data 12.

[0085] When obtaining the set of features of the first information 16 or the image data 12 respectively, the first information 16 (at least a portion) indicating the features in the image data or the set 16 (at least a portion) of the features of the image data 12 can be provided respectively (see Figure 1 Arrow 18 in the figure represents the statistical model 19.

[0086] Statistical model 19 can evaluate whether at least a portion of camera 10 is occluded based on second information, which is a set of features in a set 16 that respectively indicates the features of first information 16 or image data 12.

[0087] According to this disclosure, the amount of feature quantity in the first information 16 can indicate the number of features 16 of at least one particular type of image feature.

[0088] For example, when edges are considered as a specific type of image feature 16, there may be an edge threshold defined for the image, which can be used to determine whether at least a portion of camera 10 is occluded. The edge threshold can be compared to the number of edges extracted from image data 12 and indicated in the first information 16. For example, if the number of extracted edges indicated in the first information 16 and extracted from image data 12 is higher than the edge threshold, it can be determined that camera 10 is not occluded because it is assumed that a sufficient number of objects with corresponding edges exist in image data 12. Furthermore, there may be another edge threshold higher than the edge threshold, and if the number of extracted edges indicated in the first information 16 and extracted from image data 12 is higher than this other edge threshold, it can be determined that camera 10 is occluded because, based on the assumption of a certain (high) number of edges in image 12, camera 10 must be dirty or otherwise occluded.

[0089] Similar approaches are possible for any of the features or feature types in image 12. It is possible to extract from image data 12 the number of corners, specific texture values, the number of shapes, the number of different colors, the number of points of interest, and / or the number of objects, the number or values ​​of transformation-based features, and / or the number of local feature descriptors, etc., and based on these quantities, a decision regarding the absence or presence of camera occlusion can be made when these quantities are compared with corresponding thresholds.

[0090] The analysis of the quantities of features 16 detected in image data 12 is performed using statistical model 19. Statistical model 19 can assess whether at least a portion of camera 10 is occluded by comparing the quantities of the set of features with corresponding thresholds for those quantities.

[0091] According to this disclosure, statistical model 19 may be, for example, a Bayesian statistical model, a random forest model, or a decision tree model, wherein any other suitable statistical model may also be used according to this disclosure.

[0092] In a Bayesian statistical model, for example, there may exist knowledge about the quantity (distribution) of image features indicating that the camera 10 is occluded and / or about the quantity (distribution) of image features indicating that the camera 10 is not occluded. The Bayesian statistical model uses this knowledge to infer whether the camera 10 is occluded based on second information, which indicates the quantity of features in the set 16 of features respectively indicating the first information 16 or the image data 12.

[0093] Inferences based on Bayesian statistical models are generally well-known.

[0094] Random forest models are a well-known supervised machine learning (ML) process for classification tasks. According to this disclosure, a random forest model can be a model trained to classify whether camera 10 is occluded based on image feature data. Specifically, the random forest model is trained based on the amount of image features present in the image to classify whether camera 10 is occluded. During training, multiple decision trees are constructed with respect to the amount of image features. The training of the random forest model and the generation of multiple decision trees are well-known in themselves and will not be described in detail here.

[0095] Therefore, when the feature quantities in the set 16 of features of the first information 16 or image data 12 are obtained respectively, they can be passed as input data to the random forest model, so that the random forest model can classify whether the camera 10 is occluded based on its decision trees. In the random forest model, each decision tree of the random forest model makes its own prediction based on the rules it has learned according to the image feature quantities in the corresponding training subset. The final prediction about whether the camera 10 is occluded is determined by the majority vote in the decision trees. Each tree votes for the corresponding label (i.e., occluded camera 10 or unoccluded camera 10), and the label with the most votes becomes the final prediction.

[0096] Compared to random forest models, decision tree models are single-tree models where decisions are made based on feature values, resulting in different outcomes at the leaf nodes of the tree. The decision tree model is trained based on image features present in the image to classify whether camera 10 is occluded. The training of decision tree models is well-known.

[0097] Then, when the feature quantities in the set 16 of features of the first information 16 or image data 12 are obtained respectively, they can be passed as input data to the decision tree model, so that the random forest model can classify whether camera 10 is occluded based on its decision tree. The decision tree model considers the corresponding feature quantity at each internal node and decides which branch of the tree to follow based on the feature quantity. This process continues until the leaf node of the decision tree model is reached, which provides the corresponding classification answer, i.e., occluded camera 10 or unoccluded camera 10. Prediction using the decision tree model is known in itself.

[0098] Therefore, statistical model 19 evaluates whether at least a portion of camera 10 is occluded based on second information indicating feature quantity in first information 16 indicating features in image data 12.

[0099] An assessment of whether at least a portion of camera 10 is occluded can be performed quickly in real time with reduced processing and storage resources. Camera 10 captures a current image. Image features 16 can be rapidly extracted using at least one image feature extraction process, wherein, if more than one feature extraction process is used, the respective feature extraction processes used can be used at least partially in parallel, which accelerates feature extraction. Statistical model 19 can quickly determine whether camera 10 is occluded based on the amount of image features, because detailed analysis of the extracted image features 16 is not required.

[0100] Figure 2 The steps of a method for detecting occlusion of camera 10 according to an example of this disclosure are shown.

[0101] exist Figure 2 In step S11, image data 12 indicating the image captured by camera 10 is obtained. In step S12, first information 16 indicating features in image data 12 is extracted from image data 12. In step S13, statistical model 19 evaluates whether at least a portion of camera 10 is occluded based on second information indicating the amount of features in first information 16.

[0102] Figure 3 The steps of a method for monitoring the functionality of camera 10 according to an example of this disclosure are shown. The functionality of camera 10 can be monitored at different times. For example, it can be monitored periodically or at predetermined times to ensure that camera 10 is working correctly.

[0103] Steps S11, S12, and S13 of the method for the function of monitoring camera 10 correspond to Figure 2 Steps S11, S12, and S13 of the method, which are in Figure 3 The same reference numerals are used to indicate them. However, Figure 3 The method for monitoring the camera 10 includes an additional step S14, wherein if the evaluation in step S13 indicates that at least a portion of the camera 10 is obstructed, an output indicating the presence of the obstructed camera 10 is generated. This indication can be arranged in any suitable manner and can be an audio indication and / or a visual indication.

[0104] Figure 4 An arrangement of a data processing apparatus 40 according to an example of this disclosure is shown. The data processing apparatus 40 can be arranged to perform... Figure 2 The steps of the method and / or the steps used to perform Figure 3 The steps of the method.

[0105] The data processing device 40 may include a data storage unit 401 and a data processing unit 404.

[0106] Data storage unit 401 may include computer-readable storage medium 402.

[0107] A computer program 403 may be provided on a computer-readable storage medium 402.

[0108] Computer program 403 and therefore computer-readable storage medium 402 may include instructions that, when executed by data processing unit 404 or more generally by computer, cause computer or data processing unit 404 to perform... Figure 2 The steps of the method and / or Figure 3 The steps of the method.

[0109] Figure 5 An example of the arrangement of vehicle 50 according to this disclosure is shown.

[0110] according to Figure 5 The vehicle 50 includes at least one camera 10. Each of the at least one camera 10 may be mounted on or within the vehicle 50. Furthermore, the vehicle 50 includes a data processing unit 40. The data processing unit 40 of the vehicle 50 can perform… Figure 2 The method includes steps for detecting whether at least one of the cameras 10 of the vehicle 50 is obstructed. As described above, the detection can be performed in real time. Furthermore, the data processing unit 40 of the vehicle 50 can perform... Figure 3 The method involves steps that allow monitoring of the function of at least one camera 10 in vehicle 50. Upon detection that a camera 10 in vehicle 50 is obstructed, the data processing unit 40 can output an indication of the presence of the obstructed camera 10 to the driver of vehicle 50. This indication can specify which camera 10 in vehicle 50 is obstructed. Thus, the driver of vehicle 50 can quickly and in real-time receive information about the malfunction of the corresponding camera 10 in vehicle 50.

[0111] Figure 6A , Figure 6B and Figure 6C An exemplary image is shown for determining whether the camera 10, which captures the corresponding image 12, is occluded.

[0112] Figure 6A Image 12 shows the interior of a vehicle (such as vehicle 50). The camera 10 that captured image 12 was mounted inside vehicle 50. Figure 6A Image 12 in the image can yield 1404 features 16. The extracted features 16 are derived from... Figure 6AThe points in image 12 are exemplarily indicated. Statistical model 19 correctly determines that the camera 10 capturing image 12 is not occluded based on feature 16 because the amount / number of extracted features 16 is greater than a threshold amount / number of features. The threshold amount / number of features can be a predetermined amount / number of features that can be determined when training statistical model 19. For example, different images of the interior of a vehicle can be used to train statistical model 19. During training, a minimum amount / number of features that should exist in images captured by the unoccluded camera 10 and showing the interior of the vehicle can be identified / determined. The determined minimum amount / number of features can then be determined / set as the threshold amount / number of features.

[0113] exist Figure 6A , Figure 6B and Figure 6C In the example, the second information indicating the feature quantity in the first information 16 represents the feature quantity / number extracted from the image / image data 12 captured by the corresponding camera 10 11.

[0114] Figure 6B Image 12 shows an exemplary image captured by an occluded camera 10. Zero features 16 can be extracted from image 12, and the statistical model 19 correctly determines the capture. Figure 6B Image 12 is obscured by camera 10.

[0115] Figure 6C Image 12 shows an exemplary image 12 captured by camera 10 installed in an office. In the image, 3751 features 16 can be extracted, and statistical model 19 correctly determines, based on features 16, that the camera 10 capturing image 12 was not occluded. The extracted features 16 are... Figure 6C The points in the image are exemplarily indicated. It is also determined here that camera 10 is not occluded because the amount / number of extracted features 16 is greater than the threshold amount / number of features. Similarly, in Figure 6C In the example, the threshold amount / number of features can be a predetermined amount / number of features. Similarly, in... Figure 6C In the example, a threshold amount / number of features can be determined when training statistical model 19. Different images of the office can be used to train statistical model 19. During training, a minimum amount / number of features that should exist in the image 12 captured by the unobstructed camera 10 and showing the office can be identified / determined. The determined minimum amount / number of features can then be determined / set as a threshold amount / number of features.

[0116] As used herein, the phrase “at least one” when referring to a list of one or more entities should be understood to mean at least one entity selected from any one or more entities in the list of entities, but not necessarily including every and at least one of each entity specifically listed in the list of entities, and does not exclude any combination of entities in the list of entities. This definition also allows for the optional presence of entities other than those specifically identified within the list of entities referred to by the phrase “at least one,” whether related to or unrelated to those specifically identified entities. Thus, as a non-limiting example, “at least one of A and B” (or equivalently, “at least one of A or B”, or equivalently, “at least one of A and / or B”) could in one example refer to at least one (optionally including more than one) A, without B (and optionally including entities other than B); in another example, it could refer to at least one (optionally including more than one) B, without A (and optionally including entities other than A); and in yet another example, it could refer to at least one (optionally including more than one) A and at least one (optionally including more than one) B (and optionally including other entities). In other words, the phrases “at least one,” “one or more,” and “and / or” are open-ended expressions that are both connected and separate in operation. For example, each of the expressions “at least one of A, B, and C,” “at least one of A, B, or C,” “one or more of A, B, and C,” “one or more of A, B, or C,” and “A, B, and / or C” can mean a single A, a single B, a single C, A and B together, A and C together, B and C together, A, B, and C together, and any of the above, optionally combined with at least one other entity.

[0117] By studying the accompanying drawings, the disclosure, and the appended claims, those skilled in the art can understand and implement other variations of the disclosed examples in practice with respect to the claimed disclosure. In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite articles "a" or "an" do not exclude multiple. A single processor or other unit can perform the functions of several items or steps recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used for benefit. Computer programs can be stored / distributed on suitable media, such as optical storage media or solid-state media provided with or as part of other hardware, but can also be distributed in other forms (such as via the Internet or other wired or wireless telecommunications systems). Any reference numerals in the claims should not be construed as limiting the scope of the claims.

[0118] List of reference numerals

[0119] 10 cameras

[0120] 11. Acquire Image Data

[0121] 12 Image Data

[0122] 13. Transfer image data to at least one feature extraction process or tool.

[0123] 14 Feature extraction processing or tools

[0124] 15. Extract first information indicating features in the image data through at least one feature extraction process or tool.

[0125] 16 indicates the first information of features in the image data

[0126] 17. Extract / obtain first information from image data

[0127] 18 provides the first information indicating the features in the image data to the statistical model.

[0128] 19 statistical models

[0129] S11 obtains image data indicating the image captured by the camera.

[0130] S12 Extracts information about features in the indicator image data

[0131] S13 assesses whether at least part of the camera is obstructed.

[0132] S14 outputs an indication of the presence of the obstructed camera.

[0133] 40 Data Processing Units

[0134] 401 Data Storage Unit

[0135] 402 Computer-readable storage media

[0136] 403 Computer Program

[0137] 404 Data Processing Unit

[0138] 50 vehicles

Claims

1. A method for detecting occlusion of a camera (10), wherein, The method includes: - Obtain (S11) image data (12) indicating the image captured by the camera (10); - Extract (S12) first information (16) indicating features in the image data (12) from the image data (12); and -The camera (10) is at least partially occluded by using a statistical model (19) based on second information indicating the amount of features in the first information (16) (S13).

2. The method according to claim 1, wherein, The camera (10) is mounted on or inside the vehicle (50).

3. The method according to claim 1 or 2, wherein: - The features in the image data (12) are image features; and / or - The features in the image data (12) are features of different types of image features; and / or - The amount of feature quantity in the first information (16) indicates the number of features of at least one specific type of image feature.

4. The method according to claim 3, wherein, The different types of image features include at least one of the following types: edges, corners, textures, shapes, colors, transformation-based features, local feature descriptors, points of interest, and / or objects.

5. The method according to any one of the preceding claims, wherein, The statistical model (19) assesses that at least a portion of the camera (10) is occluded by comparing the quantity of the set of features with the corresponding threshold of the respective quantity.

6. The method according to any one of the preceding claims, wherein, The statistical model (19) is a Bayesian statistical model, a random forest model, or a decision tree model.

7. The method according to any one of the preceding claims, wherein, Feature extraction is performed using at least one of a set of feature extraction processes.

8. The method according to claim 7, wherein, The set of feature extraction processes includes at least one of the following: MinEigen, BRISK, FAST, Harris, LBP, GLCM, HOG, Gabor, SIFT, SURF, BRIEF, ORB, HSV, YCbCr, LAB, Canny, Sobel, Laplacian, Shi-Tomasi, FFT, DCT, DWT, RANSAC, KLT, PCA, t-SNE, UMAP, CNN, ResNet, VGG, YOLO, RCNN, UNet, OF, STIP, FD, ZFD, HOG, FAST, BRIEF, ORB, HSV, and / or MSER.

9. A method for monitoring the function of a camera (10) by performing the steps of the method according to any one of the preceding claims, wherein, The method further includes: if the evaluation indicates that at least a portion of the camera (10) is occluded, then causing (S14) an output of an indication of the presence of the occluded camera (10).

10. A data processing apparatus (40) comprising components for performing steps (S11, S12, S13, S14) of the method according to any one of the preceding claims.

11. A computer program (403) comprising instructions which, when executed by a computer, cause the computer to perform the steps (S11, S12, S13, S14) of the method according to any one of claims 1 to 9.

12. A computer-readable storage medium (402) including instructions that, when executed by a computer, cause the computer to perform the steps (S11, S12, S13, S14) of the method according to any one of claims 1 to 9.

13. A vehicle comprising a camera (10) and a processing device (40) according to claim 10.