Image processing apparatus and image processing method
By employing multi-class and binary classification recognition processes, the processing load and accuracy issues caused by the increase in the number of object categories in advanced driver assistance systems and autonomous driving systems are resolved, achieving efficient and accurate image recognition.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ASTEMO LTD
- Filing Date
- 2021-02-05
- Publication Date
- 2026-06-30
AI Technical Summary
In existing technologies for advanced driver assistance systems and autonomous driving systems, as the number of object categories increases, the image recognition processing load increases, and the processing time cannot be completed within the necessary time, resulting in a decrease in recognition accuracy.
A multi-class recognition unit is used for multi-class recognition processing, a tracking processing unit performs image tracking and calculates the predicted position, and a binary recognition unit is combined to determine the category, thereby reducing the processing load and improving the recognition accuracy.
By using multi-class and binary classification recognition processes, the image recognition load is reduced, the recognition accuracy is improved, and multiple recognition objects can be processed efficiently.
Smart Images

Figure CN115769253B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to an image processing apparatus and an image processing method. Background Technology
[0002] Previously, inventions related to object detection devices, object detection methods, and programs were known (see Patent Document 1 below). Patent Document 1 discloses an object detection device having a detection unit and a nonlinear processing unit (Abstract, Technical Solution 1, Paragraph 0006 of this document). The detection unit detects one or more object candidate regions from an captured image. The nonlinear processing unit inputs a portion or all of the captured image, which contains at least the aforementioned object candidate regions, into a neural network, which simultaneously estimates the pose of the object within the aforementioned object candidate regions and the distance to the object. Furthermore, the nonlinear processing unit uses the output of the neural network to output object information, which includes at least information about the distance to the object.
[0003] The conventional object detection device described in Patent Document 1 detects objects within the shooting range based on images captured by a vehicle-mounted camera, and outputs object information that includes at least the distance to the detected object. Examples of objects detected by the object detection device include other vehicles, pedestrians, two-wheeled vehicles such as bicycles or motorcycles, traffic lights, signs, utility poles, billboards, and other roadside installations that may impede the movement of the vehicle (paragraph 0008 of this document).
[0004] The detection of candidate regions using the object detection function of the aforementioned object detection device is based on using a scanning rectangle of a size comparable to the object to be detected in the image captured by the vehicle-mounted camera to determine the presence or absence of the object (paragraph 0021 of this document). Then, image feature quantities are calculated for the image region within the scanning rectangle, and a pre-learned recognizer is used to determine whether there are other vehicles within the scanning rectangle, or to output a likelihood indicating the probability of other vehicles (paragraph 0022 of this document).
[0005] Existing technical documents
[0006] Patent documents
[0007] Patent Document 1: Japanese Patent Application Publication No. 2019-008460 Summary of the Invention
[0008] The problem that the invention aims to solve
[0009] Similar to the object detection device described above, there are various types of recognition devices that determine whether an object to be detected exists within the scanning rectangle. These include binary classification devices that distinguish between vehicles and other objects, and multi-class recognition devices that simultaneously distinguish between vehicles, pedestrians, and other objects. However, with the development of Advanced Driver Assistance Systems (ADAS) and Automated Driving Systems (ADS), there is a tendency for the number of categories of objects to be recognized to further increase.
[0010] In image processing that identifies objects from images captured by an imaging device, to cope with an increase in the number of object categories, it is necessary to use multiple recognizers or increase the hierarchy of each recognizer to improve recognition accuracy. However, increasing the number of recognizers or the hierarchy of recognizers increases the processing load for object recognition, and the processing time may not be able to be contained within the necessary timeframe.
[0011] This disclosure provides an image processing apparatus and an image processing method that can reduce the processing load of image processing for recognizing multiple objects from an image and improve recognition accuracy.
[0012] Methods for solving problems
[0013] One aspect of this disclosure is an image processing apparatus, characterized by comprising: a multi-class recognition unit that performs multi-class recognition processing on an image captured by an imaging device to recognize multiple categories of objects; a tracking processing unit that performs image tracking using the objects recognized by the multi-class recognition processing as tracking objects, and calculates a predicted position of the tracking object in the image at a later time based on the image at a previous time; and a recognition unit that performs binary classification recognition processing on the predicted position in the image at the later time, corresponding to the category of the tracking object, to recognize the category of the tracking object.
[0014] Invention Effects
[0015] According to one of the above-described embodiments of this disclosure, an image processing apparatus and image processing method can be provided that can reduce the processing load of image processing for recognizing multiple objects from an image and improve recognition accuracy. Attached Figure Description
[0016] Figure 1 This is a block diagram illustrating one embodiment of the image processing apparatus of this disclosure.
[0017] Figure 2A This is a flowchart illustrating one embodiment of the image processing method of this disclosure.
[0018] Figure 2B This is a flowchart illustrating one embodiment of the image processing method of this disclosure. Detailed Implementation
[0019] Hereinafter, embodiments of the image processing apparatus and image processing method of this disclosure will be described with reference to the accompanying drawings.
[0020] Figure 1 This is a block diagram illustrating one embodiment of the image processing apparatus of this disclosure. The image processing apparatus IPA of this embodiment is, for example, a device for identifying multiple categories of objects from images captured by an imaging device ID. More specifically, the image processing apparatus IPA is, for example, mounted on a vehicle and identifying multiple different objects around the vehicle from images captured by an imaging device ID such as a single-lens reflex camera or a stereo camera. Furthermore, the images captured by the imaging device ID are not particularly limited; for example, color images or images with varying shades can be appropriately selected.
[0021] exist Figure 1 In the example shown, the imaging device ID is a stereo camera mounted on a vehicle. The image processing apparatus IPA, for example, includes a processing unit 100 containing a processing unit such as a CPU, a storage unit 200 containing a storage unit such as ROM or RAM, and a computer program stored in the storage unit 200 and executed by the processing unit 100. Furthermore, although not shown in the figures, the image processing apparatus IPA, for example, has an input / output unit for inputting and outputting signals.
[0022] The image processing apparatus (IPA) includes, for example, a signal processing unit 110 and a recognition processing unit 150. The signal processing unit 110 includes, for example, an image acquisition unit 111 and a parallax calculation unit 112. The recognition processing unit 150 includes, for example, a first recognition processing unit 120, a second recognition processing unit 130, and an output processing unit 140. The first recognition processing unit 120 includes, for example, an image region selection unit 121 and a multi-class recognition unit 122. The second recognition processing unit 130 includes, for example, a tracking processing unit 131 and a recognition unit 132. The recognition unit 132 includes, for example, multiple binary classification recognition units 132a and 132b.
[0023] Each part of the processing unit 100 is, for example, a functional block of the processing unit 100 implemented by the processing unit 100 executing a computer program stored in the storage unit 200. Each part of the processing unit 100 can be implemented by its own dedicated processing device, or multiple functional blocks can be implemented by a single processing device. In addition, the storage unit 200 can be composed of one or more storage devices, or it can be composed of a single storage device.
[0024] In addition, Figure 1In the example shown, the image processing apparatus IPA includes a storage unit 200, but the image processing apparatus IPA can also be connected to an external storage unit 200. Additionally, in Figure 1 In the example shown, the image processing device IPA is connected to the external imaging device ID, but the image processing device IPA may also include the imaging device ID. Additionally, in Figure 1 In the example shown, the recognition unit 132 has two binary classification recognition units 132a and 132b, but it may also have three or more binary classification recognition units.
[0025] The image processing device IPA identifies objects 202 from images captured by the imaging device ID, which are pre-stored in the storage unit 200. The identified objects 202 include various categories such as other vehicles, pedestrians, moving objects, obstacles, roads, road signs, road markers, and signals around the vehicle equipped with the image processing device IPA. Furthermore, other vehicles identified by the image processing device IPA can include, for example, light vehicles such as bicycles, bicycles with prime movers, motorized two-wheelers, light cars, ordinary cars, large cars, buses, and trucks. Moreover, other vehicles can also be categorized based on factors such as position, posture, direction of travel, speed, acceleration, and angular velocity relative to the vehicle, such as vehicles in front, following, oncoming, crossing, turning right, and turning left.
[0026] Next, refer to Figure 2A and Figure 2B ,and Figure 1 The operation of the image processing apparatus IPA shown will be used to illustrate one embodiment of the image processing method of this disclosure. Figure 2A and Figure 2B Is using Figure 1 The flowchart shows the image processing method IPM of this embodiment of the image processing apparatus IPA.
[0027] The imaging device ID, for example, captures images at a predetermined period and for a predetermined shooting time. The image processing device IPA utilizes... Figure 2A The image processing method IPM shown processes images captured by the imaging device ID at predetermined intervals. When starting... Figure 2A In the image processing method IPM shown, the image processing device IPA first performs image acquisition processing P1.
[0028] In the image acquisition process P1, the image acquisition unit 111 acquires an image from the imaging device ID, for example, and stores it in the storage unit 200 as part of the image information 201. Furthermore, if the imaging device ID is a stereo camera, the image information 201 includes image information of the right image captured by the right camera and the left image captured by the left camera, for example.
[0029] Furthermore, in the image acquisition process P1, the disparity calculation unit 112 takes, for example, the right and left images as input, searches for regions in the left image similar to a specific region in the right image, and performs disparity calculation. The disparity calculation unit 112 outputs a disparity image by performing this processing on the entire region of the right image. The disparity calculation unit 112 stores the disparity image in the storage unit 200 as part of the image information 201.
[0030] Next, the image processing device IPA performs, for example, image region selection processing P2. In image region selection processing P2, the image processing device IPA selects an image region from the image captured by the imaging device ID that may contain one of multiple categories of recognition objects. More specifically, the first recognition processing unit 120 obtains, for example, a disparity image as the output of the disparity calculation unit 112 from the image information 201 stored in the disparity calculation unit 112 or the storage unit 200.
[0031] In the image region selection process P2, the image region selection unit 121, for example, takes a disparity image as input, groups adjacent and similar disparities in the disparity image, and generates rectangular boxes surrounding the grouped disparities. Furthermore, the image region selection unit 121 selects rectangular boxes whose dimensions (horizontal and vertical) are greater than a predetermined size as image regions that may contain one of multiple categories of recognition objects.
[0032] The image region selection unit 121 outputs the location information of the selected image region, namely its coordinates on the parallax image, and its width and height as image regions 203 that may contain the identified object, and stores it in the storage unit 200. Here, for example, when multiple image regions are selected from the parallax image, the image region selection unit 121 assigns an identification number N from 1 to n (natural numbers) to each image region and stores it in the storage unit 200 as image region 203.
[0033] Furthermore, the image region selection unit 121 can, for example, estimate the categories of recognition objects that may be contained in the image region of the parallax image enclosed by the rectangle based on the aspect ratio of the rectangle, and select only the image region that is likely to contain a specific category of recognition object. Additionally, if the imaging device ID is a single-lens reflex camera, the image region selection unit 121 can also select an image region from the image of the single-lens reflex camera that may contain one of multiple categories of recognition objects.
[0034] In this case, the image region selection unit 121 can, for example, use the object detection results of the millimeter-wave radar installed on the vehicle to select the image region. Alternatively, the image region selection unit 121 can, for example, pre-specify a specific region of the image of the capturing device ID, and select an image region that may contain one of multiple categories of objects by performing a raster scan of that region using a window of arbitrary size.
[0035] Next, the multi-class recognition unit 122 performs, for example, a process P3 that sets the recognition number N of the image region that is the object of the multi-class recognition process to N=1. Then, the multi-class recognition unit 122 performs multi-class recognition processing on the image captured by the imaging device ID, performing multi-class recognition processing P4 to identify multiple categories of objects.
[0036] More specifically, in the multi-class recognition process P4, the multi-class recognition unit 122 identifies multiple categories of objects from the image regions selected by the image region selection unit 121. The multi-class recognition unit 122 sequentially performs the multi-class recognition process P4 on each image region selected by the selection process P2, whose recognition numbers N range from 1 to n.
[0037] The multi-class recognition process P4 includes, for example, registration number determination process P4a, multi-class recognition process P4b, category determination process P4c, category candidate registration process P4d, P4e, and auto-incrementing process P4f. The multi-class recognition unit 122 first determines, in registration number determination process P4a, whether the registration number of the tracked object in the tracking process P6a (described later) is less than the upper limit.
[0038] For example, if the registration number determination process P4a determines that the registration number is not less than the upper limit (No), that is, if the registration number is determined to have reached the upper limit, the multi-class recognition unit 122 does not execute the multi-class recognition process P4b and subsequent processes, and proceeds to the next process P5. On the other hand, for example, if the registration number determination process P4a determines that the registration number is less than the upper limit (Yes), the multi-class recognition unit 122 executes the multi-class recognition process P4b.
[0039] For example, in the multi-class recognition process P4b, the multi-class recognition unit 122 identifies multiple categories of recognition objects stored in the storage unit 200 as recognition objects 202 from the image region. The multi-class recognition unit 122 also evaluates, for example, the similarity between the image region 203 selected by the image region selection unit 121 and stored in the storage unit 200 and the multi-class recognition learning data 204 stored in the storage unit 200.
[0040] The multi-class recognition learning data 204 is, for example, learning data obtained through machine learning by inputting multiple images of cars, motorized two-wheeled vehicles, and other objects as recognition objects. That is, the multi-class recognition unit 122 uses the multi-class recognition learning data 204 obtained by inputting multiple categories of recognition objects and performing machine learning to perform multi-class recognition processing.
[0041] More specifically, in this embodiment, the multi-class recognition unit 122 performs multi-class recognition processing, for example, using multi-class recognition learning data 204. This multi-class recognition learning data 204 is machine learning data that has been input with at least two objects: automobiles as the first category of recognition objects and motorized two-wheelers as the second category of recognition objects. In this embodiment, although the categories of recognition objects in the multi-class recognition processing P4b are described as, for example, automobiles and motorized two-wheelers, the categories and number of recognition objects are not particularly limited.
[0042] The multi-class recognition unit 122 calculates, for example, a similarity evaluation value between the image region 203 and the multi-class recognition learning data 204 in the multi-class recognition processing P4b. Specifically, the multi-class recognition unit 122 calculates, for example, a similarity evaluation value between the multi-class recognition learning data 204 and the image region 203, where the first category of the recognition object (i) is a car and the second category of the recognition object (ii) is a motorized two-wheeled vehicle. Then, based on the similarity evaluation value, the multi-class recognition unit 122 performs a category determination processing P4c for the recognition objects existing in the image region 203.
[0043] In this determination process P4c, for example, if the similarity evaluation value is above a predetermined threshold, the multi-class recognition unit 122 identifies the category of the object to be identified in the image region 203. More specifically, for example, if the similarity evaluation value between the image region 203 and the car that is the first category of the object to be identified (i) is above a predetermined threshold, the multi-class recognition unit 122 identifies the car that is the first category of the object to be identified (i) and its location information from the image region 203. Furthermore, the multi-class recognition unit 122 assigns a registration number to the car that is the first category of the object to be identified from the image region 203 and performs the process P4d of registering it in the storage unit 200 as a tracking object and category candidate 205.
[0044] Furthermore, in the aforementioned determination process P4c, for example, if the similarity evaluation value between image region 203 and the motorized two-wheeled vehicle as the second category of identification object (ii) is above a predetermined threshold, the multi-class identification unit 122 identifies the motorized two-wheeled vehicle as the second category of identification object (ii) and its location information from image region 203. Then, the multi-class identification unit 122 assigns a registration number to the motorized two-wheeled vehicle identified from image region 203 as the second category of identification object (ii), and performs process P4e, registering it in the storage unit 200 as a tracking object and category candidate 205. For example, after process P4d or process P4e, the multi-class identification unit 122 performs an increment process P4f.
[0045] Furthermore, in the aforementioned determination process P4c, for example, if the similarity evaluation value between image region 203 and the recognition object of the first category (i) and the recognition object of the second category (ii) is less than a predetermined threshold, the multi-class recognition unit 122 identifies that the image region 203 does not contain a recognition object. In this case, the multi-class recognition unit 122 performs, for example, an incrementing process P4f.
[0046] In the incrementing process P4f, the multi-class recognition unit 122 increments the recognition number N of the image region 203 that will be the target of the next multi-class recognition process P4 to N+1. The multi-class recognition unit 122 repeatedly executes the multi-class recognition process P4, including the processes P4a to P4f, until the incremented recognition number N of the image region 203 exceeds the number n of image regions 203 selected in the selection process P2.
[0047] After the multi-class recognition process P4 is completed for all image regions 203 selected in the selection process P2, the tracking processing unit 131 performs, for example, a process P5 in which the registration number R of the tracking object, which will be the processing object in the recognition process P6 described later, is set to 1. Furthermore, the tracking processing unit 131 calculates the predicted position of the tracking object registered in the storage unit 200 as a tracking object and category candidate 205, and performs the recognition process P6 to determine the category of the tracking object.
[0048] The tracking processing unit 131, for example, sequentially performs identification processing P6 on each tracking object registered in the storage unit 200 with registration numbers R from 1 to m (natural numbers) as tracking objects and category candidates 205 in the aforementioned category candidate registration processes P4d and P4e. Identification processing P6 includes, for example, tracking processing P6a, category candidate determination processing P6b, binary classification identification processing P6c and P6h, category determination processing P6d and P6i, registration processing P6e and P6j, predicted position calculation processing P6f and P6k, registration deletion processing P6g, and auto-increment processing P6l.
[0049] The tracking processing unit 131 first performs image tracking in tracking processing P6a, setting the identified objects identified by multi-class recognition processing P4 as the tracking objects, and calculates the predicted position of the tracking objects in the image of the next time step based on the image of the previous time step. For example, the tracking processing unit 131 calculates the predicted position of the tracking objects in the current time step based on the position information of the tracking objects in the previous time step.
[0050] In tracking processing P6a, the tracking processing unit 131 uses, for example, a method that uses an image of the tracked object from the previous time step as a template and searches for the tracked object in the current time step by template matching, or a method that estimates the movement of each pixel within the region of the tracked object by optical flow, etc. Then, the tracking processing unit 131 performs motion prediction of the tracked object in the current time step based on the position of the tracked object from the previous time step and the movement of the tracked object in the past.
[0051] In addition, for example in the category candidate determination process P6b, the tracking processing unit 131 refers to the tracking object registered in the storage unit 200 and the category candidate 205 to determine whether the category of the tracking object identified in the multi-class identification process P4 is a car as the first category (i) or a motorized two-wheeled vehicle as the second category (ii).
[0052] In the category candidate determination process P6b, when the tracking processing unit 131 determines that the category of the tracked object is a car of the first category (i), the recognition unit 132 performs binary classification recognition processing corresponding to the category of the tracked object based on the predicted position of the image at a later time, and identifies the category of the tracked object. More specifically, the recognition unit 132 performs binary classification recognition processing P6c using the binary classification recognition unit 132a corresponding to the object of the first category (i).
[0053] The binary classification recognition unit 132a uses the binary classification recognition learning data 206 stored in the storage unit 200 to perform binary classification recognition processing on the predicted location of the object to be recognized and its surroundings, and to identify the category of the object to be tracked. Here, the binary classification recognition learning data 206 is binary classification recognition learning data for automobiles that has been machine-processed by taking images of the first category of the object to be recognized (i), namely automobiles, and other objects to be recognized, which are multiple categories of objects to be recognized as multiple categories of objects to be recognized by the image processing device IPA.
[0054] In the binary classification recognition process P6c, the binary classification recognition unit 132a calculates, for example, an evaluation value of the similarity between the predicted location of the tracked object and its surrounding image region and the binary classification recognition learning data 206. Next, the binary classification recognition unit 132a performs a category determination process P6d. In the category determination process P6d, for example, if the aforementioned similarity evaluation value is above a predetermined threshold, the binary classification recognition unit 132a determines that the category of the tracked object is the first category of identified object (i), i.e., a car, and performs a registration process P6e.
[0055] In the registration process P6e, the binary classification recognition unit 132a, for example, registers the car, which is the first category of the identification object (i), as the category of the tracking object in the output information 208 of the storage unit 200. Additionally, in the registration process P6e, the binary classification recognition unit 132a sets the predicted position of the tracking object to the position of the car, which is the first category of the identification object (i), and registers it in the output information 208 of the storage unit 200. Next, the binary classification recognition unit 132a performs the predicted position calculation process P6f, for example.
[0056] In the predicted position calculation process P6f, the binary classification recognition unit 132a, for example, calculates the difference between the position information of the tracked object at the previous time step and the position information of the tracked object at the current time step, and divides this difference by the frame capture interval to calculate the moving speed of the tracked object. Furthermore, the binary classification recognition unit 132a, for example, calculates the predicted position of the tracked object at the next time step based on the position information of the tracked object at the current time step and the moving speed of the tracked object. The predicted position of the tracked object calculated here is used, for example, in the tracking process P6a at the next time step.
[0057] Furthermore, in the category determination process P6d described above, for example, if the similarity evaluation value is less than a predetermined threshold, the binary classification identification unit 132a determines that the tracked object is an identification object or background of a category other than the car identified as the first category (i), and performs the registration deletion process P6g. In the registration deletion process P6g, the binary classification identification unit 132a deletes, for example, the tracked object and the category candidate 205 registered in the storage unit 200.
[0058] Furthermore, in the aforementioned category candidate determination process P6b, when the tracking processing unit 131 determines that the category of the tracked object is a motorized two-wheeled vehicle of the second category (ii), the recognition unit 132 performs binary classification recognition processing corresponding to the category of the tracked object based on the predicted position of the image at a later time, and identifies the category of the tracked object. More specifically, the recognition unit 132 performs binary classification recognition processing P6h using the binary classification recognition unit 132b corresponding to the second category (ii).
[0059] The binary classification recognition unit 132b uses the binary classification recognition learning data 207 stored in the storage unit 200 to perform binary classification recognition processing on the predicted location of the object to be recognized and its surroundings, thereby identifying the category of the object to be tracked. Here, the binary classification recognition learning data 207 is binary classification recognition learning data for a motorized two-wheeled vehicle that has been machine-learned by inputting images of a second category of an object to be recognized (ii) of one of the multiple categories of objects to be recognized as multiple categories of objects to be recognized by the image processing device IPA, and other objects to be recognized. Furthermore, the binary classification recognition learning data 206 and the binary classification recognition learning data 207 can be learned by different machine learning methods.
[0060] In the binary classification recognition process P6h, the binary classification recognition unit 132b calculates, for example, the similarity evaluation value between the predicted location of the tracked object and its surrounding image region and the binary classification recognition learning data 207. Next, the binary classification recognition unit 132b performs the category determination process P6i. In the category determination process P6i, for example, if the aforementioned similarity evaluation value is above a predetermined threshold, the binary classification recognition unit 132b determines that the category of the tracked object is the second category of identified object (ii), namely a motorized two-wheeled vehicle, and performs the registration process P6j.
[0061] In the registration process P6j, the binary classification identification unit 132b, for example, registers the motorized two-wheeled vehicle, which is the identification object of the second category (ii), as the category of the tracking object in the output information 208 of the storage unit 200. Additionally, in the registration process P6j, the binary classification identification unit 132b sets the predicted position of the tracking object to the position of the motorized two-wheeled vehicle, which is the identification object of the second category (ii), and registers it in the output information 208 of the storage unit 200. Next, the binary classification identification unit 132b performs the predicted position calculation process P6k, for example.
[0062] In the predicted position calculation process P6k, the binary classification and recognition unit 132b, for example, calculates the difference between the position information of the tracked object at the previous time step and the position information of the tracked object at the current time step, and divides this difference by the frame capture interval to calculate the moving speed of the tracked object. Furthermore, the binary classification and recognition unit 132b, for example, calculates the predicted position of the tracked object at the next time step based on the position information of the tracked object at the current time step and the moving speed of the tracked object. The predicted position of the tracked object calculated here is used, for example, in the tracking process P6a at the next time step.
[0063] Furthermore, in the category determination process P6i described above, for example, if the similarity evaluation value is less than a predetermined threshold, the binary classification identification unit 132b determines that the tracked object is an identification object or background of a category other than a motorized two-wheeled vehicle, which is an identification object (ii) of the second category, and performs the registration deletion process P6g. In the registration deletion process P6g, the binary classification identification unit 132b deletes, for example, the tracked object and the category candidate 205 registered in the storage unit 200.
[0064] After the aforementioned prediction position calculation processes P6f, P6k, or registration deletion processes P6g are completed, the tracking processing unit 131, for example, performs an increment process P6l. In this increment process P6l, the tracking processing unit 131 increments the registration number R of the tracking object and category candidate 205, which will be the processing object of the next identification process P6, to R+1. The tracking processing unit 131, for example, repeatedly executes the identification process P6, including the aforementioned tracking process P6a to the increment process P6l, until the incremented registration number R of the tracking object and category candidate 205 exceeds the number m of tracking objects and category candidates 205 registered in the multi-class identification process P4.
[0065] After the identification process P6 is completed, the predicted location and category of the tracked object are as follows: Figure 1 As shown, output information 208 is output from the second identification processing unit 130 or the storage unit 200 to the output processing unit 140. The output processing unit 140 utilizes the output information 208 in the signal generation and processing of autonomous driving and advanced driver assistance systems, for example, by outputting the output information 208 to vehicle control devices constituting ADS, ADAS, etc.
[0066] The following describes the operation of the image processing apparatus IPA and the image processing method IPM using it in this embodiment.
[0067] In recent years, ADAS and ADS, which utilize vehicle-mounted cameras and other image capture devices, as well as external identification sensors such as radar, have attracted attention. In image processing for object recognition from images captured by image capture devices, to cope with the increase in the number of object categories, it is necessary to use multiple recognizers or increase the layers of each recognizer to improve recognition accuracy. However, increasing the number of recognizers or the layers of recognizers increases the processing load for object recognition, and the processing time may not be able to accommodate it within the necessary timeframe.
[0068] As described above, the image processing apparatus IPA of this embodiment includes a multi-class recognition unit 122, a tracking processing unit 131, and a recognition unit 132. The multi-class recognition unit 122 performs multi-class recognition processing P4 on images captured by the imaging device ID, recognizing multiple categories of objects. The tracking processing unit 131 performs image tracking, using the objects recognized by the multi-class recognition processing P4 as tracking objects, and calculates the predicted position of the tracking objects in the image at a later time based on the image at a previous time. The recognition unit 132 performs binary classification recognition processing P6c and P6h corresponding to the category of the tracking object based on the predicted position in the image at the later time, recognizing the category of the tracking object.
[0069] Furthermore, the image processing method IPM of this embodiment performs multi-class recognition processing P4 on the image captured by the imaging device ID to identify multiple categories of objects. Then, the image processing method IPM performs image tracking using the objects identified by the multi-class recognition processing P4 as tracking objects, and calculates the predicted position of the tracking objects in the image of a subsequent time step based on the image of the previous time step.
[0070] Then, the image processing method IPM performs binary classification recognition P6c and P6h on the predicted position of the image at the later time step, corresponding to the category of the tracked object, to identify the category of the tracked object.
[0071] The image processing apparatus IPA and image processing method IPM of this embodiment described above can reduce the processing load of image processing for recognizing multiple objects from an image and improve recognition accuracy. More specifically, the image processing apparatus IPA and image processing method IPM of this embodiment can reduce the processing load of image processing compared to the case of recognizing multiple objects from an image using only a multi-classifier or only a binary classifier.
[0072] The reason is that by using both multi-class recognition processing based on the multi-class recognition unit 122 and binary recognition processing based on the recognition unit 132, the multi-class recognition processing layer can be made shallower, reducing the processing load, compared to using only multi-class recognition processing. Thus, by making the multi-class recognition processing layer shallower, even if the recognition accuracy of the multi-class recognition processing decreases, the recognition accuracy can be improved by using the object identified in the multi-class recognition processing as the tracking object and performing binary recognition processing on that tracking object.
[0073] Furthermore, by using the objects identified by the multi-class recognition process P4 as the tracking objects, and performing binary classification recognition on the predicted positions of the tracking objects in later time-series images according to the category of the tracking objects, binary classification recognition can be performed only within a very limited image region. This reduces the processing load of the binary classification recognition process, eliminates misidentifications in the multi-class recognition process, and improves the recognition accuracy of multiple categories of objects.
[0074] Furthermore, in the image processing apparatus IPA of this embodiment, the multi-class recognition unit 122 uses multi-class recognition learning data 204, which has been processed by machine learning on multiple categories of objects to be recognized, to perform multi-class recognition processing P4. Based on this structure, multi-class recognition processing for recognizing multiple categories of objects from an image can be performed with high accuracy based on the results of machine learning.
[0075] Furthermore, in the image processing apparatus IPA of this embodiment, the multi-class recognition unit 122 performs multi-class recognition processing P4 using multi-class recognition learning data 204, which incorporates at least the first category of recognition objects (i) and the second category of recognition objects (ii) as input through machine learning. Based on this structure, it is possible to accurately distinguish the first category of recognition objects (i) and the second category of recognition objects (ii) from multiple categories of recognition objects contained in an image.
[0076] Furthermore, in the image processing apparatus IPA of this embodiment, the first category of identification object (i) is a car, and the second category of identification object (ii) is a motorized two-wheeled vehicle. Based on this structure, it is possible to accurately distinguish between a car (i) as the first category of identification object and a motorized two-wheeled vehicle (ii) as the second category of identification object from among multiple categories of identification objects contained in an image.
[0077] Furthermore, in the image processing apparatus IPA of this embodiment, the recognition unit 132 includes a plurality of binary classification recognition units 132a and 132b. The plurality of binary classification recognition units 132a and 132b respectively use binary classification recognition learning data 206 and 207, which are machine learning data of an object belonging to one of the multiple categories, to perform binary classification recognition processing. According to this structure, the binary classification recognition units 132a and 132b can determine with high accuracy whether an object belongs to one of the multiple categories.
[0078] Furthermore, in the image processing apparatus IPA of this embodiment, the number of binary classification recognition units 132a and 132b is equal to the number of categories recognized by the multi-class recognition unit 122. More specifically, in the image processing apparatus IPA of this embodiment, the multi-class recognition unit 122 recognizes two categories: a first category of recognition objects (i) and a second category of recognition objects (ii), and the recognition unit 132 has two binary classification recognition units 132a and 132b. With this structure, binary classification recognition processing is performed on all categories of recognition objects identified by the multi-class recognition unit 122, thereby improving the accuracy of object category recognition.
[0079] Furthermore, the image processing apparatus IPA of this embodiment includes an image region selection unit 121, which selects an image region from an image that may contain one of multiple categories of recognition objects. Then, a multi-class recognition unit 122 recognizes multiple categories of recognition objects from the image region selected by the image region selection unit 121. According to this structure, multi-class recognition processing based on the multi-class recognition unit 122 can be performed only on a limited image region, thereby reducing the processing amount of multi-class recognition processing and lowering the processing load.
[0080] As described above, according to this embodiment, an image processing apparatus (IPA) and an image processing method (IPM) can be provided that can reduce the processing load of image processing for recognizing multiple objects from an image and improve recognition accuracy.
[0081] The embodiments of the image processing apparatus and image processing method of this disclosure have been described in detail above with reference to the accompanying drawings. However, the specific structure is not limited to these embodiments. Even design changes that do not depart from the spirit of this disclosure are also included in this disclosure.
[0082] Symbol Explanation
[0083] 121 Image Region Selection Unit
[0084] 122 Multi-class Recognition Department
[0085] 131 Tracking and Processing Department
[0086] 132 Identification Department
[0087] 132a Binary Classification Recognition Unit
[0088] 132b binary classification recognition unit
[0089] 204 multi-class recognition learning data
[0090] 206 binary classification recognition learning data
[0091] 207 binary classification recognition learning data
[0092] ID camera device
[0093] IPA Image Processing Device
[0094] IPM Image Processing Methods
[0095] P4 Multi-class Recognition Processing
[0096] P6c binary classification recognition processing
[0097] P6h binary classification recognition processing.
Claims
1. An image processing apparatus, characterized in that, have: The image region selection unit selects an image region from an image captured by the imaging device that may contain one of multiple categories of recognition objects; A multi-class recognition unit that identifies multiple categories of objects from the image region selected by the image region selection unit; The tracking processing unit performs image tracking of the identified object as the tracking object by the multi-class recognition processing, and calculates the predicted position of the tracking object in the image at a later time based on the image at a previous time. as well as The recognition unit performs binary classification recognition processing on the predicted position of the image at the later time step, corresponding to the category of the tracked object, to identify the category of the tracked object. The recognition unit has multiple binary classification recognition units. The multiple binary classification recognition components each use machine learning binary classification recognition learning data of the recognition object of one of the multiple categories to perform the binary classification recognition process.
2. The image processing apparatus according to claim 1, characterized in that, The multi-class recognition unit uses multi-class recognition learning data to perform the multi-class recognition process. The multi-class recognition learning data is data obtained by machine learning from inputting multiple categories of the recognition objects.
3. The image processing apparatus according to claim 2, characterized in that, The multi-class recognition unit uses the multi-class recognition learning data, which has been machine-learned by inputting at least the recognition objects of the first category and the recognition objects of the second category, to perform the multi-class recognition processing.
4. The image processing apparatus according to claim 3, characterized in that, The first category of identified objects is automobiles, and the second category of identified objects is motorized two-wheeled vehicles.
5. The image processing apparatus according to claim 1, characterized in that, The number of binary classification units is equal to the number of categories identified by the multi-class classification units.
6. An image processing method, characterized in that, Select an image region from the images captured by the imaging device that may contain one of the multiple categories of objects to be identified; Identify multiple categories of objects from the selected image region; Image tracking is performed on the identified objects identified by the multi-class recognition process, and the predicted position of the tracked objects in the image at the next time step is calculated based on the image at the previous time step. as well as For the predicted position of the image at the later time step, a binary classification recognition process is performed corresponding to the category of the tracked object to identify the category of the tracked object. The binary classification recognition process is performed using multiple binary classification recognition learning data, each of which is obtained by machine learning through inputting the recognition object of one of the multiple categories.