Error correction method and device for non-human detection results
By determining the location and feature information of the image to be detected in the intelligent security system, and using the similarity calculation of the differential feature location for error correction, the problem of false detection of people riding two-wheeled vehicles has been solved, improving the detection accuracy and efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- JINAN BOGUAN INTELLIGENT TECH CO LTD
- Filing Date
- 2024-12-17
- Publication Date
- 2026-06-19
AI Technical Summary
The intelligent security system has a problem of misdetecting people riding two-wheeled vehicles in low-light conditions, which leads to frequent false alarms and affects security protection and user experience.
By determining the initial detection category information in the image to be detected as the location information of pedestrians or two-wheeled vehicles, complete features are extracted, and based on the differential feature location information, the similarity between the target feature information and the pre-established feature database of pedestrians and two-wheeled vehicles is calculated for error correction.
It effectively solves the problem of misidentification between two-wheeled vehicles and pedestrians, improves error correction efficiency, and increases detection accuracy.
Smart Images

Figure CN122244897A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image recognition and detection technology, and in particular to a method and apparatus for correcting the detection results of machines, non-humans, and humans. Background Technology
[0002] In the field of intelligent security, intelligent security systems can be used for area intrusion alarms, people counting, and boundary crossing detection. While the development of deep learning has gradually improved security effectiveness and significantly reduced false alarms and missed alarms, there are still many false alarms for objects with indistinct features, partial obstructions, or in low-light conditions. In particular, when a person is riding a bicycle in the opposite or forward direction, the deep learning model of the intelligent security system may misdetect the bicycle rider as a pedestrian, which will affect subsequent alarms, significantly impacting not only security protection but also the user experience. Summary of the Invention
[0003] This invention provides a method and apparatus for correcting the detection results of vehicles, non-motorized vehicles, and pedestrians. It can correct the detection results of vehicles, non-motorized vehicles, and pedestrians, effectively solving the problem of misidentification between two-wheeled vehicles and pedestrians, and also greatly improving the error correction efficiency.
[0004] According to one aspect of the present invention, a method for correcting machine / non-human detection results is provided, the method comprising:
[0005] The detection target in the image to be detected is determined to be either a pedestrian or a two-wheeled vehicle, and the location information of the detection target is determined.
[0006] Based on the location information, the feature extraction information corresponding to the detection target is determined from the complete feature extraction result of the image to be detected;
[0007] Based on the predetermined differential feature location information, the target feature information is determined according to the feature extraction information;
[0008] Determine the pedestrian category similarity between the target feature information and the pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and the pre-established two-wheeled vehicle feature database;
[0009] The category correction result of the detected target is determined based on the similarity between the pedestrian category and the two-wheeled vehicle category.
[0010] According to another aspect of the present invention, an error correction device for machine-to-human detection results is provided, the device comprising:
[0011] The location information determination module is used to determine the detection target in the image to be detected whose initial detection category information is pedestrian or two-wheeled vehicle, and to determine the location information of the detection target;
[0012] The feature extraction information determination module is used to determine the feature extraction information corresponding to the detection target from the complete feature extraction results of the image to be detected based on the location information;
[0013] The target feature information determination module is used to determine target feature information based on the feature extraction information according to the pre-determined differential feature location information.
[0014] The similarity determination module is used to determine the pedestrian category similarity between the target feature information and the pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and the pre-established two-wheeled vehicle feature database;
[0015] The category correction result determination module is used to determine the category correction result of the detected target based on the pedestrian category similarity and the two-wheeled vehicle category similarity.
[0016] According to another aspect of the present invention, an electronic device is provided, the electronic device comprising:
[0017] At least one processor; and
[0018] A memory communicatively connected to the at least one processor; wherein,
[0019] The memory stores a computer program that can be executed by the at least one processor, which enables the at least one processor to perform the error correction method for non-human detection results according to any embodiment of the present invention.
[0020] According to another aspect of the present invention, a computer-readable storage medium is provided, the computer-readable storage medium storing computer instructions, which, when executed by a computer processor, implement the error correction method for machine-non-human detection results according to any embodiment of the present invention.
[0021] The technical solution of this invention determines the target in the image to be detected as either a pedestrian or a two-wheeled vehicle, based on the initial detection category information, and determines the location information of the target. Based on the location information, it extracts the corresponding feature information from the complete feature extraction results of the image. Based on pre-determined differential feature location information, it determines the target feature information according to the feature extraction information. It then determines the pedestrian category similarity between the target feature information and a pre-established pedestrian feature library, and the two-wheeled vehicle category similarity between the target feature information and a pre-established two-wheeled vehicle feature library. Finally, it determines the category correction result of the detected target based on the pedestrian and two-wheeled vehicle category similarities. This solution, for targets with initial detection category information of either pedestrians or two-wheeled vehicles, uses pre-determined differential feature location information to extract target feature information at locations that better reflect the significant differences between the two categories. This results in target feature information containing more obvious differential features, focusing on more critical local features, and thus achieving error correction of non-human / machine detection results based on the target feature information. This effectively solves the problem of misidentification between two-wheeled vehicles and pedestrians, and also greatly improves error correction efficiency.
[0022] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of the present invention, nor is it intended to limit the scope of the invention. Other features of the invention will become readily apparent from the following description. Attached Figure Description
[0023] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0024] Figure 1 A flowchart illustrating a method for correcting machine / non-human detection results provided in an embodiment of the present invention;
[0025] Figure 2a This is a schematic diagram of a pedestrian riding a two-wheeled vehicle traveling forward, provided by an embodiment of the present invention.
[0026] Figure 2b This is a schematic diagram of a pedestrian riding a two-wheeled vehicle in the opposite direction, provided by an embodiment of the present invention.
[0027] Figure 3 A flowchart illustrating a process for determining differential feature location information is provided in an embodiment of the present invention.
[0028] Figure 4 A schematic diagram of filling a reference pedestrian feature matrix provided in an embodiment of the present invention;
[0029] Figure 5 A flowchart illustrating another method for correcting machine / non-human detection results provided in an embodiment of the present invention;
[0030] Figure 6 A flowchart illustrating another method for correcting machine / non-human detection results provided in an embodiment of the present invention;
[0031] Figure 7a This is a schematic diagram of road distortion captured by an image acquisition device according to an embodiment of the present invention;
[0032] Figure 7b A schematic diagram illustrating a method for determining regional distortion factors according to an embodiment of the present invention;
[0033] Figure 8 This is a schematic diagram of the structure of a device for correcting the detection results of machines and non-humans provided in an embodiment of the present invention;
[0034] Figure 9 This is a schematic diagram of an electronic device for implementing a method for correcting the detection results of machines and non-humans, as provided in an embodiment of the present invention. Detailed Implementation
[0035] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.
[0036] It should be noted that the terms "candidate," "target," etc., used in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.
[0037] It should be noted that the concepts of "first" and "second" mentioned in this invention are only used to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or their interdependencies.
[0038] Figure 1 This is a flowchart illustrating a method for correcting the detection results of non-human and machine objects (non-human) objects according to an embodiment of the present invention. This embodiment is applicable to situations where, after preliminary detection of the image to be detected, errors are corrected in the obtained detection results. This method can be executed by a device for correcting the detection results of non-human and machine objects. This device can be implemented in hardware and / or software, and is generally integrated into any electronic device with network communication capabilities, such as a mobile terminal, PC, or server. Figure 1 As shown, the error correction method for machine-non-human detection results in this embodiment of the invention may include the following process:
[0039] S101. Determine the target in the image to be detected as a pedestrian or a two-wheeled vehicle based on the initial detection category information, and determine the location information of the target.
[0040] The image to be detected is the image used for detection, and it can be acquired using an actual image acquisition device. During detection, each frame of the image acquisition device can be used as the image to be detected, or image frames can be selected at fixed intervals for detection. For example, an image frame can be acquired every 8 frames for detection.
[0041] The initial detection category information reflects the category information identified after preliminary detection of targets in the image to be detected. The initial detection category information is determined through a pre-constructed basic detection network. The initial detection categories can include pedestrians, motor vehicles, and two-wheeled vehicles. The initial detection categories can also be set according to actual needs or the actual situation of the model algorithm; for example, the initial detection categories can also include tricycles and head / shoulder categories, etc.
[0042] Figure 2a This is a schematic diagram of a pedestrian riding a two-wheeled vehicle traveling forward, provided by an embodiment of the present invention. Figure 2b This is a schematic diagram of an image to be detected showing a pedestrian riding a two-wheeled vehicle in the opposite direction, provided as an embodiment of the present invention. Figure 2a and Figure 2bAs shown, regardless of whether a pedestrian riding a two-wheeled vehicle is traveling forward or backward, when using the initial depth model to detect such an image, the proportion of pedestrian features in the model's detection box is much greater than the proportion of two-wheeled vehicle features. This causes the model to easily identify a pedestrian riding a two-wheeled vehicle as a pedestrian, leading to incorrect identification results. The solution of this invention further corrects the machine / non-human detection results after the initial detection of the image, thereby improving the accuracy of identifying both pedestrians and two-wheeled vehicles in the image.
[0043] To correct errors for pedestrians and two-wheeled vehicles, after identifying the initial detection category of either a pedestrian or a two-wheeled vehicle in the image to be detected, the location information of the detected targets can be determined. This location information refers to the target's position within the image to be detected. The location information of the detected targets can be directly obtained from the basic detection network.
[0044] S102. Based on the location information, determine the feature extraction information corresponding to the detection target from the complete feature extraction results of the image to be detected.
[0045] Here, the complete feature extraction result refers to the complete image feature information extracted by the basic detection network during the recognition process of the image to be detected. For example, the complete feature extraction result is the feature extraction result of multiple feature extraction layers in the basic detection network, including at least one shallow feature extraction layer. Considering that some features are lost during the convolution process, feature extraction based on shallow feature extraction layers in a deep network can result in more complete feature information.
[0046] Since the location information of the target in the image to be detected has been determined, the feature extraction information corresponding to the target can be determined from the complete feature extraction results of the image to be detected based on the location information. Then, further detection and judgment can be performed based on the determined feature extraction information corresponding to the target, and error correction can be achieved for machine-to-human detection results.
[0047] S103. Based on the predetermined differential feature location information, determine the target feature information according to the feature extraction information.
[0048] Among them, the location information of the differentiated features is used to reflect the location of the features that can differentiate pedestrians and two-wheeled vehicles. For example, Figure 2a Features located in the lower middle part of the image are more helpful in accurately identifying the target as a pedestrian or a two-wheeled vehicle. Figure 2a Features in the upper middle position are insufficient to differentiate between pedestrians and two-wheeled vehicles, therefore, like Figure 2a The relevant positions in the lower middle can be used as the location information of the differential features.
[0049] Specifically, the differential feature location information can be predetermined. With the help of the differential feature location information, we can focus on the feature information at the key difference location. Based on the differential feature location information, we can determine the target feature information for position matching from the feature extraction information, so that the target feature information contains more critical local feature information, which can help to accurately identify the detection target, thereby realizing the error correction of machine, non-human and human detection results.
[0050] S104. Determine the pedestrian category similarity between the target feature information and the pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and the pre-established two-wheeled vehicle feature database.
[0051] The pedestrian feature database contains more comprehensive pedestrian feature information, while the two-wheeled vehicle feature database contains more comprehensive two-wheeled vehicle feature information. These databases can be pre-established by extracting features from images of predetermined categories, or they can be dynamically updated during actual error correction operations.
[0052] After determining the target feature information, the similarity between the target feature information and each pedestrian feature information in the pedestrian feature database, as well as the similarity between the target feature information and each two-wheeled vehicle feature information in the two-wheeled vehicle feature database, can be determined. For example, cosine similarity, Jaccard distance, and other methods can be used to calculate the similarity.
[0053] S105. Determine the category correction result of the detected target based on the similarity between pedestrian category and two-wheeled vehicle category.
[0054] The category correction result reflects the final category information of the target to be detected. Specifically, after obtaining the pedestrian category similarity and the two-wheeled vehicle category similarity, the two similarities can be compared. The category with the higher similarity is the final category to which the target belongs, thus obtaining the category correction result of the target and achieving the purpose of correcting the initial detection category information.
[0055] The technical solution of this invention determines the target in the image to be detected as either a pedestrian or a two-wheeled vehicle, based on the initial detection category information, and determines the location information of the target. Based on the location information, it extracts the corresponding feature information from the complete feature extraction results of the image. Based on pre-determined differential feature location information, it determines the target feature information according to the feature extraction information. It determines the pedestrian category similarity between the target feature information and a pre-established pedestrian feature library, and the two-wheeled vehicle category similarity between the target feature information and a pre-established two-wheeled vehicle feature library. Finally, it determines the category correction result of the detected target based on the pedestrian and two-wheeled vehicle category similarities. By employing this solution, for targets with initial detection category information of either pedestrians or two-wheeled vehicles, it extracts target feature information from locations that better reflect the significant differences between the two categories by utilizing pre-determined differential feature location information. This results in target feature information containing more obvious differential features, focusing on more critical local features, and thus achieving error correction of non-human and machine detection results based on the target feature information. This effectively solves the problem of misidentification between two-wheeled vehicles and pedestrians, and also greatly improves error correction efficiency.
[0056] Figure 3 This is a flowchart illustrating a process for determining differentiated feature location information according to an embodiment of the present invention. This embodiment can be combined with various optional solutions in one or more of the above embodiments. Figure 3 As shown, the process for determining the differential feature location information in this embodiment of the invention specifically includes:
[0057] S301. Initialize the difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix; wherein, the difference parameters include different types of difference parameters; the type of difference parameter corresponds to a sliding window of different size.
[0058] The reference pedestrian feature matrix is obtained by extracting features from pedestrians in the reference image, and the reference two-wheeled vehicle feature matrix is obtained by extracting features from two-wheeled vehicles in the reference image. The pedestrian and two-wheeled vehicle categories in the reference image are predetermined. The reference pedestrian and reference two-wheeled vehicle feature matrices can be extracted from the same reference image or from different reference images. For reference pedestrian and reference two-wheeled vehicle feature matrices extracted from different reference images, their sizes are standardized according to the size of the feature extraction information corresponding to the detection target. The reference pedestrian and reference two-wheeled vehicle feature matrices are used to predetermine the differential feature location information in subsequent schemes. The differential feature location information is determined by the differences in feature blocks extracted from the reference pedestrian and reference two-wheeled vehicle feature matrices; feature blocks are extracted by sliding windows of adaptive size from the reference pedestrian and reference two-wheeled vehicle feature matrices.
[0059] In this context, a feature unit is the smallest unit in the feature matrix, and a feature block consists of at least two feature units. For the obtained reference pedestrian feature matrix and reference two-wheeled vehicle feature matrix, in order to extract feature information using a sliding window in subsequent steps, optionally, the reference pedestrian feature matrix and reference two-wheeled vehicle feature matrix can be filled first, and then the difference parameters of each feature unit can be initialized. Specifically, filling the feature matrix involves filling each direction of the feature matrix with a preset number of feature units. The number of filled feature units can be preset based on the actual sliding window size. For example, if the maximum sliding window used is 3*3, then one feature unit needs to be filled in each direction of the feature matrix; if the maximum sliding window used is 5*5, then two feature units need to be filled in each direction of the feature matrix; and so on, thus achieving the filling process for the reference pedestrian feature matrix and reference two-wheeled vehicle feature matrix.
[0060] Figure 4 A schematic diagram of filling a reference pedestrian feature matrix provided in an embodiment of the present invention, as shown below. Figure 4 As shown in the figure, the yellow part is the original reference pedestrian feature matrix, and the gray part is the filled part. Figure 4 The middle part is a reference pedestrian feature matrix filled with two feature units in each direction. Figure 4 Each small cell in the matrix represents a feature unit. The principle and effect of filling the feature matrix of the reference two-wheeled vehicle are similar to... Figure 4 The filling of the reference pedestrian feature matrix is similar, and will not be described in detail here.
[0061] Each feature unit has its own variance parameters. These variance parameters include different types, specifically first-type variance parameters and other types. The type of variance parameter corresponds to a different size sliding window, with the first-type variance parameter corresponding to the smallest sliding window. For example, taking the first feature unit in the top left corner of the feature matrix, its variance parameters may include wR1, hR1, cR1, wR3, hR3, cR3, wR5, hR5, and cR5. Here, w, h, and c reflect the three dimensions of the feature unit: W (width), H (height), and C (channels), respectively. wR1, hR1, and cR1 are first-type variance parameters, representing the current feature... The first feature unit represents a 1x1 window centered on the current feature unit along the W, H, and C dimensions. wR3, hR3, cR3, wR5 and wR5, hR5, cR5 are other class-specific differential parameters. wR3, hR3, and cR3 represent 3x3 windows centered on the current feature unit along the W, H, and C dimensions, respectively, while wR5, hR5, and cR5 represent 5x5 windows centered on the current feature unit along the W, H, and C dimensions, respectively. For example... Figure 4 As shown, Figure 4 The image shows the difference parameters of the reference pedestrian feature matrix along the W dimension. Figure 4 The blue box formed by the first feature unit in the upper left corner of the yellow area is wR1, the red box is wR3, and the purple box is wR5.
[0062] It should be noted that the principle for determining the difference parameters of each feature unit in the W, H, and C dimensions of the feature matrix is the same. In this embodiment of the invention, only the principle is explained from the W dimension; the details of the principle in the H and C dimensions will not be repeated. Furthermore, the number and size of the sliding window can be determined according to the detection accuracy of the actual scene. The specific parameters included in other types of difference parameters can be set based on actual needs. For example, other types of difference parameters can also be wR7, hR7, cR7, etc., formed by a 7*7 window, or larger windows can be used to determine different other types of difference parameters.
[0063] Specifically, the difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix are initialized. This can be done by setting the difference parameter of each feature unit to 0, that is, setting the specific difference parameter values of each feature unit in the W, H, and C dimensions to 0.
[0064] S302. Use an initial-size sliding window to extract features from the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix to obtain initial pedestrian feature blocks and initial two-wheeled vehicle feature blocks, and determine the differences between the initial pedestrian feature blocks and the initial two-wheeled vehicle feature blocks; wherein, the positions of the initial pedestrian feature blocks and the initial two-wheeled vehicle feature blocks correspond.
[0065] The initial size of the sliding window can be set based on actual needs. For example, the initial size of the sliding window can be a 1x1 window.
[0066] Specifically, when using an initially sized sliding window to extract features from the feature matrix, the extraction can begin from the first feature unit at the top left corner of the feature matrix. Based on the sliding window, feature extraction can be performed to obtain feature blocks, which are a portion of the feature matrix. In this invention, the principle of using a sliding window to extract features from the reference pedestrian feature matrix is the same as the principle of using a sliding window to extract features from the reference two-wheeled vehicle feature matrix. The principle of using an enlarged sliding window for feature extraction in subsequent solutions is also the same. Therefore, this invention only describes one aspect in detail; the same principles will not be elaborated further.
[0067] It should be noted that when using an initial-size sliding window to extract features from the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix, the feature extraction is performed using a window of the same size, and the window slides at corresponding positions within the two feature matrices. For example, if a 1x1 initial-size sliding window extracts features at the first row and first column of the reference pedestrian feature matrix, resulting in the initial pedestrian feature block, then the initial two-wheeled vehicle feature block should also be obtained by using a 1x1 initial-size sliding window to extract features at the first row and first column of the reference two-wheeled vehicle feature matrix. This ensures that the initial pedestrian and initial two-wheeled vehicle feature blocks obtained from the two feature matrices correspond in position.
[0068] After obtaining the initial pedestrian and two-wheeled vehicle feature blocks, the differences between them can be determined. Specifically, the differences can be determined by inverting the similarity between the initial pedestrian and two-wheeled vehicle feature blocks. The differences between these two feature blocks reflect the magnitude of the difference between pedestrian and two-wheeled vehicle feature information at the corresponding location. A greater difference indicates a better ability to distinguish between pedestrian and two-wheeled vehicle features at that location, which is more helpful for subsequent error correction of vehicle and non-pedestrian detection results.
[0069] As an optional but non-limiting implementation, the present invention can determine the similarity between pedestrian feature blocks and two-wheeled vehicle feature blocks based on the following formula:
[0070]
[0071] Among them, s person,bike Indicates the similarity between pedestrian feature blocks and two-wheeled vehicle feature blocks; A m B represents the pedestrian feature blocks extracted from the pedestrian feature matrix based on a sliding window; m This represents the two-wheeled vehicle feature blocks extracted from the two-wheeled vehicle matrix based on a sliding window.
[0072] As an optional but non-limiting implementation, after obtaining the similarity between the pedestrian feature block and the two-wheeled vehicle feature block, the present invention can determine the difference between the pedestrian feature block and the two-wheeled vehicle feature block based on the following formula:
[0073] Diff = 1-s person,bike ;
[0074] Here, Diff represents the difference between the pedestrian feature block and the two-wheeled vehicle feature block. The larger the value of Diff, the greater the difference.
[0075] It should be noted that the differences between the initial pedestrian feature block and the initial two-wheeled vehicle feature block can be determined based on the above formula. In subsequent schemes, after obtaining the expanded feature blocks, the determination can still be based on the above formula.
[0076] S303. If the difference is greater than the difference threshold, modify the first type of difference parameter of each feature unit in the initial pedestrian feature block and two-wheeled vehicle feature block, and expand the sliding window; otherwise, continue to use the initial size sliding window for feature extraction.
[0077] For example, taking the extraction of initial feature blocks from the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix using a 1*1 sliding window starting from the top left corner, if the difference between the initial pedestrian feature block and the initial two-wheeled vehicle feature block is determined to be greater than the difference threshold, then the first type of difference parameter of each feature unit in the initial pedestrian feature block and the two-wheeled vehicle feature block is modified, that is, wR1, hR1, and cR1 of each feature unit in these feature blocks are modified and assigned a value of 1. The original 1*1 sliding window is then expanded to a 3*3 sliding window.
[0078] If the difference between the initial pedestrian feature block and the initial two-wheeled vehicle feature block is determined to be less than or equal to the difference threshold, then a 1*1 sliding window (initial size sliding window) is used for feature extraction.
[0079] The difference threshold can be pre-calculated based on the actual needs of the scenario. The difference threshold can reflect the required degree of difference. The larger the difference threshold, the higher the required degree of difference and the more stringent the requirement; the smaller the difference threshold, the lower the required degree of difference and the more lenient the requirement.
[0080] As an optional but non-limiting implementation, the method for determining the difference threshold is as follows: During scene initialization, extract equal amounts of pedestrian features and non-motorized vehicle features (100-1000 features each). Use 1*1, 3*3, and 5*5 windows sequentially to extract feature matrices along the W, H, and C dimensions. Calculate the difference between pedestrian and two-wheeled vehicle categories based on the relevant formulas for calculating difference mentioned above. Then, sort the obtained difference values from largest to smallest, and finally obtain the threshold according to scene requirements. For example, if the current scene yields 1000 difference values, the difference values at the top 10% can be selected as the threshold; therefore, the difference value at the 100th position is the difference threshold.
[0081] S304. Continue to extract features from the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix using an expanded sliding window to obtain expanded pedestrian feature blocks and expanded two-wheeled vehicle feature blocks, and determine the differences between the expanded pedestrian feature blocks and the expanded two-wheeled vehicle feature blocks; wherein the expanded pedestrian feature blocks and the expanded two-wheeled vehicle feature blocks correspond in position.
[0082] For example, after expanding the sliding window to a 3*3 window, the expanded 3*3 sliding window can be used to continue to extract features from the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix, resulting in expanded pedestrian feature blocks and expanded two-wheeled vehicle feature blocks, and the positions of the expanded pedestrian feature blocks and expanded two-wheeled vehicle feature blocks correspond.
[0083] It should be noted that, due to the expansion of the sliding window, the expanded pedestrian feature block contains more pedestrian feature information than the initial pedestrian feature block, and correspondingly, the expanded two-wheeled vehicle feature block contains more two-wheeled vehicle feature information than the initial two-wheeled vehicle feature block.
[0084] After obtaining the enlarged pedestrian feature block and the enlarged two-wheeled vehicle feature block, we can continue to use the formulas for calculating the similarity between the pedestrian feature block and the two-wheeled vehicle feature block, as well as the formulas for calculating the difference between the pedestrian feature block and the two-wheeled vehicle feature block, to determine the difference between the enlarged pedestrian feature block and the enlarged two-wheeled vehicle feature block.
[0085] S305. If the difference is greater than the difference threshold, modify the first type of difference parameter of each feature unit in the expanded pedestrian feature block and the expanded two-wheeled vehicle feature block, as well as the other type of difference parameters of the center point feature unit. Determine whether to continue expanding the sliding window according to the feature extraction requirements, and continue to perform feature extraction using the expanded sliding window, or continue to perform feature extraction using the current sliding window; otherwise, shrink the current sliding window and continue to perform feature extraction using the shrunken sliding window.
[0086] The feature extraction requirements can be set based on the user's actual scenario needs. These requirements reflect the size of the sliding window the user requires; for example, the requirement could be a maximum sliding window size of 3x3 or a maximum of 5x5. Essentially, the feature extraction requirements reflect the maximum extent to which the sliding window can be expanded.
[0087] For example, if the calculated difference when using a 3*3 sliding window is greater than the difference threshold, then the first type of difference parameter of each feature unit in the expanded pedestrian feature block and the expanded two-wheeled vehicle feature block, as well as the other type of difference parameters of the center point feature unit, are modified. The center point feature unit is the feature unit at the center of the feature block. Specifically, the first type of difference parameters wR1, hR1, and cR1 in the feature block can be assigned a value of 1, and the other type of difference parameters wR3, hR3, and cR3 of the center point feature unit corresponding to the current 3*3 sliding window can be assigned a value of 1. Whether to further expand the sliding window is then determined based on the feature extraction requirements. If the feature extraction requirement is a 5*5 sliding window, then the 3*3 sliding window can be further expanded to a 5*5 sliding window, and feature extraction can continue using the 5*5 sliding window. If the feature extraction requirement is a 3*3 sliding window, then the sliding window is not expanded further, and the current 3*3 sliding window is used for feature extraction.
[0088] If the calculated difference is less than or equal to the difference threshold, the current sliding window is reduced in size, and feature extraction continues using the reduced sliding window. For example, a 3x3 sliding window can be reduced to a 1x1 sliding window, and feature extraction can continue using the reduced 1x1 sliding window.
[0089] S306. Continue sliding the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix until the sliding is complete. Determine the corresponding differential feature position information based on the differential parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix.
[0090] Specifically, after the sliding window is further expanded, feature extraction is continued using the expanded window, and so on, until the window is expanded to a suitable size, until the sliding of the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix is completed, and the corresponding differential feature position information is determined based on the difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix.
[0091] As an optional but non-limiting implementation, determining the corresponding differentiated feature location information based on the difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix may include the following steps A1-A3:
[0092] Step A1: Determine the target feature units in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix whose first type of difference parameters meet the conditions.
[0093] Specifically, we can search for feature units in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix where the value of the first type of difference parameter is equal to 1, that is, we can search for the first type of difference parameter wR1 = 1.
[0094] The characteristic units that hR1=1&cR1=1 are the target characteristic units; those that do not meet the conditions are not the target characteristic units.
[0095] Step A2: Determine other class difference parameters of the target feature unit. If other class difference parameters meet the conditions, determine the location information of the differentiated feature based on the sliding window size corresponding to the other class difference parameter.
[0096] After identifying the target feature unit, it is necessary to further determine whether other class-specific difference parameters of the target feature unit meet the conditions. Specifically, this involves checking if any of the other class-specific difference parameter values of the target feature unit have a value of 1, i.e., checking if any of the other class-specific difference parameters wR3, hR3, cR3, wR5, hR5, cR5, etc., have a value of 1. If a difference parameter value of 1 is found, the differential feature location information is determined based on the sliding window size corresponding to that other class-specific difference parameter. For example, if wR3, hR3, and cR3 have a value of 1, a feature matrix of 3*3 size is extracted along the corresponding dimension, expanding outward by 1 unit from the current target feature unit, and the location information corresponding to the extracted feature matrix is used as the differential feature location information. Similarly, if wR5, hR5, and cR5 have a value of 1, a feature matrix of 5*5 size is extracted along the corresponding dimension, expanding outward by 2 units from the current target feature unit, and the location information corresponding to the extracted feature matrix is used as the differential feature location information. By analogy, following this method, all other differential parameter values of the target feature unit are judged to determine the differential feature location information.
[0097] Step A3: Otherwise, determine the location information of the differentiated features based on the target feature unit.
[0098] If it is determined that there are no parameters with a value of 1 among the other types of differential parameters of the target feature unit, then only the feature matrix of the corresponding dimension with the current target feature unit as the center is extracted. That is, the feature matrices corresponding to the first type of differential parameters wR1, hR1, and cR1 respectively, and the position information corresponding to the extracted feature matrix is used as the differential feature position information.
[0099] By analogy, the difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix can be determined to identify the corresponding differentiated feature location information.
[0100] For example, this embodiment provides a complete example to illustrate the process of determining the location information of differential features. Feature matrices in the W, H, and C dimensions are extracted using 1*1, 3*3, and 5*5 windows, as detailed below:
[0101] Step 1: First, fill the feature matrix with two feature units in each direction. The filling value is not limited.
[0102] Step 2: Taking the feature matrix in the W dimension as an example, perform feature extraction: Starting from the top left corner (before filling the feature matrix), use a 1*1 window to extract feature information (the extracted features are 1*1*c). For the obtained initial pedestrian feature block and initial two-wheeled vehicle feature block, use the relevant formula above to calculate the difference. If it is greater than the threshold, assign 1 to the w value of each feature unit (w=0, h=0, c=0) in the feature matrix. Expand the current feature matrix by one feature unit in four directions to become a 3*3 window and execute step 4. Otherwise, execute step 3.
[0103] Step 3: Slide one unit to the right with a step size of 1, continue to extract feature information, calculate the difference, and if it is greater than the threshold, assign 1 to the w value of each feature unit in the feature matrix and execute step 4; otherwise, execute step 3.
[0104] Step 4: Extract feature information (the extracted features are 3*3*c). Continue to calculate the difference for the obtained feature blocks. If it is greater than the threshold, assign 1 to the w value of each feature unit in the feature matrix, assign 1 to the wR3 value of the center point feature unit, expand the current feature matrix by one feature unit in four directions to become a 5*5 window and execute step 6. Otherwise, execute step 5.
[0105] Step 5: Reduce the size to a 1*1*c feature matrix centered on the central feature unit. If the 1*1 window has already been calculated, proceed directly to step 3. Otherwise, continue extracting feature information and calculating the difference. If the difference is greater than the threshold, assign 1 to the w value of each feature unit in the feature matrix and proceed to step 3. If the difference is less than the threshold, proceed to step 3.
[0106] Step 6: Extract feature information (the extracted features are 5*5*c). Continue to calculate the difference for the obtained feature blocks. If it is greater than the threshold, assign 1 to the w value of each feature unit in the feature matrix, assign 1 to the wR5 value of the center point feature unit and execute step 8. Otherwise, execute step 7.
[0107] Step 7: Shrink the feature matrix to a 3*3*c matrix centered on the central feature unit. If the 3*3 window has been calculated, slide one unit to the right with a step size of 1 and execute step 4. Otherwise, continue to extract feature information and calculate the difference. If it is greater than the threshold, assign 1 to the w value of each feature unit in the feature matrix and assign 1 to the wR3 value of the central feature unit. Slide one unit to the right with a step size of 1 and execute step 4. Otherwise, execute step 5.
[0108] Step 8: Slide one unit to the right with a step size of 1, continue to extract feature information, calculate the difference, and if it is greater than the threshold, assign 1 to the w value of each feature unit in the feature matrix, assign 1 to the wR5 value of the center point feature unit and execute step 8; otherwise, execute step 7.
[0109] Step 9: Extract the H and C dimensions of the feature matrix using the same logic. The difference is that when the value is greater than the threshold, h, c, or hR3, cR3, or hR5, cR5 are assigned a value of 1.
[0110] The embodiments of this invention utilize adaptive sliding windows to extract differential feature location information, enabling the feature content corresponding to this information to more clearly differentiate between pedestrians and two-wheeled vehicles. This results in the target feature information containing more obvious differential features. By focusing on more critical local features, the accuracy and efficiency of subsequent error correction for non-human and machine detection results are greatly improved.
[0111] Figure 5 This is a flowchart of another method for correcting machine / non-human detection results provided by an embodiment of the present invention. The technical solution of this embodiment further optimizes the process of determining target feature information based on pre-determined differential feature location information and feature extraction information in the aforementioned embodiments, building upon the technical solutions of the above embodiments. This embodiment can be combined with various optional solutions in one or more of the above embodiments. For example... Figure 5As shown, the error correction method for machine / non-human detection results according to an embodiment of the present invention may include the following process:
[0112] S501. Determine the target in the image to be detected as a pedestrian or a two-wheeled vehicle based on the initial detection category information, and determine the location information of the target.
[0113] S502. Based on the location information, determine the feature extraction information corresponding to the detection target from the complete feature extraction results of the image to be detected.
[0114] S503. Based on the pre-determined differential feature location information, extract the multi-scale feature information of the corresponding location from the feature extraction information.
[0115] Since the determined differential feature location information corresponds to feature information of different window sizes, multi-scale feature information of the corresponding location can be extracted from the feature extraction information of the target to be detected based on the pre-determined differential feature location information. For example, the feature information obtained from the feature extraction information of the target to be detected can have multiple scales such as 1*1, 3*3, and 5*5.
[0116] S504. Perform sliding splitting on the feature information larger than the reference scale in the multi-scale feature information to obtain multiple reference scale feature information.
[0117] The reference scale can be determined according to the feature extraction requirements. For example, the reference scale can be 3*3 or 5*5.
[0118] For example, if the reference scale is 3*3, for feature information larger than the reference scale in the multi-scale feature information, such as the extracted 5*5 scale feature information, the 5*5 scale is larger than the reference size of 3*3, the 5*5 scale feature information can be slide-splittered to obtain multiple 3*3 reference scale feature information.
[0119] S505. After connecting the feature information smaller than the reference scale in the multi-scale feature information, split it to obtain multiple reference scale feature information.
[0120] For example, if the reference scale is 3*3, for feature information smaller than the reference scale in the multi-scale feature information, such as the extracted 1*1 scale feature information, we can first perform a concat operation on the feature matrices of the same size, then connect them in order from left to right and from top to bottom, and finally split the connected matrix to obtain multiple reference scale feature information.
[0121] S506. Connect the reference scale feature information obtained from the multi-scale feature information to obtain the target feature information.
[0122] After decomposing the feature information at each scale, the target feature information can be obtained by connecting the feature information at each reference scale.
[0123] S507. Determine the pedestrian category similarity between the target feature information and the pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and the pre-established two-wheeled vehicle feature database.
[0124] As an optional but non-limiting implementation, determining the pedestrian category similarity between the target feature information and a pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and a pre-established two-wheeled vehicle feature database, may include the following steps B1-B2:
[0125] Step B1: Determine the pedestrian similarity between the target feature information and each pedestrian feature information in the pedestrian feature database, and take the maximum value of the pedestrian similarity as the pedestrian category similarity.
[0126] Step B2: Determine the similarity between the target feature information and each two-wheeled vehicle feature information in the two-wheeled vehicle feature library, and take the maximum value of the two-wheeled vehicle similarity as the two-wheeled vehicle category similarity.
[0127] The similarity between pedestrians and two-wheeled vehicles is determined using the following formula:
[0128]
[0129] Where s represents pedestrian similarity or two-wheeled vehicle similarity, and A i B represents the target feature information of the i-th target in the image to be detected; j This represents the j-th pedestrian feature in the pedestrian feature database or the j-th two-wheeled vehicle feature in the two-wheeled vehicle feature database.
[0130] S508. Determine the category correction result of the detected target based on the similarity between pedestrian category and two-wheeled vehicle category.
[0131] The solution of this invention, for a target in an image to be detected whose initial detection category information is pedestrian or two-wheeled vehicle, determines the target feature information based on pre-determined differential feature location information. The differential feature location information is obtained by sliding extraction based on an adaptive sliding window. The feature content corresponding to the obtained differential feature location information can better distinguish the differences between pedestrians and two-wheeled vehicles. This makes the target feature information contain more obvious differential features, focusing on more critical local features, and thus realizes the error correction of non-human and machine detection results based on the target feature information. This effectively solves the problem of misidentification between two-wheeled vehicles and pedestrians, and also greatly improves the error correction efficiency.
[0132] Figure 6 This is a flowchart illustrating another method for correcting the detection results of non-human and machine vehicles provided by an embodiment of the present invention. The technical solution of this embodiment further optimizes the process prior to determining the category correction result of the detection target based on the similarity between pedestrian and two-wheeled vehicle categories in the aforementioned embodiments, building upon the technical solutions of the previous embodiments. This embodiment can be combined with various optional solutions in one or more of the above embodiments. For example... Figure 6 As shown, the error correction method for machine / non-human detection results according to an embodiment of the present invention may include the following process:
[0133] S601. Determine the target in the image to be detected as a pedestrian or a two-wheeled vehicle based on the initial detection category information, and determine the location information of the target.
[0134] S602. Based on the location information, determine the feature extraction information corresponding to the detection target from the complete feature extraction results of the image to be detected.
[0135] S603. Based on the predetermined differential feature location information, determine the target feature information according to the feature extraction information.
[0136] S604. Determine the pedestrian category similarity between the target feature information and the pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and the pre-established two-wheeled vehicle feature database.
[0137] S605. Correct the velocity of the target based on the predetermined regional distortion factor and the position information of the target to obtain the relative velocity of the target.
[0138] Specifically, Figure 7a This is a schematic diagram of road distortion captured by an image acquisition device according to an embodiment of the present invention, such as... Figure 7a As shown, due to the optical characteristics of the image acquisition device's lens, the area corresponding to the orange arrow in the diagram appears as a straight line, while the area corresponding to the green arrow appears as a diagonal line. Furthermore, the road appears narrower the further away the green arrow is. For the image acquisition device, only targets directly below it travel in a straight line; targets on either side of the device travel in diagonal lines. Additionally, the closer a target is to the image acquisition device, the smaller the distortion; the farther away, the greater the distortion.
[0139] Due to distortion issues caused by image acquisition devices, calculating the speed of a target directly based on its motion trajectory in the captured image can result in significant errors. For example, even if the same target is moving at a constant speed on an actual physical road, its calculated speed will differ considerably at different locations in the video frame. This leads to inaccurate identification of pedestrians or bicycles in the image, especially when pedestrians are riding bicycles in the opposite direction or forwards, making identification errors more likely.
[0140] To address the distortion problem caused by image acquisition devices, this invention introduces a regional distortion factor to correct the speed of the detected target, thereby reducing the speed difference of the same target at different positions in the image. The regional distortion factor is determined based on the driving data of a reference vehicle.
[0141] As an optional but non-limiting implementation, the process of determining the regional distortion factor may include the following steps C1-C3:
[0142] Step C1: Determine the detection location range and detection time range of the reference vehicle based on the driving data of the reference vehicle.
[0143] The reference vehicle is a motor vehicle traveling at a constant speed, and its driving data can be obtained based on real-world scenarios. The detection position range of the reference vehicle reflects the range within which the reference vehicle target can be continuously detected in the camera's field of view, and the detection position range of the reference vehicle reflects the range within which the reference vehicle target can be continuously detected in the camera's field of view for a given period of time.
[0144] Specifically, a coordinate system can be established with the lower left corner of the image actually captured by the image acquisition device as the origin, with the horizontal axis as the X-axis and the vertical axis as the Y-axis. Figure 7b This is a schematic diagram of a method for determining regional distortion factors provided in an embodiment of the present invention, as shown below. Figure 7b As shown in the diagram, the blue square represents the reference vehicle, which travels along the positive Y-axis. The coordinates of the reference vehicle are specifically determined by its center point. Figure 7b (x) min ,y min (x) represents the starting coordinates of the reference vehicle's journey. max y max (x) represents the coordinates of the endpoint of the reference vehicle's journey. min y min )~(x max y max ) indicates the range of detection locations for the reference motor vehicle, t min ~t maxThis is for reference regarding the inspection time range for motor vehicles.
[0145] Step C2: Divide the detection location range into regions based on the detection time range to obtain multiple detection areas.
[0146] Specifically, the detection time range can be divided into equal parts to obtain different time periods; and the detection area corresponding to each time period can be determined based on each time period, thereby dividing the detection location range into multiple detection areas.
[0147] It should be noted that the number of detection regions obtained by dividing the detection time range into equal parts is the same as the number of detection areas obtained by dividing the detection location range. For example, if the detection time range is divided into 4 equal parts, the detection location range will be divided into 4 segments, resulting in 4 detection areas. The number of detection areas can be set according to the actual needs of the scenario, such as being determined by the field of view of the image acquisition device; the larger the field of view, the more detection areas there are.
[0148] Step C3: Determine the regional distortion factor of the detection area based on the average speed of the reference motor vehicle in each detection area.
[0149] After obtaining the various detection zones, the regional distortion factor of each zone can be determined based on the average speed of the reference vehicle in each zone. Specifically, the average speed of the reference vehicle in each detection zone can be determined based on its displacement and time information within that zone.
[0150] For example, taking the detection time range as an example of dividing it into four equal parts, the various time periods after the division... After dividing the detection area, four detection regions will be formed, each corresponding to a different regional distortion factor, such as... Figure 7b As shown, the first detection region is closest to the image acquisition device, and the last detection region is farthest from the image acquisition device. The coordinate ranges of these four detection regions are as follows: (y... min ,y1), (y1,y2), (y2,y3) and (y3,y max ).
[0151] In this embodiment of the invention, the vertical coordinate is used to calculate speed information. Firstly, false alarms mostly occur when the vehicle is traveling in the opposite or forward direction; when a two-wheeled vehicle is traveling laterally, the vehicle's features are more prominent, and false alarms are rare. Secondly, it's due to camera distortion, such as... Figure 7b As shown, visually only the target traveling in a straight line directly below the camera is visible, such as... Figure 7b As indicated by the orange arrow; the camera moves diagonally to both sides, as shown. Figure 7bAs shown by the green arrow, using the relative distance between two positions to calculate velocity would increase the velocity of targets on both sides of the camera; using only the vertical coordinate would yield a more accurate result. Therefore, in this embodiment of the invention, vertical coordinate information is chosen for the relevant calculations.
[0152] As an optional but non-limiting implementation, the regional distortion factor of each detection region can be calculated based on the following formula:
[0153]
[0154] Among them, k1, k2, k3 and k4 are the regional distortion factors of each detection area, respectively.
[0155] Since the pre-determined regional distortion factor essentially reflects the degree of distortion at corresponding positions in the image for different detection regions, the velocity of the detected target can be corrected based on the regional distortion factor and the target's position information to obtain its relative velocity. This allows the determined relative velocity value to be closer to the real-world relative velocity value for the same type of target, narrowing the velocity threshold range and eliminating its chaotic dispersion. Simultaneously, the increased velocity differences between different types of detected targets reduce the likelihood of overlapping velocity threshold ranges, thus facilitating accurate identification and judgment, and avoiding misidentification between two-wheeled vehicles and pedestrians.
[0156] As an optional but non-limiting implementation, the relative velocity of the target in multiple frames of images to be detected can be calculated based on the following formula:
[0157]
[0158] Where v′ represents the relative velocity of the target in multiple frames of the image to be detected; y i y represents the ordinate position information of the target to be detected in the i-th frame of the image to be detected; i-1 The coordinates of the target to be detected are in the (i-1)th frame of the image to be detected; i is greater than or equal to 1; T is the time interval between two adjacent detection frames.
[0159] S606. Based on the voting mechanism, determine the probability of the first pedestrian category and the probability of the first two-wheeled vehicle category of the detected target according to the relative speed of the detected target in multiple frames of images to be detected.
[0160] After determining the relative speed of the target, a voting mechanism can be used to determine the probability of the target being classified as a pedestrian and a two-wheeled vehicle. The probability of the pedestrian category reflects the probability that the target, determined by the voting mechanism, is a pedestrian. The probability of the two-wheeled vehicle category reflects the probability that the target, determined by the voting mechanism, is a two-wheeled vehicle.
[0161] As an optional but non-limiting implementation, based on a voting mechanism, determining the probability of the first pedestrian category and the probability of the first two-wheeled vehicle category of the detected target according to the relative speed of the target in multiple frames of images to be detected may include the following steps D1-D3:
[0162] Step D1: Based on the comparison results between the relative speed of the detected targets in multiple frames of images to be detected and the predetermined relative speed threshold ranges for pedestrians and two-wheeled vehicles, determine the pedestrian score and two-wheeled vehicle score of the detected targets in each frame of images to be detected.
[0163] The relative speed threshold range for pedestrians can be predetermined based on the relative speed between pedestrians and reference motor vehicles in the actual scenario. Similarly, the relative speed threshold range for two-wheeled vehicles can also be predetermined based on the actual situation. The present invention aims to reflect a more realistic speed situation through each speed threshold range.
[0164] Specifically, for a target in a given frame of the image to be detected, the relative speed of the target is compared with the relative speed threshold ranges for pedestrians and two-wheeled vehicles. If it falls within the pedestrian relative speed threshold range, the pedestrian score of the target is incremented by one; if it falls within the two-wheeled vehicle relative speed threshold range, the two-wheeled vehicle score is incremented by one. This process is repeated to obtain the specific pedestrian and two-wheeled vehicle scores for each frame of the image to be detected.
[0165] Step D2: Based on the position change information of the detected target in multiple frames of images to be detected, determine the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score of the detected target in each frame of images to be detected.
[0166] The voting weights are used to reflect the credibility of the scores in each frame of the image to be detected. For example... Figure 6 As shown, due to the distortion problem of the image acquisition device, the distortion degree is relatively small near the image acquisition device, which indicates that the information obtained from the image near the image acquisition device is more reliable; the distortion degree is greater at the location far from the image acquisition device, resulting in greater error and lower reliability.
[0167] To address this, the voting weights for pedestrian and two-wheeled vehicle scores in each frame of the image to be detected can be determined based on the positional changes of the target across multiple frames. A higher voting weight is assigned when facing the image acquisition device, and a lower weight when facing away from it.
[0168] Step D3: Determine the probability of the first pedestrian category and the probability of the first two-wheeled vehicle category based on the pedestrian score and two-wheeled vehicle score of the detected targets in multiple frames of images to be detected, as well as the corresponding voting weights.
[0169] As an optional but non-limiting implementation, the total score for the pedestrian category and the total score for the two-wheeled vehicle category can be calculated based on the following formulas:
[0170]
[0171] Among them, g person For pedestrians, the total score is g. bike The total score for the two-wheeled vehicle category; j represents the j-th round of voting, and n is the total number of voting rounds; This represents the pedestrian's score in the j-th round of voting; This represents the score of the second round of voting in the j-th round; 'a' is the initial weight value, and 'b' is the magnitude of each weight adjustment. For example, you can set 'a' to 0.35, 'b' to 0.05, and 'n' to 5 to calculate the score after 5 rounds of voting.
[0172] It should be noted that the parameters a, b, and n in the above formula can be set differently according to actual needs.
[0173] After obtaining the total scores for pedestrian and two-wheeled vehicle categories, the probabilities of the first pedestrian category and the first two-wheeled vehicle category can be determined.
[0174] As an optional but non-limiting implementation, the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score of the detected target in each frame of the image to be detected are determined based on the positional change information of the detected target in multiple frames of the image to be detected. This may include the following steps E1-E4:
[0175] Step E1: Sequentially determine the current position information of the target in each frame of the image to be detected and the corresponding previous frame position information in the previous frame of the image to be detected.
[0176] Among them, the previous frame position information refers to the position information of the target in the previous frame of the image to be detected.
[0177] Step E2: Determine the movement direction of the detected target based on the current position information and the change information of the previous frame position information; wherein, the movement direction includes facing the image acquisition device and facing away from the image acquisition device.
[0178] Specifically, during the process of the detected target changing from the current location information to the previous frame location information, if the change is closer to the image acquisition device, the direction of movement of the detected target is towards the image acquisition device; if the change is farther away from the image acquisition device, the direction of movement of the detected target is away from the image acquisition device.
[0179] Step E3: If the direction of movement of the target in the frame to be detected is towards the image acquisition device, then increase the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score determined in the frame to be detected.
[0180] If the direction of movement is towards the image acquisition device, the credibility of the target to be detected in this process can be considered higher. Therefore, the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score determined in the frame to be detected are increased, and the increase in voting weights can be preset.
[0181] Step E4: If the direction of movement of the target in the frame to be detected is away from the image acquisition device, then the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score determined in the frame to be detected are reduced.
[0182] If the direction of movement is away from the image acquisition device, the credibility of the target to be detected in this process can be considered to be lower. Therefore, the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score determined in the image to be detected in that frame are reduced. The reduction in voting weights can be preset.
[0183] S607. Determine the probability of the second pedestrian category based on the similarity of pedestrian categories, and determine the probability of the second two-wheeled vehicle category based on the similarity of two-wheeled vehicle categories.
[0184] Specifically, the probability of the second pedestrian category can be determined based on the pedestrian category similarity obtained in the previous scheme, and the probability of the second two-wheeled vehicle category can be determined based on the two-wheeled vehicle category similarity.
[0185] S608. Determine the target pedestrian category probability based on the first pedestrian category probability and the second pedestrian category probability.
[0186] It should be noted that, in order to make the error correction of the non-human detection results more accurate, the probability of the final judgment can be determined from two aspects: the location information of the differentiated features that can distinguish the categories, and the lens distortion of the image acquisition device. This greatly improves the accuracy and reliability of the error correction of the non-human detection results.
[0187] Specifically, the probabilities of the first pedestrian category and the second pedestrian category are added together, and the result is used as the target pedestrian category probability. Optionally, a confidence weight is determined for the first pedestrian category probability and the second pedestrian category probability, and the target pedestrian category probability is determined based on the weighted sum of the first pedestrian category probability and the second pedestrian category probability. The confidence weight can be determined based on the actual detection results.
[0188] S609. Determine the target two-wheeled vehicle category probability based on the probability of the first two-wheeled vehicle category and the probability of the second two-wheeled vehicle category.
[0189] Specifically, the probabilities of the first and second two-wheeled vehicle categories are added together, and the result is used as the target two-wheeled vehicle category probability. Optionally, a confidence weight is determined for the first and second two-wheeled vehicle category probabilities, and the target two-wheeled vehicle category probability is determined based on the weighted sum of the first and second two-wheeled vehicle category probabilities. The confidence weight can be determined based on the actual detection results.
[0190] S610. Determine the error correction result of the detected target based on the probability of the target pedestrian category and the probability of the target two-wheeled vehicle category.
[0191] Specifically, the probability of the target pedestrian category and the probability of the target two-wheeled vehicle category are compared. If the probability of the target pedestrian category is greater than the probability of the target two-wheeled vehicle category, the target is detected as a pedestrian; if the probability of the target pedestrian category is less than the probability of the target two-wheeled vehicle category, the target is detected as a two-wheeled vehicle. This achieves error correction for the detected target.
[0192] The solution in this invention, for targets whose initial detection category information is pedestrian or two-wheeled vehicle, extracts target feature information at locations that better reflect the differences between the two categories by utilizing pre-determined differential feature location information. This results in the target feature information containing more obvious differential features. Simultaneously, this embodiment introduces a region distortion factor to correct the speed of the detected target and a voting mechanism with decreasing weights to calculate the error correction scores for pedestrian and two-wheeled vehicle categories. This addresses the distortion problem of the image acquisition device to some extent. Based on these two levels, the error correction of the non-human and machine detection results effectively solves the problem of misidentification between two-wheeled vehicles and pedestrians, while also greatly improving the error correction efficiency.
[0193] Figure 8This is a schematic diagram of a device for correcting the detection results of machines and non-humans provided in an embodiment of the present invention. This embodiment of the invention is applicable to situations where, after preliminary detection of the image to be detected, errors are corrected in the obtained machine and non-human detection results. This error correction device can be implemented in software and / or hardware, and is generally integrated into any electronic device with network communication capabilities, such as a mobile terminal, PC, or server. Figure 8 As shown, the device includes:
[0194] The location information determination module 801 is used to determine the detection target in the image to be detected whose initial detection category information is pedestrian or two-wheeled vehicle, and to determine the location information of the detection target;
[0195] The feature extraction information determination module 802 is used to determine the feature extraction information corresponding to the detection target from the complete feature extraction result of the image to be detected based on the location information;
[0196] The target feature information determination module 803 is used to determine target feature information based on the feature extraction information according to the pre-determined differential feature location information.
[0197] The similarity determination module 804 is used to determine the pedestrian category similarity between the target feature information and the pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and the pre-established two-wheeled vehicle feature database;
[0198] The category correction result determination module 805 is used to determine the category correction result of the detected target based on the pedestrian category similarity and the two-wheeled vehicle category similarity.
[0199] As an optional but non-limiting implementation, the differentiated feature location information is determined by the difference results of feature blocks extracted from the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix; the feature blocks are extracted from the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix through a sliding window with adaptive size; correspondingly, the specific process of determining the differentiated feature location information includes:
[0200] The difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix are initialized; wherein, the difference parameters include different types of difference parameters; the type of difference parameter corresponds to a sliding window of different size;
[0201] Feature extraction is performed on the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix using an initial-size sliding window to obtain initial pedestrian feature blocks and initial two-wheeled vehicle feature blocks, and the differences between the initial pedestrian feature blocks and the initial two-wheeled vehicle feature blocks are determined; wherein, the positions of the initial pedestrian feature blocks and the initial two-wheeled vehicle feature blocks correspond;
[0202] If the difference is greater than the difference threshold, then modify the first type of difference parameter of each feature unit in the initial pedestrian feature block and the initial two-wheeled vehicle feature block, and expand the sliding window; otherwise, continue to use the initial size sliding window for feature extraction.
[0203] The reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix are further extracted using an expanded sliding window to obtain expanded pedestrian feature blocks and expanded two-wheeled vehicle feature blocks, and the differences between the expanded pedestrian feature blocks and the expanded two-wheeled vehicle feature blocks are determined; wherein the expanded pedestrian feature blocks and the expanded two-wheeled vehicle feature blocks are located in corresponding positions;
[0204] If the difference is greater than the difference threshold, then modify the first type of difference parameter of each feature unit in the expanded pedestrian feature block and the expanded two-wheeled vehicle feature block, as well as the other type of difference parameters of the center point feature unit, and determine whether to continue to expand the sliding window according to the feature extraction requirements, and continue to perform feature extraction using the expanded sliding window, or continue to perform feature extraction using the current sliding window; otherwise, shrink the current sliding window, and continue to perform feature extraction using the shrunken sliding window.
[0205] Until the sliding of the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix is completed, the corresponding differential feature position information is determined according to the difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix.
[0206] As an optional but non-limiting implementation, the corresponding differentiated feature location information is determined based on the difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix, including:
[0207] Determine the target feature units in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix that satisfy the conditions for the first type of difference parameters;
[0208] Determine other class difference parameters of the target feature unit. If the other class difference parameters meet the conditions, determine the differential feature location information according to the sliding window size corresponding to the other class difference parameters.
[0209] Otherwise, the location information of the differentiated features is determined based on the target feature unit.
[0210] As an optional but non-limiting implementation, the target feature information determination module 803 includes: a multi-scale feature information extraction unit, a first reference scale feature information determination unit, a second reference scale feature information determination unit, and a target feature information determination unit. Wherein:
[0211] A multi-scale feature information extraction unit is used to extract multi-scale feature information of the corresponding position from the feature extraction information based on the pre-determined differential feature position information;
[0212] The first reference scale feature information determination unit is used to perform sliding splitting on the feature information larger than the reference scale in the multi-scale feature information to obtain multiple reference scale feature information.
[0213] The second reference scale feature information determination unit is used to connect and then split the feature information smaller than the reference scale in the multi-scale feature information to obtain multiple reference scale feature information.
[0214] The target feature information determination unit is used to connect the reference scale feature information obtained from the multi-scale feature information to obtain the target feature information.
[0215] As an optional but non-limiting implementation, the similarity determination module 804 includes a pedestrian category similarity determination unit and a two-wheeled vehicle category similarity determination unit, wherein:
[0216] The pedestrian category similarity determination unit is used to determine the pedestrian similarity between the target feature information and each pedestrian feature information in the pedestrian feature database, and to take the maximum value of the pedestrian similarity as the pedestrian category similarity.
[0217] The two-wheeled vehicle category similarity determination unit is used to determine the two-wheeled vehicle similarity between the target feature information and each two-wheeled vehicle feature information in the two-wheeled vehicle feature library, and to take the maximum value of the two-wheeled vehicle similarity as the two-wheeled vehicle category similarity.
[0218] As an optional but non-limiting implementation, the initial detection category information is determined by a pre-constructed basic detection network; the complete feature extraction result is the feature extraction result of multiple feature extraction layers in the basic detection network, and the multiple feature extraction layers include at least one shallow feature extraction layer.
[0219] As an optional but non-limiting implementation, the apparatus further includes:
[0220] A velocity correction module is used to correct the velocity of the detection target based on a predetermined regional distortion factor and the position information of the detection target, so as to obtain the relative velocity of the detection target.
[0221] The target category probability determination module is used to determine the first pedestrian category probability and the first two-wheeled vehicle category probability of the target based on a voting mechanism and the relative speed of the target in multiple frames of images to be detected.
[0222] The second two-wheeled vehicle category probability determination module is used to determine the second pedestrian category probability based on the pedestrian category similarity and the second two-wheeled vehicle category probability based on the two-wheeled vehicle category similarity.
[0223] The target pedestrian category probability determination module is used to determine the target pedestrian category probability based on the first pedestrian category probability and the second pedestrian category probability;
[0224] The target two-wheeled vehicle category probability determination module is used to determine the target two-wheeled vehicle category probability based on the first two-wheeled vehicle category probability and the second two-wheeled vehicle category probability.
[0225] The error correction result determination module is used to determine the error correction result of the detected target based on the target pedestrian category probability and the target two-wheeled vehicle category probability.
[0226] As an optional but non-limiting implementation, the regional distortion factor is determined based on the driving data of a reference vehicle;
[0227] The process of determining the regional distortion factor includes:
[0228] The detection location range and detection time range of the reference vehicle are determined based on the driving data of the reference vehicle.
[0229] The detection location range is divided into multiple detection areas based on the detection time range.
[0230] The regional distortion factor of the detection area is determined based on the average speed of the reference motor vehicle in each detection area.
[0231] As an optional but non-limiting implementation, the target category probability determination module includes: a score determination unit, a voting weight determination unit, and a target category probability determination unit, wherein:
[0232] The scoring determination unit is used to determine the pedestrian score and the two-wheeled vehicle score of the detected target in each frame of the image to be detected based on the comparison results between the relative speed of the detected target in multiple frames of the image to be detected and the pre-determined relative speed threshold range for pedestrians and relative speed threshold range for two-wheeled vehicles.
[0233] The voting weight determination unit is used to determine the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score of the detected target in each frame of the detected image based on the position change information of the detected target in the multiple frames of the image to be detected.
[0234] The target category probability determination unit is used to determine the first pedestrian category probability and the first two-wheeled vehicle category probability based on the pedestrian score and two-wheeled vehicle score of the target in the multi-frame image to be detected and the corresponding voting weight.
[0235] As an optional but non-limiting implementation, the voting weight determination unit includes: a previous frame position information determination subunit, a movement direction determination subunit, a voting weight increase subunit, and a voting weight decrease subunit. Wherein:
[0236] The previous frame position information determination subunit is used to sequentially determine the current position information of the detection target in each frame of the image to be detected and the previous frame position information in the corresponding previous frame of the image to be detected.
[0237] The movement direction determination subunit is used to determine the movement direction of the detected target based on the current position information and the change information of the previous frame position information; wherein, the movement direction includes facing the image acquisition device and facing away from the image acquisition device;
[0238] The voting weight increase subunit is used to increase the voting weight corresponding to the pedestrian score and the two-wheeled vehicle score determined in the frame of the image to be detected if the movement direction of the detected target in the frame is facing the image acquisition device.
[0239] The voting weight reduction subunit is used to reduce the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score determined in the frame of the image to be detected if the direction of movement of the detected target in the frame is away from the image acquisition device.
[0240] The error correction device for machine and non-human detection results provided in this embodiment of the invention can execute the error correction method for machine and non-human detection results provided in any embodiment of the invention, and has the corresponding functional modules and beneficial effects of executing the error correction method for machine and non-human detection results.
[0241] The acquisition, storage, use, and processing of data in this invention comply with relevant national laws and regulations and do not violate public order and good morals.
[0242] According to embodiments of the present invention, the present invention also provides an electronic device, a readable storage medium, and a computer program product.
[0243] Figure 9A schematic diagram of an electronic device 10 that can be used to implement embodiments of the present invention is shown. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device can also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the invention described and / or claimed herein.
[0244] like Figure 9 As shown, the electronic device 10 includes at least one processor 11 and a memory, such as a read-only memory (ROM) 12 or a random access memory (RAM) 13, communicatively connected to the at least one processor 11. The memory stores computer programs executable by the at least one processor. The processor 11 can perform various appropriate actions and processes based on the computer program stored in the ROM 12 or loaded from storage unit 18 into the RAM 13. The RAM 13 may also store various programs and data required for the operation of the electronic device 10. The processor 11, ROM 12, and RAM 13 are interconnected via a bus 14. An input / output (I / O) interface 15 is also connected to the bus 14.
[0245] Multiple components in electronic device 10 are connected to I / O interface 15, including: input unit 16, such as keyboard, mouse, etc.; output unit 17, such as various types of displays, speakers, etc.; storage unit 18, such as disk, optical disk, etc.; and communication unit 19, such as network card, modem, wireless transceiver, etc. Communication unit 19 allows electronic device 10 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.
[0246] Processor 11 can be a variety of general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various processors running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. Processor 11 performs the various methods and processes described above, such as error correction methods for machine-to-human detection results.
[0247] In some embodiments, the error correction method for non-human detection results can be implemented as a computer program tangibly contained in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program can be loaded and / or installed on electronic device 10 via ROM 12 and / or communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the error correction method for non-human detection results described above can be performed. Alternatively, in other embodiments, processor 11 can be configured to perform the error correction method for non-human detection results by any other suitable means (e.g., by means of firmware).
[0248] Various embodiments of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific reference products (ASSPs), systems-on-chip (SoCs), complex programmable logic devices (CPLDs), computer hardware, firmware, software, and / or combinations thereof. These various embodiments may include implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transferring data and instructions to the storage system, the at least one input device, and the at least one output device.
[0249] Computer programs used to implement the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, such that when executed by the processor, the computer programs cause the functions / operations specified in the flowcharts and / or block diagrams to be performed. The computer programs may be executed entirely on a machine, partially on a machine, or as a standalone software package, partially on a machine and partially on a remote machine, or entirely on a remote machine or server.
[0250] In the context of this invention, a computer-readable storage medium can be a tangible medium that may contain or store a computer program for use by or in conjunction with an instruction execution system, apparatus, or device. A computer-readable storage medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination thereof. Alternatively, a computer-readable storage medium may be a machine-readable signal medium. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fibers, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.
[0251] To provide interaction with a user, the systems and techniques described herein can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user; and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the electronic device. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).
[0252] The systems and technologies described herein can be implemented in computing systems that include back-end components (e.g., as data servers), or computing systems that include switching components (e.g., application servers), or computing systems that include front-end components (e.g., user computers with graphical user interfaces or web browsers through which users can interact with implementations of the systems and technologies described herein), or any combination of such back-end, switching, or front-end components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., communication networks). Examples of communication networks include local area networks (LANs), wide area networks (WANs), blockchain networks, and the Internet.
[0253] A computing system can include clients and servers. Clients and servers are generally located far apart and typically interact through communication networks. The client-server relationship is created by computer programs running on the respective computers and having a client-server relationship with each other. The server can be a cloud server, also known as a cloud computing server or cloud host, which is a hosting product within the cloud computing service system to address the shortcomings of traditional physical hosts and VPS services, such as high management difficulty and weak business scalability.
[0254] It should be understood that the various forms of processes shown above can be used, with steps reordered, added, or deleted. For example, the steps described in this invention can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution of this invention can be achieved, and this is not limited herein.
[0255] The specific embodiments described above do not constitute a limitation on the scope of protection of this invention. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this invention should be included within the scope of protection of this invention.
Claims
1. A method for error correction of non-human detection results, characterized in that, The method includes: The detection target in the image to be detected is determined to be either a pedestrian or a two-wheeled vehicle, based on the initial detection category information; and the location information of the detection target is determined. Based on the location information, the feature extraction information corresponding to the detection target is determined from the complete feature extraction result of the image to be detected; Based on the predetermined differential feature location information, the target feature information is determined according to the feature extraction information; Determine the pedestrian category similarity between the target feature information and the pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and the pre-established two-wheeled vehicle feature database; The category correction result of the detected target is determined based on the similarity between the pedestrian category and the two-wheeled vehicle category.
2. The method of claim 1, wherein, in, The differential feature location information is determined by the difference results of feature blocks extracted from the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix; the feature blocks are extracted from the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix by sliding windows with adaptive size. The specific determination process is as follows: The difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix are initialized; wherein, the difference parameters include different types of difference parameters; the type of difference parameter corresponds to a sliding window of different size; Feature extraction is performed on the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix using an initial-size sliding window to obtain initial pedestrian feature blocks and initial two-wheeled vehicle feature blocks, and the differences between the initial pedestrian feature blocks and the initial two-wheeled vehicle feature blocks are determined; wherein, the positions of the initial pedestrian feature blocks and the initial two-wheeled vehicle feature blocks correspond; If the difference is greater than the difference threshold, then modify the first type of difference parameter of each feature unit in the initial pedestrian feature block and the initial two-wheeled vehicle feature block, and expand the sliding window; otherwise, continue to use the initial size sliding window for feature extraction. The reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix are further extracted using an expanded sliding window to obtain expanded pedestrian feature blocks and expanded two-wheeled vehicle feature blocks, and the differences between the expanded pedestrian feature blocks and the expanded two-wheeled vehicle feature blocks are determined; wherein the expanded pedestrian feature blocks and the expanded two-wheeled vehicle feature blocks are located in corresponding positions; If the difference is greater than the difference threshold, then modify the first type of difference parameter of each feature unit in the expanded pedestrian feature block and the expanded two-wheeled vehicle feature block, as well as the other type of difference parameters of the center point feature unit, and determine whether to continue to expand the sliding window according to the feature extraction requirements, and continue to perform feature extraction using the expanded sliding window, or continue to perform feature extraction using the current sliding window; otherwise, shrink the current sliding window, and continue to perform feature extraction using the shrunken sliding window. Until the sliding of the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix is completed, the corresponding differential feature position information is determined according to the difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix.
3. The method of claim 2, wherein, Based on the difference parameters of each feature unit in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix, the corresponding differentiated feature location information is determined, including: Determine the target feature units in the reference pedestrian feature matrix and the reference two-wheeled vehicle feature matrix that satisfy the conditions for the first type of difference parameters; Determine other class difference parameters of the target feature unit. If the other class difference parameters meet the conditions, determine the differential feature location information according to the sliding window size corresponding to the other class difference parameters. Otherwise, the location information of the differentiated features is determined based on the target feature unit.
4. The method according to claim 2 or 3, characterized in that, Based on pre-determined differential feature location information, target feature information is determined according to the feature extraction information, including: Based on the pre-determined differential feature location information, multi-scale feature information of the corresponding location is extracted from the feature extraction information; The feature information larger than the reference scale in the multi-scale feature information is subjected to sliding splitting to obtain multiple reference scale feature information; After connecting and splitting the feature information smaller than the reference scale in the multi-scale feature information, multiple reference scale feature information are obtained. The reference scale feature information obtained from the multi-scale feature information is concatenated to obtain the target feature information.
5. The method according to claim 1, characterized in that, Determining the pedestrian category similarity between the target feature information and a pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and a pre-established two-wheeled vehicle feature database, includes: The pedestrian similarity between the target feature information and each pedestrian feature information in the pedestrian feature database is determined respectively, and the maximum value of the pedestrian similarity is taken as the pedestrian category similarity; The similarity between the target feature information and each two-wheeled vehicle feature information in the two-wheeled vehicle feature library is determined respectively, and the maximum value of the two-wheeled vehicle similarity is taken as the two-wheeled vehicle category similarity.
6. The method according to claim 1, characterized in that, Before determining the category correction result of the detected target based on the pedestrian category similarity and the two-wheeled vehicle category similarity, the method further includes: The velocity of the detected target is corrected based on a predetermined regional distortion factor and the position information of the detected target to obtain the relative velocity of the detected target; Based on the voting mechanism, the probability of the first pedestrian category and the probability of the first two-wheeled vehicle category of the detected target are determined according to the relative speed of the detected target in multiple frames of images to be detected. The probability of a second pedestrian category is determined based on the pedestrian category similarity, and the probability of a second two-wheeled vehicle category is determined based on the two-wheeled vehicle category similarity. The target pedestrian category probability is determined based on the first pedestrian category probability and the second pedestrian category probability; The target two-wheeled vehicle category probability is determined based on the first two-wheeled vehicle category probability and the second two-wheeled vehicle category probability. The error correction result of the detected target is determined based on the probability of the target pedestrian category and the probability of the target two-wheeled vehicle category.
7. The method according to claim 6, characterized in that, The regional distortion factor is determined based on the driving data of the reference vehicle. The process of determining the regional distortion factor includes: The detection location range and detection time range of the reference vehicle are determined based on the driving data of the reference vehicle. The detection location range is divided into multiple detection areas based on the detection time range. The regional distortion factor of the detection area is determined based on the average speed of the reference motor vehicle in each detection area.
8. The method according to claim 6 or 7, characterized in that, Based on a voting mechanism, and according to the relative speeds of the detected targets in multiple frames of images to be detected, the probabilities of the first pedestrian category and the first two-wheeled vehicle category of the detected targets are determined, including: Based on the comparison results between the relative speed of the detected target in multiple frames of images to be detected and the predetermined relative speed threshold ranges for pedestrians and two-wheeled vehicles, the pedestrian score and two-wheeled vehicle score of the detected target in each frame of images to be detected are determined. Based on the position change information of the detected target in the multiple frames of images to be detected, the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score of the detected target in each frame of images to be detected are determined respectively. Based on the pedestrian score and two-wheeled vehicle score of the detected target in the multi-frame image to be detected, and the corresponding voting weight, the probability of the first pedestrian category and the probability of the first two-wheeled vehicle category are determined.
9. The method according to claim 8, characterized in that, Based on the position change information of the detected target in the multiple frames of images to be detected, the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score of the detected target in each frame of the images to be detected are determined, including: The current position information of the target in each frame of the image to be detected and the previous frame position information in the corresponding previous frame of the image to be detected are determined sequentially. Based on the current location information and the change information of the previous frame location information, the movement direction of the detected target is determined; wherein, the movement direction includes facing the image acquisition device and facing away from the image acquisition device; If the movement direction of the target in the frame to be detected is towards the image acquisition device, then the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score determined in the frame to be detected are increased. If the direction of movement of the target in the frame to be detected is away from the image acquisition device, then the voting weights corresponding to the pedestrian score and the two-wheeled vehicle score determined in the frame to be detected are reduced.
10. A device for correcting machine / non-human detection results, characterized in that, The device includes: The location information determination module is used to determine the detection target in the image to be detected whose initial detection category information is pedestrian or two-wheeled vehicle, and to determine the location information of the detection target; The feature extraction information determination module is used to determine the feature extraction information corresponding to the detection target from the complete feature extraction results of the image to be detected based on the location information; The target feature information determination module is used to determine target feature information based on the feature extraction information according to the pre-determined differential feature location information. The similarity determination module is used to determine the pedestrian category similarity between the target feature information and the pre-established pedestrian feature database, and the two-wheeled vehicle category similarity between the target feature information and the pre-established two-wheeled vehicle feature database; The category correction result determination module is used to determine the category correction result of the detected target based on the pedestrian category similarity and the two-wheeled vehicle category similarity.