A method and apparatus for identifying a type of vehicle stop

By constructing the temporal features of the target vehicle and using an attention mechanism to fuse information from multiple time points, the problems of poor generalization and insufficient joint prediction in vehicle parking type identification in existing technologies are solved, thereby improving identification accuracy and system efficiency.

CN122223950APending Publication Date: 2026-06-16BEIJING VOYAGER TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING VOYAGER TECH CO LTD
Filing Date
2024-12-13
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies, especially for identifying vehicle parking types, suffer from poor generalization, inability to jointly predict multiple target vehicles, and lack of temporal attention interaction, which affects the intelligence and efficiency of the perception and control systems of autonomous vehicles.

Method used

By determining the historical pose of the target vehicle at multiple historical moments and the pose at the target moment, the first temporal feature of the target vehicle is constructed. The position and attribute information at different moments are fused using an attention mechanism, and combined with map features and traffic participant information, the parking type of the vehicle is identified.

🎯Benefits of technology

It improves the accuracy of vehicle parking type recognition, reduces computational load and cache pressure, and enhances recognition capabilities in complex environments.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122223950A_ABST
    Figure CN122223950A_ABST
Patent Text Reader

Abstract

According to an embodiment of the present disclosure, a method and device for identifying a parking type of a vehicle are provided. The method comprises: determining a plurality of historical poses of a target vehicle in a traffic scene at a plurality of historical time instants; determining reference information based on the plurality of historical poses and a target pose of the target vehicle at a target time instant, the reference information indicating at least a positional relationship of the target pose relative to the plurality of historical poses; constructing a first time sequence feature of the target vehicle based on the reference information and a set of object attributes of the target vehicle at the target time instant and the plurality of historical time instants; and determining a parking type of the target vehicle based on at least the first time sequence feature of the target vehicle, the parking type being associated with an expected parking time of the target vehicle. In this way, embodiments of the present disclosure can improve the accuracy of identifying the parking type of the vehicle.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The exemplary embodiments disclosed herein generally relate to the field of computers, and particularly to a method, apparatus, device, computer-readable storage medium, and computer program product for identifying vehicle parking types. Background Technology

[0002] With the rapid development of transportation systems, vehicle recognition and motion detection are widely used in areas such as road traffic safety and autonomous driving. Long-term parking recognition is a binary classification task that determines whether a target vehicle intends to park for an extended period or move away based on information from traffic participants and the road. It provides crucial reference for downstream decision-making path planning, enabling vehicles to maintain their position or change lanes. The ability of autonomous vehicles to accurately and promptly identify long-term parking directly determines the intelligence and efficiency of the perception and control system. Summary of the Invention

[0003] In a first aspect of this disclosure, a method for identifying vehicle parking types is provided. The method includes: determining multiple historical poses of a target vehicle in a traffic scene at multiple historical moments; determining reference information based on the multiple historical poses and a target pose of the target vehicle at a target moment, the reference information indicating at least the positional relationship of the target pose relative to the multiple historical poses; constructing a first temporal feature of the target vehicle based on the reference information, a set of object attributes of the target vehicle at the target moment, and the multiple historical moments; and determining the parking type of the target vehicle based at least on the first temporal feature of the target vehicle, the parking type being associated with the expected parking time of the target vehicle.

[0004] In a second aspect of this disclosure, an apparatus for identifying vehicle parking types is provided. The apparatus includes: a first determining module configured to determine multiple historical poses of a target vehicle in a traffic scene at multiple historical times; a second determining module configured to determine reference information based on the multiple historical poses and a target pose of the target vehicle at a target time, the reference information indicating at least the positional relationship of the target pose relative to the multiple historical poses; a constructing module configured to construct a first temporal feature of the target vehicle based on the reference information, a set of object attributes of the target vehicle at the target time, and the multiple historical poses; and a third determining module configured to determine the parking type of the target vehicle based at least on the first temporal feature of the target vehicle, the parking type being associated with the expected parking time of the target vehicle.

[0005] In a third aspect of this disclosure, an electronic device is provided. The device includes at least one processing unit; and at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit. When executed by the at least one processing unit, the instructions cause the device to perform the method of the first aspect.

[0006] In a fourth aspect of this disclosure, a computer-readable storage medium is provided. The computer-readable storage medium stores a computer program that can be executed by a processor to implement the method of the first aspect.

[0007] In a fifth aspect of this disclosure, a computer program product is provided. The computer program product includes computer-executable instructions that, when executed by a processor, implement the method of the first aspect.

[0008] It should be understood that the content described in this summary section is not intended to limit the key or essential features of the embodiments of this disclosure, nor is it intended to restrict the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description

[0009] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. In the drawings, the same or similar reference numerals denote the same or similar elements, wherein:

[0010] Figure 1 A schematic diagram of an identification system that can be implemented in accordance with embodiments of the present disclosure is shown;

[0011] Figure 2 A flowchart illustrating the identification of vehicle parking types according to some embodiments of the present disclosure is shown;

[0012] Figures 3A to 3B A schematic diagram illustrating the process for determining a second timing feature according to some embodiments of the present disclosure is shown;

[0013] Figure 4 A schematic structural block diagram of an example device for identifying vehicle parking type according to some embodiments of the present disclosure is shown; and

[0014] Figure 5 A block diagram of an apparatus capable of implementing several embodiments of the present disclosure is shown. Detailed Implementation

[0015] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0016] It should be noted that the headings of any section / subsection provided herein are not limiting. Various embodiments are described throughout this document, and embodiments of any type may be included under any section / subsection. Furthermore, embodiments described in any section / subsection may be combined in any way with any other embodiments described in the same section / subsection and / or different sections / subsections.

[0017] In the description of embodiments of this disclosure, the term "comprising" and similar terms should be understood as open-ended inclusion, i.e., "including but not limited to". The term "based on" should be understood as "at least partially based on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The term "some embodiments" should be understood as "at least some embodiments". Other explicit and implicit definitions may also be included below. The terms "first", "second", etc., may refer to different or the same objects. Other explicit and implicit definitions may also be included below.

[0018] The embodiments of this disclosure may involve user data, data acquisition, and / or use. All of these aspects comply with applicable laws, regulations, and relevant provisions. In the embodiments of this disclosure, all data collection, acquisition, processing, manipulation, forwarding, and use are conducted with the user's knowledge and confirmation. Accordingly, in implementing the embodiments of this disclosure, the type, scope of use, and usage scenarios of any data or information that may be involved should be communicated to the user and their authorization obtained in accordance with relevant laws and regulations through appropriate means. The specific methods of notification and / or authorization may vary depending on the actual situation and application scenario, and the scope of this disclosure is not limited in this respect.

[0019] In this specification and the embodiments, any processing of personal information will be carried out only under the premise of legality (such as obtaining the consent of the personal information subject, or being necessary for the performance of a contract), and will only be carried out within the scope stipulated or agreed upon. A user's refusal to process personal information other than that necessary for basic functions will not affect the user's use of basic functions.

[0020] As briefly mentioned earlier, with the rapid development of transportation systems, vehicle recognition and motion detection are widely used in areas such as road traffic safety and autonomous driving. Long-term parking recognition is a binary classification task that determines whether a target vehicle intends to park for an extended period or move away based on information from traffic participants and the road. It provides crucial reference for downstream decision-making path planning to determine whether to maintain or change lanes. The ability of autonomous vehicles to accurately and promptly identify long-term parking directly determines the intelligence and efficiency of the perception and control system.

[0021] Traditional long-stop recognition solutions include the XGBoost (X-Gradient Boosting Library) ensemble learning algorithm and the VectorNet algorithm. On the one hand, the XGBoost ensemble learning algorithm has weak understanding of complex environments and poor generalization. On the other hand, the VectorNet algorithm has problems such as requiring normalized encoding of positions, being unable to jointly predict multiple target vehicles, lacking temporal attention interaction, and having a relatively simple spatial attention mechanism.

[0022] Embodiments of this disclosure propose a scheme for identifying vehicle parking types. According to various embodiments of this disclosure, multiple historical poses of a target vehicle in a traffic scene at multiple historical moments are determined; based on the multiple historical poses and the target vehicle's target pose at a target moment, reference information is determined, the reference information at least indicating the positional relationship of the target pose relative to the multiple historical poses; based on the reference information, a set of object attributes of the target vehicle at the target moment and at multiple historical moments, a first temporal feature of the target vehicle is constructed; and based at least on the first temporal feature of the target vehicle, the parking type of the target vehicle is determined, the parking type being associated with the target vehicle's expected parking time.

[0023] In this way, embodiments of the present disclosure can improve the accuracy of identifying vehicle parking types (e.g., long-term parking or short-term parking).

[0024] Example Environment

[0025] Figure 1 A schematic diagram of an identification system 100 that can be implemented in an embodiment of the present disclosure is shown. The identification system 100 can be used to identify whether a vehicle is parked for an extended period of time (i.e., whether the vehicle remains parked for a predetermined period of time).

[0026] like Figure 1 As shown, the identification system 100 may include a first encoder 120, a second encoder 130, a third encoder 140, and a fourth encoder 150. The first encoder 120 may be any suitable time encoder, and the second encoder 130, the third encoder 140, and the fourth encoder 150 may be any suitable spatial encoder.

[0027] exist Figure 1 In the recognition system 100, the recognition system 100 can obtain traffic participant information 110 and map information 115 within a preset area from the vectorized scene 105. The vectorized scene 105 can be a vectorized representation constructed based on perceptual information and / or map data (e.g., high-precision map data).

[0028] The identification system 100 can process the traffic participant information 110 using the first encoder 120 to obtain a first temporal feature. After obtaining the first temporal feature, the identification system 100 can determine the parking type of the target vehicle based at least on the first temporal feature.

[0029] The specific process for determining the type of stop will be discussed below. Figure 2 Detailed description.

[0030] In some implementations, the identification system 100 can be deployed in servers or autonomous vehicles. Such servers can be, for example, standalone physical servers, server clusters or distributed systems composed of multiple physical servers, or cloud servers providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks, and big data and artificial intelligence platforms. Electronic devices can include, for example, computing systems / servers, such as mainframes, edge computing nodes, computing devices in cloud environments, and so on.

[0031] In some implementations, an autonomous vehicle can be any type of vehicle capable of carrying people and / or goods and moving via a power system such as an engine, including but not limited to cars, trucks, buses, electric vehicles, RVs, etc. An autonomous vehicle can also be an automated driving vehicle (also known as an autonomous vehicle) that integrates functions such as environmental perception, planning and decision-making, and multi-level assisted driving.

[0032] It should be understood that the structure and function of the identification system 100 are described for illustrative purposes only and do not imply any limitation on the scope of this disclosure.

[0033] Example process

[0034] Figure 2 A flowchart of an example process 200 for identifying vehicle parking type according to some embodiments of the present disclosure is shown. Process 200 can be implemented at identification system 100. Reference is made below. Figure 1 Describe the process 200.

[0035] like Figure 2 As shown in box 210, the recognition system 100 determines multiple historical poses of the target vehicle in the traffic scene at multiple historical moments.

[0036] As an example, such as Figure 1As shown, the recognition system 100 can acquire multiple historical poses of a target vehicle at multiple historical moments from a traffic scene. The traffic scene can be a vectorized scene 105, which can be determined based on perception information collected by a pre-set acquisition device. The acquisition device can be, for example, a acquisition vehicle used for traffic scene acquisition. The pose indicates the position and orientation of the target vehicle's own coordinate system.

[0037] To reduce the cache pressure on the online deployment recognition system 100 and the overfitting problem in the vehicle parking type recognition process, in some embodiments, the recognition system 100 can determine multiple historical poses of the target vehicle at multiple historical moments based on a second upper limit. The number of multiple historical poses is less than the second upper limit.

[0038] As an example, the recognition system 100 can reduce the number of historical poses that should be acquired during the vehicle parking type recognition process to less than a second number to alleviate cache pressure and overfitting issues.

[0039] Continue to refer to Figure 2 In box 220, the recognition system 100 determines reference information based on multiple historical poses and the target vehicle's pose at the target time. The reference information at least indicates the positional relationship of the target pose relative to the multiple historical poses. In this way, the information interaction of the target vehicle at different times can be enhanced.

[0040] As an example, such as Figure 1 As shown, the recognition system 100 can determine the positional relationship between multiple historical poses and the target pose based on the poses of the target vehicle at multiple historical moments and the pose at the target moment. Such positional relationships can be, for example, the distance, direction, orientation, etc., between the target pose and the multiple historical poses. The first encoder 120 can be, for example, any suitable time encoder.

[0041] In some embodiments, the identification system 100 may determine relative distance information corresponding to a set of historical positions based on a set of historical positions indicated by multiple historical poses and a target position indicated by a target pose, as at least part of the reference information.

[0042] For example, the target vehicle is at a first position at a first moment and at a second position at a second moment. The recognition system 100 can determine the relative distance information between the first and second positions based on the first and second positions, and use this relative distance information as part of the reference information. The relative distance information can be calculated based on the following formula:

[0043] ||p s -p t ||2 (1)

[0044] Where p represents the location feature, and s and t correspond to different timestamps.

[0045] In some embodiments, the identification system 100 may determine relative direction information corresponding to a set of historical locations and a target location as at least part of the reference information.

[0046] As an example, the recognition system 100 can determine a set of angular offsets (i.e., relative direction information) of a set of historical positions relative to the target position, based on the target position and a set of historical positions. The relative direction information can be calculated based on the following formula:

[0047]

[0048] in, and This represents the coordinates of the target vehicle at time t. and Let θ represent the coordinates of the target vehicle at time s, and let θ represent the orientation of the target vehicle.

[0049] In some embodiments, the identification system 100 may determine relative orientation information corresponding to a set of historical poses based on orientation information and the target orientation indicated by the target position, as at least part of the reference information.

[0050] As an example, the recognition system 100 can calculate the difference between the target orientation and a set of reference orientations corresponding to a set of historical poses to determine the relative orientation information corresponding to a set of historical poses. The relative orientation information can be calculated based on the following formula:

[0051] θ s -θ t (3)

[0052] In some embodiments, the reference information further includes time interval information between the target time and multiple historical times. As an example, the identification system 100 can calculate the differences between the target time and the multiple historical times separately to determine the time interval information between the target time and the multiple historical times.

[0053] Continue to refer to Figure 2 In box 230, the recognition system 100 constructs the first temporal feature of the target vehicle based on reference information and a set of object attributes of the target vehicle at the target time and multiple historical times. Such a set of object attributes may be, for example, the category, size, taillight status, etc. of the target vehicle at the target time and multiple historical times.

[0054] As an example, such as Figure 1As shown, the recognition system 100 can construct the input sequence of the first encoder 120 based on reference information, a set of object attributes of the target vehicle at the target time and multiple historical times. Further, the recognition system 100 can input the input sequence into the first encoder 120 to construct the first temporal feature 125 of the target vehicle.

[0055] In some embodiments, the recognition system 100 may construct a first temporal feature using an attention mechanism in the first encoder 120. Specifically, the recognition system 100 may generate initial temporal features based on a set of object attributes. Further, the recognition system 100 may utilize the attention mechanism to update the initial temporal features based on reference features corresponding to reference information, in order to construct the first temporal feature of the target vehicle.

[0056] As an example, after determining the reference information, the recognition system 100 can convert the reference information into Fourier features and generate relative position features (i.e., reference features) through a multilayer perceptron. For a set of object attributes at the target time and multiple historical time points, the recognition system 100 can also generate a set of object attribute features (i.e., initial temporal features) through Fourier transform and multilayer perceptron.

[0057] Furthermore, the recognition system 100 can construct a first temporal feature by updating a set of object attribute features (i.e., initial temporal features) based on the attention mechanism and relative position features (i.e., reference features).

[0058] The specific process of constructing the first temporal feature using the attention mechanism is as follows: The recognition system 100 can perform a linear mapping between the relative position feature (i.e., reference feature) and a set of object attribute features (i.e., initial temporal feature) to determine a set of reference vectors. A set of reference vectors includes a query vector corresponding to the target time, multiple key vectors corresponding to the target time and multiple historical time points, and multiple value vectors. A set of reference vectors can be obtained based on the following formula:

[0059]

[0060] Where, r t→s This represents a relative positional feature (i.e., a reference feature). z t and z s This represents the object attribute characteristics (i.e., initial temporal characteristics) at timestamps t and s. q represents the learnable parameters of a linear mapping. t Represents the query vector, k ts Represents the key vector, v ts Represents a value vector.

[0061] After determining a set of reference vectors, the recognition system 100 can use the query vector and key vector to calculate the dot product and normalize it to obtain the attention weights. The attention weights can be obtained based on the following formula:

[0062]

[0063] Where T represents multiple historical moments, and d k Let α be the dimension of the vector. t This represents the attention weight.

[0064] Furthermore, the recognition system 100 can use attention weights to perform a weighted summation of the value vectors to obtain a context representation. The context representation can be obtained based on the following formula:

[0065] m t =∑ s∈T α t v ts (6)

[0066] Furthermore, the recognition system 100 can transform the concatenation of object attribute features and context representation at the target time using a linear mapping. Further, the recognition system 100 can obtain a gating signal using the sigmoid function. The gating signal can be obtained based on the following formula:

[0067] g t =sigmoid(W gate [z t ,m t (7)

[0068] Finally, the recognition system 100 can construct a first temporal feature based on the gating signal, context representation, and object attribute features at the target time. The first temporal feature can be obtained based on the following formula:

[0069] z′ t =g t ⊙W self z t +(1-g t )⊙m t (8)

[0070] Where, m t Indicates context representation, g t This represents the gating signal, z′ t W represents the first time-series feature. gate W selfThese are the learnable parameters of the linear mapping. Through attention and gating mechanisms, relative positional and attribute information between different time points in the input data is effectively fused. This approach helps the recognition system 100 capture more relevant information when handling complex scenes and improves recognition accuracy.

[0071] Continue to refer to Figure 2 In box 240, the identification system 100 determines the parking type of the target vehicle based at least on a first temporal feature of the target vehicle, the parking type being associated with the expected parking time of the target vehicle. Such parking types include: long parking, short parking, and moving parking. A long parking type indicates that the target vehicle has been parked for a duration exceeding a preset time; a short parking type indicates that the target vehicle has been parked for a duration not exceeding a preset time; and a moving parking type indicates that the target vehicle is in a moving state.

[0072] As an example, such as Figure 1 As shown, the recognition system 100 can call the fourth encoder 150 to process the first temporal feature 125 to obtain the recognition result 160 (i.e., the parking type of the target vehicle).

[0073] In some embodiments, the identification system 100 may update a first temporal feature based on the distance between the target vehicle and a group of traffic participants. Specifically, the identification system 100 may identify a group of traffic participants associated with the target vehicle. The distance from the group of traffic participants to the target vehicle is less than a first distance. Such a group of traffic participants may be, for example, other vehicles in the traffic scene that are distinct from the target vehicle, pedestrians, bicycles, motorcycles, etc., in the traffic scene.

[0074] As an example, such as Figure 3A As shown, Figure 3A The system includes a target vehicle 310, a first traffic participant 320, and a second traffic participant 330. The identification system 100 can identify the first traffic participant 320 whose distance from the target vehicle 310 is less than a first distance. The first distance can be set as needed by those skilled in the art, and this disclosure does not limit it.

[0075] After identifying a group of traffic participants associated with the target vehicle, the identification system 100 can determine a first set of distances from the traffic participants to the target vehicle. For example, such as... Figure 3A As shown, the identification system 100 can determine the distance between the target vehicle 310 and the first traffic participant 320 (i.e., the first set of distances).

[0076] Furthermore, the identification system 100 can update the first temporal features based on a first set of distances and a set of reference temporal features associated with a set of traffic participants to determine the second temporal features. The set of reference temporal features is determined based on a local attention mechanism associated with a set of traffic participants.

[0077] As an example, such as Figure 1 As shown, the recognition system 100 can input the first temporal feature and a set of reference temporal features into the second encoder 130 to obtain the second temporal feature 170.

[0078] Additional or alternative land, such as Figure 3A As shown, the identification system 100 can also identify a second traffic participant 330 whose distance from the first traffic participant 320 is less than a first distance. The first temporal feature of the target vehicle 310 is, for example, a; the first reference temporal feature of the first traffic participant 320 is, for example, b; and the second reference temporal feature of the second traffic participant 330 is, for example, c.

[0079] The recognition system 100 can use a first-round local attention mechanism to enable the first temporal feature to interact with the first reference temporal feature, the first reference temporal feature to interact with the first temporal feature and the second reference temporal feature, and the second reference temporal feature to interact with the first reference temporal feature. Through the processing of the first round of local attention mechanism, the first temporal feature is updated to a+b, the first reference temporal feature is updated to a+b+c, and the second reference temporal feature is updated to b+c.

[0080] Furthermore, such as Figure 3B As shown. The recognition system 100 can use a second round of local attention mechanism to update the first temporal feature again based on the updated first temporal feature and the updated second temporal feature to obtain the second temporal feature. The second temporal feature is a+b+c.

[0081] This method can effectively reduce the computational load of vehicle parking type identification.

[0082] Finally, the identification system 100 can determine the parking type of the target vehicle based on the second temporal features.

[0083] In some embodiments, the identification system 100 can also determine the parking type of the target vehicle based on a first temporal feature and map features. Specifically, the identification system 100 can acquire map information associated with the traffic scene. As an example, the identification system 100 can acquire map information corresponding to the target area from a high-precision map based on the target area corresponding to the traffic scene.

[0084] Furthermore, the recognition system 100 can generate map features associated with the target vehicle based on map information. For example, such as... Figure 1 As shown, the recognition system 100 can input map information 115 into the third encoder 140 to obtain map features 175.

[0085] Finally, the identification system 100 can determine the parking type of the target vehicle based on the first temporal features and map features. As an example, such as... Figure 1 As shown, the recognition system 100 can input the first temporal feature 125 and the map feature 175 into the fourth encoder 150 to obtain the recognition result 160 (i.e., the parking type of the target vehicle).

[0086] In some scenarios, the recognition system 100 can also use the fourth encoder 150 to process the second temporal feature 170 and the map feature 175 to obtain the recognition result 160.

[0087] In some embodiments, the identification system 100 may determine multiple map elements associated with a target vehicle based on map information. The distances between the multiple map elements and the target vehicle are less than a second distance. Such map elements include traffic lights, traffic signs, lane lines, and road surface features.

[0088] Furthermore, the recognition system 100 can construct map features based on a second set of distances from multiple map identifiers to the target vehicle and element features of multiple maps.

[0089] As an example, such as Figure 1 As shown, the recognition system 100 can call the third encoder 140 to process a second set of distances from multiple map markers to the target vehicle and multiple map element features to construct map features through the local attention mechanism in the third encoder 140.

[0090] To reduce the cache pressure on the online deployment recognition system 100 and the overfitting problem in the vehicle parking type recognition process, in some embodiments, the recognition system 100 can determine multiple map elements associated with the target vehicle based on map information and a first upper limit. The number of multiple map elements is less than the first upper limit.

[0091] As an example, the recognition system 100 can reduce the number of map elements that should be processed during the vehicle parking type recognition process to below a certain number to alleviate cache pressure and overfitting issues.

[0092] In this way, embodiments of the present disclosure can improve the accuracy of identifying vehicle parking types (e.g., long-term parking or short-term parking).

[0093] Example devices and equipment

[0094] Figure 4A schematic structural block diagram of a device 400 for identifying vehicle parking type according to certain embodiments of the present disclosure is shown. Device 400 may be implemented as or included in example environment 100. Various modules / components in device 400 may be implemented by hardware, software, firmware, or any combination thereof.

[0095] As shown in the figure, the device 400 includes: a first determining module 410 configured to determine multiple historical poses of a target vehicle in a traffic scene at multiple historical moments; a second determining module 420 configured to determine reference information based on the multiple historical poses and the target pose of the target vehicle at a target moment, wherein the reference information at least indicates the positional relationship of the target pose relative to the multiple historical poses; a constructing module 430 configured to construct a first temporal feature of the target vehicle based on the reference information, a set of object attributes of the target vehicle at the target moment and the multiple historical moments; and a third determining module 440 configured to determine the parking type of the target vehicle based at least on the first temporal feature of the target vehicle, wherein the parking type is associated with the expected parking time of the target vehicle.

[0096] In some embodiments, the third determining module 440 is further configured to: determine a group of traffic participants associated with the target vehicle; determine a first set of distances from the group of traffic participants to the target vehicle; update a first set of time-series features based on the first set of distances and a set of reference time-series features associated with the group of traffic participants to determine a second time-series feature; and determine the parking type of the target vehicle based at least on the second time-series feature.

[0097] In some embodiments, the distance from a group of traffic participants to the target vehicle is less than a first distance.

[0098] In some embodiments, a set of reference temporal features associated with a group of traffic participants is determined based on a local attention mechanism associated with the group of traffic participants.

[0099] In some embodiments, the third determining module 440 is further configured to: acquire map information associated with a traffic scenario; generate map features associated with a target vehicle based on the map information; and determine the parking type of the target vehicle based on the first temporal features and the map features.

[0100] In some embodiments, the third determining module 440 is further configured to: determine multiple map elements associated with the target vehicle based on map information, wherein the distance between the multiple map elements and the target vehicle is less than a second distance; and construct map features based on a second set of distances from multiple map identifiers to the target vehicle and element features of the multiple map elements.

[0101] In some embodiments, the third determining module 440 is further configured to: determine multiple map elements associated with the target vehicle based on map information and a first number upper limit, wherein the number of multiple map elements is less than the first number upper limit.

[0102] In some embodiments, the first determining module 410 is further configured to: determine multiple historical poses of the target vehicle at multiple historical moments based on a second upper limit, wherein the number of multiple historical moments is less than the second upper limit.

[0103] In some embodiments, the second determining module 420 is further configured to: determine relative distance information corresponding to a set of historical positions based on a set of historical positions indicated by a plurality of historical poses and a target position indicated by a target pose, as at least part of the reference information; determine relative direction information corresponding to a set of historical positions based on a set of historical positions and a target position, as at least part of the reference information; and determine relative orientation information corresponding to a set of historical poses based on the relative direction information and the target orientation indicated by the target position, as at least part of the reference information.

[0104] In some embodiments, the reference information may also include time interval information between the target time and multiple historical times.

[0105] In some embodiments, the construction module 430 is further configured to: generate initial temporal features based on the set of object attributes; and update the initial temporal features based on the reference features corresponding to the reference information using an attention mechanism to construct the first temporal features of the target vehicle.

[0106] Figure 5 A block diagram is shown illustrating a computing device 500 in which one or more embodiments of the present disclosure may be implemented. It should be understood that... Figure 5 The computing device 500 shown is merely exemplary and should not be construed as limiting the functionality and scope of the embodiments described herein. Figure 5 The computing device 500 shown can be used to implement Figure 1 Example recognition system 100.

[0107] like Figure 5 As shown, computing device 500 is in the form of a general-purpose computing device. Components of computing device 500 may include, but are not limited to, one or more processors or processing units 510, memory 520, storage devices 530, one or more communication units 540, one or more input devices 550, and one or more output devices 560. Processing unit 510 may be a physical or virtual processor and is capable of performing various processes according to programs stored in memory 520. In a multiprocessor system, multiple processing units execute computer-executable instructions in parallel to improve the parallel processing capability of computing device 500.

[0108] Computing device 500 typically includes multiple computer storage media. Such media can be any accessible media that is accessible to computing device 500, including but not limited to volatile and non-volatile media, removable and non-removable media. Memory 520 can be volatile memory (e.g., registers, cache, random access memory (RAM)), non-volatile memory (e.g., read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory), or some combination thereof. Storage device 530 can be removable or non-removable media and can include machine-readable media, such as flash drives, disks, or any other media that can be used to store information and / or data (e.g., training data for training) and can be accessed within computing device 500.

[0109] The computing device 500 may further include additional removable / non-removable, volatile / non-volatile storage media. Although not explicitly stated... Figure 5 As shown, disk drives for reading from or writing to removable, non-volatile disks (e.g., "floppy disks") and optical disk drives for reading from or writing to removable, non-volatile optical disks can be provided. In these cases, each drive can be connected to a bus (not shown) via one or more data media interfaces. Memory 520 may include computer program product 525 having one or more program modules configured to perform various methods or actions of various embodiments of this disclosure.

[0110] The communication unit 540 enables communication with other computing devices via a communication medium. Additionally, the functionality of the components of the computing device 500 can be implemented as a single computing cluster or multiple computing machines capable of communicating via communication connections. Therefore, the computing device 500 can operate in a networked environment using logical connections to one or more other servers, networked personal computers (PCs), or another network node.

[0111] Input device 550 can be one or more input devices, such as a mouse, keyboard, trackball, etc. Output device 560 can be one or more output devices, such as a monitor, speaker, printer, etc. Computing device 500 can also communicate as needed with one or more external devices (not shown) via communication unit 540. These external devices, such as storage devices, display devices, etc., can communicate with one or more devices that enable user interaction with computing device 500, or with any device that enables computing device 500 to communicate with one or more other computing devices (e.g., network card, modem, etc.). Such communication can be performed via input / output (I / O) interfaces (not shown).

[0112] According to an exemplary implementation of this disclosure, a computer-readable storage medium is provided that stores computer-executable instructions thereon, wherein the computer-executable instructions are executed by a processor to implement the methods described above. According to an exemplary implementation of this disclosure, a computer program product is also provided, which is tangibly stored on a non-transitory computer-readable medium and includes computer-executable instructions, which are executed by a processor to implement the methods described above.

[0113] Various aspects of this disclosure are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatuses, devices, and computer program products implemented according to this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0114] These computer-readable program instructions can be provided to a general-purpose computer, a special-purpose computer, or other programmable vehicle parking type identification device processing unit to produce a machine such that, when executed by the computer or other programmable vehicle parking type identification device processing unit, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable vehicle parking type identification device, and / or other equipment to operate in a particular manner. Thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0115] Computer-readable program instructions can be loaded onto a computer, other programmable vehicle parking type identification device, or other equipment to cause a series of operational steps to be performed on the computer, other programmable vehicle parking type identification device, or other equipment to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable vehicle parking type identification device, or other equipment to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0116] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction, which contains one or more executable instructions for implementing the specified logical function. In some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0117] Various implementations of this disclosure have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed implementations. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described implementations. The terminology used herein is chosen to best explain the principles, practical applications, or improvements to technology in the market, or to enable others skilled in the art to understand the various implementations disclosed herein.

Claims

1. A method for identifying vehicle parking type, comprising: Determine the multiple historical poses of the target vehicle at multiple historical moments in the traffic scene; Based on the multiple historical poses and the target pose of the target vehicle at the target time, reference information is determined, wherein the reference information at least indicates the positional relationship of the target pose relative to the multiple historical poses; Based on the reference information and a set of object attributes of the target vehicle at the target time and the multiple historical times, a first temporal feature of the target vehicle is constructed; as well as Based at least on the first temporal characteristics of the target vehicle, the parking type of the target vehicle is determined, and the parking type is associated with the expected parking time of the target vehicle.

2. The method of claim 1, wherein determining the parking type of the target vehicle based at least on the first temporal characteristics of the target vehicle comprises: Identify a group of traffic participants associated with the target vehicle; Determine the first set of distances from the group of traffic participants to the target vehicle; Based on the first set of distances and a set of reference time-series features associated with the group of traffic participants, the first time-series features are updated to determine the second time-series features; as well as The parking type of the target vehicle is determined based at least on the second temporal feature.

3. The method according to claim 2, wherein the distance from the group of traffic participants to the target vehicle is less than a first distance.

4. The method of claim 2, wherein a set of reference temporal features associated with the group of traffic participants is determined based on a local attention mechanism associated with the group of traffic participants.

5. The method of claim 1, wherein determining the parking type of the target vehicle based at least on the first temporal characteristics of the target vehicle comprises: Obtain map information associated with the traffic scenario; Based on the map information, map features associated with the target vehicle are generated; as well as Based on the first temporal feature and the map feature, the parking type of the target vehicle is determined.

6. The method of claim 5, wherein generating map features associated with the target vehicle based on the map information comprises: Based on the map information, multiple map elements associated with the target vehicle are identified, and the distance between the multiple map elements and the target vehicle is less than a second distance; as well as The map features are constructed based on the second set of distances from the multiple map identifiers to the target vehicle and the element features of the multiple map elements.

7. The method of claim 6, wherein determining the plurality of map elements associated with the target vehicle based on the map information comprises: Based on the map information and the first upper limit, a plurality of map elements associated with the target vehicle are determined, wherein the number of the plurality of map elements is less than the first upper limit.

8. The method according to claim 1, wherein determining the multiple historical poses of the target vehicle at multiple historical moments in a traffic scene comprises: Based on the second upper limit, the target vehicle is determined to have multiple historical poses at multiple historical moments, wherein the number of multiple historical moments is less than the second upper limit.

9. The method according to claim 1, wherein determining the reference information based on the plurality of historical poses and the target pose of the target vehicle at the target time includes at least one of the following: Based on a set of historical positions indicated by the plurality of historical poses and the target position indicated by the target pose, relative distance information corresponding to the set of historical positions is determined as at least part of the reference information; Based on the set of historical locations and the target location, the relative direction information corresponding to the set of historical locations is determined as at least a part of the reference information; Based on the relative direction information and the target orientation indicated by the target position, relative orientation information corresponding to the set of historical poses is determined as at least part of the reference information.

10. The method according to claim 1, wherein the reference information further includes time interval information between the target time and the plurality of historical times.

11. The method according to claim 1, wherein constructing a first temporal feature of the target vehicle based on the reference information and a set of object attributes of the target vehicle at the target time and the plurality of historical times comprises: Based on the aforementioned set of object attributes, generate initial temporal features; as well as Using an attention mechanism, the initial temporal features are updated based on the reference features corresponding to the reference information to construct the first temporal features of the target vehicle.

12. A device for identifying vehicle parking type, comprising: The first determining module is configured to determine the multiple historical poses of the target vehicle at multiple historical moments in the traffic scene. The second determining module is configured to determine reference information based on multiple historical poses and the target pose of the target vehicle at the target time. The reference information at least indicates the positional relationship of the target pose relative to the multiple historical poses. The building module is configured to construct the first temporal features of the target vehicle based on reference information, a set of object attributes of the target vehicle at the target time and multiple historical times; as well as The third determining module is configured to determine the parking type of the target vehicle based at least on a first temporal characteristic of the target vehicle, the parking type being associated with the expected parking time of the target vehicle.

13. An electronic device, comprising: At least one processing unit; as well as At least one memory is coupled to at least one processing unit and stores instructions for execution by the at least one processing unit, which, when executed by the at least one processing unit, cause the electronic device to perform the method according to any one of claims 1 to 11.

14. A computer-readable storage medium having a computer program stored thereon, the computer program being executable by a processor to implement the method according to any one of claims 1 to 11.

15. A computer program product comprising computer-executable instructions, wherein the computer-executable instructions, when executed by a processor, implement the method according to any one of claims 1 to 11.