Image sensor with on-sensor and dedicated object recognition
By integrating an image sensor, ADC, and ASIC onto the sensor, and applying a trained machine learning model for object recognition, the problems of latency and uneven resource allocation in existing computer vision tasks are solved, enabling more efficient object detection and real-time processing of autonomous vehicles.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- WAYMO LLC
- Filing Date
- 2025-12-12
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, computer vision tasks are typically executed by centralized general-purpose processors, which leads to increased latency in object detection data streams and uneven resource allocation, making it difficult to meet the real-time processing needs of autonomous or semi-autonomous vehicles.
The computer vision task is offloaded to a dedicated processor directly coupled to the sensor, and image data is processed in real time using image sensors, analog-to-digital converters (ADCs), and application-specific integrated circuits (ASICs). Trained machine learning models are then applied for object recognition.
By performing computation at the sensor level, data transmission latency and communication overhead are reduced, improving the efficiency and accuracy of object detection and freeing up central processing system resources for other tasks.
Smart Images

Figure CN122244408A_ABST
Abstract
Description
Technical Field
[0001] The example embodiments relate to image sensors with on-sensor and dedicated object recognition. Background Technology
[0002] Unless otherwise stated herein, the descriptions in this section are not prior art to the claims of this application and are not considered prior art by virtue of their inclusion in this section.
[0003] Computer vision tasks, particularly those related to the operation of autonomous or semi-autonomous vehicles, are typically computationally intensive. In this specific context, computer systems often process high frame rate data from cameras and other sensors to identify objects in the vehicle's surrounding environment, such as other vehicles, traffic signals and signs, pedestrians, cyclists, and debris on the road. Summary of the Invention
[0004] This disclosure relates to image sensors with on-sensor and dedicated object recognition capabilities. Compared to alternative methods, the example embodiments described herein allow computer vision tasks to be offloaded from a centralized general-purpose processor to a dedicated processor directly coupled to the sensor. This allows for greater customization and configuration of the processor for the task associated with each sensor. For example, the dedicated processor can apply trained machine learning models to image data received from the sensor to identify objects for autonomous or semi-autonomous vehicle navigation.
[0005] In one aspect, a system is provided. The system includes a vehicle, a first device attached to the vehicle and along a first orientation relative to the vehicle, and a second device attached to the vehicle and along a second orientation relative to the vehicle. The first and second orientations are different. The first device includes a first image sensor configured to capture first image data about a surrounding environment, a first analog-to-digital converter (ADC) configured to receive the captured first image data from the first image sensor and provide converted first image data, and a first application-specific integrated circuit (ASIC). The first ASIC is configured to receive the converted first image data from the first ADC and apply a first trained machine learning model to the converted first image data to identify one or more first objects in the surrounding environment within the converted first image data. The first trained machine learning model is selected from a plurality of machine learning models based on the first orientation. The second device includes a second image sensor configured to capture second image data about a surrounding environment, a second ADC configured to receive the captured second image data from the second image sensor and provide converted second image data, and a second ASIC. The second ASIC is configured to receive the converted second image data from the second ADC and apply a second trained machine learning model to the converted second image data to identify one or more second objects in the surrounding environment within the converted second image data. The second trained machine learning model is selected from multiple machine learning models based on a second orientation.
[0006] In another aspect, an apparatus is provided. The apparatus includes an image sensor configured to capture image data about a surrounding environment, an analog-to-digital converter (ADC) configured to receive the captured image data from the image sensor and provide converted image data, and an application-specific integrated circuit (ASIC). The ASIC is configured to receive the converted image data from the ADC, apply a trained machine learning model to the converted image data to identify one or more objects in the surrounding environment within the converted image data, and output an image frame. At least one line of the image frame includes metadata, which includes object classification data and object location data of one or more identified objects in the surrounding environment.
[0007] In another aspect, a method is provided. The method includes capturing image data about a surrounding environment by an image sensor. The method also includes receiving the captured image data from the image sensor by an analog-to-digital converter (ADC). The method further includes providing the converted image data to an application-specific integrated circuit (ASIC) by the ADC. The method also includes applying a trained machine learning model to the converted image data by the ASIC to identify one or more objects in the surrounding environment within the converted image data. The method further includes outputting an image frame by the ASIC. At least one line of the image frame includes metadata including object classification data and object location data of one or more identified objects in the surrounding environment.
[0008] These and other aspects, advantages, and alternatives will become clear to those skilled in the art upon reading the following detailed description and, where appropriate, referring to the accompanying drawings. Attached Figure Description
[0009] Figure 1 This is a functional block diagram illustrating a vehicle according to an example embodiment.
[0010] Figure 2A This is a diagram illustrating the physical configuration of a vehicle according to an example embodiment.
[0011] Figure 2B This is a diagram illustrating the physical configuration of a vehicle according to an example embodiment.
[0012] Figure 2C This is a diagram illustrating the physical configuration of a vehicle according to an example embodiment.
[0013] Figure 2D This is a diagram illustrating the physical configuration of a vehicle according to an example embodiment.
[0014] Figure 2E This is a diagram illustrating the physical configuration of a vehicle according to an example embodiment.
[0015] Figure 2F This is a diagram illustrating the physical configuration of a vehicle according to an example embodiment.
[0016] Figure 2G This is a diagram illustrating the physical configuration of a vehicle according to an example embodiment.
[0017] Figure 2H This is a diagram illustrating the physical configuration of a vehicle according to an example embodiment.
[0018] Figure 2I This is a diagram illustrating the physical configuration of a vehicle according to an example embodiment.
[0019] Figure 2J This is a diagram illustrating the field of view of various sensors according to an example embodiment.
[0020] Figure 2K This is an illustration of beam steering for a sensor according to an example embodiment.
[0021] Figure 3 This is a conceptual illustration of wireless communication between various computing systems associated with autonomous or semi-autonomous vehicles, according to an example embodiment.
[0022] Figure 4A This is a block diagram of a system including a lidar device according to an example embodiment.
[0023] Figure 4B This is a block diagram of a lidar device according to an example embodiment.
[0024] Figure 5A This is a block diagram of a device according to an example embodiment.
[0025] Figure 5B This is an illustration of a multi-layer die stack according to an example embodiment.
[0026] Figure 5C This is an illustration of a vehicle with equipment according to an example embodiment.
[0027] Figure 6A This is a block diagram of a process according to an example embodiment.
[0028] Figure 6B This is a block diagram of an image frame according to an example embodiment.
[0029] Figure 7A This is an illustration of an image frame according to an example embodiment.
[0030] Figure 7B This is an illustration of an image frame including metadata according to an example embodiment.
[0031] Figure 7C This is an illustration of an image frame according to an example embodiment.
[0032] Figure 7D This is an illustration of an image frame including metadata according to an example embodiment.
[0033] Figure 8 This is a flowchart illustration of a method according to an example embodiment.
[0034] Figure 9 This is a flowchart illustration of a method according to an example embodiment. Detailed Implementation
[0035] This document considers exemplary methods and systems. Any exemplary embodiments or features described herein are not necessarily to be construed as preferred or advantageous over other embodiments or features. Furthermore, the exemplary embodiments described herein are not intended to be limiting. It will be readily understood that certain aspects of the disclosed systems and methods can be arranged and combined in a wide variety of different configurations, all of which are considered herein. Additionally, the specific arrangements shown in the figures should not be considered limiting. It should be understood that other embodiments may include more or fewer of each element shown in the given figures. Furthermore, some of the elements shown may be combined or omitted. Moreover, exemplary embodiments may include elements not shown in the figures.
[0036] Referring to two compared elements, features, etc., as “identical” can mean that they are “substantially identical.” Therefore, the phrase “substantially identical” can include cases with a deviation that is considered low in the art, such as 5% or less.
[0037] The lidar device described herein may include one or more light emitters and one or more detectors for detecting light emitted by the one or more light emitters and reflected by one or more objects in the environment surrounding the lidar device. As an example, the surrounding environment may include an internal or external environment, such as the interior or exterior of a building. Additionally or alternatively, the surrounding environment may include the interior of a vehicle. Further, the surrounding environment may include the area around and / or nearby on a road. Examples of objects in the surrounding environment include, but are not limited to, other vehicles, traffic signs, pedestrians, cyclists, road surfaces, buildings, and terrain. Additionally, the one or more light emitters may emit light into the local environment of the lidar itself. For example, light emitted from the one or more light emitters may interact with the lidar housing and / or surfaces or structures coupled to the lidar. In some cases, the lidar may be mounted to a vehicle, in which case the one or more light emitters may be configured to emit light that interacts with objects near the vehicle. Furthermore, the light emitters may include fiber optic amplifiers, laser diodes, light-emitting diodes (LEDs), etc.
[0038] As described above, in some embodiments, the specific task can vary based on the sensor device's location and / or orientation on the autonomous or semi-autonomous vehicle, and such determination can occur automatically. For example, the sensor device can determine its forward position on the autonomous vehicle and automatically load an appropriate machine learning model. This allows for the flexibility to adapt the sensor to different tasks and allows for streamlined updates of the machine learning model based on the preferences of the developer and / or deployer.
[0039] The embodiments described herein also provide technical improvements to the current method, since detection performed by sensors is typically performed by a central processing unit or system, which increases the latency of the object detection data stream because image data will be sent to the central processing system before any analysis is performed.
[0040] In view of the above-mentioned problems, the embodiments of this paper allow computer vision tasks to be performed on the sensor device, thereby offloading processing tasks from the central processing unit. Therefore, the sensor device as described herein may include an image sensor, an analog-to-digital converter (ADC), and an application-specific integrated circuit (ASIC). Image data can be collected by the image sensor, converted into a digital format by the ADC, and sent to the ASIC, which can then use a trained machine learning model to detect objects in the autonomous vehicle's surrounding environment. This provides a technical improvement over current methods, as such detection is performed by a central processing unit or system, which is typically located far from the image capture device. This increases the latency of the object detection data stream, as the image data is sent to the central processing system before any analysis is performed. Furthermore, the central processing unit or system may serve multiple image capture devices, further limiting its ability to dedicate resources to each image capture device. In contrast, the device described above performs analysis and other computer vision tasks, resulting in significantly improved performance. This also allows the central processing system to allocate more resources to autonomous driving logic. Additionally, by co-locating the processing components in the same device as the image sensor, this allows for lower latency and communication overhead in transmissions between each component.
[0041] In some embodiments, the device may include three layers in a multi-layer die stack: (i) an image sensing layer (image sensor), (ii) an ADC layer (including an ADC, an HDR processor, and a cache), and (iii) a processing layer, including an ASIC.
[0042] The following description and accompanying drawings will illustrate the features of various exemplary embodiments. The embodiments provided are by way of example and not limitation. Therefore, the dimensions of the drawings are not necessarily drawn to scale.
[0043] The example systems within the scope of this disclosure will now be described in more detail. The example systems can be implemented in or take the form of automobiles. Additionally, the example systems can be implemented in or take the form of various vehicles, such as cars, trucks (e.g., pickup trucks, vans, tractors, and tractor-trailers), motorcycles, buses, airplanes, helicopters, drones, lawnmowers, bulldozers, boats, submarines, all-terrain vehicles, snowmobiles, aircraft, recreational vehicles, amusement park vehicles, agricultural equipment or vehicles, construction equipment or vehicles, warehouse equipment or vehicles, factory equipment or vehicles, trams, golf carts, trains, handcarts, sidewalk transport vehicles, and robotic equipment. Other vehicles are also possible. Furthermore, in some embodiments, the example systems may not include a vehicle.
[0044] Now refer to the attached diagram, Figure 1 This is a functional block diagram illustrating an example vehicle 100, which can be configured to operate fully or partially in an autonomous mode. More specifically, vehicle 100 can operate in an autonomous mode without human interaction by receiving control commands from a computing system. As part of operating in autonomous mode, vehicle 100 can use sensors to detect and possibly identify objects in the surrounding environment for safe navigation. Furthermore, example vehicle 100 can operate in a partially autonomous (i.e., semi-autonomous) mode, where some functions of vehicle 100 are controlled by a human driver, while others are controlled by the computing system. For example, vehicle 100 may also include subsystems enabling the driver to control the operation of vehicle 100 (such as steering, acceleration, and braking), while the computing system performs assistance functions, such as lane departure warning / lane keeping assist or adaptive cruise control, based on other objects in the surrounding environment (e.g., the vehicle).
[0045] As described in this article, in partially autonomous driving mode, even when the vehicle assists with one or more driving operations (e.g., steering, braking, and / or acceleration to perform lane centering, adaptive cruise control, advanced driver assistance systems (ADAS), and emergency braking), the human driver should be aware of the vehicle's surroundings and supervise the assisted driving operations. Here, even if the vehicle may perform all driving tasks in certain situations, the human driver is expected to be responsible for taking control as needed.
[0046] Although various systems and methods are described below in conjunction with autonomous vehicles for the sake of simplicity, these or similar systems and methods can be used in various driver assistance systems that do not rise to the level of fully autonomous driving systems (i.e., partially autonomous driving systems). In the United States, the Society of Automotive Engineers (SAE) has defined different levels of automated driving operation to indicate the degree or extent to which a vehicle controls driving, although different organizations in the United States or other countries may classify levels differently. More specifically, the disclosed systems and methods can be used in SAE Level 2 driver assistance systems, which implement steering, braking, acceleration, lane centering, adaptive cruise control, and other driver support. The disclosed systems and methods can be used in SAE Level 3 driver assistance systems, which are capable of autonomous driving under limited (e.g., highway) conditions. Similarly, the disclosed systems and methods can be used in vehicles using SAE Level 4 automated driving systems, which operate autonomously in most normal driving situations and require only occasional human operator attention. In all such systems, accurate lane estimation can be performed automatically without driver input or control (e.g., when the vehicle is in motion), leading to improved reliability of vehicle positioning and navigation, as well as overall safety improvements for autonomous, semi-autonomous, and other driver assistance systems. As previously mentioned, other organizations in the U.S. or other countries may classify levels of automated driving operations differently than the SAE does. Without limitation, the systems and methods disclosed herein can be used with driver assistance systems defined by the automated driving operation levels of these other organizations.
[0047] like Figure 1 As shown, vehicle 100 may include various subsystems, such as a propulsion system 102, a sensor system 104, a control system 106, one or more peripheral devices 108, a power supply 110, a computer system 112 (which may also be referred to as a computing system) with a data storage device 114, and a user interface 116. In other examples, vehicle 100 may include more or fewer subsystems, each subsystem including multiple elements. The subsystems and components of vehicle 100 may be interconnected in various ways. Additionally, in embodiments, the functionality of vehicle 100 described herein may be divided into additional functions or physical components, or combined into fewer functions or physical components. For example, control system 106 and computer system 112 may be combined into a single system to operate vehicle 100 according to various operations.
[0048] The propulsion system 102 may include one or more components operable to provide powered motion to the vehicle 100, and may include an engine / motor 118, an energy source 119, a transmission 120, and wheels / tires 121, among other possible components. For example, the engine / motor 118 may be configured to convert the energy source 119 into mechanical energy, and may correspond to one or a combination of an internal combustion engine, an electric motor, a steam engine, or a Stirling engine, among other possible options. For example, in some embodiments, the propulsion system 102 may include multiple types of engines and / or motors, such as gasoline engines and electric motors.
[0049] Energy source 119 refers to an energy source that can provide power, in whole or in part, to one or more systems of vehicle 100 (e.g., engine / motor 118). For example, energy source 119 may correspond to gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and / or other electrical sources. In some embodiments, energy source 119 may include a combination of a fuel tank, battery, capacitor, and / or flywheel.
[0050] The transmission 120 can send mechanical power from the engine / motor 118 to the wheels / tires 121 and / or other possible systems of the vehicle 100. Thus, the transmission 120 may include a gearbox, clutch, differential, and drive shaft, as well as other possible components. The drive shaft may include an axle connected to one or more wheels / tires 121.
[0051] In the example embodiment, the wheels / tires 121 of the vehicle 100 can have various configurations. For example, the vehicle 100 can exist as a unicycle, bicycle / motorcycle, tricycle, or four-wheeled car / truck, among other possible configurations. Thus, the wheels / tires 121 can be attached to the vehicle 100 in various ways and can be made of different materials, such as metal and rubber.
[0052] Sensor system 104 may include various types of sensors, such as a Global Positioning System (GPS) 122, an Inertial Measurement Unit (IMU) 124, radar 126, lidar 128, a camera 130, a steering sensor 123, and a throttle / brake sensor 125, as well as other possible sensors. In some embodiments, sensor system 104 may also include sensors configured to monitor the internal systems of vehicle 100 (e.g., O2 monitor, fuel gauge, engine oil temperature, and brake wear).
[0053] GPS 122 may include a transceiver operable to provide information about the positioning of vehicle 100 relative to the Earth. IMU 124 may be configured to use one or more accelerometers and / or gyroscopes and can sense changes in the positioning and orientation of vehicle 100 based on inertial acceleration. For example, IMU 124 can detect the pitch and yaw of vehicle 100 when vehicle 100 is stationary or in motion.
[0054] Radar 126 may represent one or more systems configured to use radio signals to sense objects (including the speed and direction of travel) within the surrounding environment of vehicle 100. Thus, radar 126 may include an antenna configured to transmit and receive radio signals. In some embodiments, radar 126 may correspond to an mountable radar configured to obtain measurements of the surrounding environment of vehicle 100.
[0055] The lidar 128 may include one or more laser sources, a laser scanner, and one or more detectors, as well as other system components, and may operate in a coherent mode (e.g., using heterodyne detection) or an incoherent detection mode (i.e., time-of-flight mode). In some embodiments, one or more detectors of the lidar 128 may include one or more photodetectors, which may be particularly sensitive detectors (e.g., avalanche photodiodes). In some embodiments, such photodetectors may be able to detect single photons (e.g., single-photon avalanche diodes). Furthermore, such photodetectors may be arranged (e.g., via series electrical connections) in an array (e.g., as in a silicon photomultiplier tube (SiPM)). In some examples, one or more photodetectors are Geiger-mode operating devices, and the lidar includes sub-components designed for such Geiger-mode operation.
[0056] Camera 130 may include one or more devices (e.g., still camera, video camera, thermal imaging camera, stereo camera, and night vision camera) configured to capture images of the surrounding environment of vehicle 100.
[0057] The steering sensor 123 can sense the steering angle of the vehicle 100, which may involve measuring the angle of the steering wheel or measuring an electrical signal representing the angle of the steering wheel. In some embodiments, the steering sensor 123 can measure the angle of the wheels of the vehicle 100, such as detecting the angle of the wheels relative to the forward axle of the vehicle 100. The steering sensor 123 can also be configured to measure a combination (or subset) of the steering wheel angle, an electrical signal representing the steering wheel angle, and the angles of the wheels of the vehicle 100.
[0058] Throttle / brake sensor 125 can detect the positioning of the throttle or brake of vehicle 100. For example, throttle / brake sensor 125 can measure the angle of both the gas pedal (throttle) and the brake pedal, or it can measure an electrical signal that can represent, for example, the angle of the gas pedal (throttle) and / or the angle of the brake pedal. Throttle / brake sensor 125 can also measure the angle of the throttle body of vehicle 100, which may include part of a modulated physical mechanism (e.g., a butterfly valve and a carburetor) that provides energy source 119 to engine / motor 118. In addition, throttle / brake sensor 125 can measure the pressure of one or more brake pads on the rotor of vehicle 100 or a combination (or subset) of the angle of the gas pedal (throttle) and brake pedal, an electrical signal representing the angle of the gas pedal (throttle) and brake pedal, the angle of the throttle body, and the pressure exerted by at least one brake pad on the rotor of vehicle 100. In other embodiments, the throttle / brake sensor 125 may be configured to measure the pressure applied to a vehicle pedal (such as a throttle or brake pedal).
[0059] The control system 106 may include components configured to assist navigation of the vehicle 100, such as a steering unit 132, a throttle 134, a braking unit 136, a sensor fusion algorithm 138, a computer vision system 140, a navigation / path system 142, and an obstacle avoidance system 144. More specifically, the steering unit 132 may be operable to adjust the forward direction of the vehicle 100, and the throttle 134 may control the operating speed of the engine / motor 118 to control the acceleration of the vehicle 100. The braking unit 136 may decelerate the vehicle 100, which may involve using friction to slow down the wheels / tires 121. In some embodiments, the braking unit 136 may convert the kinetic energy of the wheels / tires 121 into electrical current for subsequent use by one or more systems of the vehicle 100.
[0060] Sensor fusion algorithm 138 may include Kalman filters, Bayesian networks, or other algorithms capable of processing data from sensor system 104. In some embodiments, sensor fusion algorithm 138 may provide evaluations based on incoming sensor data, such as evaluations of individual objects and / or features, evaluations of specific situations, and / or evaluations of potential impacts within a given situation.
[0061] Computer vision system 140 may include hardware and software (e.g., a general-purpose processor such as a central processing unit (CPU), a dedicated processor such as a graphics processing unit (GPU) or tensor processing unit (TPU), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), volatile memory, non-volatile memory, or one or more machine learning models) operable to process and analyze images in an effort to determine moving objects (e.g., other vehicles, pedestrians, cyclists, or animals) and stationary objects (e.g., traffic lights, road boundaries, speed bumps, or potholes). Therefore, computer vision system 140 may use object recognition, structure-from-motion (SFM), video tracking, and other algorithms used in computer vision, such as to identify objects, map the environment, track objects, estimate object velocities, etc.
[0062] The navigation / path system 142 can determine the driving path of the vehicle 100, which may involve dynamically adjusting the navigation during operation. Thus, the navigation / path system 142 can navigate the vehicle 100 using data from sensor fusion algorithm 138, GPS 122, maps, and other sources. The obstacle avoidance system 144 can assess potential obstacles based on sensor data and enable the vehicle 100's systems to avoid or otherwise traverse potential obstacles.
[0063] like Figure 1 As shown, vehicle 100 may also include peripheral devices 108, such as a wireless communication system 146, a touchscreen 148, an internal microphone 150, and / or a speaker 152. Peripheral devices 108 may provide controls or other elements for a user to interact with user interface 116. For example, touchscreen 148 may provide information to the user of vehicle 100. User interface 116 may also accept input from the user via touchscreen 148. Peripheral devices 108 may also enable vehicle 100 to communicate with devices such as other vehicle equipment.
[0064] Wireless communication system 146 can communicate wirelessly with one or more devices, either directly or via a communication network. For example, wireless communication system 146 can use 3G cellular communications such as Code Division Multiple Access (CDMA), Evolved Data Optimized (EVDO), Global System for Mobile Communications (GSM) / General Packet Radio Service (GPRS), or cellular communications such as 4G Global Microwave Access Interoperability (WiMAX) or Long Term Evolution (LTE), or 5G. Alternatively, wireless communication system 146 can communicate with a wireless local area network (WLAN) using Wi-Fi® or other possible connections. Wireless communication system 146 can also communicate directly with devices using, for example, an infrared link, Bluetooth, or ZigBee. In the context of this disclosure, other wireless protocols, such as various vehicle communication systems, are possible. For example, wireless communication system 146 may include one or more Dedicated Short Range Communication (DSRC) devices, which may include public and / or private data communications between vehicles and / or roadside stations.
[0065] The carrier 100 may include a power supply 110 for powering components. In some embodiments, the power supply 110 may include a rechargeable lithium-ion or lead-acid battery. For example, the power supply 110 may include one or more batteries configured to provide power. The carrier 100 may also use other types of power supplies. In an example embodiment, the power supply 110 and the energy source 119 may be integrated into a single energy source.
[0066] The vehicle 100 may also include a computer system 112 to perform operations such as those described herein. Thus, the computer system 112 may include at least one processor 113 (which may include at least one microprocessor), operable to execute instructions 115 stored in a non-transitory computer-readable medium, such as a data storage device 114. In some embodiments, the computer system 112 may represent multiple computing devices that can be used to control various components or subsystems of the vehicle 100 in a distributed manner.
[0067] In some embodiments, the data storage device 114 may include instructions 115 (e.g., program logic) executable by the processor 113 to perform various functions of the vehicle 100, including those described above. Figure 1 The functions described. The data storage device 114 may also contain additional instructions, including instructions for sending data to, receiving data from, interacting with, and / or controlling one or more of the propulsion system 102, sensor system 104, control system 106, and peripheral devices 108.
[0068] In addition to command 115, data storage device 114 can store data such as road maps and route information. Such information can be used by vehicle 100 and computer system 112 during autonomous, semi-autonomous and / or manual operation of vehicle 100.
[0069] Vehicle 100 may include a user interface 116 for providing information to or receiving input from a user of vehicle 100. User interface 116 may control or implement control over the content and / or layout of interactive images that may be displayed on touchscreen 148. Furthermore, user interface 116 may include one or more input / output devices within a set of peripheral devices 108, such as wireless communication system 146, touchscreen 148, microphone 150, and speaker 152.
[0070] Computer system 112 can control the functions of vehicle 100 based on input received from various subsystems (e.g., propulsion system 102, sensor system 104, or control system 106) and from user interface 116. For example, computer system 112 can utilize input from sensor system 104 to estimate the outputs generated by propulsion system 102 and control system 106. Depending on the embodiment, computer system 112 can be operable to monitor many aspects of vehicle 100 and its subsystems. In some embodiments, computer system 112 can disable some or all functions of vehicle 100 based on signals received from sensor system 104.
[0071] The components of vehicle 100 can be configured to operate in a manner interconnected with other components, either internally or externally to their respective systems. For example, in an example embodiment, camera 130 can capture multiple images that may represent information about the state of the environment surrounding vehicle 100 as it operates in autonomous or semi-autonomous mode. The state of the environment may include parameters of the road on which the vehicle is operating. For example, computer vision system 140 may be able to identify slope (gradient) or other features based on multiple images of the road. Additionally, a combination of GPS 122 and features identified by computer vision system 140 can be used with map data stored in data storage device 114 to determine specific road parameters. Furthermore, radar 126 and / or lidar 128 and / or some other environmental mapping, ranging, and / or positioning sensor systems may also provide information about the vehicle's surrounding environment.
[0072] In other words, the combination of various sensors (which may be referred to as input indication and output indication sensors) and computer system 112 can interact to provide indications of inputs for controlling the vehicle or indications of the environment surrounding the vehicle.
[0073] In some embodiments, computer system 112 can make determinations about various objects based on data provided by systems other than radio systems. For example, vehicle 100 may have lasers or other optical sensors configured to sense objects in the vehicle's field of view. Computer system 112 can use the outputs from various sensors to determine information about objects in the vehicle's field of view, and can determine distance and orientation information to various objects. Computer system 112 can also determine whether an object is desired or undesirable based on the outputs from various sensors.
[0074] although Figure 1 Various components of the vehicle 100 (i.e., the wireless communication system 146, the computer system 112, the data storage device 114, and the user interface 116) are shown as integrated into the vehicle 100; however, one or more of these components may be installed or associated separately from the vehicle 100. For example, the data storage device 114 may exist partially or wholly separate from the vehicle 100. Therefore, the vehicle 100 can be provided in the form of device elements that can be positioned separately or together. The device elements constituting the vehicle 100 can be communicatively coupled together in a wired and / or wireless manner.
[0075] Figures 2A-2E An example vehicle 200 (e.g., a fully autonomous vehicle or a semi-autonomous vehicle) is shown. The example vehicle 200 may include a reference vehicle. Figure 1 The vehicle 100 describes some or all of its functions. Although for illustrative purposes, the vehicle 200... Figures 2A-2E The vehicle is shown as a truck with side mirrors, but this disclosure is not limited thereto. For example, vehicle 200 may represent a truck, car, semi-trailer truck, motorcycle, golf cart, off-road vehicle, agricultural vehicle, or any other vehicle described elsewhere herein (e.g., bus, boat, aircraft, helicopter, drone, lawnmower, bulldozer, submarine, all-terrain vehicle, snowmobile, aircraft, recreational vehicle, amusement park vehicle, agricultural equipment, construction equipment or vehicle, warehouse equipment or vehicle, factory equipment or vehicle, tram, train, handcart, sidewalk transport vehicle, and robotic equipment).
[0076] Example vehicle 200 may include one or more sensor systems 202, 204, 206, 208, 210, 212, 214, and 218. In some embodiments, sensor systems 202, 204, 206, 208, 210, 212, 214, and / or 218 may represent one or more optical systems (e.g., cameras), one or more lidar systems, one or more radar systems, one or more inertial sensors, one or more humidity sensors, one or more acoustic sensors (e.g., microphones and sonar devices), or one or more other sensors configured to sense information about the environment surrounding vehicle 200. In other words, any sensor system now known or created hereafter may be coupled to vehicle 200 and / or may be used in conjunction with various operations of vehicle 200. As an example, lidar may be used for autonomous driving or other types of navigation, planning, perception, and / or mapping operations of vehicle 200. Additionally, sensor systems 202, 204, 206, 208, 210, 212, 214 and / or 218 may represent combinations of sensors described herein (e.g., one or more lidar and radar; one or more lidar and camera; one or more camera and radar; or one or more lidar, camera and radar).
[0077] Notice, Figures 2A-2E The number, location, and type of sensor systems (e.g., 202 and 204) depicted are intended as non-limiting examples of the location, number, and type of such sensor systems for autonomous or semi-autonomous vehicles. Alternative numbers, locations, types, and configurations of such sensors are possible (e.g., consistent with vehicle size, shape, aerodynamics, fuel economy, aesthetics, or other conditions to reduce costs or adapt to specific environments or applications). For example, sensor systems (e.g., 202 and 204) may be positioned at various other locations on the vehicle (e.g., at location 216) and may have a field of view corresponding to the interior and / or surrounding environment of the vehicle 200.
[0078] Sensor system 202 may be mounted on top of vehicle 200 and may include one or more sensors configured to detect information about the environment surrounding vehicle 200 and output indications of that information. For example, sensor system 202 may include any combination of cameras, radar, lidar, inertial sensors, humidity sensors, and acoustic sensors (e.g., microphones and sonar devices). Sensor system 202 may include one or more movable mounts operable to adjust the orientation of one or more sensors in sensor system 202. In one embodiment, the movable mount may include a rotating platform that can scan the sensors to obtain information from every direction around vehicle 200. In another embodiment, the movable mount of sensor system 202 may be movable in a scanning manner within a specific angular and / or azimuth and / or elevation range. Sensor system 202 may be mounted on the roof of a vehicle, but other mounting locations are also possible.
[0079] Furthermore, the sensors of sensor system 202 can be distributed at different locations and do not need to be juxtaposed at a single location. Additionally, each sensor of sensor system 202 can be configured to move or scan independently of other sensors in sensor system 202. Alternatively or additionally, multiple sensors can be installed at one or more of sensor locations 202, 204, 206, 208, 210, 212, 214, and / or 218. For example, there may be two LiDAR devices installed at sensor locations and / or one LiDAR device and one radar device installed at sensor locations.
[0080] One or more sensor systems 202, 204, 206, 208, 210, 212, 214, and / or 218 may include one or more lidar devices. For example, a lidar device may include multiple light emitter devices arranged within an angular range relative to a given plane (e.g., the xy-plane). For example, one or more of sensor systems 202, 204, 206, 208, 210, 212, 214, and / or 218 may be configured to rotate or pivot about an axis perpendicular to the given plane (e.g., the z-axis) to illuminate the environment surrounding the vehicle 200 with light pulses. Information about the surrounding environment can be determined based on various aspects of the detected reflected light pulses (e.g., elapsed time of flight, polarization, and intensity).
[0081] In the example embodiment, sensor systems 202, 204, 206, 208, 210, 212, 214, and / or 218 may be configured to provide corresponding point cloud information that can be related to physical objects within the surrounding environment of vehicle 200. While vehicle 200 and sensor systems 202, 204, 206, 208, 210, 212, 214, and 218 are shown to include certain features, it should be understood that other types of sensor systems are contemplated within the scope of this disclosure. Furthermore, the example vehicle 200 may include combinations of... Figure 1 Any component described in vehicle 100.
[0082] In the example configuration, one or more radars may be located on vehicle 200. Similar to radar 126 described above, one or more radars may include antennas configured to transmit and receive radio waves (e.g., electromagnetic waves with frequencies between 30 Hz and 300 GHz). Such radio waves can be used to determine the distance and / or velocity to one or more objects in the surrounding environment of vehicle 200. For example, one or more sensor systems 202, 204, 206, 208, 210, 212, 214, and / or 218 may include one or more radars. In some examples, one or more radars may be located near the rear of vehicle 200 (e.g., sensor systems 208 and 210) to actively scan the environment near the rear of vehicle 200 for radio-reflecting objects. Similarly, one or more radars may be located near the front of vehicle 200 (e.g., sensor systems 212 or 214) to actively scan the environment near the front of vehicle 200. For example, the radars may be located in positions suitable for illuminating an area including the forward path of vehicle 200 without being obstructed by other features of vehicle 200. For example, radar can be embedded in and / or mounted in or near the front bumper, headlights, cowl, and / or hood. Additionally, one or more additional radars can be positioned to actively scan the sides and / or rear of vehicle 200 for radio-reflecting objects, such as by including such devices in or near the rear bumper, side panels, rocker panels, and / or chassis.
[0083] Vehicle 200 may include one or more cameras. For example, one or more sensor systems 202, 204, 206, 208, 210, 212, 214 and / or 218 may include one or more cameras. The cameras may be photosensitizing instruments, such as still cameras, video cameras, thermal imaging cameras, stereo cameras, night vision cameras, etc., configured to capture multiple images of the surrounding environment of vehicle 200. For this purpose, the cameras may be configured to detect visible light and may additionally or alternatively be configured to detect light from other parts of the spectrum, such as infrared or ultraviolet light. The cameras may be two-dimensional detectors and may optionally have a three-dimensional spatial sensitivity range. In some embodiments, the camera may include, for example, a range detector configured to generate two-dimensional images indicating distances from the camera to multiple points in the surrounding environment. For this purpose, the camera may use one or more range detection techniques. For example, the camera may provide range information by using structured light technology, in which vehicle 200 illuminates objects in the surrounding environment with a predetermined light pattern (such as a grid or checkerboard pattern), and the camera is used to detect reflections of the predetermined light pattern from the surrounding environment. Based on the distortion in the reflected light pattern, vehicle 200 can determine the distance to a point on an object. The predetermined light pattern may include infrared light or radiation of other suitable wavelengths for such measurements. In some examples, a camera may be mounted inside the windshield of vehicle 200. Specifically, the camera may be positioned to capture images from a forward-facing view relative to vehicle 200. Other mounting positions and viewing angles of the camera may also be used, whether inside or outside vehicle 200. Furthermore, the camera may have associated optics operable to provide an adjustable field of view. Further still, the camera may be mounted to vehicle 200 with a movable mounting to change the camera's pointing angle, such as via a translation / tilt mechanism.
[0084] Vehicle 200 may also include one or more acoustic sensors for sensing the surrounding environment of vehicle 200 (e.g., one or more of sensor systems 202, 204, 206, 208, 210, 212, 214, 216, 218 may include one or more acoustic sensors). The acoustic sensors may include microphones (e.g., piezoelectric microphones, condenser microphones, ribbon microphones, or microelectromechanical systems (MEMS) microphones) for sensing sound waves (i.e., pressure differences) in a fluid (e.g., air) surrounding vehicle 200. Such acoustic sensors can be used to identify sounds in the surrounding environment (e.g., sirens, human voices, animal sounds, or alarms), and the control strategy of vehicle 200 can be based on these sounds. For example, if the acoustic sensors detect an sirens (e.g., an ambulance siren or fire truck siren), vehicle 200 may decelerate and / or navigate to the edge of a road.
[0085] Despite Figures 2A to 2E Not shown, but vehicle 200 may include a wireless communication system (e.g., similar to...). Figure 1 Wireless communication systems 146 and / or other than Figure 1 (Outside of the wireless communication system 146). The wireless communication system may include a wireless transmitter and receiver, which may be configured to communicate with devices external to or internal to the vehicle 200. Specifically, the wireless communication system may include transceivers configured to communicate with other vehicles and / or computing devices, for example, in a vehicle communication system or road station. Examples of such vehicle communication systems include DSRC, radio frequency identification (RFID), and other communication standards proposed for intelligent transportation systems.
[0086] In addition to or in lieu of those shown, vehicle 200 may include one or more other components. These additional components may include electrical or mechanical functions.
[0087] The control system of vehicle 200 can be configured to control vehicle 200 according to a control strategy among a plurality of possible control strategies. The control system can be configured to receive information from sensors coupled to vehicle 200 (on or outside vehicle 200), modify the control strategy (and related driving behavior) based on that information, and control vehicle 200 according to the modified control strategy. The control system can also be configured to monitor information received from sensors and continuously evaluate driving conditions; and can also be configured to modify the control strategy and driving behavior based on changes in driving conditions. For example, the route taken by the vehicle from one destination to another can be modified based on driving conditions. Additionally or alternatively, speed, acceleration, turning angle, following distance (i.e., distance to the vehicle in front of the current vehicle), lane selection, etc., can all be modified in response to changes in driving conditions.
[0088] As described above, in some embodiments, the vehicle 200 may be in the form of a truck, but alternative forms are also possible and are considered herein. Thus, Figures 2F-2I An embodiment of vehicle 250 in the form of a semi-truck is shown. For example, Figure 2F A front view of vehicle 250 is shown. Figure 2G An isometric view of vehicle 250 is shown. In an embodiment where vehicle 250 is a semi-trailer truck, vehicle 250 may include a tractor unit 260 and a trailer unit 270 (in... Figure 2G (As shown in the image). Figure 2H and 2I Side and top views of the tractor unit 260 are provided. It is similar to the vehicle 200 illustrated above. Figures 2F-2I The vehicle 250 shown in the middle diagram may also include various sensor systems (e.g., similar to the reference 250). Figures 2A-2ESensor systems 202, 206, 208, 210, 212, 214 are shown and described. In some embodiments, although Figures 2A-2E The vehicle 200 may include only a single copy of some sensor systems (e.g., sensor system 204), but Figures 2F-2I The vehicle 250 shown may include multiple copies of the sensor system (e.g., sensor systems 204A and 204B, as shown).
[0089] While the entire accompanying drawings and description may refer to a vehicle of a given form (e.g., semi-trailer truck 250 or van 200), it should be understood that the embodiments described herein can be equally applied to a variety of vehicle environments (e.g., with modifications to take into account the form factor of the vehicle). For example, sensors and / or other components described or shown as part of van 200 may also be used in semi-trailer truck 250 (e.g., for navigation and / or obstacle detection and avoidance).
[0090] Figure 2J Various sensor fields of view are shown (e.g., associated with the aforementioned vehicle 250). As described above, vehicle 250 may contain multiple sensors / sensor units. For example, the positions of various sensors may correspond to... Figures 2F-2I The location of the sensor is disclosed in the figures. However, in some cases, the sensor may have other locations. To simplify the figures, from... Figure 2J Sensor location labels are omitted from the attached diagram. For each sensor unit of vehicle 250, Figure 2J Representative fields of view are shown (e.g., fields of view labeled 252A, 252B, 252C, 252D, 254A, 254B, 256, 258A, 258B, and 258C). The field of view of a sensor may include angular regions on which the sensor can detect objects (e.g., azimuth and / or elevation regions).
[0091] Figure 2K An example embodiment for a vehicle (e.g., reference) is shown. Figures 2F-2J The beam steering of the sensors of the vehicle 250 is shown and described. In various embodiments, the sensor unit of the vehicle 250 may be radar, lidar, sonar, etc. Furthermore, in some embodiments, the sensor may be scanned within its field of view during operation. Various different scanning angles of the example sensor are shown as regions 272, each indicating the angular region on which the sensor is operating. The sensor may periodically or iteratively change the region on which it is operating. In some embodiments, the vehicle 250 may use multiple sensors to measure region 272. Additionally, other regions may be included in other examples. For example, one or more sensors may measure aspects of the trailer 270 of the vehicle 250 and / or the area directly in front of the vehicle 250.
[0092] At some angles, the sensor's operating area 275 may include the rear wheels 276A and 276B of the trailer 270. Therefore, the sensor can measure the rear wheels 276A and / or 276B during operation. For example, the rear wheels 276A and 276B may reflect lidar or radar signals transmitted by the sensor. The sensor can receive the reflected signals from the rear wheels 276A and 276. Therefore, the data collected by the sensor may include data from the reflections from the wheels.
[0093] In some cases, such as when the sensor is radar, reflections from the rear wheels 276A and 276B may appear as noise in the received radar signal. Therefore, by guiding the radar signal away from the sensor using the rear wheels 276A and 276B, the radar can operate with an enhanced signal-to-noise ratio.
[0094] Figure 3 This is a conceptual illustration of wireless communication between various computing systems associated with an autonomous or semi-autonomous vehicle, according to an example embodiment. Specifically, wireless communication can occur between the remote computing system 302 and the vehicle 200 via network 304. Wireless communication can also occur between the server computing system 306 and the remote computing system 302, and between the server computing system 306 and the vehicle 200.
[0095] Vehicle 200 can correspond to various types of vehicles capable of transporting passengers or objects between locations, and can take any one or more forms of vehicles discussed above. In some embodiments, vehicle 200 can operate in an autonomous or semi-autonomous mode, which enables the control system to use sensor measurements to safely navigate vehicle 200 between destinations. When operating in autonomous or semi-autonomous mode, vehicle 200 can navigate with or without passengers. As a result, vehicle 200 can transport passengers between desired destinations.
[0096] Remote computing system 302 can represent any type of device associated with remote assistance technology, including but not limited to those described herein. In the examples, remote computing system 302 can represent any type of device configured to (i) receive information related to vehicle 200, (ii) provide an interface through which a human operator can sequentially perceive the information and input a response related to the information, and (iii) send the response to vehicle 200 or other devices. Remote computing system 302 can take various forms, such as workstations, desktop computers, laptop computers, tablets, mobile phones (e.g., smartphones), and / or servers. In some examples, remote computing system 302 may include multiple computing devices operating together in a network configuration.
[0097] The remote computing system 302 may include one or more subsystems and components similar to or identical to those of the vehicle 200. At a minimum, the remote computing system 302 may include a processor configured to perform the various operations described herein. In some embodiments, the remote computing system 302 may also include a user interface including input / output devices such as a touchscreen and a speaker. Other examples are also possible.
[0098] Network 304 represents the infrastructure that enables wireless communication between remote computing system 302 and vehicle 200. Network 304 also enables wireless communication between server computing system 306 and remote computing system 302, as well as between server computing system 306 and vehicle 200.
[0099] The location of the remote computing system 302 can vary within the examples. For instance, the remote computing system 302 may be located remotely from vehicle 200 and wirelessly communicate with vehicle 200 via network 304. In another example, the remote computing system 302 may correspond to a computing device within vehicle 200 that is separate from vehicle 200, but which a human operator can interact with while acting as a passenger or driver of vehicle 200. In some examples, the remote computing system 302 may be a computing device with a touchscreen operable by passengers of vehicle 200.
[0100] In some embodiments, the operations performed by the remote computing system 302 described herein may additionally or alternatively be performed by the vehicle 200 (i.e., by any system or subsystem of the vehicle 200). In other words, the vehicle 200 may be configured to provide a remote assistance mechanism that the driver or passengers of the vehicle can interact with.
[0101] Server computing system 306 can be configured to wirelessly communicate with remote computing system 302 and vehicle 200 (or possibly directly with remote computing system 302 and / or vehicle 200) via network 304. Server computing system 306 can represent any computing device configured to receive, store, determine, and / or transmit information relating to vehicle 200 and its remote assistance. Thus, server computing system 306 can be configured to perform any operation or part of such operation described herein as being performed by remote computing system 302 and / or vehicle 200. Some embodiments of wireless communication related to remote assistance may utilize server computing system 306, while others may not.
[0102] Server computing system 306 may include one or more subsystems and components similar to or the same as those of remote computing system 302 and / or vehicle 200, such as processors configured to perform the various operations described herein, and wireless communication interfaces for receiving information from and providing information to remote computing system 302 and vehicle 200.
[0103] The various systems described above can perform a variety of operations. These operations and related characteristics will now be described.
[0104] Based on the above discussion, a computing system (e.g., a remote computing system 302, a server computing system 306, or a computing system local to the vehicle 200) can operate to use a camera to capture images of the surrounding environment of the autonomous or semi-autonomous vehicle. Typically, at least one computing system will be able to analyze the images and may control the autonomous or semi-autonomous vehicle.
[0105] In some embodiments, to facilitate autonomous or semi-autonomous operation, a vehicle (e.g., vehicle 200) may receive data representing objects in its surrounding environment (also referred to herein as “environmental data”) in various ways. Sensor systems on the vehicle can provide environmental data representing objects in the surrounding environment. For example, the vehicle may have various sensors, including cameras, radar, lidar, microphones, radio units, and other sensors. Each of these sensors can communicate environmental data about the information received by each respective sensor to a processor within the vehicle.
[0106] In one example, the camera may be configured to capture still images and / or video. In some embodiments, the vehicle may have more than one camera positioned in different orientations. Moreover, in some embodiments, the camera may be able to move to capture images and / or video in different directions. The camera may be configured to store the captured images and video in memory for later processing by the vehicle's processing system. The captured images and / or video may be environmental data. Furthermore, the camera may include an image sensor as described herein.
[0107] In another example, the radar can be configured to transmit electromagnetic signals reflected by various objects near the vehicle, and then capture the electromagnetic signals reflected from the objects. The captured reflected electromagnetic signals allow the radar (or processing system) to make various determinations about the objects reflecting the electromagnetic signals. For example, the distance to the various reflecting objects and the location of the various reflecting objects can be determined. In some embodiments, the vehicle may have more than one radar in different orientations. The radar can be configured to store the captured information in a memory for later processing by the vehicle's processing system. The information captured by the radar may be environmental data.
[0108] In another example, a lidar can be configured to transmit electromagnetic signals (e.g., infrared light, such as infrared light from a gas or diode laser or other possible light source) reflected by a target object near the vehicle. The lidar can be able to capture the reflected electromagnetic (e.g., infrared light) signals. The captured reflected electromagnetic signals can enable a ranging system (or processing system) to determine the distance to various objects. The lidar can also be able to determine the rate or velocity of the target object and store it as environmental data.
[0109] Additionally, in this example, the microphone can be configured to capture audio of the environment surrounding the vehicle. The sounds captured by the microphone can include emergency vehicle sirens and other vehicle sounds. For example, the microphone could capture the sound of an ambulance, fire truck, or police car sirens. The processing system can then identify the captured audio signals to indicate an emergency vehicle. In another example, the microphone could capture the exhaust sound of another vehicle, such as a motorcycle. The processing system can then identify the captured audio signals to indicate a motorcycle. The data captured by the microphone can form part of the environmental data.
[0110] In another example, the radio unit can be configured to transmit electromagnetic signals, which may take the form of Bluetooth signals, 802.11 signals, and / or other radio technology signals. The first electromagnetic radiation signal can be transmitted via one or more antennas located in the radio unit. Furthermore, the first electromagnetic radiation signal can be transmitted using one of many different radio signaling modes. However, in some embodiments, it is desirable to transmit the first electromagnetic radiation signal in a signaling mode that requests a response from a device located near the autonomous or semi-autonomous vehicle. The processing system can be able to detect nearby devices based on the communication response to the radio unit and use the information from that communication as part of environmental data.
[0111] In some embodiments, the processing system may be able to combine information from various sensors to further determine the vehicle's surroundings. For example, the processing system may combine data from both radar information and captured images to determine whether another vehicle or pedestrian is in front of the autonomous or semi-autonomous vehicle. In other embodiments, the processing system may use other combinations of sensor data to make determinations about the surrounding environment.
[0112] When operating in autonomous (or semi-autonomous) mode, a vehicle can control its operation with minimal human input. For example, a human operator can input an address into the vehicle, which can then drive to the designated destination without further human input (e.g., the human does not need to steer or touch the brake / accelerator pedal). Furthermore, when the vehicle operates autonomously or semi-autonomously, sensor systems can receive environmental data. The vehicle's processing system can modify the vehicle's control based on the environmental data received from various sensors. In some examples, the vehicle can change its speed in response to environmental data from various sensors. The vehicle can change its speed to avoid obstacles, comply with traffic regulations, etc. When the processing system in the vehicle identifies an object near the vehicle, the vehicle may be able to change its speed or otherwise alter its movement.
[0113] When a vehicle detects an object but lacks confidence in its detection, it may request a human operator (or a more powerful computer) to perform one or more remotely assisted tasks, such as (i) confirming whether the object actually exists in the surrounding environment (e.g., whether there is actually a stop sign or not), (ii) confirming whether the vehicle's identification of the object is correct, (iii) correcting the identification if incorrect, and / or (iv) providing supplementary instructions (or modifying current instructions) to an autonomous or semi-autonomous vehicle. Remotely assisted tasks may also include instructions from the human operator to control the vehicle's operation (e.g., instructing the vehicle to stop at the stop sign if the human operator determines the object is one), although in some cases the vehicle itself may control its own operation based on feedback from the human operator regarding object identification.
[0114] To facilitate this, the vehicle can analyze environmental data representing objects in the surrounding environment to identify at least one object with a detection confidence level below a threshold. A processor within the vehicle can be configured to detect various objects in the surrounding environment based on environmental data from various sensors. For example, in one embodiment, the processor can be configured to detect objects that may be important for the vehicle's identification. Such objects may include pedestrians, cyclists, street signs, other vehicles, indicator signals on other vehicles, and various other objects detected in the captured environmental data.
[0115] Detection confidence indicates the likelihood that a identified object is correctly identified or exists in the surrounding environment. For example, a processor can perform object detection on image data within received environmental data and determine that at least one object has a detection confidence below a threshold, based on the inability to identify objects with a detection confidence above a threshold. If the result of object detection or object recognition is uncertain, the detection confidence may be low or below a set threshold.
[0116] The vehicle can detect objects in the surrounding environment in various ways, depending on the source of the environmental data. In some embodiments, the environmental data may come from a camera and is image or video data. In other embodiments, the environmental data may come from LiDAR. The vehicle can analyze the captured image or video data to identify objects in the image or video data. The method and apparatus can be configured to monitor image and / or video data for the presence of objects in the surrounding environment. In other embodiments, the environmental data may be radar, audio, or other data. The vehicle can be configured to identify objects in the surrounding environment based on radar, audio, or other data.
[0117] In some embodiments, the technology used by the vehicle to detect objects can be based on a set of known data. For example, data related to environmental objects can be stored in a memory located within the vehicle. The vehicle can compare the received data with the stored data to determine objects. In other embodiments, the vehicle can be configured to determine objects based on the context of the data. For example, street signs associated with buildings may typically be orange. Therefore, the vehicle can be configured to detect orange objects located near one side of a road as street signs associated with construction. Additionally, when the vehicle's processing system detects objects in the captured data, it can also calculate a confidence level for each object.
[0118] Furthermore, the vehicle may also have a confidence threshold. The confidence threshold can vary depending on the type of object being detected. For example, a lower confidence threshold might be used for objects that may require a rapid response from the vehicle (such as brake lights on another vehicle). However, in other embodiments, the confidence threshold may be the same for all detected objects. When the confidence associated with a detected object is greater than the confidence threshold, the vehicle can assume that the object has been correctly identified and adjust the vehicle's control accordingly based on that assumption.
[0119] The vehicle's actions can vary when the confidence level associated with a detected object is less than a confidence threshold. In some embodiments, the vehicle may react as if the detected object were present, despite the low confidence level. In other embodiments, the vehicle may react as if the detected object were not present.
[0120] When the vehicle detects an object in its surroundings, it can also calculate a confidence level associated with that specific detected object. Depending on the embodiment, the confidence level can be calculated in various ways. In one example, when detecting an object in the surroundings, the vehicle can compare environmental data with predetermined data about known objects. The closer the match between the environmental data and the predetermined data, the higher the confidence level. In other embodiments, the vehicle can use mathematical analysis of the environmental data to determine the confidence level associated with the object.
[0121] In response to determining that an object has a detection confidence level below a threshold, the vehicle may send a request for remote assistance in object identification to a remote computing system. As mentioned above, the remote computing system can take various forms. For example, the remote computing system may be a computing device located within the vehicle, separate from the vehicle, but which a human operator can interact with while acting as a passenger or driver, such as through a touchscreen interface for displaying remote assistance information. Additionally or alternatively, as another example, the remote computing system may be a remote computer terminal or other device located at a location not near the vehicle.
[0122] Remote assistance requests may include environmental data containing the object, such as image data, audio data, etc. The vehicle may transmit the environmental data to a remote computing system via a network (e.g., network 304) and, in some embodiments, via a server (e.g., server computing system 306). A human operator of the remote computing system can then use the environmental data as the basis for responding to the request.
[0123] In some embodiments, when an object is detected as having a confidence level below a confidence threshold, the object may be given preliminary identification, and the vehicle may be configured to adjust its operation in response to the preliminary identification. Such operational adjustment may take the form of stopping the vehicle, switching the vehicle to a manual control mode, changing the vehicle's speed (e.g., rate and / or direction), and other possible adjustments.
[0124] In other embodiments, even if the vehicle detects an object with a confidence level that reaches or exceeds a threshold, the vehicle may operate based on the detected object (e.g., stop if the object is identified as a stop sign with a higher confidence level), but may be configured to request remote assistance while (or later) the vehicle is operating based on the detected object.
[0125] Figure 4A This is a block diagram of a system according to an example embodiment. Specifically, Figure 4A System 400 is shown, which includes a system controller 402, a lidar device 410, multiple sensors 412, and multiple controllable components 414. The system controller 402 includes a processor 404, a memory 406, and instructions 408 stored in the memory 406 and executable by the processor 404 to perform functions.
[0126] Processor 404 may include one or more processors, such as one or more general-purpose microprocessors (e.g., having a single core or multiple cores) and / or one or more special-purpose microprocessors. One or more processors may include, for example, one or more central processing units (CPUs), one or more microcontrollers, one or more graphics processing units (GPUs), one or more tensor processing units (TPUs), one or more ASICs, and / or one or more field-programmable gate arrays (FPGAs). Other types of processors, computers, or devices configured to execute software instructions are also considered herein.
[0127] The memory 406 may include computer-readable media, such as non-transitory computer-readable media, which may include, but are not limited to, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), non-volatile random access memory (e.g., flash memory), solid-state drive (SSD), hard disk drive (HDD), optical disc (CD), digital video optical disc (DVD), digital magnetic tape, read / write (R / W) CD, R / W DVD, etc.
[0128] The lidar device 410, further described below, includes a plurality of light emitters configured to emit light (e.g., light pulses) and one or more photodetectors configured to detect light (e.g., reflected portions of light pulses). The lidar device 410 can generate three-dimensional (3D) point cloud data from the output of the photodetectors and provide the 3D point cloud data to a system controller 402. The system controller 402 can then perform operations on the 3D point cloud data to determine characteristics of the surrounding environment (e.g., relative positioning of objects within the surrounding environment, edge detection, object detection, and proximity sensing).
[0129] Similarly, system controller 402 may use outputs from multiple sensors 412 to determine characteristics of system 400 and / or the surrounding environment. For example, sensors 412 may include one or more of GPS, IMU, image capture devices (e.g., cameras), light sensors, thermal sensors, and other sensors indicating parameters related to system 400 and / or the surrounding environment. For illustrative purposes, lidar device 410 is depicted as separate from sensor 412, and in some examples may be considered part of or considered as sensor 412.
[0130] Based on characteristics of system 400 and / or the surrounding environment determined by system controller 402 from outputs from lidar device 410 and sensor 412, system controller 402 can control controllable component 414 to perform one or more actions. For example, system 400 may correspond to a vehicle, in which case controllable component 414 may include the vehicle's braking system, steering system, and / or acceleration system, and system controller 402 may modify aspects of these controllable components based on characteristics determined from lidar device 410 and / or sensor 412 (e.g., when system controller 402 controls the vehicle in autonomous or semi-autonomous mode). In this example, lidar device 410 and sensor 412 are also controllable by system controller 402.
[0131] Figure 4B This is a block diagram of a lidar device according to an example embodiment. Specifically, Figure 4B A lidar device 410 is shown, having a controller 416 configured to control a plurality of light emitters 424 and one or more photodetectors (e.g., a plurality of photodetectors 426, etc.). The lidar device 410 also includes a firing circuit 428 configured to select and supply power to each of the plurality of light emitters 424, and may include a selector circuit 430 configured to select each of the plurality of photodetectors 426. The controller 416 includes a processor 418, a memory 420, and instructions 422 stored in the memory 420.
[0132] Similar to processor 404, processor 418 may include one or more processors, such as one or more general-purpose microprocessors and / or one or more special-purpose microprocessors. One or more processors may include, for example, one or more CPUs, one or more microcontrollers, one or more GPUs, one or more TPUs, one or more ASICs, and / or one or more FPGAs. Other types of processors, computers, or devices configured to execute software instructions are also considered herein.
[0133] Similar to memory 406, memory 420 may include computer-readable media, such as non-transitory computer-readable media, such as, but not limited to, ROM, PROM, EPROM, EEPROM, non-volatile random access memory (e.g., flash memory), SSD, HDD, CD, DVD, digital magnetic tape, R / W CD, R / W DVD, etc.
[0134] Instruction 422 is stored in memory 420 and can be executed by processor 418 to perform functions associated with control trigger circuit 428 and selector circuit 430 for generating 3D point cloud data and for processing 3D point cloud data (or possibly facilitating processing of 3D point cloud data by another computing device such as system controller 402).
[0135] The controller 416 can determine 3D point cloud data by emitting light pulses using light emitters 424. An emission time is established for each light emitter, and the relative position at the time of emission is also tracked. Various aspects of the environment surrounding the lidar device 410, such as various objects, reflect the light pulses. For example, when the lidar device 410 is in an environment including roads, such objects may include vehicles, signs, pedestrians, road surfaces, or building cones. Some objects may be more reflective than others, such that the intensity of the reflected light can indicate the type of object reflecting the light pulse. Furthermore, the surface of an object may be positioned differently relative to the lidar device 410, and therefore take more or less time to reflect a portion of the light pulse back to the lidar device 410. Therefore, the controller 416 can track the detection time when the photodetector detects the reflected light pulse and the relative position of the photodetector at the detection time. By measuring the time difference between the emission time and the detection time, the controller 416 can determine how far the light pulse travels before being received, and thus determine the relative distance to the corresponding object. By tracking the relative positioning of the emission and detection times, the controller 416 can determine the orientation of the light pulses and reflected light pulses relative to the lidar device 410, thereby determining the relative orientation of the object. By tracking the intensity of the received light pulses, the controller 416 can determine the reflectivity of the object. Therefore, the 3D point cloud data determined based on this information can indicate the relative positioning of the detected reflected light pulses (e.g., in a coordinate system such as a Cartesian coordinate system) and the intensity of each reflected light pulse.
[0136] Trigger circuit 428 is used to select the light emitter for emitting light pulses. Selector circuit 430 is similarly used to sample the output from the photodetector.
[0137] Figure 5A Device 500 is shown. In some embodiments, device 500 may correspond to the above description regarding... Figure 1 The sensor system described above 102, regarding the ... Figures 2A to 2E The sensor systems described are 202, 204, 206, 208, 210, 212 and 214 and / or the above-mentioned... Figure 4AThis is part of the described system 400. Such devices may also be located on the carrier 200 in the same or similar positioning and / or orientation as sensor systems 202, 204, 206, 208, 210, 212, and 214. Furthermore, in some embodiments, multiple instances of device 500 may be attached to the same carrier.
[0138] In some embodiments, device 500 may include an image sensor 502, an analog-to-digital converter (ADC) 504, and / or an application-specific integrated circuit (ASIC) 506. The image sensor 502 may be configured to capture image data about the surrounding environment of device 500. The ADC 504 may be configured to receive the captured image data from the image sensor 502 and / or provide converted image data. The ASIC 506 may be configured to receive the converted image data and / or apply a trained machine learning model to the converted image data. The machine learning model may be configured to identify one or more objects in the surrounding environment within the converted image data.
[0139] In some embodiments, a trained machine learning model can be selected from multiple machine learning models based on the orientation and / or positioning of the device 500 relative to the vehicle 200. For example, a first trained machine learning model may be trained to recognize vehicles, pedestrians, traffic signals, or signs, while a second trained machine learning model may be trained to recognize passing vehicles, passing cyclists, or pedestrians. Therefore, an appropriate machine learning model can be selected for the current task based on the orientation and / or positioning of the device 500 relative to the vehicle 200.
[0140] Figure 5B Depicting Figure 5A The embodiments of the device 500 depicted herein. As shown, in some embodiments, the device (in...) Figure 5B The device (described as device 550) may include a multilayer die stack, with certain components deposited on each layer. Each layer is physically coupled to the layers immediately above and below it. For example, each layer may be constructed on a substrate made of a semiconductor material such as silicon. These layers can then be interconnected using wire bonding, controlled collapse chip interconnect (C4) methods (also known as "flip chip"), and / or through-silicon vias (TSVs). These layers may be adhered using a die attach film (DAF).
[0141] In some embodiments, device 550 may include a multilayer die stack having at least three layers: a first layer 552, a second layer 554, and a third layer 556. The second layer 554 may be positioned above the first layer 552, and the third layer 556 may be positioned above the second layer 554. Each layer may have one or more components of the sensor device 500 disposed thereon. For example, an application-specific integrated circuit (ASIC) 506 may be located on the first layer 552, an analog-to-digital converter (ADC) 504 may be located on the second layer 554, and an image sensor 502 may be located on the third layer 556.
[0142] As mentioned above Figures 2A-2E The sensor system can be located at different positions and orientations on the carrier 200. This also applies to the device 500. For example, the carrier can have several devices 500 located at different positions and / or orientations around its exterior.
[0143] exist Figure 5C An example is depicted, illustrating a vehicle 570 with two devices 500 positioned around its exterior: a forward device 572 and a lateral device 574. Dashed lines indicate the direction of travel of the vehicle 570. As described above, a vehicle may have several devices 500 positioned at different locations and / or orientations around its exterior. In some cases, the orientation of the devices 500 around the vehicle 570 may be relative to the direction of travel of the vehicle 570.
[0144] For example, the first orientation of one device may be within 15° parallel to the vehicle's direction of travel (e.g., forward-facing device 572), while the second orientation of another device may be within 15° perpendicular to the vehicle's direction of travel (e.g., lateral device 574). Based on this difference in position and / or orientation, different sensor devices may produce different image frames, as will be discussed later regarding... Figures 7A-7D The subject of discussion.
[0145] Figure 6AA process 600 that can be performed by device 602, which may correspond to device 500 or device 550 described above, is depicted. Similar to devices 500 and 550, device 602 may include an image sensor 604, which may be configured to capture image data 606 about the surrounding environment of device 602. This image data 606 may then be provided to an analog-to-digital converter (ADC) 608. The ADC may then convert the image data 606 from an analog signal to digital information, thereby producing converted image data 610. The converted image data 610 may then be provided to an application-specific integrated circuit (ASIC) 612. The ASIC 612 may be configured to perform processing and / or computer vision tasks on the converted image data 610. For example, the ASIC 612 may be configured to apply a trained machine learning model 614 to the converted image data 610 to identify one or more objects in the surrounding environment within the converted image data 610. In some embodiments, the trained machine learning model may include a convolutional neural network (CNN).
[0146] In the context of device 602 operating in conjunction with vehicle 200, trained machine learning model 614 can be configured to identify objects relevant to the navigation of vehicle 200. In some embodiments, ASIC 612 can be generated from a lidar device (e.g., regarding...) Figure 4A and Figure 4B The described lidar device 410 receives lidar data indicating the distance between vehicle 200 and one or more objects in the surrounding environment and / or to one or more objects in the surrounding environment. In some embodiments, ASIC 612 may receive lidar data from a source other than the lidar device. For example, ASIC 612 may receive lidar data from a central processing unit or system, or the lidar data may be stored in memory within device 602. In this way, lidar data collected by other lidar devices (e.g., lidar devices on other vehicles besides vehicle 200) may also be processed by device 602. ASIC 612 may be configured to apply a trained machine learning model 614 to a subset of the transformed image data 610 based on the lidar data. For example, the lidar data may indicate that an object may be located in a certain area of the transformed image data 610, and the trained machine learning model 614 may be applied to that particular area based on such indication.
[0147] After processing, the ASIC 612 can output an image frame 616. The image frame 616 may include the converted image data 610 and additional metadata, as will be discussed below. Figure 6B As described.
[0148] Figure 6B A block diagram depicting image frame 650 is shown, such as... Figure 6A The process 600 shown generates an image frame 616. The output image frame 650 may include converted image data 652, which may correspond to the converted image data 610 generated by the ADC 608. In some embodiments, the image frame 650 may be provided to a central computing device (e.g., as mentioned above regarding...). Figure 4A The system controller 402 is described. Although Figure 6B The image frame 650 shown includes both converted image data 652 and metadata 654, but in some embodiments, the image frame 650 may include only one or another (e.g., only converted image data 652).
[0149] Image frame 650 may also include metadata 654, which may include additional information about image frame 650, converted image data 652, and / or the device 602 that generated it. For example, metadata 654 may include object classification data 656 and / or object location data 658 relating to objects identified within the converted image data 652. Such objects can be identified and / or located by applying a trained machine learning model 614. For example, object classification data 656 may include determining whether an object present in the converted image data 652 is a vehicle, a person, a traffic signal, or a traffic sign. In some embodiments, object classification data 656 may also include information about the orientation of such an object relative to a vehicle (e.g., passing, in front, behind, etc.).
[0150] Object location data 658 may include an indication of the location of the identified object within the transformed image data 652. For example, object location data 658 may include a region of the transformed image data 652. As another example, object location data 658 may include coordinates specifying the location of the object within the transformed image data 652. In the above example, when a machine learning model 614 trained on LiDAR data is applied, object location data 658 may include spatial coordinates (e.g., x, y, and / or z coordinates) within the LiDAR data and / or the transformed image data.
[0151] Metadata 654 can be included within image frame 650 as one or more rows and / or columns, for example, as binary or text data. The following will discuss... Figures 7A-7D Examples are provided.
[0152] Figure 7A Depicting a device installed from the front (e.g., located in) Figures 2A to 2E The sensor system 206 shown and / or Figure 5CImage data 700 is captured by an image sensor at or near the location of the device 572 shown. During the operation of the vehicle 200, such a forward-mounted device can capture objects such as vehicles, pedestrians, traffic signals, or signs. Figure 7A In the process, image data 700 captures vehicle 702 and stop sign 704. This image data can then be processed by other components of the device as described herein, for example, via... Figure 6A The process described in the 600 is the ADC and / or ASIC.
[0153] Figure 7B This refers to the same image data 700 that has undergone processing by components of the device as described herein, for example, via process 600 depicted in Figure 6. Specifically, Figure 7B Image frame 720 is depicted, containing converted image data 722 (converted from image data 700) and metadata 724. This metadata 724 can be appended as additional lines (or multiple lines) to image frame 720. Figure 7B In a specific example, metadata 724 includes object classification data (object_type) and object location data (object_location), the latter being in the form of three-dimensional coordinates. In some embodiments, object location data may refer to LiDAR data, as described above. Although metadata 724 is appended to... Figure 7B The top row of image frame 720 is shown, but this is merely an example. In some embodiments, metadata 724 may be appended to the bottom row of image frame 720 and / or one or more columns of image frame 720.
[0154] Figure 7C Depicting the combination of and Figure 7A Another example of image data captured by a device positioned and / or facing differently. Specifically, it depicts image data captured by a side-mounted device (e.g., located at...). Figures 2A-2E The sensor system 216 and / or shown in the figure Figure 5C Image data 740 is captured by an image sensor at or near the location of device 574 shown in the diagram. During the operation of vehicle 200, such a side-mounted device can capture objects such as passing vehicles, passing cyclists, or pedestrians. Figure 7C In the process, image data 740 captures passing cyclists 742 and passing vehicles 744. This image data can then be processed by other components of the device as described herein, for example, via... Figure 6A The process described in the 600 is the ADC and / or ASIC.
[0155] Figure 7DThis refers to the same image data 740 that has undergone processing by components of the device as described herein, for example, via process 600 depicted in FIG. 6. Specifically, Figure 7D Image frame 760 is depicted, containing converted image data 762 (converted from image data 740) and metadata 764. This metadata 764 can be appended as additional lines (or multiple lines) to image frame 760. Figure 7D In a specific example, metadata 726 includes object classification data (object_type) and object location data (object_location) for passing cyclists 742, the latter being in the form of two-dimensional coordinates (e.g., x and y coordinates within transformed image data 762).
[0156] As described above, image frames such as image frame 720 and image frame 760 can be provided to a central computing device (e.g., as mentioned above regarding...). Figure 4A The system controller 402 is described.
[0157] like Figures 7A-7D As shown, different devices can generate different image frames. For specialization and higher efficiency, different devices can be equipped with machine learning models specifically designed and / or trained for the types of image frames they might use. For example, generating... Figure 7B The device displaying image frame 720 may have used a machine learning model trained to recognize stop signs, etc. However, in some embodiments, the machine learning model trained for each of the different devices may be the same model. For example, a model may be trained to recognize multiple different types of objects (e.g., based on a set of training data used to train the model).
[0158] As another example, a trained machine learning model can be trained using image data collected from other vehicles via a supervised learning process. This image data uses a camera or other sensor with the same or similar orientation as the device on which the trained machine learning model can be deployed. For example, a forward-facing device (e.g., such as...) can be trained on image data collected from other vehicles using a camera also mounted at the front of those vehicles. Figure 5C (The forward device 572 in the middle). In this way, the trained machine learning model can improve the accuracy of object recognition performed using the model.
[0159] In other words, the trained machine learning model can be trained through a supervised learning process using image data collected by a camera coupled to a first auxiliary vehicle, which operates in an autonomous or semi-autonomous mode along an auxiliary orientation relative to the vehicle's direction of travel, wherein the orientation relative to the vehicle's direction of travel and the auxiliary orientation relative to the auxiliary vehicle's direction of travel are substantially the same.
[0160] As described above, the trained machine learning model can be selected from multiple machine learning models. In some embodiments, the device (e.g., device 500) may include memory, and the trained machine learning model may be stored in a first memory during the manufacture of the vehicle. In some embodiments, the trained machine learning model may be selected from multiple machine learning models based on the location and / or orientation of the device.
[0161] In other words, the device may include memory, and the ASIC may be configured to compute from a remote computing device (e.g., regarding...). Figure 3 The remote computing system 302 and / or server computing system 306 described receive a trained machine learning model and store the trained machine learning model in memory.
[0162] Figure 8 This is a flowchart of method 800 according to an example embodiment. Method 800 can be provided by a system (e.g., referring to...) Figure 4A The system 400 shown and described), the system controller (e.g., reference) Figure 4A The system controller 402 shown and described) and / or device (e.g., Figure 5A The device 500 shown in the figure is executed.
[0163] At box 802, method 800 may include image data about the surrounding environment captured by an image sensor.
[0164] At block 804, method 800 may include receiving captured image data from an image sensor by an analog-to-digital converter (ADC).
[0165] At block 806, method 800 may include providing the converted image data by the ADC to an application-specific integrated circuit (ASIC).
[0166] At box 808, method 800 may include applying a machine learning model trained by an ASIC to the transformed image data to identify one or more objects in the surrounding environment within the transformed image data.
[0167] At box 810, method 800 may include an image frame output by the ASIC, wherein at least one line of the image frame includes metadata including object classification data and object location data of one or more identified objects in the surrounding environment.
[0168] In some embodiments, method 800 may further include: receiving an updated trained machine learning model from a remote computing device by an ASIC; storing the updated trained machine learning model in memory by the ASIC; capturing additional image data about the surrounding environment by an image sensor; receiving the additional captured image data from the image sensor by an ADC; providing the additional transformed image data to the ASIC by the ADC; applying the updated trained machine learning model to the additional transformed image data by the ASIC to identify one or more additional objects in the surrounding environment within the additional transformed image data; and outputting additional image frames by the ASIC, wherein at least one line of the additional image frames includes metadata including object classification data and object location data of one or more additional identified objects in the surrounding environment.
[0169] In some embodiments, method 800 may further include providing image frames to a central computing device by an ASIC.
[0170] Figure 9 This is a flowchart of method 900 according to an example embodiment. Method 900 can be provided by a system (e.g., referring to...) Figure 4A The system 400 shown and described), the system controller (e.g., reference) Figure 4A The system controller 402 shown and described) and / or device (e.g., Figure 5A The device 500 shown in the figure is executed.
[0171] At block 902, method 900 may include capturing first image data about the surrounding environment by a first image sensor of a first device, wherein the first device is attached to the vehicle along a first orientation relative to the vehicle;
[0172] At block 904, method 900 may include receiving captured first image data from a first image sensor by a first ADC of the first device.
[0173] At box 906, method 900 may involve first image data converted by a first ADC.
[0174] At block 908, method 900 may involve receiving converted first image data from a first ADC by a first ASIC of a first device.
[0175] At box 910, method 900 may involve applying a first trained machine learning model to transformed first image data by a first ASIC to identify one or more first objects in the surrounding environment within the transformed first image data, wherein the first trained machine learning model is selected from a plurality of machine learning models based on a first orientation.
[0176] At block 912, method 900 may include capturing second image data about the surrounding environment by a second image sensor of a second device, wherein the second device is attached to the vehicle along a second orientation relative to the vehicle, wherein the first orientation and the second orientation are different.
[0177] At block 914, method 900 may include receiving captured second image data from a second image sensor by a second ADC of the second device.
[0178] At box 916, method 900 may involve second image data converted by a second ADC.
[0179] At block 918, method 900 may involve receiving converted second image data from a second ADC by a second ASIC of a second device.
[0180] At box 920, method 900 may involve applying a second trained machine learning model to transformed second image data by a second ASIC to identify one or more second objects in the surrounding environment within the transformed second image data, wherein the second trained machine learning model is selected from multiple machine learning models based on a second orientation.
[0181] This disclosure is not limited to the specific embodiments described herein, which are intended to illustrate various aspects. It will be apparent to those skilled in the art that many modifications and variations can be made without departing from its spirit and scope. In addition to those listed herein, functionally equivalent methods and apparatuses within the scope of this disclosure will become apparent from the foregoing description. These modifications and variations are intended to fall within the scope of the appended claims.
[0182] The above detailed description, with reference to the accompanying drawings, illustrates various features and functions of the disclosed systems, devices, and methods. In the drawings, similar symbols generally identify similar components unless the context otherwise requires. The exemplary embodiments described herein and in the drawings are not intended to be limiting. Other embodiments may be utilized, and other changes may be made, without departing from the scope of the subject matter set forth herein. It will be readily understood that aspects of this disclosure, as generally described herein and illustrated in the drawings, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations, all of which are explicitly contemplated herein.
[0183] With respect to any or all of the message flow diagrams, scenarios, and flowcharts in the accompanying drawings and as discussed herein, each step, block, operation, and / or communication may represent the processing and / or transmission of information according to exemplary embodiments. Alternative embodiments are included within the scope of these exemplary embodiments. In these alternative embodiments, for example, operations described as steps, blocks, transmissions, communications, requests, responses, and / or messages may be performed not in the order shown or discussed, including substantially simultaneously or in reverse order, depending on the functionality involved. Furthermore, more or fewer blocks and / or operations may be used with any of the message flow diagrams, scenarios, and flowcharts discussed herein, and these message flow diagrams, scenarios, and flowcharts may be combined with each other in part or in whole.
[0184] The steps, blocks, or operations representing information processing may correspond to circuitry that can be configured to perform specific logical functions of the methods or techniques described herein. Alternatively or additionally, the steps or blocks representing information processing may correspond to modules, segments, or portions of program code (including associated data). The program code may include one or more instructions executable by a processor for implementing specific logical operations or actions in the method or technique. The program code and / or associated data may be stored on any type of computer-readable medium, such as storage devices including RAM, disk drives, solid-state drives, or other storage media.
[0185] Furthermore, a step, frame, or operation representing one or more information transfers may correspond to information transfers between software and / or hardware modules within the same physical device. However, other information transfers may occur between software and / or hardware modules in different physical devices.
[0186] The specific arrangements shown in the figures should not be considered limiting. It should be understood that other embodiments may include more or fewer of each element shown in the given figures. Furthermore, some of the elements shown may be combined or omitted. Additionally, exemplary embodiments may include elements not shown in the figures.
[0187] While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for illustrative purposes and not restrictive, and the true scope is indicated by the appended claims.
Claims
1. A system for object recognition, comprising: Vehicle; A first device, attached to a vehicle and along a first orientation relative to the vehicle, wherein the first device includes: The first image sensor is configured to capture first image data about the surrounding environment; A first analog-to-digital converter (ADC) is configured to receive captured first image data from a first image sensor and provide converted first image data; and The first application-specific integrated circuit (ASIC) is configured as follows: Receive converted first image data from the first ADC; and The first trained machine learning model is applied to the transformed first image data to identify one or more first objects in the surrounding environment within the transformed first image data, wherein the first trained machine learning model is selected from multiple machine learning models based on a first orientation; and A second device is attached to a vehicle and oriented in a second orientation relative to the vehicle, wherein the first and second orientations are different, and wherein the second device comprises: The second image sensor is configured to capture second image data about the surrounding environment; The second ADC is configured to receive captured second image data from the second image sensor and provide converted second image data; and The second ASIC is configured as follows: Receive converted second image data from the second ADC; and The second trained machine learning model is applied to the transformed second image data to identify one or more second objects in the surrounding environment within the transformed second image data, wherein the second trained machine learning model is selected from the plurality of machine learning models based on a second orientation.
2. The system according to claim 1, wherein, The first orientation is within 15° parallel to the vehicle's direction of travel, and the second orientation is within 15° perpendicular to the vehicle's direction of travel.
3. The system according to claim 2, wherein, The first trained machine learning model is trained using image data collected by a camera coupled to a first auxiliary vehicle through a supervised learning process. The first auxiliary vehicle operates in an autonomous or semi-autonomous mode along a first auxiliary orientation relative to the direction of travel of the first auxiliary vehicle, wherein the first orientation relative to the direction of travel of the vehicle and the first auxiliary orientation relative to the direction of travel of the first auxiliary vehicle are substantially the same.
4. The system according to claim 2, wherein, The first training machine learning model was trained to recognize vehicles, pedestrians, traffic signals, or signs.
5. The system according to claim 2, wherein, The second training machine learning model was trained to identify passing vehicles, passing cyclists, or pedestrians.
6. The system according to claim 1, wherein, The first device also includes a first memory, wherein the first trained machine learning model is stored in the first memory during the manufacture of the vehicle.
7. The system according to claim 1, wherein, The first device further includes a first memory, wherein the second device further includes a second memory, wherein the first ASIC is further configured to receive a first trained machine learning model from a remote computing device and store the first trained machine learning model in the first memory, and wherein the second ASIC is further configured to receive a second trained machine learning model from a remote computing device and store the second trained machine learning model in the second memory.
8. The system according to claim 1, wherein, The first trained machine learning model and the second trained machine learning model are different from each other.
9. The system of claim 1, further comprising a lidar device configured to generate lidar data indicating the distance between the vehicle and one or more objects in the surrounding environment, wherein, The first ASIC is configured to apply a first trained machine learning model to a subset of the transformed first image data based on LiDAR data.
10. The system according to claim 1, wherein, The first device also includes: The first layer of a multi-layer die stack, in which the ASIC is mounted; A second layer of a multi-layer die stack, wherein the second layer is positioned above and coupled to the first layer, and wherein the ADC is disposed on the second layer; and A third layer of multilayer die stacking, wherein the third layer is positioned above and coupled to the second layer, and wherein a first image sensor is disposed on the third layer.
11. An apparatus comprising: An image sensor is configured to capture image data about the surrounding environment; An analog-to-digital converter (ADC) is configured to receive captured image data from an image sensor and provide converted image data; and Application-Specific Integrated Circuit (ASIC) is configured as follows: Receive converted image data from the ADC; The trained machine learning model is applied to the transformed image data to identify one or more objects in the surrounding environment within the transformed image data; as well as Output an image frame, wherein at least one line of the image frame includes metadata, which includes object classification data and object location data of one or more objects identified in the surrounding environment.
12. The device according to claim 11, wherein, The machine learning models trained include convolutional neural networks (CNNs).
13. The device according to claim 11, wherein, The ASIC is configured to apply a trained machine learning model to a subset of transformed image data based on LiDAR data indicating distances to one or more objects in the surrounding environment.
14. The device according to claim 13, wherein, The object location data includes the spatial coordinates within the LiDAR data.
15. The device according to claim 11, wherein, Object location data includes spatial coordinates within the transformed image data.
16. The device according to claim 11, wherein, Object classification data includes the determination of whether the object is a vehicle, a person, a traffic signal, or a traffic sign.
17. The apparatus of claim 11, further comprising: The first layer of a multi-layer die stack, in which the ASIC is mounted; A second layer of a multi-layer die stack, wherein the second layer is positioned above and coupled to the first layer, and wherein the ADC is disposed on the second layer; and A third layer of multi-layer die stacking, wherein the third layer is positioned above and coupled to the second layer, and wherein an image sensor is disposed on the third layer.
18. A method for object recognition, comprising: Image data about the surrounding environment is captured by an image sensor; The captured image data is received from the image sensor by an analog-to-digital converter (ADC); The ADC provides the converted image data to the application-specific integrated circuit (ASIC). The ASIC applies a trained machine learning model to the transformed image data to identify one or more objects in the surrounding environment within the transformed image data; as well as The ASIC outputs an image frame, wherein at least one line of the image frame includes metadata, which includes object classification data and object location data of one or more objects identified in the surrounding environment.
19. The method of claim 18, further comprising: The machine learning model is updated and trained by the ASIC from a remote computing device; The ASIC stores the updated trained machine learning model in memory; Additional image data about the surrounding environment is captured by the image sensor; The ADC receives additional captured image data from the image sensor; The ADC provides the additional converted image data to the ASIC; The ASIC applies an updated, trained machine learning model to the additional transformed image data to identify one or more additional objects in the surrounding environment within the additional transformed image data; as well as The ASIC outputs additional image frames, wherein at least one line of the additional image frame includes metadata, which includes object classification data and object location data of one or more additional objects identified in the surrounding environment.
20. The method of claim 18, further comprising: The ASIC provides image frames to the central computing device.