Video capture processing and effects
By determining and adjusting the camera's position and trajectory, and combining auxiliary visual odometry and IMU, the problem of image shakiness during handheld shooting was solved, achieving a smooth and consistent virtual track camera effect.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- QUALCOMM INC
- Filing Date
- 2024-11-19
- Publication Date
- 2026-06-16
AI Technical Summary
Existing cameras struggle to move smoothly and consistently along a predefined path when handheld shooting, resulting in image jitter and instability, making it difficult to achieve the effect of a virtual track camera.
By determining the location and planned trajectory of the image capture device, and combining auxiliary visual odometry and IMU, the actual positioning of the camera is adjusted to match the planned trajectory, thereby achieving image transformation and distortion to simulate the shooting effect of a track camera.
It enables smooth camera movement along a predefined path, improving image stability and consistency, and enhancing the shooting effect of the virtual track camera.
Smart Images

Figure CN122228522A_ABST
Abstract
Description
Technical Field
[0001] This application relates in general to capturing and processing images. For example, aspects of this application relate to an image capture and processing apparatus configured to perform video capture processing and provide various effects based on the video capture processing (e.g., acting as a virtual orbital camera). Background Technology
[0002] A camera is a device that uses an image sensor to receive light and capture image frames (such as still images or video frames). A camera may include one or more processors, such as an image signal processor (ISP), which processes one or more image frames captured by the image sensor. For example, raw image frames captured by the image sensor may be processed by an image signal processor (ISP) to generate a final image. Cameras can be configured with various image capture and image processing settings to alter the appearance of an image. However, changing the appearance of an image can only alter what has already been captured. In some cases, techniques that help improve the content captured by the camera may be useful. Summary of the Invention
[0003] In some examples, systems and techniques for improved image capture are described. For example, aspects of this disclosure relate to systems and techniques for performing video capture processing and providing various effects based on that video capture processing (e.g., acting as a virtual orbital camera that can be used to capture images).
[0004] In one exemplary example, an imaging apparatus is provided. The imaging apparatus includes at least one memory and at least one processor coupled to the at least one memory. The at least one processor is configured to: receive an indication of a planned trajectory traversing an environment from a first location toward a second location; determine the location of an image capturing device in the environment; acquire an image from the image capturing device at a third location; and transform the image based on the difference between the third location and a corresponding location in the planned trajectory.
[0005] For example, a method for capturing an image is provided. The method includes: receiving an indication of a planned trajectory traversing an environment from a first location toward a second location; determining the location of an image capturing device in the environment; acquiring an image from the image capturing device at a third location; and transforming the image based on the difference between the third location and a corresponding location in the planned trajectory.
[0006] In another example, a non-transitory computer-readable medium is provided. The non-transitory computer-readable medium stores instructions thereon that, when executed by at least one processor, cause at least one processor to: receive an instruction for a planned trajectory traversing an environment from a first location toward a second location; determine the location of an image capturing device in the environment; acquire an image from the image capturing device at a third location; and transform the image based on the difference between the third location and a corresponding location in the planned trajectory.
[0007] For example, an apparatus for capturing an image is provided. The apparatus includes: components for receiving an indication of a planned trajectory traversing an environment from a first location toward a second location; components for determining the location of an image capturing device in the environment; components for acquiring an image from the image capturing device at a third location; and components for transforming the image based on the difference between the third location and a corresponding location in the planned trajectory.
[0008] In some aspects, one or more of the devices described herein are part of or include the following: mobile devices (e.g., mobile phones or so-called "smartphones" or other mobile devices), wearable devices, extended reality devices (e.g., virtual reality (VR) devices, augmented reality (AR) devices, or mixed reality (MR) devices), personal computers, laptop computers, server computers, vehicles (e.g., computing devices of vehicles), or other devices. In some aspects, a device includes one or more cameras for capturing one or more images. In some aspects, the device includes a display for displaying one or more images, notifications, and / or other displayable data. In some aspects, the device may include one or more sensors. In some cases, the one or more sensors may be used to determine the position and / or pose of the device, the state of the device, and / or for other purposes.
[0009] This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used in isolation to define the scope of the claimed subject matter. This subject matter should be understood with reference to the appropriate portions of the entire specification, any or all drawings, and each claim.
[0010] The foregoing, as well as other features and embodiments, will become more apparent from the following description, claims and drawings. Attached Figure Description
[0011] The exemplary embodiments of this application are described in detail below with reference to the following figures.
[0012] Figure 1 This is a block diagram illustrating the architecture of an image capture and processing device based on some examples.
[0013] Figure 2 This is a diagram illustrating the architecture of an example enhanced image capture and processing system configured with a virtual orbit camera according to some aspects of this disclosure.
[0014] Figure 3 This is a flowchart illustrating various aspects of a technique for capturing images using a virtual orbit camera, according to the present disclosure.
[0015] Figure 4 This is a block diagram illustrating the architecture of an image capture device configured to capture images using a virtual orbit camera, according to various aspects of this disclosure.
[0016] Figure 5 This is a block diagram illustrating a positioning estimation engine based on various aspects of this disclosure.
[0017] Figure 6 This is a block diagram illustrating a path generation engine based on various aspects of this disclosure.
[0018] Figure 7 This is a block diagram illustrating various aspects of a railcar stabilization engine according to this disclosure.
[0019] Figure 8 This is a block diagram illustrating a trajectory feedback engine according to various aspects of this disclosure.
[0020] Figure 9 This is a block diagram illustrating an example of an image distortion engine according to various aspects of this disclosure.
[0021] Figure 10 This is a block diagram illustrating another example of an image distortion engine according to various aspects of this disclosure.
[0022] Figure 11 An example of a graphical representation used to provide feedback on deviations from the planned trajectory is shown.
[0023] Figure 12 This is a flowchart illustrating a process for capturing an image according to various aspects of this disclosure.
[0024] Figure 13 This is a diagram illustrating an example of a system used to implement some of the aspects described in this article. Detailed Implementation
[0025] Certain aspects and embodiments of this disclosure are provided below. Some of these aspects and embodiments may be applied independently, and some may be combined, as will be apparent to those skilled in the art. Specific details are set forth in the following description for purposes of explanation in order to provide a thorough understanding of the various embodiments of this application. However, it will be apparent, however, that the various embodiments may be practiced without these specific details. The accompanying drawings and descriptions are not intended to be limiting.
[0026] The following description provides only exemplary embodiments and is not intended to limit the scope, applicability, or configuration of this disclosure. Rather, the subsequent description of exemplary embodiments will provide those skilled in the art with enabling descriptions for implementing the exemplary embodiments. It should be understood that various changes may be made to the function and arrangement of the elements without departing from the scope of this application as set forth in the appended claims.
[0027] Image capture devices, such as cameras, can be configured with a variety of image capture and image processing settings that help improve the appearance of the captured images. Some settings can be configured for post-processing of the image, such as changing contrast, brightness, saturation, sharpness, color levels, curves, and color, etc. Additionally, some imaging device settings, such as ISO, exposure time (also known as exposure duration), aperture size, aperture value, shutter speed, focal length, and gain, can be determined and applied before or during image capture.
[0028] However, image capture and processing settings can be limited by how the imaging device is handled to obtain the image to be processed. In some cases, camera movement (such as panning, tilting, tracking, and telescopic shots) can be used to enhance the captured image. Camera movement in such shots can be used to enhance the storytelling, increase tension, engage the viewer, evoke and / or enhance emotions, etc. However, capturing smooth and consistent moving camera shots along a predefined path using a handheld device can be challenging due to the difficulty of moving the hand smoothly and consistently. For example, a handheld camera may experience speed, position, and angular jitter when attempting to capture a moving camera shot.
[0029] In some cases, mechanical gimbals or other digital and / or analog stabilization mechanisms may be used, but such mechanisms typically attempt to eliminate user hand tremors (e.g., tremors in the hands of the person holding the imaging device). These mechanisms often attempt to follow (e.g., adjust) the user's direction of movement to remove motion, even if the user's movement may not be smooth or consistent. However, such mechanisms may only smooth such motion, but do not attempt to align the image with a defined path, such as if the imaging device is mounted on a track, vehicle (e.g., a railcar), movable / rotatable axis, etc., and moves along a defined path.
[0030] This describes systems, apparatuses, processes (also referred to as methods), and computer-readable media (collectively, “systems and techniques”) for performing video capture processing and providing various effects based on that video capture processing (e.g., acting as a virtual track camera for acquiring images). Moving a camera along a predefined path in an environment allows for the capture of smoother and more consistent camera images. For a virtual track camera, a planned trajectory defining how the camera should move across the environment can be determined. For example, a device including the camera (e.g., a mobile device) may present a set of patterned trajectories, for example, on the device's display. The device may receive instructions on the intended trajectory to be used, such as user input from a user of the device selecting a patterned trajectory from that set of patterned trajectories. The camera may also determine its localization in the environment. In some cases, the camera's localization in the environment may be determined using assisted visual odometry and an internal measurement unit (IMU) or 6DOF localization methods.
[0031] For clarity, the expected trajectory refers to, for example, a trajectory indicated by a user (e.g., information about the trajectory or path). In some cases, the expected trajectory may be based on selecting a pattern trajectory from a set of pattern trajectories. A pattern trajectory may be a set of commonly used trajectories for capturing images (e.g., a lens). In other cases, the expected trajectory may be indicated by the movement of a camera (e.g., an image capturing device) through the environment. Based on the expected trajectory, the planned trajectory may be determined, for example, by adapting the expected trajectory to the environment / object, smoothing the expected trajectory, etc.
[0032] Upon receiving an instruction for a desired trajectory, the camera can determine a planned trajectory across the environment corresponding to the desired trajectory. For example, the planned trajectory can be the path (or route) along which the camera should move to capture an image. In some cases, the planned trajectory may include a threshold distance at which the camera can move / rotate / turn / tilt / etc. before adjustments to the captured image can be applied.
[0033] The planned trajectory can be relative to the camera's position. For example, the planned trajectory can begin based on the camera's current position. In some cases, the planned trajectory can be adjusted based on the environment. For example, the expected trajectory can be selected based on an object. For example, the selected expected trajectory can be an arc or circle around the object. In some cases, the planned trajectory can be adjusted for distance from the object or the environment. In some examples, the distance to the object can be determined, for example, to determine a path around the object, and this distance and / or size of the object can be used to adjust the size / scale of the selected expected trajectory to form the planned trajectory. In some cases, an indication of the speed of the expected trajectory can also be received (e.g., implicitly or explicitly). This indicated speed can be incorporated into the determined planned trajectory. The determined planned trajectory can be output for presentation. For example, the camera can display a preview image of the planned trajectory overlaid on the environment, indicating what the camera can capture if image capture begins. The determined planned trajectory can be displayed on the preview image (e.g., a preview image overlaid on the environment). In some cases, the preview image may continue to be displayed while the image is being captured by the camera.
[0034] In some cases, an instruction to begin capturing images may subsequently be received, and the camera may begin capturing images. As indicated above, the planned trajectory of the camera may include the desired positioning (e.g., translation and rotation) of the camera along the camera path at certain times (e.g., based on the receiving speed of the lens path). While image capture occurs, the camera may continue to monitor its movement through the environment, for example, based on information from an auxiliary visual odometry and IMU or 6DOF positioning system, to determine the camera's actual position / pose in the environment. As the camera moves through the environment while capturing images, the camera's actual position / pose defines the actual trajectory. The camera may compare the desired positioning as indicated in the planned trajectory with the actual position / pose.
[0035] In some cases, the camera can update the display on the preview image to indicate the difference between the desired and actual positioning. For example, user interface (UI) elements indicating the difference between the desired and actual positioning can be displayed (e.g., overlaid) on the preview image. This indication between the desired and actual positioning can help the user move the camera more accurately along the planned trajectory. Additionally, when images are taken along the actual trajectory, the camera's position at the time of image capture can be associated with the image. In some cases, the difference between the desired and actual positioning can also be associated with the image. The camera can then adjust (e.g., distort, reproject, etc.) the image to align the captured image with the planned trajectory. For example, the camera can distort the image so that it appears to have been captured using a camera moving along the planned trajectory.
[0036] The various aspects of the technology described in this article will be discussed below with respect to the accompanying figures.
[0037] Figure 1 This is a block diagram illustrating the architecture of an image capture and processing system 100. The image capture and processing system 100 includes various components for capturing and processing images of a scene (e.g., an image of scene 110). The image capture and processing system 100 can capture individual images (or photographs) and / or capture video comprising multiple images (or video frames) in a specific sequence. A lens 115 of the image capture and processing system 100 faces scene 110 and receives light from scene 110. In some cases, the lens 115 and the image sensor 130 may be associated with an optical axis. In one exemplary example, both the photosensitive area of the image sensor 130 (e.g., a photodiode) and the lens 115 may be centered on the optical axis. The lens 115 bends incident light from scene 110 toward the image sensor 130. The light received by the lens 115 passes through an aperture. In some cases, the aperture (e.g., aperture size) is controlled by one or more control mechanisms 120 and received by the image sensor 130. In some cases, the aperture may have a fixed size.
[0038] One or more control mechanisms 120 may control exposure, focus, and / or zoom based on information from image sensor 130 and / or information from image processor 150. One or more control mechanisms 120 may include multiple mechanisms and components; for example, control mechanism 120 may include one or more exposure control mechanisms 125A, one or more focus control mechanisms 125B, and / or one or more zoom control mechanisms 125C. One or more control mechanisms 120 may also include additional control mechanisms besides those illustrated, such as controls for analog gain, flash, HDR, depth of field, and / or other image capture properties.
[0039] The exposure control mechanism 125A of the control mechanism 120 can obtain the exposure settings. In some cases, the exposure control mechanism 125A stores the exposure settings in a memory register. Based on the exposure settings, the exposure control mechanism 125A can control the aperture size (e.g., aperture size or aperture value), the duration of aperture opening (e.g., exposure time or shutter speed), the duration of light collection by the sensor (e.g., exposure time or electronic shutter speed), the sensitivity of the image sensor 130 (e.g., ISO speed or film speed), the analog gain applied by the image sensor 130, or any combination thereof. The exposure settings may be referred to as image capture settings and / or image processing settings.
[0040] Image sensor 130 includes one or more arrays of photodiodes or other photosensitive elements. Each photodiode measures the amount of light that ultimately corresponds to a specific pixel in an image generated by image sensor 130. In some cases, different photodiodes may be covered by different filters. In some cases, different photodiodes may be covered in different color filters, and light that matches the color of the filter covering the photodiode can thus be measured. Various color filter arrays can be used, including Bayer color filter arrays, four-color color filter arrays (also known as four-color Bayer color filter arrays or QCFA), and / or any other color filter array. For example, a Bayer color filter includes a red color filter, a blue color filter, and a green color filter, wherein each pixel of the image is generated based on red light data from at least one photodiode covered in the red color filter, blue light data from at least one photodiode covered in the blue color filter, and green light data from at least one photodiode covered in the green color filter. Other types of color filters can use yellow, magenta, and / or cyan (also known as "emerald green") color filters as alternatives to or supplements to red, blue, and / or green color filters. In some cases, some photodiodes can be configured to measure infrared (IR) light. In some specific implementations, the photodiode measuring IR light may not be covered by any filter, thus allowing the IR photodiode to measure both visible light (e.g., RGB or other colors) and IR light. In some examples, the IR photodiode may be covered by an IR filter, thus allowing IR light to pass through while blocking light from other parts of the spectrum (e.g., visible light, color). Some image sensors (e.g., image sensor 130) may lack filters entirely (e.g., color, IR, or any other part of the spectrum) and may alternatively use different photodiodes (in some cases, vertically stacked) throughout the pixel array.
[0041] In some cases, image sensor 130 may optionally or additionally include opaque and / or reflective masks that block light from reaching certain photodiodes or portions of certain photodiodes at certain times and / or from certain angles. In some cases, opaque and / or reflective masks may be used for phase detection autofocus (PDAF). In some cases, opaque and / or reflective masks may be used to block portions of the electromagnetic spectrum from reaching the photodiodes of the image sensor (e.g., IR cutoff filters, UV cutoff filters, bandpass filters, low-pass filters, high-pass filters, etc.). Image sensor 130 may also include an analog gain amplifier for amplifying the analog signal output from the photodiodes and / or an analog-to-digital converter (ADC) for converting the analog signal output from the photodiodes (and / or the analog signal amplified by the analog gain amplifier) into a digital signal. In some cases, certain components or functions discussed with respect to one or more control mechanisms in control mechanism 120 may be alternatively or additionally included in image sensor 130. Image sensor 130 may be a charge-coupled device (CCD) sensor, an electron multiplication CCD (EMCCD) sensor, an active pixel sensor (APS), a complementary metal-oxide semiconductor (CMOS), an N-type metal-oxide semiconductor (NMOS), a hybrid CCD / CMOS sensor (e.g., sCMOS), or some other combination thereof.
[0042] Image processor 150 may include one or more processors, such as one or more image signal processors (ISPs) (including ISP 154), one or more host processors (including host processor 152), and / or relative to... Figure 13The computing system 1300 may include one or more processors of any other type of processor 1310 discussed herein. The host processor 152 may be a digital signal processor (DSP) and / or other types of processor. In some specific implementations, the image processor 150 is a single integrated circuit or chip (e.g., referred to as a system-on-a-chip or SoC) that includes the host processor 152 and the ISP 154. In some cases, the chip may also include one or more input / output ports (e.g., input / output (I / O) port 156), a central processing unit (CPU), a graphics processing unit (GPU), a broadband modem (e.g., 3G, 4G, or LTE, 5G, etc.), memory, connectivity components (e.g., Bluetooth™, Global Positioning System (GPS), etc.), any combination thereof, and / or other components. I / O port 156 may include any suitable input / output port or interface according to one or more protocols or specifications, such as Inter-Integrated Circuit 2 (I2C) interface, Inter-Integrated Circuit 3 (I3C) interface, Serial Peripheral Interface (SPI) interface, Serial General Purpose Input / Output (GPIO) interface, Mobile Industrial Processor Interface (MIPI) (such as MIPI CSI-2 physical (PHY) layer port or interface, Advanced High Performance Bus (AHB) bus, any combination thereof and / or other input / output ports). In an exemplary example, host processor 152 may use the I2C port to communicate with image sensor 130, and ISP 154 may use the MIPI port to communicate with image sensor 130.
[0043] Image processor 150 can perform multiple tasks, such as demosaicing, color space conversion, image frame downsampling, pixel interpolation, automatic exposure (AE) control, automatic gain control (AGC), CDAF, PDAF, automatic white balance, merging image frames to form an HDR image, image recognition, object recognition, feature recognition, receiving input, managing output, managing memory, or some combination thereof. Image processor 150 can store image frames and / or processed images in random access memory (RAM) 140 / 825, read-only memory (ROM) 145 / 820, cache, memory unit, another storage device, or some combination thereof.
[0044] Various input / output (I / O) devices 160 may be connected to the image processor 150. I / O devices 160 may include a display screen, keyboard, keypad, touchscreen, touchpad, touch-sensitive surface, printer, any other output device 1335, any other input device 1345, or some combination thereof. In some cases, text may be input into the image processing device 105B via the physical keyboard or keypad of the I / O device 160, or via a virtual keyboard or keypad on the touchscreen of the I / O device 160. I / O devices 160 may include one or more ports, jacks, or other connectors that enable a wired connection between the image capture and processing system 100 and one or more peripheral devices, through which the image capture and processing system 100 can receive data from and / or send data to one or more peripheral devices. I / O devices 160 may include one or more wireless transceivers that enable a wireless connection between the image capture and processing system 100 and one or more peripheral devices, through which the image capture and processing system 100 can receive data from and / or send data to one or more peripheral devices. Peripheral devices may include any type of I / O device 160 discussed earlier, and they can be considered I / O devices 160 in themselves once they are coupled to ports, jacks, wireless transceivers or other wired and / or wireless connectors.
[0045] In some cases, the image capture and processing system 100 may be a single device. In other cases, the image capture and processing system 100 may be two or more separate devices, including an image capture device 105A (e.g., a camera) and an image processing device 105B (e.g., a computing device coupled to the camera). In some embodiments, the image capture device 105A and the image processing device 105B may be coupled together, for example, via one or more wires, cables, or other electrical connectors, and / or wirelessly coupled together via one or more wireless transceivers. In some embodiments, the image capture device 105A and the image processing device 105B may be disconnected from each other.
[0046] like Figure 1 As shown, the vertical dashed line will Figure 1The image capture and processing system 100 is divided into two parts, namely image capture device 105A and image processing device 105B. Image capture device 105A includes a lens 115, a control mechanism 120, and an image sensor 130. Image processing device 105B includes an image processor 150 (including an ISP 154 and a host processor 152), RAM 140, ROM 145, and I / O devices 160. In some cases, certain components illustrated in image capture device 105A (such as ISP 154 and / or host processor 152) may be included in image capture device 105A.
[0047] Figure 2 This is a diagram illustrating the architecture of an example enhanced image capture and processing system 200 configured with a virtual orbit camera according to some aspects of this disclosure. In some examples, Figure 2 System 200 may include image capture and processing system 100, image capture device 105A, image processing device 105B, or combinations thereof. In some examples, system 200 may perform tracking, localization, and / or mapping of an environment (e.g., a scene) in the physical world as part of image capture and processing. For example, system 200 may track the pose (e.g., position and orientation) of system 200 relative to an environment (e.g., a 3D map of the environment), locate and / or anchor virtual content at a specific location on a map of the environment, and render the virtual content on display 209 such that the virtual content appears to be located in the environment at a specific location on a map of the scene to which the virtual content is located and / or anchored. Display 209 may include glass, screen, lens, projector, and / or other display mechanisms that allow users to see a real-world environment and also allow virtual content to be overlaid, overlapped, blended, or otherwise displayed thereon.
[0048] In this exemplary example, system 200 includes one or more image sensors 202, accelerometers 204, gyroscopes 206, storage devices 207, computing components 210, image processing engines 224, and rendering engines 226. It should be noted that... Figure 2 The components 202-226 shown are non-limiting examples provided for illustrative and explanatory purposes, and other examples may include those with... Figure 2The components shown may be more numerous, fewer, or different than those shown. For example, in some cases, system 200 may include one or more other sensors (e.g., one or more inertial measurement units (IMUs), radar, light detection and ranging (LIDAR) sensors, radio detection and ranging (RADAR) sensors, sound detection and ranging (SODAR) sensors, sound navigation and ranging (SONAR) sensors, audio sensors, etc.), one or more display devices, one or more other processing engines, one or more other hardware components, and / or Figure 2 One or more other software and / or hardware components not shown. While various components of system 200 (such as image sensor 202) may be referred to in the singular form herein, it should be understood that system 200 may include multiple components discussed herein (e.g., multiple image sensors 202).
[0049] System 200 includes an input device 208 or communicates (wired or wirelessly) with that input device. Input device 208 may include any suitable input device, such as a touchscreen, pen or other pointing device, keyboard, mouse, buttons or keys, microphone for receiving voice commands, gesture input device for receiving gesture commands, video game controller, steering wheel, joystick, a set of buttons, trackball, remote control, any other input device discussed herein, or any combination thereof. In some cases, image sensor 202 may capture images that can be processed to interpret gesture commands.
[0050] In some embodiments, one or more image sensors 202, accelerometers 204, gyroscopes 206, storage devices 207, computing components 210, image processing engines 224, and rendering engines 226 may be part of the same computing device. For example, in some cases, one or more image sensors 202, accelerometers 204, gyroscopes 206, storage devices 207, computing components 210, image processing engines 224, and rendering engines 226 may be integrated into head-mounted displays (HMDs), extended reality glasses, smartphones, laptops, tablets, gaming systems, and / or any other computing device. However, in some embodiments, one or more image sensors 202, accelerometers 204, gyroscopes 206, storage devices 207, computing components 210, image processing engines 224, and rendering engines 226 may be part of two or more separate computing devices. For example, in some cases, some of the components 202-226 may be part of or implemented by one computing device, and the remaining components may be part of or implemented by one or more other computing devices.
[0051] Storage device 207 can be any storage device used for storing data. Furthermore, storage device 207 can store data from any component of system 200. For example, storage device 207 can store data from image sensor 202 (e.g., image or video data), data from accelerometer 204 (e.g., measurements), data from gyroscope 206 (e.g., measurements), data from computing component 210 (e.g., processing parameters, preferences, virtual content, rendered content, scene maps, tracking and positioning data, object detection data, privacy data, application data, face recognition data, occlusion data, etc.), data from image processing engine 224, and / or data from rendering engine 226 (e.g., output frames). In some examples, storage device 207 may include a buffer for storing frames processed by computing component 210.
[0052] One or more computing components 210 may include a central processing unit (CPU) 212, a graphics processing unit (GPU) 214, a digital signal processor (DSP) 216, an image signal processor (ISP) 218, and / or other processors (e.g., a neural processing unit (NPU) implementing one or more trained neural networks). Computing component 210 may perform various operations such as image enhancement, computer vision, graphics rendering, extended reality operations (e.g., tracking, localization, pose estimation, map building, content anchoring, content rendering, etc.), image and / or video processing, sensor processing, recognition (e.g., text recognition, face recognition, object recognition, feature recognition, tracking or pattern recognition, scene recognition, occlusion detection, etc.), trained machine learning operations, filtering, and / or any of the various operations described herein. In some examples, computing component 210 may implement (e.g., control, operate, etc.) an image processing engine 224 and a rendering engine 226. In other examples, computing component 210 may also implement one or more other processing engines.
[0053] Image sensor 202 may include any image and / or video sensor or capture device. In some examples, image sensor 202 may be part of a multi-camera assembly, such as a dual-camera assembly. Image sensor 202 may capture image and / or video content (e.g., raw image and / or video data), which may then be processed by computing component 210, image processing engine 224, and / or rendering engine 226 as described herein. In some examples, image sensor 202 may include image capture and processing system 100, image capture device 105A, image processing device 105B, or combinations thereof.
[0054] In some examples, image sensor 202 may capture image data and generate an image (also referred to as a frame) based on that image data and / or provide the image data or frame to image processing engine 224 and / or rendering engine 226 for processing. The image or frame may include a video frame in a video sequence or a still image. The image or frame may include an array of pixels representing a scene. For example, the image may be: a red-green-blue (RGB) image with red, green, and blue color components per pixel; a lightness, redness, and blueness (YCbCr) image with a lightness component and two chromaticity (redness and blueness) components per pixel; or any other suitable type of color or monochrome image.
[0055] In some cases, image sensor 202 (and / or other cameras of XR system 200) may also be configured to capture depth information. For example, multiple image sensors 202 may be used to capture images from multiple locations and depth information determined based on differences between images captured from multiple locations (e.g., based on parallax information). As another example, in some implementations, image sensor 202 (and / or other cameras) may include an RGB-D camera. In some cases, system 200 may include one or more depth sensors (not shown) that are separate from image sensor 202 (and / or other cameras) and capable of capturing depth information. For example, such depth sensors may acquire depth information independently of image sensor 202. In some examples, depth sensors may be physically mounted in the same general location as image sensor 202, but may operate at a different frequency or frame rate than image sensor 202. In some examples, depth sensors may take the form of a light source that can project structured or textured light patterns (which may include one or more narrowband lights) onto one or more objects in a scene. Depth information can then be obtained by utilizing the geometric deformation of the projected pattern caused by the surface shape of the object. In one example, depth information can be obtained from a stereo sensor, such as a combination of an infrared structured light projector and an infrared camera registered to a camera (e.g., an RGB camera).
[0056] System 200 may also include other sensors among its one or more sensors. The one or more sensors may include one or more accelerometers (e.g., accelerometer 204), one or more gyroscopes (e.g., gyroscope 206), and / or other sensors. The one or more sensors may provide velocity, orientation, and / or other positioning-related information to computing component 210. For example, accelerometer 204 may detect the acceleration of system 200 and may generate an acceleration measurement based on the detected acceleration. In some cases, accelerometer 204 may provide one or more translation vectors (e.g., up / down, left / right, forward / backward) that may be used to determine the positioning or pose of system 200. Gyroscope 206 may detect and measure the orientation and angular velocity of system 200. For example, gyroscope 206 may be used to measure the pitch, roll, and yaw of system 200. In some cases, gyroscope 206 may provide one or more rotation vectors (e.g., pitch, yaw, roll). In some examples, image sensor 202, image processing engine 224, and / or rendering engine 226 may use measurements obtained by accelerometer 204 (e.g., one or more translation vectors) and / or measurements obtained by gyroscope 206 (e.g., one or more rotation vectors) to calculate the pose of system 200. As previously noted, in other examples, system 200 may also include other sensors such as inertial measurement unit (IMU), magnetometer, gaze and / or eye-tracking sensor, machine vision sensor, smart scene sensor, speech recognition sensor, collision sensor, vibration sensor, positioning sensor, tilt sensor, etc.
[0057] As noted above, in some cases, one or more sensors may include at least one IMU. An IMU is an electronic device that uses a combination of one or more accelerometers, one or more gyroscopes, and / or one or more magnetometers to measure the specific force, angular velocity, and / or orientation of system 200. In some examples, one or more sensors may output measured information associated with the capture of images by image sensor 202 (and / or other cameras of system 200) and / or depth information obtained using one or more depth sensors of system 200.
[0058] The outputs of one or more sensors (e.g., accelerometer 204, gyroscope 206, one or more IMUs and / or other sensors) may be used by image processing engine 224 and / or rendering engine 226 to determine the pose of system 200 and / or image sensor 202 (or other cameras of system 200). In some cases, the pose of system 200 and the pose of image sensor 202 (or other cameras) may be the same. The pose of image sensor 202 refers to the positioning and orientation of image sensor 202 relative to a reference frame (e.g., about scene 110). In some specific implementations, the camera pose may be determined for 6 degrees of freedom (6DoF), which refers to three translational components (e.g., which may be given by X (horizontal), Y (vertical), and Z (depth) coordinates relative to a reference frame (such as the image plane)) and three angular components (e.g., roll, pitch, and yaw relative to the same reference frame). In some implementations, camera pose can be determined for 3 degrees of freedom (3DoF), which refers to three angular components (e.g., roll, pitch, and yaw).
[0059] In some cases, a device tracker (not shown) may use measurements from one or more sensors and image data from image sensor 202 to track the pose (e.g., 6DoF pose) of system 200. For example, the device tracker may fuse visual data from image data (e.g., using a visual tracking solution) with inertial data from measurements to determine the position and motion of system 200 relative to the physical world (e.g., a scene) and a map of the physical world. As described below, in some examples, when tracking the pose of system 200, the device tracker may generate a three-dimensional (3D) map of the scene (e.g., the real world) and / or generate updates to the 3D map of the scene. 3D map updates may include, for example, but not limited to, new or updated features and / or feature or landmark points associated with the scene and / or the 3D map of the scene, positioning updates that identify or update the position of system 200 within the scene and the 3D map of the scene, etc. The 3D map provides a digital representation of the scene in the real / physical world. In some examples, the 3D map may anchor location-based objects and / or content to real-world coordinates and / or objects. System 200 can use map-rendered scenes (e.g., scenes in the physical world represented by a 3D map and / or scenes associated with that 3D map) to merge the physical and virtual worlds and / or merge virtual content or objects with the physical environment. For example, system 200 can compare features detected in a current image of the environment with features detected in a previous image of the environment. In some cases, features detected in a previous image of the environment may be stored in a 3D map.
[0060] In some aspects, the pose of image sensor 202 and / or system 200 as a whole can be determined and / or tracked by computing component 210 using a visual tracking solution based on images captured by image sensor 202 (and / or other cameras of XR system 200). For example, in some examples, computing component 210 can perform tracking using computer vision-based tracking, model-based tracking, and / or simultaneous localization and mapping (SLAM) techniques. For example, computing component 210 may perform SLAM or may communicate (wired or wirelessly) with a SLAM system (not shown). SLAM refers to a class of techniques in which an environment map (e.g., an environment map that can be modeled by system 200) is created while tracking the pose of the camera (e.g., image sensor 202) and / or system 200 relative to that map. This map may be called a SLAM map and may be three-dimensional (3D). SLAM technology can be performed using color or grayscale image data captured by image sensor 202 (and / or other cameras of system 200) and can be used to generate an estimate of the 6DoF pose measurement of image sensor 202 and / or system 200. Such SLAM technology configured to perform 6DoF tracking can be referred to as 6DoF SLAM. In some cases, the output of one or more sensors (e.g., accelerometer 204, gyroscope 206, one or more IMUs and / or other sensors) can be used to estimate, correct, and / or otherwise adjust the estimated pose.
[0061] In some cases, 6DoF SLAM (e.g., 6DoF tracking) can associate features observed from certain input images from image sensor 202 (and / or other cameras) with a SLAM map. For example, 6DoF SLAM can use feature point association from the input images to determine the pose (locality and orientation) of image sensor 202 and / or system 200 in the input images. 6DoF map construction can also be performed to update the SLAM map. In some cases, the SLAM map maintained using 6DoF SLAM can contain 3D feature points triangulated from two or more images. For example, keyframes can be selected from the input images or video stream to represent the observed scene. For each keyframe, the corresponding 6DoF camera pose associated with the image can be determined. The pose of image sensor 202 and / or system 200 can be determined by projecting features from the 3D SLAM map onto the image or video frame and updating the camera pose based on verified 2D-3D correspondence.
[0062] In some situations, when capturing images, it can be difficult to move a handheld imaging device (e.g., an image capture device) along a predefined path (e.g., a planned trajectory) because people may find it difficult to move their hands smoothly and accurately along such a path. Typically, to capture images along such predefined paths, image capture devices are mounted to devices that can move along rails or tracks. For example, an image capture device can be mounted to a railcar, which can be a physical vehicle that can move along rails, tracks, roads, or another physical path through the environment. However, physical railcars can be expensive, bulky, and / or difficult to use.
[0063] Figure 3 This is a flowchart illustrating a technique 300 for capturing images using a virtual orbit camera according to various aspects of this disclosure. In some cases, technique 300 may be implemented by an enhanced image capture and processing system (such as enhanced image capture and processing system 200). Figure 3 As shown, upon receiving an instruction to capture images using a virtual orbit camera, a planned trajectory 302 can be determined. This planned trajectory can be a virtual path along which the image capture device can move while capturing images. This virtual path is analogous to a physical path or track that a physical track vehicle can travel on while capturing images. As part of determining the planned trajectory, a set of pattern trajectories 304 can be presented for selection, along with options for configuring the pattern trajectories, such as trajectory speed, whether to zoom along the trajectory, whether to base the trajectory on an object, etc. This set of pattern trajectories can be a set of commonly used trajectories (e.g., lenses) for capturing images, such as track vehicle in / out of a lens, track vehicle zoom, panning lens, tracking lens, 360-degree surround lens, etc. In response to the selection of a pattern trajectory (e.g., a desired trajectory) received from this set of pattern trajectories 304, the planned trajectory can be determined for the environment surrounding the image capture device based on the selected pattern trajectory (and selected options for tuning the pattern trajectory).
[0064] In some cases, to execute a virtual orbit camera shot, the image capture device may move along a planned trajectory across the environment. Since not all environments are the same, the intended trajectory (e.g., a pattern-based trajectory) can be adapted to the environment. In some cases, aspects of the environment can be detected and taken into account when determining the planned trajectory. These aspects may include the size of the environment, objects in the environment (with which the image capture device is associated), the selected pattern trajectory, etc. For example, when a panning shot is selected as the intended trajectory, the determination of the planned trajectory may take into account the size of the environment, as a 20-foot panning shot may not fit in a 10-foot room. In some cases, information about the environment and the localization of the image capture device within the environment can be obtained using visual odometry, 6DOF, 6DOF mapping / SLAM techniques, etc. For example, a device used to capture images using a virtual orbit camera may also acquire an image of the environment and detect 3D feature points from the acquired image. These 3D feature points can be used to localize the device, locate objects in the environment, and detect aspects of the environment. The planned trajectory for capturing images can be determined, for example, by drawing a path from a first position to a second position and the pose of the image capturing device on that path, at which the image capturing device can move to capture images, just as if the image capturing device were on a railcar.
[0065] In some cases, the planned trajectory can be determined based on one or more objects. For example, the device's user interface may allow the user to select object 306 as the basis for the expected trajectory. As a more specific example, a pattern trajectory may require capturing images along a path that curves around the selected object. In such examples, the planned trajectory can be determined based on the selected object by adjusting the scale and / or curvature of the planned trajectory based on the distance, size, shape, etc. of the selected object. In some cases, the distance, size, shape, etc. of the selected object may be determined based on the environment, and the positioning of the image capture device in the environment can be obtained using localization and mapping techniques such as visual odometry, 6DOF, 6DOF mapping / SLAM, etc.
[0066] In other cases, when a device moves across its environment, the planned trajectory can be determined based on the device's position / pose. For example, a user may move along a predetermined trajectory with the device, and the device can capture the user's position and / or pose as the user moves along that predetermined trajectory. The device can determine the planned trajectory based on the predetermined trajectory. For example, the predetermined trajectory can be smoothed to generate the planned trajectory. This smoothing action can be performed along multiple axes. In some cases, the planned trajectory may have a similar size / direction / velocity to the predetermined trajectory. In some cases, when the device moves along the predetermined trajectory, information such as positioning, orientation, and motion can be determined based on the device's pose. The device's pose can be determined using localization and mapping techniques such as visual odometry, 6DOF, 6DOF mapping / SLAM, etc.
[0067] Based on a planned trajectory, images can be captured. For example, an image capture device can begin capturing images while moving along an actual trajectory from a first position toward a second position. Compared to the planned trajectory, the image capture device can also provide feedback 308 on the movement along the actual trajectory. For example, the image capture device can track its position / pose in the environment while capturing images and compare its position / pose with a desired position / pose in the planned trajectory. If the image capture device deviates from the desired position / pose in the planned trajectory, it can determine what kind of adjustment to the pose will allow it to return to the planned trajectory. If the adjustment to the pose is within a certain threshold, the image capture device can utilize existing image stabilization techniques to capture images along the planned trajectory. For example, conventional optical and digital image stabilization techniques can be adjusted to attempt to align the captured images with the desired position / pose in the planned trajectory (while still removing shakiness), rather than simply attempting to align the captured images with each other to eliminate shakiness.
[0068] In some cases, if the image capture device determines that the deviation from the desired position / pose in the planned trajectory has exceeded a threshold amount (which can be set based on the amount of motion available in conventional image stabilization techniques and can vary between yaw, pitch, roll, etc.), feedback indicating that a deviation from the planned trajectory is occurring and / or instructions on how to adjust the pose / movement of the imaging device to return it to the planned trajectory can be output for display in the user interface. For example, one or more visual indicators 310 may be displayed in the user interface, such as in a virtual viewfinder, where the visual indicators 310 may display an indication that the image capture device has deviated, how the image capture device has deviated (e.g., translation, backward / forward, rotation, movement speed, etc.), and the degree to which the image capture device has deviated from the planned trajectory.
[0069] In some examples, if the image capture device has captured off-target images where the associated desired position / pose deviates from the desired position / pose by more than a threshold distance from the desired position / pose in the planned trajectory, the image capture device may apply compensation techniques to adjust the captured off-target images so that they appear to have been taken from the planned trajectory. For example, the image capture device may distort or reproject the off-target images. In some cases, applying compensation techniques to the captured images may occur during image capture or after image capture using a virtual orbit camera has been completed (e.g., after the image capture device has reached the end of the planned trajectory or otherwise stopped capturing images).
[0070] In some cases, the planned trajectory can be determined based on the device's movement through the environment along a predetermined path. The device can then move along the planned trajectory and the images captured as the device moves along it. Image capture (and compensation, if appropriate) along the planned trajectory can be performed in a manner substantially similar to that described above.
[0071] Images captured and / or adjusted using a virtual track vehicle can be output as 312, for example, for storage, playback, etc.
[0072] Figure 4 This is a block diagram illustrating the architecture 400 of an image capture device configured to capture images using a virtual orbit camera, according to various aspects of this disclosure. Figure 4 As shown, the image capture device includes an imaging device 402 and an IMU 404. In some cases, the imaging device 402 can be coupled with... Figure 2 Corresponding to the image sensor 202, and the IMU 404 can be used with Figure 2 The system 200 includes one or more sensors (e.g., accelerometer 204, gyroscope 206, one or more IMUs and / or other sensors). In some cases, as part of preparation for capturing or using a virtual orbit camera to capture images, IMU 404 may output motion information about the image capture device (such as information about specific force, angular rate, and / or orientation of the image capture device) to positioning estimation engine 406, which may estimate the positioning of the image capture device in the environment based on data measuring the motion of the image capture device and images received from imaging device 402. In some cases, multiple imaging devices 402 may be used (e.g., for depth estimation via stereo depth). In some cases, imaging device 402 may include sensors for optical depth estimation, such as time-of-flight sensors, depth sensors, etc. This positioning information may be output to a vehicle stabilization engine 412, a path generation engine 414, and / or a trajectory feedback engine 418. In some cases, the positioning information may also be output to an image distortion engine 410 (not shown).
[0073] In some cases, imaging device 402 may generate an image of the environment. For example, after the image capture device receives an instruction to use a virtual orbit camera, imaging device 402 may generate an image and output that image to positioning estimation engine 406, depth estimation engine 408, and image warping engine 410. In some cases, imaging device 402 may also output the image to trajectory feedback engine 418 (not shown). Some of the images output by imaging device 402 may be processed by auxiliary visual odometry engine 406, depth estimation engine 408, and / or other components of the image capture device to generate positioning information, depth information, preview images, etc., without storing the images in a long-term storage device (e.g., flash memory, hard disk drive, etc.). In some cases, images output by imaging device 402 may be stored in a long-term storage device, such as those captured in response to an instruction to capture an image (e.g., shutter release), and in this case, passed to image warping engine 410.
[0074] Depth estimation engine 408 can estimate the depth information of objects in the environment, for example, based on images from imaging device 402 using any optical depth estimation technique (such as stereo depth estimation, monocular depth estimation, time-of-flight, depth sensor, etc.). In some cases, depth estimation engine 408 may be integrated with localization estimation engine 406. Depth information may be output to path generation engine 414 and / or image warping engine 410.
[0075] The path generation engine 414 can receive a selected mode trajectory and configuration options 416 for the selected mode trajectory. Based on the selected mode trajectory (e.g., the expected trajectory), the configuration options for the selected mode trajectory, the pose information of the image capture device and / or map information about the environment surrounding the image capture device (e.g., objects in the environment, aspects of the environment, etc.), and depth information, the path generation engine 414 can determine a planned trajectory for the virtual orbit camera to traverse the environment. The planned trajectory may include a set of positions and poses that the imaging device should have at certain times when capturing images using the virtual orbit camera. The path generation engine 414 can output the planned trajectory to the orbit vehicle stabilization engine 412 and the trajectory feedback engine 418.
[0076] The vehicle stabilization engine 412 can receive a planned trajectory and position and / or pose information, and can determine (e.g., from the planned trajectory) the difference between the desired and actual positioning and / or what adjustments the pose will make to allow the image capture device to return to the planned trajectory. In some cases, the vehicle stabilization engine 412 can interface with existing image stabilization techniques to capture images along the planned trajectory. The vehicle stabilization engine 412 can output the difference between the desired and actual positioning and / or adjustment information to the trajectory feedback engine 418 to generate feedback for presentation in the user interface indicating a deviation from the planned trajectory, and / or instructions on how to adjust the pose / movement of the imaging device to bring the imaging device back onto the planned trajectory. For example, the trajectory feedback engine 418 can generate graphic elements (e.g., generate and output graphic elements) that can overlay (e.g., place on) the image from the imaging device 402 on the device's display and / or viewfinder. In some cases, the trajectory feedback engine 418 may also receive images (not shown) from the imaging device 402, and the trajectory feedback engine 418 may overlay graphic elements on the image from the imaging device 402 and output the image and graphic elements for display on the device's display and / or viewfinder.
[0077] The warping engine 410 receives images from the imaging device 402 and warps those images based on information received from the depth estimation engine 408 and the track vehicle stabilization engine 412. The warping engine 410 can warp or reproject offset images obtained when the image capturing device is at a position / pose deviating from the planned trajectory by more than one or more threshold distances. The warped images, as well as images obtained on or near the planned trajectory, can be output 420, for example, for storage, display, etc.
[0078] Figure 5 This is a block diagram illustrating a positioning estimation engine 500 according to various aspects of this disclosure. In some cases, the positioning estimation engine 500 may be substantially similar to... Figure 4 The localization estimation engine 500, as shown, can receive an image 502 from one or more imaging devices 504 and motion information from an IMU 506. The image 502 can be received by a feature extraction engine 508, which can extract feature points from the received image 502. For example, the feature extraction engine 508 can be machine learning (ML) based (or non-ML based) and includes one or more feature extractors for detecting and extracting current feature points (e.g., current features) in the received image. Examples of feature extractors may include Scale Invariant Feature Transform (SIFT), Speed-Up Robust Features (SURF), Features from Accelerated Segment Test (FAST), Binary Robust Independent Basic Features (BRIEF), and Oriented FAST and Rotated BRIEF (ORB), etc.
[0079] Motion information from IMU 506 can be received by IMU pose estimation engine 510. IMU pose estimation engine 510 can attempt to estimate the pose based on, for example, the previous pose and motion information indicating how the image capture device is moving. In some cases, IMU pose estimation engine 510 can output an orientation angle indicating how the image capture device is oriented in the environment to pose determination engine 512.
[0080] The pose determination engine 512 may receive orientation angle information and extracted features from the feature extraction engine 508. In some cases, the pose determination engine 512 may obtain features from one or more previous images from a database of features 514 from previous images. Based on the features from the previous images, the current features from the feature extraction engine 508, and motion information, the pose determination engine 512 may determine the current pose by comparing the relative offset between the features of the previous images and the corresponding current features (e.g., using triangulation and motion information). The current features may also be stored in the database for use with the features 514 from the previous images for later use (e.g., for images captured at a later time point). The determined current pose from the pose determination engine 512 (e.g., indicating the camera's location and orientation in space, e.g., in the environment) may be output 518.
[0081] Figure 6 This is a block diagram illustrating a path generation engine 600 according to various aspects of this disclosure. In some cases, the path generation engine 600 may be substantially similar to... Figure 4 The path generation engine 414. As shown in the figure, the path generation engine 600 can receive (e.g., from...) Figure 5 The positioning estimation engine 500 determines the positioning of the image capture device 602 and (e.g., from the positioning estimation engine 500) ...602) and (e.g., from the positioning estimation engine 500) and (e.g Figure 4 The depth estimation engine 408 receives depth information 604. In some cases, depth information 604 may be a depth map associated with one or more images received from an imaging device. In some cases, the path generation engine 600 may also receive depth information for a selected object 606 (e.g., Figure 3The selected object (306) is indicated, along with expected trajectory information and / or indications 608 for start and end positioning, selected pattern trajectory, and / or velocity information 610. In some cases, the expected trajectory may be based on the selected pattern trajectory or on the movement of the device through the environment (e.g., positioning / pose information captured as the device moves). The expected trajectory may refer to a set of trajectory information that can be refined into a planned trajectory (e.g., indications for the selected pattern trajectory, positioning / pose information for moving the image capture device through the environment). In some cases, the pattern trajectory may include start and end positioning, as well as indications of a pattern for intermediate positioning between the start and end positioning. For example, an indication of a pattern may indicate that the intermediate positioning is in a straight line between the start and end positioning. Alternatively, an indication of a pattern may indicate that the intermediate positioning curves around the object between the start and end positioning. In some cases, the start and end positioning may be defined, for example, by the user. For example, the user may walk to (or point the image capture device at) a location in the environment and set that location as the start positioning, and then walk to (or point at) another location and set that other location as the end positioning. The start location, end location, and intermediate locations between the start and end locations define the expected trajectory. Speed information 610 can be an indication of how fast the executable virtual track vehicle camera is moving (e.g., how fast it is moving along the planned trajectory).
[0082] In some cases, depth information 604 and an indication of a selected object 606 (if selected) can be input to the object distance calculation engine 612. The object distance calculation engine can determine the distance and / or size of the selected object 606 to help determine the size of the planned trajectory to match the distance and / or size of the selected object 606. In some cases, an indication can be presented to, for example, a user, prompting the user to move the image capture device around the object to help define the expected trajectory. The determined distance and / or size information can be output from the object distance calculation engine 612 and input to the path calculation engine 614.
[0083] The path calculation engine 614 can receive determined distance and / or size information, as well as the determined positioning 602 of the image capture device, expected trajectory information and / or indications 608 for start and end positioning, and velocity information 610, and determine a planned trajectory. When determining a planned trajectory, the path calculation engine 614 can use the determined (e.g., current) positioning of the image capture device as the starting point of the planned trajectory. In some cases, when determining a planned trajectory, the pattern trajectory can be scaled based on the velocity information 610 and the length between the start and end positioning. In the case of selecting an object, when determining a planned trajectory, the pattern trajectory can be scaled based on the distance to the object and / or the size of the object, as well as the velocity information 610. In some cases, the path calculation engine 614 can also determine the planned trajectory based on the size of the environment or the position of other objects in the environment.
[0084] In some cases, the path calculation engine 614 can output a sparse set of 3D positions and poses 616 of the image capture device (e.g., planned trajectory information, sparse trajectory) as a planned trajectory. In some cases, the sparse set of 3D positions and poses 616 can be interpolated to generate a more detailed set of 3D positions and poses. This more detailed set of 3D positions and poses may include 3D positions and poses based on specific time frames (e.g., 3D positions and poses per second), 3D positions and poses based on multiple frames (e.g., 3D positions and poses every 5 frames), or a set of frame-by-frame 3D positions and poses. In some cases, multiple sets of interpolated 3D positions and poses can be generated. For example, a set of frame-by-frame 3D positions and poses can be used to warp an image so that the image appears to be obtained from a specific 3D position and pose, and 3D positions and poses per second can be used to generate feedback to the user on how to move the image capture device to return it to the planned trajectory.
[0085] Figure 7 This is a block diagram illustrating a railcar stabilization engine 700 according to various aspects of this disclosure. In some cases, the railcar stabilization engine 700 may be substantially similar to... Figure 4The track vehicle stabilization engine 412. As shown, the track vehicle stabilization engine 700 can receive a planned trajectory 702 as a set of sparse image capture device 3D positions and poses, as well as velocity information 704 of the image capture devices, and determine (e.g., current) positioning 706. The planned trajectory 702 and velocity information 704 can be input to a trajectory interpolation engine 708. In some cases, the 3D positions and poses can be calculated on a per-image basis to accurately determine whether the image needs correction (e.g., distortion). In some cases, the trajectory interpolation engine 708 can interpolate the sparse planned trajectory 702 to determine the 3D position and pose of each image. The interpolation of the 3D position can be performed, for example, using trilinear, tricubic, or another 3D interpolation technique. For example, the pose (e.g., orientation angle) can be interpolated using linear interpolation, quaternion interpolation, spherical linear interpolation, or another interpolation technique. The positioning within the sparse planned trajectory 702 to be interpolated can be determined based on velocity information (e.g., points per second of the sparse planned trajectory) and the current time. The interpolated position and pose 712 along the planned trajectory can be output from the track vehicle stabilization engine 700 and also to the deviation determination engine 710. The deviation determination engine 710 can calculate a transformation matrix 714 to determine what adjustments (e.g., rotation and translation) will allow the image capture device to return to the planned trajectory. In some cases, the transformation matrix calculation can be performed in such a manner to avoid pixels outside the field of view of the image capture device's camera. In some cases, zoom-in / zoom-out transformations can also be performed. In some cases, the deviation determination engine 710 can calculate a sample grid (e.g., an N×M mesh) that maps one image to another. The sample grid can also be transformed based on sample points within the image to be transformed. This sample grid can be output by the deviation determination engine 710.
[0086] Figure 8 This is a block diagram illustrating a trajectory feedback engine 800 according to various aspects of this disclosure. In some cases, the trajectory feedback engine 800 may be substantially similar to... Figure 4 The trajectory feedback engine 418. As shown, the planned trajectory indication engine 802 can generate a UI representation of one or more portions of the planned trajectory (e.g., an image overlaid on the environment before capturing images using a virtual track vehicle) and / or a graphical representation 804 of how image capture using a virtual track vehicle progresses relative to the planned trajectory in terms of speed (e.g., how far the image capture device should travel along the planned trajectory compared to how far the image capture device actually travels along the planned trajectory). The planned trajectory indication engine 802 can generate the UI representation based on the received planned trajectory (sparse trajectory 806 (e.g., planned trajectory information) or interpolated position and pose 808) and the determined (e.g., current) positioning 810 of the image capture device.
[0087] The trajectory feedback engine 800's motion compensation UI engine 812 can receive (e.g., from...) Figure 7 The offset determination engine 710 uses a transformation matrix 814 (or sample grid), and the motion compensation UI engine 812 generates a graphical representation 816 of rotation and translation that can be applied to the image capture device to align the image capture device with the planned trajectory. In some cases, the offset used to correct the rotation and translation of the image capture device to align the image capture device with the planned trajectory can be indicated using a tint shape offset with a different tint shape, such that aligning different tint shapes can align the image capture device with the planned trajectory. Graphical representations 804 and 816 can be blended (e.g., overlaid) by the blending engine 818 with an image 820 received from the camera 822 of the image capture device to generate an overlaid preview image 824. In some cases, image 820 can be a preview image for display in a virtual viewfinder (e.g., a display) of the image capture device.
[0088] Figure 9 This is a block diagram illustrating an example of an image distortion engine 900 according to various aspects of this disclosure. In some cases, the image distortion engine 900 may be substantially similar to... Figure 4 The image warping engine 410. As shown, the image warping engine 900 may include an image warping engine 902 and a transformation calculation engine 904. The transformation calculation engine 904 may receive a transformation matrix 903 (e.g., such as...). Figure 7 The transformation calculation engine 904 can determine a perspective transformation applicable to the image 906 acquired from the camera 908 of the image capture device, based on the received transformation matrix 903 (transformation matrix 714). For example, the image warping engine 902 can receive the image 906 and the perspective transformation matrix from the transformation calculation engine 904. The image warping engine 902 can apply the received perspective transformation matrix to the image 906 to apply perspective transformation to correct rotational deviations of the image 906 from the planned trajectory. In some cases, the transformation calculation engine 902 can receive the transformation matrix 903 and determine a 2D mesh mapping of the image that can be applied (e.g., by the image warping engine 902) to the corrected image 906. The corrected image can then be output 910, for example, for storage, display, etc.
[0089] Figure 10 This is a block diagram illustrating another example of an image warping engine 1000 according to various aspects of this disclosure. In some cases, the image warping engine 1000 may be substantially similar to... Figure 4 The image warping engine 410. As shown, the image warping engine 1000 may include an image reprojection engine 1002 and an optional occlusion repair engine 1004. The image reprojection engine 1002 may receive a transformation matrix 1006 (e.g., such as...). Figure 7The image 906 is given a transformation matrix 714 and an image 1008 obtained from a camera 1010 of an image capture device, along with corresponding depth information 1012, such as a depth map based on image 1008. An image reprojection engine 1002 can reproject (e.g., warp image 1008) based on depth and pose to correct translational and rotational deviations of image 906 relative to a translational trajectory. In some cases, warping the image to correct translational deviations may expose gaps where previously occluded areas (e.g., areas of background occluded by foreground objects) become visible. Depth information 1012 can be used to identify previously occluded areas. In some cases, warping can also be applied to background pixels in the previously occluded areas to fill the gaps. The warped image can then be output 1024.
[0090] In other cases, non-occluded pixels can be distorted, and repair can be performed by the occlusion repair engine 1004 based on the occlusion map (indicating which pixels were previously occluded and are now visible) and the reprojected image. The occlusion repair engine 1004 may attempt to fill the gaps, for example, based on pixels adjacent to the occluded pixels at similar depths. The distorted and repaired image can then be output as 1024.
[0091] Figure 11 An example of a graphical representation used to provide feedback on deviation from a planned trajectory is illustrated. In some cases, feedback can be provided as the image capturing device moves along a planned trajectory, such as visual cues in a UI, audio prompts, haptic feedback, etc. As indicated above, a visual cues can be a graphical representation of how the current position / pose of the image capturing device deviates from the planned trajectory and how to correct the deviation. For example, the graphical representation may include two differently colored shapes, one representing the planned trajectory 1102 and the other representing the current position / pose of the image capturing device 1104. In the case where the two differently colored shapes completely overlap, the image capturing device can match the desired position / pose indicated in the planned trajectory (or within its threshold distance). Depending on the degree to which the current position / pose deviates from the planned trajectory, different portions of the colored shape corresponding to the current position / pose may become visible. For example, a circular portion of the colored shape may represent a rotation (e.g., translation) of the imaging device. The circular portion 1106 of the colored shape corresponding to the now-visible planned trajectory 1102 provides feedback indicating that the image capturing device should translate to the left in this example. In some cases, the rectangular portion of the shaded shape 1108 may represent translational motion (e.g., moving the imaging capture device left / right, up / down), and may indicate when certain portions of the rectangular portion of the shaded shape 1108 undergo translational motion. In some cases, the rectangular portion of the shaded shape may also be used to indicate rotation of the image capture device along the optical axis of the image capture device, as shown in graphic representation 1110.
[0092] Figure 12This is a flowchart illustrating a process 1200 for capturing an image according to various aspects of the present disclosure. Process 1200 may be performed by an image capture device (e.g., via...). Figure 1 Image capture and processing system 100 Figure 2 Enhanced image capture and processing system 200), computing device (or apparatus) (e.g., computing system 1300) or components of computing device (e.g., chipset, codec, ... Figure 1 The host processor 152, Figure 1 Image processor 150 Figure 1 ISP 154 Figure 2 Computing component 210, Figure 2 Image processing engine 224 Figure 13 The process 1200 is executed by a processor 1310, etc. The computing device can be a mobile device (e.g., a mobile phone), a network-connected wearable device (such as a watch), an extended reality (XR) device (such as a virtual reality (VR) device or an augmented reality (AR) device), a vehicle or a component or system of a vehicle, or other types of computing devices. The operation of process 1200 can be implemented as a software component that executes and runs on one or more processors.
[0093] At box 1202, the computing device (or a component thereof) may receive an indication of a planned trajectory (e.g., how the device should move across the environment) traversing from a first location toward a second location. In some cases, the computing device (or a component thereof) may: receive an indication of a desired trajectory; receive an indication of the first location; and determine the planned trajectory traversing the environment based on the desired trajectory and the indication of the first location. In some examples, the desired trajectory is based on a set of pattern trajectories (e.g., Figure 3 A set of pattern trajectories 304). In some cases, the computing device (or a component thereof) may receive a selected pattern trajectory from this set of pattern trajectories, wherein the expected trajectory is based on the selected pattern trajectory. In some examples, the expected trajectory is based on the movement of the image capturing device through the environment. In some cases, the computing device (or a component thereof) may receive a selection of objects in the environment (e.g., Figure 3The computing device (or a component thereof) receives an indication of the object (306) and determines the distance to the selected object, wherein the planned trajectory is further determined based on the determined distance to the selected object. In some examples, the computing device (or a component thereof) may determine the planned trajectory by determining a second positioning based on the distance to the selected object. In some cases, the computing device (or a component thereof) may determine the size of the selected object, wherein the planned trajectory is further determined based on the determined size of the selected object. In some examples, the computing device (or a component thereof) may receive an indication of a velocity toward the second positioning, wherein the planned trajectory is further determined based on the indicated velocity. In some examples, the planned trajectory includes a set of positions and poses of the image capturing device between the first positioning and the second positioning.
[0094] At box 1204, the computing device (or a component thereof) can determine the location of the image capture device in the environment (e.g., via...). Figure 5 (IMU 506). In some cases, the computing device (or a component thereof) may: determine the position of the image capturing device in the environment by acquiring a first image of the environment from a first position; acquire a first feature from the first image; acquire a second image of the environment from a second position; acquire a corresponding second feature from the second image; acquire motion information indicating the movement of the image capturing device from the first position to the second position; and determine the position of the image capturing device based on the motion information and a comparison of the first feature with the corresponding second feature. In some examples, the image capturing device includes a head-mounted display.
[0095] At box 1206, the computing device (or a component thereof) may, at a third location (e.g., intermediate location), acquire the image from the image capturing device (e.g., via...). Figure 1 The image capture and processing system 100 acquires images. In some cases, a computing device (or a component thereof) can provide feedback on the position of the image capture device relative to the planned trajectory (e.g., via...). Figure 3 Visual indicator 310 Figure 8 The graphic representation of 816 Figure 8 Trajectory feedback engine 800 Figure 11 (e.g., the colored shape 1108). In some examples, feedback is provided regarding the position of the image capturing device relative to the planned trajectory when the position of the image capturing device relative to the planned trajectory exceeds a threshold distance. In some cases, the feedback includes at least one of visual cues, audio cues, and haptic feedback. In some examples, the feedback includes visual cues (e.g., via...). Figure 3 Visual indicator 310 Figure 8 The graphic representation of 816 Figure 11 (e.g., the colored shape 1108), this visual indicator shows how the current position of the image capturing device deviates from the planned trajectory.
[0096] At box 1208, the computing device (or a component thereof) may transform (e.g., distort, reproject, etc.) an image based on the difference between the third location and the corresponding location in the planned trajectory. In some cases, the computing device (or a component thereof) may: transform the image by determining a transformation based on the difference between the position of the image capturing device and the planned trajectory when the image was acquired; and apply the transformation to the image. In some examples, the computing device (or a component thereof) may: transform the image by determining a reprojection transformation based on the difference between the position of the image capturing device and the planned trajectory when the image was acquired; and apply the reprojection transformation to the image.
[0097] Figure 13 This is a diagram illustrating an example of a system used to implement certain aspects of this technology. Specifically, Figure 13 An example of computing system 1300 is illustrated. This computing system can be any computing device, such as constituting an internal computing system, a remote computing system, a camera, or any component thereof, wherein the components of the system communicate with each other using connection 1305. Connection 1305 can be a physical connection using a bus, or a direct connection to processor 1310, such as in a chipset architecture. Connection 1305 can also be a virtual connection, a networking connection, or a logical connection.
[0098] In some embodiments, computing system 1300 is a distributed system, wherein the functions described herein may be distributed across a data center, multiple data centers, a peer-to-peer network, etc. In some embodiments, one or more of the described system components represent a plurality of such components, each performing some or all of the functions of the described components. In some embodiments, components may be physical devices or virtual devices.
[0099] Example system 1300 includes at least one processing unit (CPU or processor) 1310 and a connection 1305 that couples various system components, including system memories 1315 such as read-only memory (ROM) 1320 and random access memory (RAM) 1325, to processor 1310. Computing system 1300 may include a cache 1312 of high-speed memory that is directly connected to, closely proximate to, or integrated into processor 1310.
[0100] Processor 1310 may include any general-purpose processor and hardware or software services (such as services 1332, 1334, and 1336 stored in storage device 1330 and configured to control processor 1310), as well as dedicated processors in which software instructions are incorporated into the actual processor design. Processor 1310 may be a substantially completely independent computing system containing multiple cores or processors, buses, memory controllers, caches, etc. Multi-core processors may be symmetric or asymmetric.
[0101] To enable user interaction, the computing system 1300 includes an input device 1345 that can represent any number of input mechanisms, such as a microphone for voice, a touch-sensitive screen for gesture or graphical input, a keyboard, a mouse, motion input, voice input, etc. The computing system 1300 may also include an output device 1335 that can be one or more of a plurality of output mechanisms. In some instances, a multi-mode system allows a user to provide multiple types of input / output to communicate with the computing system 1300. The computing system 1300 may include a communication interface 1340, which typically governs and manages user input and system output. The communication interface can perform or facilitate the receipt and / or transmission of wired or wireless communications using wired and / or wireless transceivers, including utilizing audio jacks / plugs, microphone jacks / plugs, Universal Serial Bus (USB) ports / plugs, Apple... ® Lightning ® Ports / plugs, Ethernet ports / plugs, fiber optic ports / plugs, dedicated wired ports / plugs, Bluetooth ® Wireless signal transmission, Bluetooth ® Low-power (BLE) wireless signal transmission, IBEACON ®The communication interface 1340 may include one or more Global Navigation Satellite System (GNSS) receivers or transceivers for determining the location of the computing system 1300 based on one or more signals received from one or more satellites associated with one or more GNSS systems. This includes wireless signal transmission, radio frequency identification (RFID) wireless signal transmission, near field communication (NFC) wireless signal transmission, dedicated short range communication (DSRC) wireless signal transmission, 802.11 Wi-Fi wireless signal transmission, wireless local area network (WLAN) signal transmission, visible light communication (VLC) wireless signal transmission, microwave access global interoperability (WiMAX) wireless signal transmission, infrared (IR) wireless signal transmission, public switched telephone network (PSTN) signal transmission, integrated services digital network (ISDN) signal transmission, 3G / 4G / 5G / LTE cellular data network wireless signal transmission, ad hoc network signal transmission, radio wave signal transmission, microwave signal transmission, infrared signal transmission, visible light signal transmission, ultraviolet light signal transmission, wireless signal transmission along the electromagnetic spectrum, or some combination thereof. GNSS systems include, but are not limited to, the U.S. Global Positioning System (GPS), Russia's Global Navigation Satellite System (GLONASS), China's BeiDou Navigation Satellite System (BDS), and Europe's Galileo GNSS. There are no limitations on operation on any particular hardware configuration, and therefore the underlying features here can be easily replaced to obtain improved hardware or firmware configurations as they are developed.
[0102] Storage device 1330 may be a non-volatile and / or non-transitory and / or computer-readable storage device, and may be a hard disk or other type of computer-readable medium capable of storing data accessible by a computer, such as magnetic tape, flash memory cards, solid-state storage devices, digital multifunction disks, cartridges, floppy disks, hard disks, magnetic tapes, magnetic stripes, any other magnetic storage media, flash memory, memristor memory, any other solid-state storage, optical disc read-only memory (CD-ROM), rewritable optical disc (CD), digital video disc (DVD), Blu-ray disc (BDD), holographic disc, another optical medium, secure digital (SD) card, micro secure digital (microSD) card, memory stick. ®Cards, smart card chips, EMV chips, Subscriber Identity Module (SIM) cards, mini / micro / nano / micro SIM cards, another integrated circuit (IC) chip / card, random access memory (RAM), static RAM (SRAM), dynamic RAM (DRAM), read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash EPROM, cache memory (L1 / L2 / L3 / L4 / L5 / L#), resistive random access memory (RRAM / ReRAM), phase change memory (PCM), spin-transfer torque RAM (STT-RAM), another memory chip or cassette and / or combinations thereof.
[0103] Storage device 1330 may include software services, servers, services, etc., which enable the system to perform functions when the code defining such software is executed by processor 1310. In some embodiments, hardware services that perform specific functions may include software components for performing functions stored in computer-readable media connected to necessary hardware components such as processor 1310, connection 1305, output device 1335, etc.
[0104] As used herein, the term "computer-readable medium" includes, but is not limited to, portable or non-portable storage devices, optical storage devices, and various other media capable of storing, containing, or carrying instructions and / or data. Computer-readable media may include non-transitory media in which data can be stored and which do not include carrier waves and / or transient electronic signals propagating wirelessly or over a wired connection. Examples of non-transitory media include, but are not limited to, magnetic disks or magnetic tapes, optical storage media (such as compact optical discs (CDs) or digital versatile discs (DVDs)), flash memory, memory, or memory devices. Computer-readable media may store code and / or machine-executable instructions thereon, which may represent procedures, functions, subroutines, programs, routines, subroutines, modules, software packages, classes, or any combination of instructions, data structures, or program statements. A code segment may be coupled to another code segment or hardware circuitry by passing and / or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, data, etc., may be passed, forwarded, or transmitted using any suitable means, including memory sharing, messaging, token passing, network transmission, etc.
[0105] In some implementations, computer-readable storage devices, media, and memories may include wired or wireless signals containing bit streams, etc. However, when referred to, non-transitory computer-readable storage media explicitly exclude media such as energy, carrier signals, electromagnetic waves, and the signals themselves.
[0106] Specific details have been provided in the foregoing description to offer a thorough understanding of the embodiments and examples presented herein. However, those skilled in the art will understand that embodiments can be practiced without these specific details. For clarity, in some instances, the technology may be presented as comprising individual functional blocks, including functional blocks containing devices, device components, steps or routines in methods embodied in software or a combination of hardware and software. Additional components may be used in addition to those shown in the figures and / or described herein. For example, circuits, systems, networks, processes, and other components may be shown as components in block diagram form to avoid obscuring these embodiments with unnecessary detail. In other cases, well-known circuits, processes, algorithms, structures, and techniques may be shown without necessary detail to avoid obscuring the embodiments.
[0107] Individual implementations may be described above as processes or methods depicted as flowcharts, flow diagrams, data flow diagrams, structure diagrams, or block diagrams. Although flowcharts may describe operations as sequential processes, many operations within an operation may be executed in parallel or concurrently. Furthermore, the order of operations may be rearranged. A process terminates when its operations are completed, but a process may have additional steps not included in the accompanying drawings. A process may correspond to a method, function, procedure, subroutine, subroutine, etc. When a process corresponds to a function, the termination of that process may correspond to the function returning to the calling function or the main function.
[0108] The processes and methods described in the examples above can be implemented using stored computer-executable instructions or computer-executable instructions otherwise obtainable from a computer-readable medium. Such instructions may include, for example, instructions and data that configure, or otherwise configure, a general-purpose computer, special-purpose computer, or processing device to perform a function or group of functions. The portion may be accessible via a network of the computer resources used. The computer-executable instructions may be, for example, binary, intermediate format instructions, such as assembly language, firmware, source code, etc. Examples of computer-readable media that can be used to store instructions, information used, and / or information created during the methods according to the described examples include disks or optical discs, flash memory, USB devices with non-volatile memory, networked storage devices, etc.
[0109] Devices implementing the processes and methods according to these disclosures may include hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof, and may take any of a variety of form factors. When implemented as software, firmware, middleware, or microcode, program code or code segments (e.g., computer program products) for performing necessary tasks may be stored in a computer-readable or machine-readable medium. A processor may perform the necessary tasks. Typical examples of form factors include laptop computers, smartphones, mobile phones, tablet devices, or other small form factor personal computers, personal digital assistants, rack-mounted devices, standalone devices, etc. The functionality described herein may also be embodied in peripheral devices or interlocking cards. By further example, such functionality may also be implemented on circuit boards of different chips or different processes executed on a single device.
[0110] Instructions, media for transmitting such instructions, computing resources for executing them, and other structures for supporting such computing resources are example components for providing the functionality described in this disclosure.
[0111] In the foregoing description, various aspects of this application have been described with reference to specific embodiments thereof; however, those skilled in the art will recognize that this application is not limited thereto. Therefore, although exemplary embodiments of this application have been described in detail herein, it is to be understood that the inventive concept can be embodied and adopted in a variety of other ways, and the appended claims are intended to be construed as including such variations, unless limited by prior art. The various features and aspects of the applications described above may be used individually or in combination. Furthermore, without departing from the scope of this specification, the embodiments can be used in any number of environments and applications beyond those described herein. Therefore, the specification and drawings should be considered illustrative rather than restrictive. For illustrative purposes, the methods are described in a particular order. It should be understood that in alternative embodiments, the methods may be performed in a different order than described.
[0112] Those skilled in the art will understand that the less than ("<") and greater than (">") symbols or terms used herein can be represented by less than or equal to ("<") respectively. ") and greater than or equal to (" The symbol "" is used to replace the actual content without departing from the scope of this description.
[0113] When a component is described as being “configured” to perform certain operations, such configuration can be achieved, for example, by designing electronic circuits or other hardware to perform the operations, by programming programmable electronic circuits (e.g., microprocessors or other suitable electronic circuits) to perform the operations, or any combination thereof.
[0114] The phrase “coupled to” means any component that is physically connected directly or indirectly to another component, and / or any component that communicates directly or indirectly with another component (e.g., connected to another component via a wired or wireless connection and / or other suitable communication interface).
[0115] The claim language or other language that states "at least one of" and / or "one or more of" in a set indicates that one member of the set or multiple members of the set (in any combination) satisfies the claim. For example, the claim language that states "at least one of A and B" or "at least one of A or B" means A, B, or A and B. In another example, the claim language that states "at least one of A, B, and C" or "at least one of A, B, or C" means A, B, C, or A and B, or A and C, or B and C, or A and B and C. The language that states "at least one of" and / or "one or more of" in a set does not limit the set to the items listed in the set. For example, the claim language that states "at least one of A and B" or "at least one of A or B" may mean A, B, or A and B, and may additionally include items not listed in the set of A and B.
[0116] The various exemplary logic blocks, modules, circuits, and algorithm steps described in conjunction with the embodiments disclosed herein can be implemented as electronic hardware, computer software, firmware, or combinations thereof. To clearly illustrate this interchangeability between hardware and software, various exemplary components, blocks, modules, circuits, and steps have been broadly described above in terms of their functionality. Whether such functionality is implemented as hardware or software depends on the specific application and the design constraints imposed on the overall system. Those skilled in the art may implement the described functionality in different ways for each specific application, but such implementation decisions should not be construed as departing from the scope of this application.
[0117] The techniques described herein can also be implemented in electronic hardware, computer software, firmware, or any combination thereof. Such techniques can be implemented in any of a variety of devices, such as general-purpose computers, wireless communication devices (mobile phones), or integrated circuit devices with multiple uses, including applications in wireless communication devices (mobile phones) and other devices. Any feature described as a module or component can be implemented together in an integrated logic device or separately as discrete but interoperable logic devices. If implemented in software, these techniques can be implemented at least in part by a computer-readable data storage medium comprising program code including instructions that, when executed, perform one or more of the methods described above. The computer-readable data storage medium can form part of a computer program product, which may include packaging material. The computer-readable medium may include memory or data storage media, such as random access memory (RAM) (such as synchronous dynamic random access memory (SDRAM)), read-only memory (ROM), non-volatile random access memory (NVRAM), electrically erasable programmable read-only memory (EEPROM), flash memory, magnetic or optical data storage media, etc. Additionally or alternatively, the technology may be implemented at least in part by a computer-readable communication medium that carries or conveys program code in the form of instructions or data structures that can be accessed, read and / or executed by a computer, such as propagated signals or waves.
[0118] The program code can be executed by a processor, which may include one or more processors, such as one or more digital signal processors (DSPs), general-purpose microprocessors, application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), or other equivalent integrated or discrete logic circuits. Such processors can be configured to perform any of the techniques described in this disclosure. A general-purpose processor may be a microprocessor; however, in alternatives, the processor may be any conventional processor, controller, microcontroller, or state machine. The processor may also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors combined with a DSP core, or any other such configuration. Therefore, as used herein, the term "processor" may refer to any of the foregoing structures, any combination of the foregoing structures, or any other structure or means suitable for implementing the techniques described herein.
[0119] The exemplary aspects of this disclosure include.
[0120] Aspect 1. An apparatus for capturing an image, the apparatus comprising: at least one memory; and at least one processor coupled to the at least one memory, the at least one processor being configured to: receive an indication of a planned trajectory traversing an environment from a first location toward a second location; determine a location of an image capturing device in the environment; acquire an image from the image capturing device at a third location; and transform the image based on a difference between the third location and a corresponding location in the planned trajectory.
[0121] Aspect 2. The apparatus according to aspect 1, wherein the at least one processor is further configured to: receive an indication of a expected trajectory; receive an indication of the first location; and determine the planned trajectory traversing the environment based on the expected trajectory and the indication of the first location.
[0122] Aspect 3. The apparatus according to aspect 2, wherein the expected trajectory is based on a set of pattern trajectories.
[0123] Aspect 4. The apparatus according to aspect 3, wherein the at least one processor is further configured to receive a selected mode trajectory from the set of mode trajectories, wherein the expected trajectory is based on the selected mode trajectory.
[0124] Aspect 5. The apparatus according to any one of Aspects 2 to 4, wherein the at least one processor is further configured to: receive an instruction for a selected object in the environment; and determine a distance to the selected object, wherein the planned trajectory is further determined based on the determined distance to the selected object.
[0125] Aspect 6. The apparatus according to aspect 5, wherein, in order to determine the planned trajectory, the at least one processor is further configured to determine the second positioning based on the distance to the selected object.
[0126] Aspect 7. The apparatus according to any one of Aspects 5 to 6, wherein the at least one processor is further configured to determine the size of the selected object, and wherein the planned trajectory is further determined based on the determined size of the selected object.
[0127] Aspect 8. The apparatus according to any one of Aspects 2 to 7, wherein the at least one processor is further configured to receive an indication of a speed of movement toward the second positioning, and wherein the planned trajectory is further determined based on the indicated speed.
[0128] Aspect 9. The apparatus according to aspect 2, wherein the expected trajectory is based on the movement of the image capturing device through the environment.
[0129] Aspect 10. The apparatus according to any one of Aspects 1 to 9, wherein, in order to transform the image, the at least one processor is configured to: determine a transformation based on the difference between the position of the image capturing device and the planned trajectory when the image is acquired; and apply the transformation to the image.
[0130] Aspect 11. The apparatus according to any one of Aspects 1 to 10, wherein, in order to transform the image, the at least one processor is further configured to: determine a reprojection transformation based on the difference between the position of the image capturing device and the planned trajectory when the image is acquired; and apply the reprojection transformation to the image.
[0131] Aspect 12. The apparatus according to any one of Aspects 1 to 11, wherein, in order to determine the position of the image capturing device in the environment, the at least one processor is configured to: obtain a first image of the environment from a first position; obtain a first feature from the first image; obtain a second image of the environment from a second position; obtain a corresponding second feature from the second image; obtain motion information indicating the movement of the image capturing device from the first position to the second position; and determine the position of the image capturing device based on the motion information and a comparison between the first feature and the corresponding second feature.
[0132] Aspect 13. The apparatus according to any one of Aspects 1 to 12, wherein the planned trajectory includes a set of positions and poses of the image capturing device between the first positioning and the second positioning.
[0133] Aspect 14. The apparatus according to any one of Aspects 1 to 13, wherein the at least one processor is further configured to provide feedback on the position of the image capturing device relative to the planned trajectory.
[0134] Aspect 15. The apparatus according to aspect 14, wherein when the position of the image capturing device relative to the planned trajectory exceeds a threshold distance, feedback is provided regarding the position of the image capturing device relative to the planned trajectory.
[0135] Aspect 16. The apparatus according to any one of Aspects 14 to 15, wherein the feedback includes at least one of visual indication, audio prompt, and tactile feedback.
[0136] Aspect 17. The apparatus according to any one of Aspects 14 to 16, wherein the feedback includes a visual indication of how the current positioning of the image capturing device deviates from the planned trajectory.
[0137] Aspect 18. The apparatus according to any one of Aspects 1 to 17, wherein the image capture device includes a head-mounted display.
[0138] Aspect 19. A method for capturing an image, the method comprising: receiving an indication of a planned trajectory traversing an environment from a first location toward a second location; determining a location of an image capturing device in the environment; acquiring an image from the image capturing device at a third location; and transforming the image based on a difference between the third location and a corresponding location in the planned trajectory.
[0139] Aspect 20. The method according to aspect 19, the method further comprising: receiving an indication of a expected trajectory; receiving an indication of a first location; and determining the planned trajectory traversing the environment based on the expected trajectory and the indication of the first location.
[0140] Aspect 21. The method according to aspect 20, wherein the expected trajectory is based on a set of pattern trajectories.
[0141] Aspect 22. The method according to aspect 21, the method further comprising receiving a selected pattern trajectory from the set of pattern trajectories, wherein the expected trajectory is based on the selected pattern trajectory.
[0142] Aspect 23. The method according to any one of Aspects 20 to 22, the method further comprising: receiving an instruction for a selected object in the environment; and determining a distance to the selected object, wherein the planned trajectory is further determined based on the determined distance to the selected object.
[0143] Aspect 24. The method according to aspect 23, wherein determining the planned trajectory includes determining the second positioning based on the distance to the selected object.
[0144] Aspect 25. The method according to any one of Aspects 23 to 24, the method further comprising determining the size of the selected object, wherein the planned trajectory is further determined based on the determined size of the selected object.
[0145] Aspect 26. The method according to any one of Aspects 20 to 25, the method further comprising receiving an indication of a speed for moving toward the second positioning, wherein the planned trajectory is further determined based on the indicated speed.
[0146] Aspect 27. The method according to aspect 20, wherein the expected trajectory is based on the movement of the image capturing device through the environment.
[0147] Aspect 28. The method according to any one of Aspects 19 to 27, wherein transforming the image comprises: determining a transformation based on the difference between the position of the image capturing device and the planned trajectory when the image is acquired; and applying the transformation to the image.
[0148] Aspect 29. The method according to any one of Aspects 19 to 28, wherein transforming the image comprises: determining a reprojection transformation based on the difference between the position of the image capturing device and the planned trajectory when the image is acquired; and applying the reprojection transformation to the image.
[0149] Aspect 30. The method according to any one of Aspects 19 to 29, wherein the position of the image capturing device in the environment is determined by: obtaining a first image of the environment from a first position; obtaining a first feature from the first image; obtaining a second image of the environment from a second position; obtaining a corresponding second feature from the second image; obtaining motion information indicating the movement of the image capturing device from the first position to the second position; and determining the position of the image capturing device based on the motion information and a comparison between the first feature and the corresponding second feature.
[0150] Aspect 31. The method according to any one of Aspects 19 to 30, wherein the planned trajectory includes a set of positions and poses of the image capturing device between the first positioning and the second positioning.
[0151] Aspect 32. The method according to any one of aspects 19 to 31, the method further comprising providing feedback on the position of the image capturing device relative to the planned trajectory.
[0152] Aspect 33. The method according to aspect 32, wherein when the position of the image capturing device relative to the planned trajectory exceeds a threshold distance, feedback is provided regarding the position of the image capturing device relative to the planned trajectory.
[0153] Aspect 34. The method according to any one of Aspects 32 to 33, wherein the feedback includes at least one of visual indication, audio cues, and haptic feedback.
[0154] Aspect 35. The method according to any one of Aspects 32 to 34, wherein the feedback includes a visual indication of how the current positioning of the image capturing device deviates from the planned trajectory.
[0155] Aspect 36. The method according to any one of aspects 19 to 35, wherein the image capturing device includes a head-mounted display.
[0156] Aspect 37. A non-transitory computer-readable medium having instructions stored thereon, the instructions, when executed by at least one processor, causing the at least one processor to: receive an instruction for a planned trajectory through an environment from a first location toward a second location; determine a location of an image capturing device in the environment; acquire an image from the image capturing device at a third location; and transform the image based on a difference between the third location and a corresponding location in the planned trajectory.
[0157] Aspect 37: The non-transitory computer-readable medium according to aspect 37 further includes instructions for causing the at least one processor to perform any of the operations of aspects 19 to 36.
[0158] Aspect 38: An apparatus for capturing images, the apparatus comprising one or more components for performing any of the operations of aspects 19 to 36.
Claims
1. An apparatus for capturing an image, the apparatus comprising: At least one memory; and At least one processor, coupled to the at least one memory, is configured to: Receive instructions on the planned trajectory across the environment from the first location to the second location; Determine the location of the image capture device in the environment; An image is acquired from the image capture device at the third location; and The image is transformed based on the difference between the third positioning and the corresponding positioning in the planned trajectory.
2. The apparatus of claim 1, wherein the at least one processor is further configured to: Receive instructions on the expected trajectory; Receive an instruction for the first location; and The planned trajectory through the environment is determined based on the expected trajectory and the indication of the first location.
3. The apparatus of claim 2, wherein the expected trajectory is based on a set of pattern trajectories.
4. The apparatus of claim 3, wherein the at least one processor is further configured to receive a selected mode trajectory from the set of mode trajectories, wherein the expected trajectory is based on the selected mode trajectory.
5. The apparatus of claim 2, wherein the at least one processor is further configured to: Receive instructions for selected objects in the environment; and The distance to the selected object is determined, and the planned trajectory is further determined based on the determined distance to the selected object.
6. The apparatus according to claim 5, wherein, In order to determine the planned trajectory, the at least one processor is also configured to determine the second positioning based on the distance to the selected object.
7. The apparatus of claim 5, wherein the at least one processor is further configured to determine the size of the selected object, and wherein the planned trajectory is further determined based on the determined size of the selected object.
8. The apparatus of claim 2, wherein the at least one processor is further configured to receive an indication of a speed of movement toward the second positioning, and wherein the planned trajectory is further determined based on the indicated speed.
9. The apparatus of claim 2, wherein the expected trajectory is based on the movement of the image capturing device through the environment.
10. The apparatus according to claim 1, wherein, In order to transform the image, the at least one processor is configured to: The transformation is determined based on the difference between the position of the image capture device and the planned trajectory when the image is acquired; as well as Apply the transformation to the image.
11. The apparatus according to claim 1, wherein, In order to transform the image, the at least one processor is further configured to: The reprojection transformation is determined based on the difference between the position of the image capture device and the planned trajectory when the image was acquired; and The reprojection transformation is applied to the image.
12. The apparatus according to claim 1, wherein, In order to determine the location of the image capture device in the environment, the at least one processor is configured to: A first image of the environment is obtained from a first location; Obtain a first feature from the first image; A second image of the environment is obtained from the second location; Obtain the corresponding second feature from the second image; Obtain motion information indicating the movement of the image capture device from the first position to the second position; as well as The position of the image capture device is determined based on the motion information and a comparison between the first feature and the corresponding second feature.
13. The apparatus of claim 1, wherein the planned trajectory includes a set of positions and poses of the image capturing device between the first positioning and the second positioning.
14. The apparatus of claim 1, wherein the at least one processor is further configured to provide feedback on the position of the image capture device relative to the planned trajectory.
15. The apparatus of claim 14, wherein when the position of the image capturing device relative to the planned trajectory exceeds a threshold distance, feedback is provided regarding the position of the image capturing device relative to the planned trajectory.
16. The apparatus of claim 14, wherein the feedback comprises at least one of visual indication, audio cues, and haptic feedback.
17. The apparatus of claim 14, wherein the feedback includes a visual indication of how the current positioning of the image capturing device deviates from the planned trajectory.
18. The apparatus of claim 1, wherein the image capture device comprises a head-mounted display.
19. A method for capturing an image, the method comprising: Receive instructions on the planned trajectory across the environment from the first location to the second location; Determine the location of the image capture device in the environment; An image is acquired from the image capture device at the third location; and The image is transformed based on the difference between the third positioning and the corresponding positioning in the planned trajectory.
20. The method according to claim 19, further comprising: Receive instructions on the expected trajectory; Receive an instruction for the first location; as well as The planned trajectory through the environment is determined based on the expected trajectory and the indication of the first location.
21. The method of claim 20, wherein the expected trajectory is based on a set of pattern trajectories.
22. The method of claim 21, further comprising receiving a selected pattern trajectory from the set of pattern trajectories, wherein the expected trajectory is based on the selected pattern trajectory.
23. The method of claim 20, further comprising: Receive instructions for selected objects in the environment; as well as The distance to the selected object is determined, and the planned trajectory is further determined based on the determined distance to the selected object.
24. The method of claim 23, wherein determining the planned trajectory includes determining the second location based on the distance to the selected object.
25. The method of claim 23, further comprising determining the size of the selected object, wherein the planned trajectory is further determined based on the determined size of the selected object.
26. The method of claim 20, further comprising receiving an indication of a speed for moving toward the second positioning, wherein the planned trajectory is further determined based on the indicated speed.
27. The method of claim 20, wherein the expected trajectory is based on the movement of the image capturing device through the environment.
28. The method of claim 19, wherein transforming the image comprises: The transformation is determined based on the difference between the position of the image capture device and the planned trajectory when the image is acquired; as well as Apply the transformation to the image.
29. The method of claim 19, wherein transforming the image comprises: The reprojection transformation is determined based on the difference between the position of the image capture device and the planned trajectory when the image is acquired. as well as The reprojection transformation is applied to the image.
30. A non-transitory computer-readable medium having instructions stored thereon, the instructions causing the at least one processor, when executed, to: Receive instructions on the planned trajectory across the environment from the first location to the second location; Determine the location of the image capture device in the environment; An image is acquired from the image capture device at the third location; and The image is transformed based on the difference between the third positioning and the corresponding positioning in the planned trajectory.