Method, device, system and readable medium for generating and displaying a depth image

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By generating and displaying stereoscopic panoramic images and their depth images, the problem of poor user experience in VR panoramic videos is solved, and realistic motion parallax and immersion are improved.

CN115222793BActive Publication Date: 2026-06-16SPREADTRUM COMMUNICATION (SHANGHAI) CO LTD

View PDF 3 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SPREADTRUM COMMUNICATION (SHANGHAI) CO LTD
Filing Date: 2017-12-22
Publication Date: 2026-06-16

Smart Images

Figure CN115222793B_ABST

Patent Text Reader

Abstract

A depth image generation and display method, device, system and readable medium, the depth image generation method comprises: acquiring images captured by a left eye camera and a right eye camera; generating a stereoscopic panorama image based on the acquired captured images; and generating a depth image corresponding to the stereoscopic panorama image based on an optical flow value of an overlapping area between adjacent images captured by the left eye camera and the right eye camera. By applying the above method, depth information is added on the basis of a stereoscopic panorama video, so that a real motion parallax can be formed during subsequent display, and the immersion of a user is increased.

Need to check novelty before this filing date? Find Prior Art

Description

[0001] This application is a divisional application of the patent filed on December 22, 2017, with application number 201711414436.5, entitled "Method, Apparatus, System and Readable Medium for Generating and Displaying Depth Images". Technical Field

[0002] The embodiments of the present invention relate to the field of image processing, and in particular to a method, apparatus, system, and readable medium for generating and displaying depth images. Background Technology

[0003] Virtual Reality (VR) technology uses head-mounted displays to create an immersive experience, making users feel as if they are actually there. Current VR systems offer high resolution, low latency, and can track the user's head position and angle in real time.

[0004] For VR scenes captured by cameras, multiple cameras can be pointed at different parts of the scene. Images captured by each camera at the same moment are then stitched together to create a panoramic image for that moment. Finally, these panoramic images from different moments are combined to form a panoramic video. This panoramic video is called a "pseudo-3D" panoramic video, meaning that at any given moment, there is only one equirectangular projection panoramic image. When the user views it, the image is processed using a certain algorithm (such as a simple left-right translation) to produce the "pseudo-3D" panoramic video effect. Because "pseudo-3D" panoramic videos offer a poor user experience and can easily cause dizziness and other discomfort, Facebook proposed Surround360 technology to shoot "true 3D" panoramic videos. Surround360 technology simulates the left and right eyes, using left-eye and right-eye cameras to generate a virtual camera's image at each point in a circle, thus generating a 360-degree panoramic image. When the user views it, they simply display the images captured by the left-eye and right-eye cameras in their respective display areas.

[0005] "Pseudo-3D" panoramic videos captured by cameras lack any depth effects such as parallax or occlusion, resulting in a poor user experience and a tendency to feel dizzy or uncomfortable. In contrast, "true 3D" panoramic videos captured by Surround360, which simulate human vision during shooting, can produce some occlusion and left-right eye parallax effects, providing a better user experience. However, since the user's eyes are always centered in the shot, viewing is limited to head rotation to select the viewing area; the user cannot move to adjust the viewpoint. Summary of the Invention

[0006] The technical problem solved by the embodiments of the present invention is how to form realistic motion parallax based on different viewpoints for captured images, thereby increasing the user's sense of immersion.

[0007] To address the aforementioned technical problems, embodiments of the present invention provide a method for generating a depth image, the method comprising: acquiring images captured by a left-eye camera and a right-eye camera; generating a stereoscopic panoramic image based on the acquired images; and generating a depth image corresponding to the stereoscopic panoramic image based on the optical flow values of the overlapping regions between adjacent images captured by the left-eye camera and the right-eye camera.

[0008] Optionally, generating the depth image corresponding to the stereoscopic panoramic image based on the optical flow values of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera includes: converting the world coordinates corresponding to the pixels in the stereoscopic panoramic image to the camera coordinates corresponding to the left-eye camera and the right-eye camera, respectively; calculating the optical flow values of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera; calculating the depth values corresponding to the pixels in the stereoscopic panoramic image based on the converted camera coordinates and the optical flow values; and generating the depth image corresponding to the stereoscopic panoramic image based on the calculated depth values.

[0009] Optionally, calculating the depth value corresponding to a pixel in the stereo panoramic image based on the transformed camera coordinates and the optical flow value includes: obtaining the radius R of the annulus formed by the left-eye camera and the right-eye camera relative to the center of the circle, and the angle δ between the z-axis of the camera coordinate system and the x-axis of the world coordinate system, and calculating... Let t1 be the width of the panoramic image, W, and calculate... Let t2 be the value of t2. ψ is the optical flow value; the sum of t1 and t2 is the depth value d corresponding to the pixel.

[0010] Optionally, generating a stereoscopic panoramic image based on the acquired images includes: performing spherical or cylindrical projection on the acquired images, that is, transforming the pixel coordinates corresponding to the pixels in the acquired images to spherical or cylindrical coordinates; calculating the optical flow value of the overlapping area between adjacent images captured by the left-eye camera and the right-eye camera; for each column of pixels in the overlapping area, generating a left-eye camera panoramic image based on the left-eye camera and its corresponding optical flow value, and generating a right-eye camera panoramic image based on the right-eye camera and its corresponding optical flow value; and synthesizing the left-eye camera panoramic image and the right-eye camera panoramic image based on a fusion algorithm to generate a stereoscopic panoramic image.

[0011] Optionally, the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera is calculated based on any of the following algorithms: phase correlation algorithm, position correlation algorithm, and Lucas-Kanade algorithm.

[0012] This invention provides a method for displaying a depth image, comprising: generating a stereoscopic panoramic image and its corresponding depth image using any of the aforementioned depth image generation methods; reconstructing a point cloud image based on the stereoscopic panoramic image and its corresponding depth image to obtain a point cloud image corresponding to the panoramic image from the left eye camera and a point cloud image corresponding to the panoramic image from the right eye camera; projecting the point cloud image onto a viewing plane and distorting its corresponding depth image; distorting the pixels in the panoramic image corresponding to the distorted depth image onto the viewing plane to generate a viewing plane image from the left eye camera and a viewing plane image from the right eye camera; and outputting and displaying a viewing plane image with a smaller corresponding depth value.

[0013] Optionally, the viewing plane is a plane perpendicular to the direction of eye gaze.

[0014] Optionally, the method for displaying the depth image further includes: compressing the stereoscopic panoramic image and its corresponding depth image to generate compressed data of the stereoscopic panoramic image and its corresponding depth image; and decompressing the compressed data of the stereoscopic panoramic image and its corresponding depth image to obtain the stereoscopic panoramic image and its corresponding depth image.

[0015] This invention provides a depth image generation apparatus, comprising: an acquisition unit adapted to acquire images captured by a left-eye camera and a right-eye camera; a first generation unit adapted to generate a stereoscopic panoramic image based on the acquired images; and a second generation unit adapted to generate a depth image corresponding to the stereoscopic panoramic image based on the optical flow values of the overlapping regions between adjacent images captured by the left-eye camera and the right-eye camera.

[0016] Optionally, the second generation unit includes: a first transformation subunit, adapted to transform the world coordinates corresponding to pixels in the stereoscopic panoramic image to camera coordinates corresponding to the left-eye camera and the right-eye camera, respectively; a first calculation subunit, adapted to calculate the optical flow value of the overlapping area between adjacent images captured by the left-eye camera and the right-eye camera; a second calculation subunit, adapted to calculate the depth value corresponding to pixels in the stereoscopic panoramic image based on the transformed camera coordinates and the optical flow value; and a first generation subunit, adapted to generate a depth image corresponding to the stereoscopic panoramic image based on the calculated depth value.

[0017] Optionally, the second calculation subunit includes: a first calculation module, a second calculation module, and a third calculation module, wherein: the first calculation module is adapted to obtain the radius R of the circle formed by the left-eye camera and the right-eye camera relative to the center of the circle, and the angle δ between the z-axis of the camera coordinate system and the x-axis of the world coordinate system, and calculate... Let t1 be the width of the panoramic image; the second calculation module is adapted to obtain the width W of the panoramic image and calculate... Let t2 be the value of t2. ψ is the optical flow value; the third calculation module is adapted to calculate the sum of t1 and t2 as the depth value d corresponding to the pixel.

[0018] Optionally, the first generation unit includes: a second conversion subunit, adapted to perform spherical or cylindrical projection on the acquired captured image, that is, to convert the pixel coordinates corresponding to the pixels in the acquired captured image to spherical coordinates or cylindrical coordinates; a first calculation subunit, adapted to calculate the optical flow value of the overlapping area between adjacent images captured by the left-eye camera and the right-eye camera; a second generation subunit, adapted to generate a left-eye camera panoramic image based on the left-eye camera and its corresponding optical flow value for each column of pixels in the overlapping area, and generate a right-eye camera panoramic image based on the right-eye camera and its corresponding optical flow value; and a third generation subunit, adapted to synthesize the left-eye camera panoramic image and the right-eye camera panoramic image based on a fusion algorithm to generate a stereoscopic panoramic image.

[0019] Optionally, the first computing subunit is adapted to calculate the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera based on any of the following algorithms: phase correlation algorithm, position correlation algorithm, and Lucas-Kanade algorithm.

[0020] This invention provides a depth image display device, comprising: a third generation unit, adapted to generate a stereoscopic panoramic image and its corresponding depth image using the method described in any one of claims 1 to 5; a reconstruction unit, adapted to reconstruct a point cloud map based on the stereoscopic panoramic image and its corresponding depth image, and obtain a point cloud map corresponding to a panoramic image from a left-eye camera and a point cloud map corresponding to a panoramic image from a right-eye camera; a distortion unit, adapted to project the point cloud map onto a viewing plane and distort its corresponding depth image; a fourth generation unit, adapted to distort the pixels in the panoramic image corresponding to the distorted depth image onto the viewing plane, and generate a viewing plane image from a left-eye camera and a viewing plane image from a right-eye camera; and an output unit, adapted to output and display a viewing plane image with a smaller corresponding depth value.

[0021] Optionally, the viewing plane is a plane perpendicular to the direction of eye gaze.

[0022] Optionally, the depth image display device further includes: a compression unit, adapted to compress the stereoscopic panoramic image and its corresponding depth image to generate compressed data of the stereoscopic panoramic image and its corresponding depth image; and a decompression unit, adapted to decompress the compressed data of the stereoscopic panoramic image and its corresponding depth image to obtain the stereoscopic panoramic image and its corresponding depth image.

[0023] This invention provides a computer-readable storage medium storing computer instructions, which, when executed, perform the steps of any of the above-described depth image generation methods.

[0024] This invention provides a computer-readable storage medium storing computer instructions, which, when executed, perform the steps of any of the above-described depth image display methods.

[0025] This invention provides a depth image generation system, including a memory and a processor. The memory stores computer instructions that can be executed on the processor. When the processor executes the computer instructions, it performs the steps of any of the depth image generation methods described above.

[0026] This invention provides a depth image display system, including a memory and a processor. The memory stores computer instructions that can be executed on the processor. When the processor executes the computer instructions, it performs the steps of any of the depth image display methods described above.

[0027] Compared with the prior art, the technical solution of the embodiments of the present invention has the following beneficial effects:

[0028] This invention generates a depth image corresponding to a stereoscopic panoramic image based on the optical flow values of the overlapping areas between adjacent images captured by the left-eye and right-eye cameras. By adding depth information to the stereoscopic panoramic video, realistic motion parallax can be formed based on different viewpoints during subsequent display, thereby increasing the user's sense of immersion.

[0029] Furthermore, based on the generated stereoscopic panoramic video and its corresponding depth image, a point cloud map is reconstructed, and then the point cloud map is projected onto the view plane. By distorting the depth image, a view plane image is generated for output. Different viewpoints can be generated based on the position of head movement, forming a realistic motion parallax and increasing the user's sense of immersion. Attached Figure Description

[0030] Figure 1 This is a detailed flowchart of a method for generating a depth image provided in an embodiment of the present invention;

[0031] Figure 2 This is a top view of a camera coordinate system and a world coordinate system provided in an embodiment of the present invention;

[0032] Figure 3 This is a detailed flowchart of a method for displaying a depth image provided in an embodiment of the present invention;

[0033] Figure 4 This is a detailed flowchart of an image capturing and display method provided in an embodiment of the present invention;

[0034] Figure 5 This is a schematic diagram of the structure of a depth image generation device provided in an embodiment of the present invention;

[0035] Figure 6 This is a schematic diagram of the structure of a depth image display device provided in an embodiment of the present invention. Detailed Implementation

[0036] In existing technologies, "pseudo-3D" panoramic videos captured by cameras lack any depth effects such as parallax or occlusion, resulting in a poor user experience and a tendency to experience dizziness and other discomfort. "True 3D" panoramic videos captured by Surround360, however, simulate human eye movement during shooting, producing some occlusion and left-right eye parallax effects, leading to a better user experience. However, since the user's eyes are always positioned at the center of the shot by default, viewing is limited to head rotation to select the viewing area; the user cannot move to adjust the viewpoint.

[0037] This invention generates a depth image corresponding to a stereoscopic panoramic image based on the optical flow values of the overlapping areas between adjacent images captured by the left-eye and right-eye cameras. By adding depth information to the stereoscopic panoramic video, realistic motion parallax can be formed based on different viewpoints during subsequent display, thereby increasing the user's sense of immersion.

[0038] To make the above-mentioned objectives, features and beneficial effects of the present invention more apparent and understandable, specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0039] See Figure 1 This invention provides a method for generating a depth image, which may include the following steps:

[0040] Step S101: Acquire images captured by the left-eye camera and the right-eye camera.

[0041] In practical implementation, due to the poor user experience and tendency to cause dizziness associated with "pseudo-3D" panoramic videos, Facebook proposed Surround360 technology to shoot "true 3D" panoramic videos. Surround360 technology simulates the left and right eyes, using left-eye and right-eye cameras to generate a virtual camera at each point in the circle, thus generating a 360-degree panoramic image. This invention can generate depth images based on images captured by the left-eye and right-eye cameras using Surround360 technology.

[0042] Step S102: Generate a stereoscopic panoramic image based on the acquired captured images.

[0043] In practice, stereoscopic panoramic images can be generated based on the images captured by the left-eye and right-eye cameras.

[0044] In practical implementation, since the images captured by the left and right eye cameras correspond to pixel coordinates, the captured images first need to undergo spherical or cylindrical projection, mapping the pixel coordinates to spherical or cylindrical coordinates. Then, based on the spherical or cylindrical coordinates, the optical flow value of the overlapping region between adjacent images captured by the left and right eye cameras is calculated. For each column of pixels in the overlapping region, a new virtual camera output is generated based on the left eye camera and its corresponding optical flow value, producing a panoramic image of the left eye camera. Similarly, a new virtual camera output is generated based on the right eye camera and its corresponding optical flow value, producing a panoramic image of the right eye camera. Finally, a fusion algorithm is used to synthesize the panoramic images of the left and right eyes to generate a stereoscopic panoramic image.

[0045] In one embodiment of the present invention, the conversion formula from spherical coordinate system to Cartesian coordinate system is as follows:

[0046] x=r sinθcosφ

[0047] y = r sinθsinφ

[0048] z = r cosθ (1)

[0049] Where (x, y, z) are Cartesian coordinates. Let r be the radius of the sphere and θ be the azimuthal angle. It is the polar angle.

[0050] Based on the camera extrinsic parameters, the conversion formula from world coordinates to camera coordinates is as follows:

[0051] X c =R1X+T (2)

[0052] Where X c Let R1 be the camera coordinates, X be the world coordinates, R1 be the rotation matrix, and T be the translation vector.

[0053] Based on the camera intrinsics, the pixel coordinates are calculated as follows:

[0054] x′=x / z

[0055] y′=y / z

[0056] u = f x x′+c x

[0057] v = f y y′+c y (3)

[0058] Where (u, v) are pixel coordinates, f x f y The focal lengths in the horizontal and vertical directions are c, respectively. x c y These represent the horizontal and vertical offsets of the image origin relative to the optical center imaging point, respectively.

[0059] Based on formulas (1), (2) and (3), the pixel coordinates of the pixels in the captured image can be converted to spherical coordinates.

[0060] Spherical projection can provide an immersive 360x180 degree experience, while cylindrical projection cannot provide top and bottom images, resulting in a worse experience than spherical projection.

[0061] In practice, the optical flow value of the overlapping region can be calculated based on phase correlation algorithm, position correlation algorithm or Lucas-Kanade algorithm.

[0062] In practice, due to the limited number of real cameras, to generate panoramic images, it is necessary to create a series of virtual cameras between the real left-eye and right-eye cameras. Furthermore, due to the limited image resolution, each column of the panoramic image can be used as the output of a virtual camera.

[0063] In one embodiment of the present invention, the phase angle of a certain column in the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera is ξ, and the phase angles corresponding to the optical centers of the left-eye camera and the right-eye camera are α1 and α2, respectively. Then, the optical flow values ψ1 corresponding to the left-eye camera and ψ2 corresponding to the right-eye camera are respectively:

[0064]

[0065] In practice, the panoramic images from the left-eye camera and the right-eye camera can be alpha-fused based on distance to generate a stereoscopic panoramic image.

[0066] Step S103: Based on the optical flow values of the overlapping areas between adjacent images captured by the left-eye camera and the right-eye camera, generate a depth image corresponding to the stereoscopic panoramic image.

[0067] In practical implementation, obtaining depth values based on methods such as structured light, time of flight (TOF), and laser scanning requires additional equipment, which not only brings difficulties to the structural design but also increases costs and reduces portability. Therefore, the embodiments of the present invention use optical flow values to calculate depth values, which does not require additional equipment and can effectively reduce costs and design complexity.

[0068] In one embodiment of the present invention, the world coordinates corresponding to the pixels in the stereoscopic panoramic image are first converted to the camera coordinates corresponding to the left-eye camera and the right-eye camera, respectively. Then, the optical flow value of the overlapping area between adjacent images captured by the left-eye camera and the right-eye camera is calculated. Based on the converted camera coordinates and the optical flow value, the depth value corresponding to the pixels in the stereoscopic panoramic image is calculated. Based on the calculated depth value, a depth image corresponding to the stereoscopic panoramic image is generated.

[0069] In specific implementation, the calculation of the optical flow value of the overlapping area between adjacent images captured by the left-eye camera and the right-eye camera can be referred to the description in step S102, which will not be repeated here.

[0070] To enable those skilled in the art to better understand and implement the present invention, embodiments of the present invention provide a top view of a camera coordinate system and a world coordinate system, such as... Figure 2 As shown.

[0071] See Figure 2 The camera coordinate system of the left eye camera C1 is x1y1z1, the camera coordinate system of the right eye camera C2 is x2y2z2, the world coordinate system with the center of the ring as the origin is xyz, the angle between the z-axis of the left eye camera coordinate system or the right eye camera coordinate system and the x-axis of the world coordinate system is δ, and the radius of the ring formed by the left eye camera and the right eye camera relative to the center is R.

[0072] For a pixel in a panoramic image, the coordinates of its corresponding point P(x,y,z) in the world coordinate system are given by formulas (5) and (6) in the left-eye camera coordinate system and the right-eye camera coordinate system, respectively:

[0073] x1=xsinδ-ycosδ

[0074] y1=z

[0075] z1=-xcosδ-ysinδ+R (5)

[0076] x2=-xsinδ-ycosδ

[0077] y2=z

[0078] z2=-xcosδ+ysinδ+R (6)

[0079] Based on the camera model, the following relationship exists:

[0080]

[0081] Based on formulas (5), (6) and (7), the following relationship can be derived:

[0082]

[0083] Furthermore, due to the following relationship:

[0084]

[0085] In real-world scenarios, when x >> y and x >> z, the depth value d can be calculated using the following formula:

[0086]

[0087] in W is the width of the panoramic image, and ψ is the optical flow value ψ1 corresponding to the left eye camera or the optical flow value ψ2 corresponding to the right eye camera.

[0088] In one embodiment of the present invention, the radius R of the ring formed by the left-eye camera and the right-eye camera relative to the center of the circle can be obtained, and the angle δ between the z-axis of the camera coordinate system and the x-axis of the world coordinate system can be calculated. Let t1 be the width of the panoramic image; then obtain the width W of the panoramic image and calculate... Let t2 be the value of t2. ψ is the optical flow value; finally, the sum of t1 and t2 is calculated as the depth value d corresponding to the pixel.

[0089] By applying the above method, a depth image corresponding to the stereoscopic panoramic image is generated based on the optical flow value of the overlapping area between adjacent images captured by the left-eye camera and the right-eye camera. By adding depth information to the stereoscopic panoramic video, a real motion parallax can be formed based on different viewpoints during subsequent display, thereby increasing the user's sense of immersion.

[0090] To enable those skilled in the art to better understand and implement the present invention, embodiments of the present invention provide a method for displaying depth images, such as... Figure 3 As shown.

[0091] See Figure 3 The method for displaying the depth image may include the following steps:

[0092] Step S301: Generate a stereoscopic panoramic image and its corresponding depth image using any of the depth image generation methods described above.

[0093] In practice, any of the above-mentioned methods for generating depth images can be used to generate stereoscopic panoramic images and their corresponding depth images, which will not be elaborated here.

[0094] Step S302: Based on the stereo panoramic image and its corresponding depth image, reconstruct the point cloud map to obtain the point cloud map corresponding to the panoramic image of the left eye camera and the point cloud map corresponding to the panoramic image of the right eye camera.

[0095] In practice, a point cloud map can be reconstructed based on the stereoscopic panoramic image and its corresponding depth image.

[0096] In one embodiment of the present invention, for each pixel p(h,w) in the panoramic image, where h and w are pixel indices, based on their coordinates on the imaging sphere... The corresponding world coordinates are calculated as follows:

[0097]

[0098] Where O is the origin of the spherical coordinate system. Let d represent the vector pointing from the origin to S, d be the depth value corresponding to p(h,w), and f be the radius of the sphere.

[0099] Step S303: Project the point cloud image onto the view plane and distort its corresponding depth image.

[0100] In practice, for a given eye position, the plane perpendicular to the direction of the eye's gaze is the visual plane of that viewpoint.

[0101] Since the embodiments of the present invention support head rotation, the position of the eyes can be changed, as long as the position of the eyes is within the supported range of head movement, and the center of the line connecting the two eyes does not need to be at the center of the circle.

[0102] Step S304 involves twisting the pixels in the panoramic image corresponding to the distorted depth image to the view plane, generating a view plane image for the left eye camera and a view plane image for the right eye camera.

[0103] Step S305: Output and display the view plane image with the smaller corresponding depth value.

[0104] In specific implementations, steps S301 and S302-S305 may be implemented in different modules. For example, step S301 may be implemented in the shooting module, and steps S302-S305 may be implemented in the display module. The shooting module and the display module can communicate with each other through an interface or a transmission line. When steps S301 and S302-S305 are implemented in different modules, the following may also be included between step S301 and step S302:

[0105] Within the module corresponding to step S301, the stereoscopic panoramic image and its corresponding depth image are compressed to generate compressed data of the stereoscopic panoramic image and its corresponding depth image.

[0106] The compressed data of the stereoscopic panoramic image and its corresponding depth image are transmitted to the module corresponding to step S302 through the interface or transmission line between the module corresponding to step S301 and the module corresponding to step S302.

[0107] Within the module corresponding to step S302, the compressed data of the stereoscopic panoramic image and its corresponding depth image are decompressed, and the stereoscopic panoramic image and its corresponding depth image are obtained again.

[0108] In practice, to reduce additional overhead, a compression algorithm with a high compression ratio can be used to compress depth images.

[0109] By applying the above method, a point cloud map is reconstructed based on the generated stereoscopic panoramic video and its corresponding depth image. The point cloud map is then projected onto the view plane, and a view plane image is generated and output by distorting the depth image. Different viewpoints can be generated based on the position of head movement, forming a real motion parallax and increasing the user's sense of immersion.

[0110] To enable those skilled in the art to better understand and implement the present invention, embodiments of the present invention also provide an image capturing and display method, such as... Figure 4 As shown.

[0111] See Figure 4 The image capturing and display method includes the following steps:

[0112] Step S401: Generate a stereoscopic panoramic image.

[0113] Step S402: Generate a depth image corresponding to the stereoscopic panoramic image.

[0114] Step S403: Compress the stereoscopic panoramic image and its corresponding depth image to generate compressed data of the stereoscopic panoramic image and its corresponding depth image, and transmit it to the display module.

[0115] In practice, steps S401, S402 and S403 can be executed in the shooting module.

[0116] Step S404: Decompress the stereoscopic panoramic image and its corresponding depth image compressed data to generate the stereoscopic panoramic image and its corresponding depth image.

[0117] Step S405: Based on the stereoscopic panoramic image and its corresponding depth image, perform point cloud reconstruction.

[0118] Step S406: Based on the point cloud map, generate a new viewpoint image and output it for display.

[0119] In a specific implementation, steps S404, S405, and S406 can be executed in the display module.

[0120] To enable those skilled in the art to better understand and implement the present invention, embodiments of the present invention also provide an apparatus capable of implementing the above-described method for generating depth images, such as... Figure 5 As shown.

[0121] See Figure 5 The depth image generation apparatus 50 includes: an acquisition unit 51, a first generation unit 52, and a second generation unit 53, wherein:

[0122] The acquisition unit 51 is adapted to acquire images captured by the left-eye camera and the right-eye camera.

[0123] The first generation unit 52 is adapted to generate a stereoscopic panoramic image based on the acquired captured image.

[0124] The second generation unit 53 is adapted to generate a depth image corresponding to the stereo panoramic image based on the optical flow value of the overlapping area between adjacent images captured by the left-eye camera and the right-eye camera.

[0125] In a specific implementation, the second generation unit 53 includes: a first conversion subunit (not shown), a first calculation subunit (not shown), a second calculation subunit (not shown), and a first generation subunit (not shown), wherein:

[0126] The first conversion subunit is adapted to convert the world coordinates corresponding to pixels in the stereoscopic panoramic image to the camera coordinates corresponding to the left-eye camera and the right-eye camera, respectively.

[0127] The first calculation subunit is adapted to calculate the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera.

[0128] The second calculation subunit is adapted to calculate the depth value corresponding to the pixel in the stereo panoramic image based on the transformed camera coordinates and the optical flow value.

[0129] The first generation subunit is adapted to generate a depth image corresponding to the stereoscopic panoramic image based on the calculated depth value.

[0130] In one embodiment of the present invention, the second computing subunit includes: a first computing module (not shown), a second computing module (not shown), and a third computing module (not shown), wherein:

[0131] The first calculation module is adapted to obtain the radius R of the circle formed by the left-eye camera and the right-eye camera relative to the center of the circle, and the angle δ between the z-axis of the camera coordinate system and the x-axis of the world coordinate system, and to calculate... Let t1 be the value.

[0132] The second calculation module is adapted to obtain a panoramic image with a width of W, and calculate... Let t2 be the value of t2. ψ is the optical flow value.

[0133] The third calculation module is adapted to calculate the sum of t1 and t2 as the depth value d corresponding to the pixel.

[0134] In a specific implementation, the first generation unit 52 includes: a second conversion subunit (not shown), a first calculation subunit (not shown), a second generation subunit (not shown), and a third generation subunit (not shown), wherein:

[0135] The second transformation subunit is adapted to perform spherical or cylindrical projection on the acquired image, that is, to transform the pixel coordinates corresponding to the pixels in the acquired image to spherical or cylindrical coordinates.

[0136] The first calculation subunit is adapted to calculate the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera.

[0137] The second generation subunit is adapted to generate a panoramic image of the left eye camera based on the left eye camera and its corresponding optical flow value for each column of pixels in the overlapping region, and to generate a panoramic image of the right eye camera based on the right eye camera and its corresponding optical flow value.

[0138] The third generation subunit is adapted to synthesize panoramic images from the left-eye camera and the right-eye camera based on a fusion algorithm to generate a stereoscopic panoramic image.

[0139] In one embodiment of the present invention, the first computing subunit is adapted to calculate the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera based on any of the following algorithms: phase correlation algorithm, bit correlation algorithm and Lucas-Kanade algorithm.

[0140] In specific implementation, the working process and principle of the generating device 50 can be referred to the description in the method provided in the above embodiments, and will not be repeated here.

[0141] To enable those skilled in the art to better understand and implement the present invention, embodiments of the present invention also provide an apparatus capable of implementing the above-described depth image display method, such as... Figure 6 As shown.

[0142] See Figure 6The depth image display device 60 includes: a third generation unit 61, a reconstruction unit 62, a distortion unit 63, a fourth generation unit 64, and an output unit 65, wherein:

[0143] The third generation unit 61 is adapted to generate a stereoscopic panoramic image and its corresponding depth image using any of the depth image generation methods described above.

[0144] The reconstruction unit 62 is adapted to reconstruct point cloud maps based on the stereo panoramic image and its corresponding depth image, and obtain point cloud maps corresponding to the panoramic image of the left eye camera and the panoramic image of the right eye camera.

[0145] The distortion unit 63 is adapted to project the point cloud image onto the view plane and distort its corresponding depth image.

[0146] The fourth generation unit 64 is adapted to distort the pixels in the panoramic image corresponding to the distorted depth image to the view plane, generating a left-eye camera view plane image and a right-eye camera view plane image.

[0147] The output unit 65 is adapted to output and display a view plane image with a smaller corresponding depth value.

[0148] In specific implementation, the visual plane is defined as a plane perpendicular to the direction of eye gaze.

[0149] In a specific implementation, the display device 60 may further include: a compression unit (not shown) and a decompression unit (not shown), wherein:

[0150] The compression unit is adapted to compress the stereoscopic panoramic image and its corresponding depth image to generate compressed data of the stereoscopic panoramic image and its corresponding depth image.

[0151] The decompression unit is adapted to decompress the compressed data of the stereoscopic panoramic image and its corresponding depth image to obtain the stereoscopic panoramic image and its corresponding depth image.

[0152] In specific implementation, the working process and principle of the display device 60 can be referred to the description in the method provided in the above embodiments, and will not be repeated here.

[0153] This invention provides a computer-readable storage medium, which is a non-volatile or non-transient storage medium, storing computer instructions thereon. When the computer instructions are executed, they perform the steps corresponding to any of the depth image generation methods described above, which will not be elaborated here.

[0154] This invention provides a computer-readable storage medium, which is a non-volatile or non-transient storage medium, storing computer instructions thereon. When the computer instructions are executed, they perform the steps corresponding to any of the above-described depth image display methods, which will not be elaborated here.

[0155] This invention provides a depth image generation system, including a memory and a processor. The memory stores computer instructions that can be executed on the processor. When the processor executes the computer instructions, it performs the steps corresponding to any of the depth image generation methods described above, which will not be elaborated here.

[0156] This invention provides a depth image display system, including a memory and a processor. The memory stores computer instructions that can be executed on the processor. When the processor executes the computer instructions, it performs the steps corresponding to any of the depth image display methods described above, which will not be elaborated here.

[0157] Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be implemented by a program instructing related hardware. The program can be stored in a computer-readable storage medium, which may include ROM, RAM, disk, or optical disk, etc.

[0158] While the present invention has been disclosed above, it is not limited thereto. Any person skilled in the art can make various modifications and alterations without departing from the spirit and scope of the invention; therefore, the scope of protection of the present invention should be determined by the scope defined in the claims.

Claims

1. A method for displaying a depth image, characterized in that, include: A depth image generation method is used to generate a stereo panoramic image and its corresponding depth image; Based on the stereo panoramic image and its corresponding depth image, a point cloud map is reconstructed to obtain the point cloud map corresponding to the panoramic image of the left eye camera and the point cloud map corresponding to the panoramic image of the right eye camera. The point cloud image is projected onto the view plane, and its corresponding depth image is distorted; The pixels in the panoramic image corresponding to the distorted depth image are distorted to the view plane to generate the view plane images of the left eye camera and the right eye camera. The output displays the view plane image corresponding to the smaller depth value; The method for generating the depth image includes: acquiring images captured by a left-eye camera and a right-eye camera; generating a stereoscopic panoramic image based on the acquired images; calculating the depth value corresponding to a pixel in the stereoscopic panoramic image based on the optical flow value of the overlapping area between adjacent images captured by the left-eye camera and the right-eye camera and the converted camera coordinates; and generating a depth image corresponding to the stereoscopic panoramic image based on the calculated depth value. The step of calculating the depth value corresponding to the pixel in the stereo panoramic image based on the transformed camera coordinates and the optical flow value includes: Let R be the radius of the circle formed by the left and right eye cameras relative to the center, and let the angle between the z-axis of the camera coordinate system and the x-axis of the world coordinate system be θ. ,calculate for t1 ; Obtain the width W of the panoramic image and calculate... for t2 ,in , The optical flow value; calculate t1 and t2 The sum is the depth value corresponding to the pixel. d .

2. The method for displaying a depth image according to claim 1, characterized in that, The visual plane is defined as a plane perpendicular to the direction of eye gaze.

3. The method for displaying a depth image according to claim 1, characterized in that, Also includes: Compress the stereoscopic panoramic image and its corresponding depth image to generate compressed data of the stereoscopic panoramic image and its corresponding depth image; Decompress the compressed data of the stereoscopic panoramic image and its corresponding depth image to obtain the stereoscopic panoramic image and its corresponding depth image.

4. The method for displaying a depth image according to claim 1, characterized in that, The process of calculating the depth value of pixels in the stereoscopic panoramic image based on the optical flow value of the overlapping area between adjacent images captured by the left-eye and right-eye cameras and the transformed camera coordinates, and generating a depth image corresponding to the stereoscopic panoramic image based on the calculated depth value includes: The world coordinates corresponding to the pixels in the stereoscopic panoramic image are transformed to the camera coordinates corresponding to the left-eye camera and the right-eye camera, respectively. Calculate the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera; Based on the transformed camera coordinates and the optical flow value, the depth value corresponding to the pixel in the stereo panoramic image is calculated. Based on the calculated depth value, a depth image corresponding to the stereoscopic panoramic image is generated.

5. The method for displaying a depth image according to claim 4, characterized in that, The process of generating a stereoscopic panoramic image based on the acquired captured images includes: The acquired image is subjected to spherical or cylindrical projection, that is, the pixel coordinates corresponding to the pixels in the acquired image are transformed to spherical or cylindrical coordinates. Calculate the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera; For each column of pixels in the overlapping region, a panoramic image of the left eye camera is generated based on the left eye camera and its corresponding optical flow value, and a panoramic image of the right eye camera is generated based on the right eye camera and its corresponding optical flow value. A stereoscopic panoramic image is generated by synthesizing panoramic images from the left-eye camera and the right-eye camera using a fusion algorithm.

6. The method for displaying a depth image according to claim 4 or 5, characterized in that, The optical flow value of the overlapping region between adjacent images captured by the left-eye and right-eye cameras can be calculated using any of the following algorithms: phase correlation algorithm, position correlation algorithm, and Lucas-Kanade algorithm.

7. A depth image display device, characterized in that, include: The third generation unit is suitable for using a depth image generation method to generate a stereo panoramic image and its corresponding depth image; The reconstruction unit is adapted to reconstruct a point cloud map based on the stereo panoramic image and its corresponding depth image, and to obtain the point cloud map corresponding to the panoramic image of the left eye camera and the point cloud map corresponding to the panoramic image of the right eye camera. A distortion unit is adapted to project the point cloud image onto a view plane and distort its corresponding depth image; The fourth generation unit is adapted to distort the pixels in the panoramic image corresponding to the distorted depth image to the view plane, generating the view plane image of the left eye camera and the view plane image of the right eye camera; The output unit is suitable for outputting and displaying view plane images with relatively small depth values; The third generation unit includes: an acquisition unit adapted to acquire images captured by a left-eye camera and a right-eye camera; a first generation unit adapted to generate a stereoscopic panoramic image based on the acquired images; and a second generation unit adapted to calculate the depth value corresponding to a pixel in the stereoscopic panoramic image based on the optical flow value of the overlapping area between adjacent images captured by the left-eye camera and the right-eye camera and the converted camera coordinates, and generate a depth image corresponding to the stereoscopic panoramic image based on the calculated depth value. The second generation unit includes: a second calculation subunit, which includes: a first calculation module, a second calculation module, and a third calculation module, wherein: The first calculation module is adapted to obtain the radius R of the circle formed by the left-eye camera and the right-eye camera relative to the center of the circle, and the angle between the z-axis of the camera coordinate system and the x-axis of the world coordinate system. ,calculate for t1 ; The second calculation module is adapted to obtain a panoramic image with a width of W, and calculate... for t2 ,in , The optical flow value; The third calculation module is suitable for calculating... t1 and t2 The sum is the depth value corresponding to the pixel. d .

8. The display device for a depth image according to claim 7, characterized in that, The visual plane is defined as a plane perpendicular to the direction of eye gaze.

9. The display device for a depth image according to claim 7, characterized in that, Also includes: A compression unit is adapted to compress the stereoscopic panoramic image and its corresponding depth image to generate compressed data of the stereoscopic panoramic image and its corresponding depth image. The decompression unit is adapted to decompress the compressed data of the stereoscopic panoramic image and its corresponding depth image to obtain the stereoscopic panoramic image and its corresponding depth image.

10. The display device for a depth image according to claim 7, characterized in that, The second generation unit further includes: The first transformation subunit is adapted to transform the world coordinates corresponding to the pixels in the stereoscopic panoramic image to the camera coordinates corresponding to the left-eye camera and the right-eye camera, respectively. The first computational subunit is adapted to calculate the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera; The first generation subunit is adapted to generate a depth image corresponding to the stereoscopic panoramic image based on the calculated depth value.

11. The display device for a depth image according to claim 7, characterized in that, The first generation unit includes: The second transformation subunit is adapted to perform spherical or cylindrical projection on the acquired image, that is, to transform the pixel coordinates corresponding to the pixels in the acquired image to spherical or cylindrical coordinates. The first computational subunit is adapted to calculate the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera; The second generation subunit is adapted to generate a panoramic image of the left eye camera based on the left eye camera and its corresponding optical flow value for each column of pixels in the overlapping area, and to generate a panoramic image of the right eye camera based on the right eye camera and its corresponding optical flow value. The third generation subunit is suitable for synthesizing panoramic images from the left-eye camera and the right-eye camera based on a fusion algorithm to generate a stereoscopic panoramic image.

12. The display device for a depth image according to claim 10 or 11, characterized in that, The first computing subunit is adapted to calculate the optical flow value of the overlapping region between adjacent images captured by the left-eye camera and the right-eye camera based on any of the following algorithms: phase correlation algorithm, position correlation algorithm, and Lucas-Kanade algorithm.

13. A computer-readable storage medium storing computer instructions thereon, characterized in that, The computer instructions are executed by a processor to implement the steps of the method according to any one of claims 1 to 6.

14. A depth image display system, comprising a memory and a processor, wherein the memory stores computer instructions executable on the processor, characterized in that, When the processor executes the computer instructions, it performs the steps of the method according to any one of claims 1 to 6.