Three-dimensional reconstruction method and computer device
By determining the initial model and observation viewpoint during 3D reconstruction and performing completion processing, the problem of missing 3D information is solved, a wider rendering boundary and a more realistic 3D reconstruction model are achieved, and the generation process is simplified.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ARASHI VISION INC
- Filing Date
- 2024-12-31
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies in 3D reconstruction suffer from limited rendering boundaries due to the lack of 3D information, and the training cost is difficult to balance with the image rendering effect, affecting the realism of the 3D reconstruction model.
By determining the initial 3D reconstruction model and the observation viewpoint, 3D reconstruction completion processing is performed to generate the target 3D reconstruction model, ensuring that the model has 3D information completion effect for each occluded viewpoint and expanding the rendering boundary range.
It simplifies the generation process of 3D reconstruction models, improves the realism of the models and rendering effects, expands the rendering boundary range, and meets the needs of multi-degree-of-freedom image rendering.
Smart Images

Figure CN122312876A_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of three-dimensional reconstruction, specifically to a three-dimensional reconstruction method and computer equipment. Background Technology
[0002] In recent years, with the rapid development of image processing technology, 3D reconstruction methods for restoring 3D information from images and reconstructing 3D scenes have been widely applied in fields such as image perspective switching, surveillance, and autonomous driving. By performing 3D reconstruction based on a single image, a corresponding 3D reconstruction model can be obtained, which can then be used for multi-degree-of-freedom image rendering.
[0003] However, when using related technologies for 3D reconstruction, the lack of 3D information leads to a limited rendering boundary range under multiple degrees of freedom. It is necessary to complete the 3D information of the 3D reconstruction model through complex training methods. There is a problem that it is difficult to balance the training cost of the 3D reconstruction model with the image rendering effect, and it is easy to interfere with the realism of the 3D reconstruction model. Summary of the Invention
[0004] To overcome the problems existing in related technologies, this disclosure provides a three-dimensional reconstruction method and computer equipment.
[0005] According to a first aspect of the present disclosure, a three-dimensional reconstruction method is provided, the three-dimensional reconstruction method comprising:
[0006] Based on the initial image, determine the initial 3D reconstruction model;
[0007] Based on the initial 3D reconstruction model, an observation viewpoint is determined, which has an obstructed viewpoint in the initial 3D reconstruction model.
[0008] Based on each of the observation viewpoints and the initial 3D reconstruction model, a 3D reconstruction completion process is performed to obtain a target 3D reconstruction model, such that the target 3D reconstruction model includes at least some of the occluded viewpoints' 3D information.
[0009] According to a second aspect of the present disclosure, a computer device is provided, including a memory and a processor, the memory storing a computer program, wherein when the computer program is executed by the processor, the processor is configured to determine an initial three-dimensional reconstruction model based on an initial image;
[0010] Based on the initial 3D reconstruction model, an observation viewpoint is determined, which has an obstructed viewpoint in the initial 3D reconstruction model.
[0011] Based on each of the observation viewpoints and the initial 3D reconstruction model, a 3D reconstruction completion process is performed to obtain a target 3D reconstruction model, such that the target 3D reconstruction model includes at least some of the occluded viewpoints' 3D information.
[0012] The technical solutions provided by the embodiments of this disclosure may include the following beneficial effects: by determining the initial three-dimensional reconstruction model and the observation viewpoint, and performing three-dimensional reconstruction completion processing based on each observation viewpoint and the initial three-dimensional reconstruction model, the generated target three-dimensional reconstruction model can simultaneously achieve three-dimensional information completion for each occluded viewpoint, thereby expanding the rendering boundary range of the target three-dimensional reconstruction model, simplifying the generation process of the target three-dimensional reconstruction model while ensuring the image rendering effect, and improving the realism of the target three-dimensional reconstruction model.
[0013] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and are not intended to limit this disclosure. Attached Figure Description
[0014] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with the invention and, together with the description, serve to explain the principles of the invention.
[0015] Figure 1 This is a flowchart illustrating a three-dimensional reconstruction method according to an exemplary embodiment.
[0016] Figure 2 This is a flowchart illustrating, according to an exemplary embodiment, the determination of multiple observation viewpoints based on an initial image.
[0017] Figure 3 This is a flowchart illustrating, according to an exemplary embodiment, the determination of an initial 3D reconstruction model based on an initial image.
[0018] Figure 4 This is a flowchart illustrating, according to an exemplary embodiment, the process of sequentially completing and updating an initial 3D reconstruction model based on each observation viewpoint until all observation viewpoints have completed the completion and update process, thereby obtaining a target 3D reconstruction model.
[0019] Figure 5 This is a flowchart illustrating, according to an exemplary embodiment, how to compare the depth information corresponding to the current observation viewpoint with the depth information corresponding to the known viewpoint in the coordinate system of each known viewpoint, and generate an occlusion mask based on the comparison result.
[0020] Figure 6 This is a flowchart illustrating a process of completing an observed image based on an occlusion mask, according to an exemplary embodiment, to obtain a completed image.
[0021] Figure 7This is a schematic diagram of an initial image shown according to an exemplary embodiment.
[0022] Figure 8 This is a schematic diagram of a reconstructed image according to an exemplary embodiment.
[0023] Figure 9 This is a flowchart illustrating a three-dimensional reconstruction method according to another exemplary embodiment.
[0024] Figure 10 This is a block diagram of a computer device according to an exemplary embodiment.
[0025] In the picture:
[0026] 100 - Computer equipment; 101 - Computing unit; 102 - ROM; 103 - RAM; 104 - Bus; 105 - Input / output interface; 106 - Input unit; 107 - Output unit; 108 - Storage unit; 109 - Communication unit. Detailed Implementation
[0027] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with some aspects of the invention as detailed in the appended claims.
[0028] In recent years, with the rapid development of image processing technology, 3D reconstruction methods for restoring 3D information from images and reconstructing 3D scenes have been widely applied in many fields such as image perspective switching, surveillance, and autonomous driving. By reconstructing 3D based on a single image, a corresponding 3D reconstruction model can be generated. This model represents the structure, shape, color, and other attributes and features of image elements in 3D space. Therefore, the 3D reconstruction model can be used for multi-degree-of-freedom (DOF) image rendering, such as 6-DOF rendering, to obtain images with different perspectives from the original image.
[0029] However, when using related technologies for 3D reconstruction, 3D information needs to be obtained from images. Due to occlusion between image elements and the different distances from the shooting device at the original viewpoint, some 3D information is missing. The further away from the original viewpoint, the more 3D information is missing, making it difficult to render images at the edge of the 3D reconstruction model, thus limiting the rendering boundary range under multiple degrees of freedom.
[0030] If the 3D information of the 3D reconstruction model is supplemented by related technologies, the training time and cost of the 3D reconstruction model will be greatly increased, and the realism of the 3D reconstruction model will be interfered with. There is a problem that it is difficult to balance the training cost of the 3D reconstruction model and the image rendering effect.
[0031] Based on this, an exemplary embodiment of this disclosure provides a 3D reconstruction method. An initial 3D reconstruction model is determined based on an initial image, and observation viewpoints are determined based on the initial 3D reconstruction model. 3D reconstruction completion processing is performed based on each observation viewpoint and the initial 3D reconstruction model to obtain a target 3D reconstruction model. This results in the target 3D reconstruction model having 3D information completion for occlusion perspectives from each observation viewpoint, providing a basis for multi-degree-of-freedom image rendering. By determining the initial 3D reconstruction model and observation viewpoints, and performing 3D reconstruction completion processing based on each observation viewpoint and the initial 3D reconstruction model, the generated target 3D reconstruction model can simultaneously achieve 3D information completion for each occlusion perspective, expanding the rendering boundary range of the target 3D reconstruction model. While ensuring image rendering effects, this simplifies the generation process of the target 3D reconstruction model and improves its realism.
[0032] In one exemplary embodiment, a three-dimensional reconstruction method is provided, applied to a computer device. The computer device may include, for example, a camera, a mobile phone, or other imaging device with imaging capabilities, and may also include a terminal device connected to the imaging device and having data processing capabilities. (Reference) Figure 1 As shown, the three-dimensional reconstruction methods include:
[0033] S100. Based on the initial image, determine the initial 3D reconstruction model.
[0034] In step S100, the initial image can be an image captured by a computer device, or an image captured by another imaging device and transmitted to the computer device. The initial image can be an image obtained in image capture mode, or an image frame extracted from a video obtained in video capture mode. The initial image can be determined based on the user's selection, reflecting the user's intention and need to reconstruct 3D information and rebuild a 3D scene using 3D reconstruction.
[0035] An initial 3D reconstruction model is generated based on the initial image. This initial 3D reconstruction model is directly generated from the initial image and can restore the original 3D information and scene of the initial image. However, due to occlusion between image elements in the initial image, and the fact that image elements in the edge regions of the image are far from the original viewpoint, the initial 3D reconstruction model suffers from missing 3D information.
[0036] S200. Based on the initial 3D reconstruction model, determine the observation viewpoint. The observation viewpoint has an obstructed view in the initial 3D reconstruction model.
[0037] In step S200, the observation viewpoint with the user's observation intention can be determined based on the initial three-dimensional reconstruction model. The observation viewpoint can have a total of 6 degrees of freedom in three-dimensional space, which can rotate and move in three mutually perpendicular axial directions respectively.
[0038] Understandably, the initial image has an initial viewpoint corresponding to its capture, which determines the framing range of the initial image. Similarly, each observation viewpoint determined by the initial 3D reconstruction model has a corresponding observation viewpoint within the initial 3D reconstruction model. The observation viewpoint represents the range of 3D information that the corresponding observation viewpoint can acquire within the initial 3D reconstruction model. This can be understood as the imaging perspective when capturing an image in the real 3D scene corresponding to the initial image, using the spatial position corresponding to the observation viewpoint as the shooting position. The observation viewpoint determines the perspective transformation relationship between different observation viewpoints and is needed for subsequent 3D reconstruction completion processing and to determine the size and orientation of the rendering range when rendering the image based on the observation viewpoint and the 3D reconstruction model.
[0039] The viewing angle of the observation point can be determined based on the user's image rendering requirements and the position of the observation point in the initial image. Understandably, if subsequent rendering of a panoramic image requires obtaining all the corresponding 3D information from pixels within the observation point or a preset range, the viewing angle can be set to a 360° panoramic view. If subsequent rendering of a specific direction and range requires obtaining partial 3D information from pixels within the observation point or a preset range, the viewing angle can be set according to the image rendering requirements. If the boundary of the preset area is close to the edge of the initial image, the viewing angle of the observation point can be set to point from the outside in of the 3D scene towards the objects in the 3D scene corresponding to the image elements in the initial image. If the boundary of the preset area is close to the center of the initial image, the viewing angle of the observation point can be set from the inside out of the 3D scene towards the objects in the 3D scene corresponding to the image elements in the initial image.
[0040] Each observation viewpoint in the initial 3D reconstruction model also has an occlusion angle smaller than or equal to the observation viewpoint. The occlusion angle is the range within which the corresponding observation viewpoint cannot obtain complete 3D information due to mutual occlusion between objects in the initial 3D reconstruction model. It can be understood as the angle from which light cannot pass through the occlusion to reach the farthest object in the real 3D scene corresponding to the initial image, using the spatial position corresponding to the observation viewpoint as the shooting position. S300: Perform 3D reconstruction completion processing based on each observation viewpoint and the initial 3D reconstruction model to obtain the target 3D reconstruction model, so that the target 3D reconstruction model includes 3D information of at least part of the occlusion angle of each observation viewpoint.
[0041] In step S300, the initial 3D reconstruction model is subjected to 3D reconstruction completion processing based on each observation viewpoint. After the 3D reconstruction completion processing is completed for each observation viewpoint, a target 3D reconstruction model corresponding to the complete 3D information can be obtained. This target 3D reconstruction model can include the actual 3D information corresponding to the initial image in 3D space, such as the position (including depth), color, and brightness of each pixel. Furthermore, the target 3D reconstruction model includes the 3D information of all or part of the occluded viewpoints of each observation viewpoint, thus achieving the effect of completing the 3D information of at least part of the occluded viewpoints of each observation viewpoint. For example, for the 3D reconstruction completion processing of each observation viewpoint, the area requiring information completion can be determined based on the current 3D reconstruction model and the corresponding edge observation viewpoint of the observation viewpoint, and the missing 3D information can be completed by image completion of the 2D image.
[0042] The 3D reconstruction and completion process is performed based on each observation viewpoint. This ensures that when performing 3D reconstruction and completion processing based on each observation viewpoint, the object of the 3D reconstruction and completion processing is the 3D reconstruction model corresponding to all observation viewpoints preceding the current observation viewpoint. This is to avoid the final target 3D reconstruction model being unable to simultaneously achieve the 3D information completion effect for each occluded viewpoint.
[0043] Compared with existing technologies, the above method does not require information completion of the entire initial image, nor does it require complex calculation processes. It only needs to perform simple 3D reconstruction completion processing based on the determined observation viewpoint to ensure the integrity of the 3D information of the target 3D reconstruction model, thereby expanding the rendering boundary range of the target 3D reconstruction model.
[0044] In this embodiment, an initial 3D reconstruction model is determined based on the initial image, and the observation viewpoints are determined based on the initial 3D reconstruction model. 3D reconstruction completion processing can be performed based on each observation viewpoint and the initial 3D reconstruction model to obtain the target 3D reconstruction model. This ensures that the target 3D reconstruction model has 3D information completion for occlusion perspectives from each observation viewpoint, providing a basis for multi-degree-of-freedom image rendering. By determining the initial 3D reconstruction model and observation viewpoints, and performing 3D reconstruction completion processing based on each observation viewpoint and the initial 3D reconstruction model, the generated target 3D reconstruction model can simultaneously achieve 3D information completion for each occlusion perspective, expanding the rendering boundary range of the target 3D reconstruction model. This simplifies the generation process of the target 3D reconstruction model while ensuring image rendering quality, and improves the realism of the target 3D reconstruction model.
[0045] In some embodiments, the observation viewpoint is determined based on an initial 3D reconstruction model, including at least one of the following methods: determining the observation viewpoint based on a user's point-marking operation on a point cloud interactive interface, wherein the point cloud interactive interface is generated based on the initial 3D reconstruction model; determining the observation viewpoint based on a user's viewpoint selection operation on a 3D model interactive interface, wherein the 3D model interactive interface is generated based on the initial 3D reconstruction model; or acquiring inertial measurement unit (IMU) data from an imaging device and determining the observation viewpoint based on the IMU data and the initial 3D reconstruction model.
[0046] When determining the observation viewpoint based on the initial 3D reconstruction model, it is necessary to ensure that the observation viewpoint meets the user's observation intent. After obtaining the initial 3D reconstruction model, a point cloud interactive interface can be generated based on the initial 3D reconstruction model and displayed on the user interface, allowing interaction with the user. The point cloud interactive interface contains the initial 3D point cloud corresponding to the initial 3D reconstruction model, and the user can perform interactive operations such as selection and movement on various points in the 3D point cloud. The user's point-marking operations on the point cloud interactive interface represent the user's intention to select the observation viewpoint, i.e., the observation intent. Therefore, the observation viewpoint can be determined based on the user's point-marking operations on the point cloud interactive interface.
[0047] After obtaining the initial 3D reconstructed model, a 3D model interactive interface can be generated based on the initial 3D reconstructed model and displayed on the user interface, allowing for user interaction. The 3D model interactive interface displays a 3D model image corresponding to the initial 3D reconstructed model, and users can perform interactive operations such as sliding and dragging on the 3D model image. The user's viewpoint selection operation on the 3D model interactive interface represents the user's intention to select an observation point, i.e., the observation intention. Therefore, the observation point can be determined based on the user's point-marking operation on the 3D model interactive interface.
[0048] After obtaining the initial 3D reconstruction model, inertial measurement unit (IMU) data from the imaging device can be acquired. The IMU can be composed, for example, of a three-axis accelerometer and a three-axis gyroscope, allowing the IMU data to characterize the device's pose. Understandably, when the initial image is a single frame, IMU data can be acquired over a period before and after capturing the initial image to reflect the user's haptic control over the device during the initial image capture. The IMU data reflects the user's haptic control over the device, representing the user's intention to select the viewing point, i.e., the observation intention. Therefore, the viewing point can be determined based on the IMU data and the initial 3D reconstruction model.
[0049] In this embodiment, the observation viewpoint is determined by any of the above methods, which realizes the determination of the observation viewpoint based on the initial 3D reconstruction model. This provides a basis for 3D reconstruction completion processing and ensures that each observation viewpoint can meet the user's observation intention of the initial 3D reconstruction model. This allows the user to autonomously select the observation viewpoint through different interactive operations or by controlling the movement of the shooting device, thereby improving the user experience.
[0050] In other embodiments, multiple observation viewpoints can be determined directly based on the initial image, with reference to... Figure 2 As shown, based on the initial image, multiple observation viewpoints are determined, including:
[0051] S410. Based on the initial image, determine the depth information of the initial image.
[0052] In step S410, the depth information of the initial image is determined based on the initial image. This depth information characterizes the depth value of each pixel in the initial image, representing the distance between image elements in the initial image and the lens at the time of capture. It reflects the occlusion relationship of image elements in three-dimensional space and their distance from the initial viewpoint. For example, the initial image can be input into a pre-trained depth estimation network model to obtain the depth information of the initial image.
[0053] S420. Determine the preset region based on the depth information of the initial image.
[0054] In step S420, a preset region can be determined based on the depth information of the initial image, so that the preset region corresponds to a specific image depth range, thereby meeting the accuracy requirements of 3D reconstruction and subsequent image rendering. For example, the foreground depth of the initial image can be determined based on the depth information of the initial image, and a circular preset region can be obtained by taking the center point of the initial image as the center and 0.8 times the foreground depth as the radius.
[0055] S430. Determine multiple observation viewpoints at preset intervals along the boundary of the preset area.
[0056] In step S430, multiple points are uniformly selected at preset intervals along the boundary of the preset region, serving as multiple observation viewpoints. This determines the multiple observation viewpoints, providing a target and basis for subsequent 3D reconstruction and completion processing. The number of observation viewpoints can be, for example, N, where N is a positive integer greater than 1. The smaller the preset interval, the higher the accuracy of 3D reconstruction and subsequent image rendering. By controlling the preset interval to control the number of observation viewpoints while ensuring the accuracy of 3D reconstruction and image rendering, the time and computational load of subsequent 3D reconstruction and completion processing can be reduced.
[0057] In this embodiment, the depth information of the initial image is determined, and a preset region is determined based on this depth information. Then, multiple observation viewpoints are determined at preset intervals along the boundaries of the preset region, thus providing a target and basis for subsequent 3D reconstruction and completion processing. Using the depth information of the initial image as the basis for determining the preset region ensures that the preset region corresponds to a specific image depth range, thereby guaranteeing that the size of the preset region meets the accuracy requirements of 3D reconstruction and subsequent image rendering.
[0058] In some embodiments, reference Figure 3 As shown, based on the initial image, the initial 3D reconstruction model is determined, including:
[0059] S110. Determine point cloud information based on the initial image and its depth information.
[0060] In step S110, point cloud information corresponding to the initial image can be generated based on the initial image and the depth information of the initial image. The point cloud information is a set of discrete points in three-dimensional space. The point cloud information includes the distribution of each discrete point in space and the attributes and features of each discrete point, such as color and brightness.
[0061] S120. Generate an initial 3D reconstruction model based on point cloud information.
[0062] In step S120, preliminary 3D reconstruction can be performed based on the determined point cloud information to generate an initial 3D reconstruction model. Since the point cloud information includes the location distribution, attributes, and features of multiple discrete points in 3D space, the initial 3D reconstruction model can be obtained after 3D reconstruction based on the point cloud information, and this initial 3D reconstruction model can correspond to the initial image in 3D space. The method for generating the initial 3D reconstruction model based on the point cloud information can be selected according to the characteristics of the point cloud information and the required accuracy of the initial 3D reconstruction model. For example, Gaussian splashing, neural radiation field, or Poisson reconstruction methods can be used to generate the initial 3D reconstruction model.
[0063] In this embodiment, point cloud information is determined based on the initial image and its depth information, and an initial 3D reconstruction model is generated based on the point cloud information. This achieves the determination of the initial 3D reconstruction model and provides a basis for the completion and update process of each observation viewpoint. Using point cloud information as the basis for generating the initial 3D reconstruction model can provide high-precision 3D coordinate data, making it easier to capture object details. It can accurately represent complex geometric shapes and subtle surface changes, and has the characteristics of high efficiency, non-contact operation, and rich attribute features.
[0064] In some embodiments, a 3D reconstruction completion process is performed based on each observation viewpoint and the initial 3D reconstruction model to obtain the target 3D reconstruction model, including:
[0065] The initial 3D reconstruction model is completed and updated sequentially based on each observation viewpoint until the completion and update process is completed for each observation viewpoint, thus obtaining the target 3D reconstruction model.
[0066] The initial 3D reconstruction model is sequentially updated and completed based on each observation viewpoint to continuously train it. This ensures that the initial 3D reconstruction model, after being updated and completed from each observation viewpoint, can fill in the missing 3D information under each occluded viewpoint, guaranteeing the accuracy and realism of the target 3D reconstruction model and achieving the image rendering effect that meets the user's needs.
[0067] Understandably, the initial 3D reconstruction model is sequentially completed and updated based on each observation viewpoint. After completing and updating the initial 3D reconstruction model based on the first observation viewpoint, the initial 3D reconstruction model has changed. When completing and updating based on the second observation viewpoint, the object of the completion and update is the 3D reconstruction model obtained after completing and updating the initial 3D reconstruction model based on the first observation viewpoint. When there are N observation viewpoints, the object of the completion and update based on the Mth observation viewpoint is the 3D reconstruction model obtained after completing and updating the initial 3D reconstruction model based on the 1st to N-1th observation viewpoints, where M is a positive integer greater than 1 and less than or equal to N. This ensures that after completing and updating based on each observation viewpoint, the resulting 3D reconstruction model can simultaneously achieve the 3D information completion effect for the boundary views of that observation viewpoint and all observation viewpoints whose completion and update order precedes that observation viewpoint. After all observation viewpoints have completed the completion and update process, the resulting target 3D reconstruction model can simultaneously achieve the 3D information completion effect for the boundary views corresponding to all observation viewpoints.
[0068] In this embodiment, the initial 3D reconstruction model is sequentially completed and updated based on each observation viewpoint until the completion and update process is completed for all observation viewpoints, resulting in the target 3D reconstruction model. This achieves the generation of the target 3D reconstruction model and provides a basis for multi-degree-of-freedom image rendering. Sequentially completing and updating the initial 3D reconstruction model based on each observation viewpoint allows the target 3D reconstruction model to achieve 3D information completion effects under occluded perspectives corresponding to all observation viewpoints, expanding the rendering boundary range of the target 3D reconstruction model, improving its realism, and thus ensuring the quality of subsequent image rendering.
[0069] In some embodiments, reference Figure 4 As shown, the initial 3D reconstruction model is sequentially completed and updated based on each observation viewpoint until the completion and update process is completed for each observation viewpoint, resulting in the target 3D reconstruction model, including:
[0070] S310. Obtain the current observation viewpoint to be updated. Based on the current observation viewpoint and the 3D reconstruction model before this completion and update process, render the observation image of the current observation viewpoint and the depth information corresponding to the observation image.
[0071] In step S310, the current observation viewpoint to be updated is obtained. The current observation viewpoint is the one that has not yet been completed and updated, and all previous observation viewpoints have completed their corresponding completion and update processes. This is the observation viewpoint that currently needs to undergo the completion and update process. After determining the current observation viewpoint, based on the current observation viewpoint and the 3D reconstruction model before this completion and update process, the observation image corresponding to the current observation viewpoint is rendered, and the depth information corresponding to the observation image is determined. This enables the rendering of the current 3D reconstruction model under the current observation viewpoint, ensuring that the depth information of the observation image represents the 3D information restored by the 3D reconstruction model before this completion and update process and the image depth of the 3D scene under the current observation viewpoint. Furthermore, it reflects the distance between image elements in the 3D scene and the current observation viewpoint from the current observation viewpoint's perspective.
[0072] The 3D reconstruction model before this update process is the initial 3D reconstruction model that has been updated by completing and updating based on the observation viewpoints before the current observation viewpoint. This 3D reconstruction model has a 3D information completion effect for the occlusion viewpoints of the observation viewpoints before the current observation viewpoint, so that the 3D reconstruction model can have high image rendering quality for the observation viewpoints of the observation viewpoints before the current observation viewpoint.
[0073] For example, there are N observation viewpoints, P1, P2, ..., PN. During the first completion update, the current observation viewpoint is P1. The 3D reconstruction model before this completion update is the initial 3D reconstruction model. When performing the completion update based on observation viewpoint P1, the observation image corresponding to observation viewpoint P1 and its corresponding depth information are rendered based on the initial 3D reconstruction model and observation viewpoint P1. During the third completion update, the current observation viewpoint is P3. The 3D reconstruction model before this completion update is the 3D reconstruction model obtained after sequentially performing the completion update processes of P1 and P2 on the initial 3D reconstruction model. This model provides 3D information completion for the occlusion perspectives of observation viewpoints P1 and P2. When performing the completion update based on observation viewpoint P3, the observation image corresponding to observation viewpoint P3 and its corresponding depth information are rendered based on this 3D reconstruction model and observation viewpoint P3.
[0074] S320. Project the depth information corresponding to the observed image to obtain the projected depth information.
[0075] In step S320, as mentioned above, the observed image and its depth information correspond to the current observation viewpoint, representing the 3D information restored by the 3D reconstruction model before this completion and update process and the image depth of the 3D scene at the current observation viewpoint. It also reflects the distance between the image elements in the 3D scene and the current observation viewpoint from the current observation viewpoint. The 3D reconstruction model before this completion and update process can have high image rendering quality for the observation viewpoints of each observation viewpoint preceding the current observation viewpoint.
[0076] However, due to the different positions and occlusion relationships of image elements at different viewpoints, the image depth varies from viewpoint to viewpoint. The 3D reconstruction model before the current completion and update process still lacks 3D information for the current observation viewpoint, making it difficult to guarantee the image rendering quality at the current viewpoint's perspective. To facilitate comparison of the image depth differences between the current observation viewpoint and other observation viewpoints, and to determine the missing 3D information for the current observation viewpoint in the 3D reconstruction model before the update process, the depth information of the observed image corresponding to the current observation viewpoint can be projected. This transforms the depth information of the observed image to other viewpoints or coordinate systems, obtaining projected depth information. This projected depth information has the same reference point for comparison as the projection target, i.e., each observation viewpoint that has already completed the completion and update process. This allows for the determination of the missing 3D information for the occluded perspective of the current observation viewpoint in the current 3D reconstruction model.
[0077] S330. Determine whether there is occlusion based on the depth information after projection. If there is occlusion, generate an occlusion mask.
[0078] In step S330, based on the projected depth information, it can be determined whether the image elements in the observation image corresponding to the current observation viewpoint are missing image information due to occlusion. If occlusion exists, it means that the 3D reconstruction model before this completion and update process is inconsistent with the 3D information of the current observation viewpoint and other observation viewpoints that have completed the completion and update process.
[0079] Since the 3D reconstruction model prior to this update process has completed 3D information for other viewpoints that have already undergone the completion and update process, the inconsistent 3D information between the current viewpoint and other viewpoints that have completed the completion and update process represents the missing 3D information in the 3D reconstruction model corresponding to the current viewpoint. Therefore, in the presence of occlusion, a corresponding occlusion mask can be generated. The occlusion mask represents the region in the observed image that lacks image information due to the missing 3D information in the 3D reconstruction model prior to this update process corresponding to the current viewpoint. This provides a target and basis for completing the image information in the observed image and for completing the 3D information in the 3D reconstruction model prior to this update process.
[0080] S340. The observed image is completed based on the occlusion mask to obtain the completed image.
[0081] In step S340, the completion process analyzes the edge regions of the image to infer the image content outside the edge regions, thereby repairing or completing damaged or missing parts of the image. By performing completion processing on the observed image based on the occlusion mask, the missing image content, i.e., two-dimensional information, can be completed. This results in a completed image that has relatively complete image information for the current viewpoint, representing the desired image rendering effect of the 3D reconstruction model before the completion update process and providing guidance for 3D information completion.
[0082] S350. Based on the completed image, the 3D reconstruction model before this completion and update process is completed and updated to obtain the 3D reconstruction model after this completion and update process.
[0083] In step S350, the completed image can be used as the update target of the completion update process. The 3D reconstruction model before the current completion update process is completed and updated. The 3D information and 3D scene corresponding to the 3D reconstruction model before the current completion update process are changed by the completion update method. In this way, the missing 3D information corresponding to the current observation viewpoint under the occluded view in the 3D reconstruction model before the current completion update process is completed, so that the image rendering effect of the 3D reconstruction model before the current completion update process can gradually approach the completed image, and the 3D reconstruction model after the current completion update process is obtained.
[0084] Understandably, this completion and update process completes the missing 3D information from the current viewpoint under occluded perspectives. The 3D reconstruction model prior to this process already possesses 3D information completion capabilities for all viewpoints preceding the current viewpoint. This ensures that the 3D reconstruction model after each completion and update process provides 3D information completion for both the current viewpoint and all viewpoints preceding it, guaranteeing image rendering quality for both the current viewpoint and all viewpoints preceding it. By sequentially using each viewpoint as the current viewpoint and performing the above completion and update process, the final 3D reconstruction model is the target 3D reconstruction model, simultaneously providing 3D information completion for all occluded viewpoints, thus ensuring image rendering quality across all viewpoints within the preset area.
[0085] In this embodiment, by sequentially using each observation viewpoint as the current observation viewpoint and performing the aforementioned completion and update process on the current observation viewpoint, it is possible to achieve 3D reconstruction completion processing for each observation viewpoint and generate the target 3D reconstruction model. Furthermore, the generated target 3D reconstruction model can simultaneously achieve 3D information completion for each occluded viewpoint, expanding the rendering boundary range of the target 3D reconstruction model. While ensuring image rendering effects, it simplifies the generation process of the target 3D reconstruction model and improves its realism. By projecting the depth information corresponding to the observation image and generating an occlusion mask based on the projected depth information, it is possible to accurately identify missing image information in the observation image and missing 3D information under the occluded viewpoint of the current observation viewpoint. This allows the completion processing of the panoramic image to provide guidance for the 3D information completion of the 3D reconstruction model. This ensures that the 3D reconstruction model after each completion and update process has 3D information completion effects for the current observation viewpoint and all observation viewpoints preceding it, guaranteeing the integrity and consistency of 3D information and further improving the effect and accuracy of 3D information completion.
[0086] In some embodiments, the depth information corresponding to the observed image is projected to obtain the projected depth information, including: projecting the depth information corresponding to the observed image onto the coordinate system of the known viewpoint to obtain the depth information corresponding to each coordinate system. The known viewpoint includes the initial viewpoint and the observation viewpoint corresponding to the observation viewpoint that has completed the completion and update process.
[0087] When projecting the depth information of the observed image, the depth information corresponding to the observed image can be projected onto the coordinate system of each known viewpoint to obtain the depth information of the observed image in each coordinate system. This allows the depth information of the observed image to have the same reference point for comparison as each known viewpoint by transforming it to the corresponding depth information in each coordinate system. This facilitates the identification of the difference between the image depth corresponding to the current observation viewpoint and the known viewpoints, thereby providing a basis for determining the missing three-dimensional information corresponding to the current observation viewpoint.
[0088] The known viewpoints include the initial viewpoint corresponding to the initial image, and the observation viewpoints corresponding to each observation viewpoint that has completed the completion and update process. For example, if the current observation viewpoint is observation viewpoint P3, then the known viewpoints include the initial viewpoint, the observation viewpoint corresponding to observation viewpoint P1, and the observation viewpoint corresponding to observation viewpoint P2. It is necessary to project the depth information D3 of the observation image corresponding to observation viewpoint P3 onto the coordinate system Z1 of the initial viewpoint, the coordinate system Z2 of the observation viewpoint corresponding to observation viewpoint P1, and the coordinate system Z3 of the observation viewpoint corresponding to observation viewpoint P2, respectively, to obtain the depth information D3Z1, D3Z2, and D3Z3 of the observation image corresponding to observation viewpoint P3 in coordinate systems Z1, Z2, and Z3, respectively.
[0089] Based on the projected depth information, determine whether there is occlusion. If there is occlusion, generate an occlusion mask, including: comparing the depth information corresponding to the current observation viewpoint with the depth information corresponding to the known viewpoint in the coordinate system of each known viewpoint, and generating an occlusion mask based on the comparison result.
[0090] It is understandable that the observation viewpoints that have completed the completion and update process have also undergone the projection of depth information and are known observation viewpoints. For each known viewpoint, its known observation viewpoint also has the corresponding depth information of the observation image in the coordinate system of that known viewpoint, which serves as the depth information corresponding to that known viewpoint.
[0091] When determining whether occlusion exists based on the projected depth information and generating an occlusion mask if occlusion exists, for each known viewpoint coordinate system, the depth information of the observed image corresponding to the current observation point in the coordinate system of the known viewpoint can be compared with the depth information corresponding to the known viewpoint. This determines the difference between the depth information of the observed image of the current observation point and the depth information of the observed image of the known point in each coordinate system. Based on the comparison result, an occlusion mask is generated to infer the difference in 3D information between the current observation point and the known point through the difference in depth information, thereby identifying the image region in the observed image of the current observation point that has missing image information.
[0092] For example, if the current observation viewpoint is P3, in the coordinate system Z1 of the initial viewpoint, the depth information D3Z1 corresponding to observation viewpoint P3 is compared with the depth information D0 corresponding to the initial viewpoint. In the coordinate system Z2 of the boundary viewpoint of observation viewpoint P1, the depth information D3Z2 corresponding to observation viewpoint P3 is compared with the depth information D1 corresponding to the boundary viewpoint of observation viewpoint P1. In the coordinate system Z3 of the boundary viewpoint of observation viewpoint P2, the depth information D3Z3 corresponding to observation viewpoint P3 is compared with the depth information D2 corresponding to the boundary viewpoint of observation viewpoint P2. Three comparison results can be obtained, and an occlusion mask can be generated based on the three comparison results.
[0093] In this embodiment, when projecting the depth information of the observed image, the depth information corresponding to the observed image is projected onto the coordinate system of the known viewpoints respectively, obtaining the depth information corresponding to each coordinate system. This allows the depth information of the observed image to be transformed into the same coordinate system as each known viewpoint, facilitating the comparison of depth information corresponding to different observation viewpoints within the same coordinate system. When determining whether occlusion exists based on the projected depth information and generating an occlusion mask if occlusion exists, the depth information corresponding to the current observation viewpoint is compared with the depth information corresponding to the known viewpoint in the coordinate system of each known viewpoint. The occlusion mask is generated based on the comparison result, thus realizing the generation of the occlusion mask and providing a basis for completion processing and the generation of the completed image. By comparing the depth information corresponding to the current observation viewpoint with the depth information corresponding to the known viewpoint, the comparison of depth information corresponding to different observation viewpoints within the same coordinate system is realized. The difference in depth information is used to infer the difference in 3D information between the current observation viewpoint and known points, thereby determining the image region in the observed image of the current observation viewpoint that has missing image information. This improves the accuracy of determining the occlusion mask and identifying the missing 3D information in the 3D reconstruction model before the current completion update process.
[0094] In some embodiments, in the coordinate system of each known viewpoint, the depth information corresponding to the current observation viewpoint and the depth information corresponding to the known viewpoint include the depth value corresponding to each pixel in the coordinate system of the known viewpoint.
[0095] In the coordinate systems of each known viewpoint, the depth information corresponding to the current observation viewpoint includes the depth value of each pixel in the observed image corresponding to the current observation viewpoint in the coordinate system of that known viewpoint. The depth information corresponding to that known viewpoint includes the depth value of each pixel in the observed image of the corresponding known point in the coordinate system of that known viewpoint. By comparing the depth information corresponding to the current observation viewpoint and the depth information corresponding to that known viewpoint, the depth values of each pixel in the observed images corresponding to the current observation viewpoint and the known point are compared in the coordinate system of that known viewpoint. This ensures that the pixels being compared for depth values between the observation viewpoint and the known viewpoint are in the same coordinate system, i.e., the same comparison base point. Therefore, the difference in the three-dimensional information between the current observation viewpoint and the observation viewpoint corresponding to the known viewpoint can be determined based on the comparison of the depth values of each pixel.
[0096] refer to Figure 5 As shown, in the coordinate system of each known viewpoint, the depth information corresponding to the current observation viewpoint is compared with the depth information corresponding to the known viewpoint, and an occlusion mask is generated based on the comparison result, including:
[0097] S331. For each known viewpoint, calculate the difference between the depth values of each pixel corresponding to the current observation viewpoint and the known viewpoint in the coordinate system of that known viewpoint.
[0098] In step S331, when comparing the depth information corresponding to the current observation viewpoint with the depth information corresponding to the known viewpoint in the coordinate system of each known viewpoint, for each known viewpoint, the difference between the depth values corresponding to each pixel of the current observation viewpoint and the known viewpoint in the coordinate system of that known viewpoint can be calculated to determine the image depth difference between the observed images of the current observation viewpoint and the observation viewpoint corresponding to that known viewpoint. Since the depth value difference calculation is all in the coordinate system of the known viewpoint, each pixel of the observed image of the current observation viewpoint can correspond one-to-one with each pixel of the known viewpoint, and the obtained difference is the difference in depth values of pixels at the same pixel coordinate, i.e., the same position.
[0099] S332. The set of pixels whose difference is greater than a preset difference is determined as the occlusion mask.
[0100] In step S332, if the difference in depth values between pixels at the same location in the observed image corresponding to the current viewpoint and the known viewpoint, under the coordinate system of the known viewpoint, is too large, it indicates that the observed image of the current viewpoint lacks image information at that pixel compared to the observed image of the known viewpoint. In the 3D scene, the current viewpoint lacks 3D information at that pixel. Therefore, the set of pixels with a difference greater than a preset difference can be determined as an occlusion mask. The difference in depth values of each pixel can be used to infer the difference in 3D information between the current viewpoint and the known point, thereby identifying the image region in the observed image of the current viewpoint that has missing image information. The preset difference can be set according to the accuracy requirements of 3D reconstruction and subsequent image rendering.
[0101] In this embodiment, for each known viewpoint, the difference between the depth values of each pixel corresponding to the current observation viewpoint and the known viewpoint in the coordinate system of that known viewpoint is calculated, and the set of pixels with a difference greater than a preset difference is determined as the occlusion mask. This achieves the generation of the occlusion mask, providing a basis for the completion processing of the observed image and the generation of the completed image. Using the difference between the depth values of each pixel corresponding to the current observation viewpoint and the known viewpoint in the coordinate system of that known viewpoint as the basis for determining the occlusion mask can accurately identify the differences in 3D information between the current observation viewpoint and known points through the differences in the depth values of each pixel, improving the accuracy of determining the occlusion mask and identifying the missing 3D information in the 3D reconstruction model before this completion and update process.
[0102] In some embodiments, such as Figure 6 As shown, the observed image is completed using an occlusion mask to obtain the completed image, which includes:
[0103] S341. Determine the fusion mask of the observed image based on the occlusion mask in the coordinate system of each known viewpoint.
[0104] In step S341, as mentioned above, the observed image has an occlusion mask corresponding to each known viewpoint in the coordinate system of each known viewpoint, such that different occlusion masks correspond to different image regions in the observed image. A fusion mask for the observed image can be determined based on the occlusion masks determined in the coordinate system of each known viewpoint, so that the fusion mask can be used as the region that ultimately needs to be completed at the current observation viewpoint, i.e., the region with missing image information. For example, image region fusion can be performed on the occlusion masks in the coordinate system of each known viewpoint to fuse multiple occlusion masks into one fusion mask. For instance, image regions where the overlap rate of multiple occlusion masks is greater than a preset overlap rate can be used as the fusion mask.
[0105] S342. Perform completion processing on the fusion mask of the observed image to obtain the completed image.
[0106] In step S342, the fusion mask of the observed image is completed, which can complete the missing image content, i.e., two-dimensional information, in the observed image, so that the completed image has relatively complete image information, representing the image rendering effect that the three-dimensional reconstruction model before this completion and update process wants to achieve after this completion and update process, ensuring that the completed image has the guiding role of three-dimensional information completion.
[0107] In this embodiment, the fusion mask of the observed image is determined based on the occlusion mask in the coordinate system of each known viewpoint, and the fusion mask of the observed image is completed to obtain the completed image. This realizes the generation of the completed image, which provides guidance for the completion of the three-dimensional information of the three-dimensional reconstruction model. This ensures that the three-dimensional reconstruction model after each completion update process has the effect of three-dimensional information completion for the current observation viewpoint and all observation viewpoints in the order before the current observation viewpoint, so as to ensure the integrity and consistency of the three-dimensional information and further improve the effect and accuracy of the three-dimensional information completion.
[0108] In some embodiments, the 3D reconstruction method further includes: obtaining a reconstructed image based on the target 3D reconstruction model and the target observation viewpoint, wherein the reconstructed image corresponds to a different viewpoint than the initial image.
[0109] A target viewpoint can be selected from multiple defined viewpoints, each with a corresponding viewing angle. After obtaining the target 3D reconstruction model, the reconstructed 3D information and scene, along with the viewing angle corresponding to the target viewpoint, can be used to generate a reconstructed image from the target viewpoint's perspective through the target 3D reconstruction model's image rendering function. This allows the reconstructed image to have a different perspective from the initial image, thus enabling the switching of image perspectives using the target 3D reconstruction model.
[0110] For example, it can be based on such Figure 7 The initial image shown is used to generate a target 3D reconstruction model through the above-described 3D reconstruction method. Based on the target 3D reconstruction model and the target observation viewpoint, a model is then generated as follows: Figure 8 The reconstructed image shown allows the reconstructed image to correspond to different viewpoints from the original image.
[0111] In this embodiment, a reconstructed image is generated based on the target 3D reconstruction model and the target observation viewpoint, enabling the reconstructed image to have a different perspective from the initial image. The target 3D reconstruction model is used to achieve image perspective switching. By using the target 3D reconstruction model and the target observation viewpoint as the basis for generating the reconstructed image, the 3D information and scene reconstructed by the target 3D reconstruction model can be applied to the image generation process. Furthermore, the information completion effect of the target 3D reconstruction model ensures the accuracy of image information and the consistency of 3D information at the boundary perspectives of different target observation viewpoints, thereby guaranteeing the image quality of the reconstructed image.
[0112] It is understandable that the target observation viewpoint can not only be selected from multiple observation viewpoints, but also any pixel in the preset area of the initial image can be used as the target observation viewpoint, so as to realize the image rendering of the viewpoint corresponding to the pixel through the target 3D reconstruction model and ensure the image rendering quality of the reconstructed image.
[0113] In some embodiments, the three-dimensional reconstruction method further includes: in response to the target observation viewpoint being located outside the preset area, expanding the preset area until the target observation viewpoint is located within the preset area, and redetermining multiple observation viewpoints.
[0114] It is understandable that when the target observation viewpoint is outside the preset area, the viewpoint representing the user's image rendering needs is outside the rendering boundary range that the target 3D reconstruction model can guarantee good image rendering effect. Compared with the existing observation viewpoint, the target observation viewpoint is farther away from the initial viewpoint of the initial image, resulting in more missing 3D information. It is impossible to complete more 3D information through the current information completion effect of the target 3D reconstruction model.
[0115] Therefore, the area of the preset region can be expanded so that the selected target observation point can be included in the preset region, and multiple observation points can be redefined. Then, the above-mentioned three-dimensional reconstruction and completion process can be performed sequentially based on the redefined multiple observation points to ensure the integrity of the three-dimensional information corresponding to all pixels in the expanded preset region. This ensures that the effective boundary range of the target three-dimensional reconstruction model for image rendering can include the target observation point, thereby ensuring the image quality at the viewpoint corresponding to the target observation point.
[0116] In this embodiment, when the target observation viewpoint is outside the preset area, the preset area is expanded until the target observation viewpoint is within the preset area, and multiple observation viewpoints are redefined. The above-mentioned three-dimensional reconstruction and completion process can be performed sequentially based on the redefined multiple observation viewpoints to update the target three-dimensional reconstruction model. This ensures that the effective boundary range of the updated target three-dimensional reconstruction model for image rendering can include the target observation viewpoint, thereby guaranteeing the image rendering quality of the reconstructed image corresponding to the target observation viewpoint.
[0117] In some embodiments, the 3D reconstruction method further includes: determining the target observation viewpoint based on the user's perspective selection operation.
[0118] Users can perform viewpoint selection operations on computer devices. These operations can include clicking, dragging, or selecting viewpoint controls in the user interface, as well as inputting viewpoint vectors within the user interface. Once the computer device recognizes the user's viewpoint selection, it determines the corresponding target observation point, providing a basis for image reconstruction and allowing users to select the viewpoint for the reconstructed image as needed.
[0119] In this embodiment, when a user's viewpoint selection operation is detected, the target observation viewpoint is determined based on the viewpoint selection operation, thus providing a basis for the generation of the reconstructed image. Users can select the target observation viewpoint through the viewpoint selection operation, realizing interaction between the user and the computer device. This allows users to select the viewpoint of the reconstructed image according to their needs, achieving viewpoint switching relative to the initial image and improving the user experience.
[0120] In some embodiments, the target 3D reconstruction model includes a Gaussian splash model or a neural radiation field model.
[0121] The target 3D reconstruction model, obtained by sequentially performing 3D reconstruction and completion processing based on each observation viewpoint, includes a Gaussian splatter model. The target 3D reconstruction model can be made a Gaussian splatter model by configuring the initial 3D reconstruction model as a Gaussian splatter model. Gaussian splatting (3D Gaussian Splatting) uses a Gaussian distribution to represent a scene in 3D space, reconstructing 3D scene information by "splashing" Gaussian functions in space. Each Gaussian function can be defined by parameters such as its position, covariance matrix, opacity, and color, which together determine the shape and visual effect of the Gaussian function in space. Through rasterization technology, the Gaussian splatter model can be used for efficient rendering of images or videos.
[0122] The target 3D reconstruction model can also include a Neural Radiance Fields (NERF) model. This can be achieved by configuring the initial 3D reconstruction model as a NERF model. A NERF model is a deep learning-based 3D reconstruction model that extracts the geometric shape and texture information of the object from the image and uses this information to generate a continuous 3D radiation field. The 3D radiation field can be represented as a function that takes a spatial vector (including spatial position and viewpoint direction) as input and outputs the color and density values of the corresponding points.
[0123] In this embodiment, the Gaussian splash model or the neural radiation field model is used as the target 3D reconstruction model. The Gaussian splash model or the neural radiation field model can achieve efficient rendering of images or videos from other perspectives. It has the characteristics of short reconstruction time, strong rendering capability, high visual quality and strong applicability, which further improves the restoration effect of 3D scenes.
[0124] In one exemplary embodiment, a three-dimensional reconstruction method is provided, referring to... Figure 9 As shown, the three-dimensional reconstruction methods include:
[0125] S1. Based on the initial image, determine the depth information of the initial image;
[0126] S2. Determine the point cloud information based on the initial image and its depth information;
[0127] S3. Generate an initial 3D reconstruction model based on point cloud information;
[0128] S4. Determine the observation viewpoint based on the initial 3D reconstruction model;
[0129] S5. Obtain the current observation viewpoint to be updated. Based on the current observation viewpoint and the 3D reconstruction model before this completion and update process, render the observation image of the current observation viewpoint and the depth information corresponding to the observation image.
[0130] S6. Project the depth information corresponding to the observed image onto the coordinate system of the known viewpoint to obtain the depth information corresponding to each coordinate system. The known viewpoint includes the initial viewpoint and the observation viewpoint corresponding to the observation viewpoint that has completed the completion and update process.
[0131] S7. For each known viewpoint, calculate the difference between the depth values of each pixel in the coordinate system of the current observation viewpoint and the known viewpoint; determine the set of pixels whose difference is greater than the preset difference as the occlusion mask;
[0132] S8. Determine the fusion mask of the observed image based on the occlusion mask in the coordinate system of each known viewpoint;
[0133] S9. Perform completion processing on the fusion mask of the observed image to obtain the completed image;
[0134] S10. Based on the completed image, the 3D reconstruction model before this completion and update process is completed and updated to obtain the 3D reconstruction model after this completion and update process.
[0135] S11. Determine whether the current observation viewpoint is the last observation viewpoint. If yes, obtain the target 3D reconstruction model and execute S12. Otherwise, return to S5 and use the next observation viewpoint as the current observation viewpoint.
[0136] S12. Determine whether the target observation viewpoint is located within the preset area or the boundary of the preset area. If yes, execute S13; otherwise, execute S14.
[0137] S13. Based on the target 3D reconstruction model and the target observation viewpoint, a reconstructed image is obtained, and the reconstructed image corresponds to a different viewpoint than the initial image;
[0138] S14. Expand the preset area until the target observation point is within the preset area, and redetermine multiple observation points, then return to S5.
[0139] In this embodiment, an initial 3D reconstruction model is determined based on the initial image, and the observation viewpoints are determined based on the initial 3D reconstruction model. 3D reconstruction completion processing can be performed based on each observation viewpoint and the initial 3D reconstruction model to obtain the target 3D reconstruction model. This ensures that the target 3D reconstruction model has 3D information completion for occlusion perspectives from each observation viewpoint, providing a basis for multi-degree-of-freedom image rendering. By determining the initial 3D reconstruction model and observation viewpoints, and performing 3D reconstruction completion processing based on each observation viewpoint and the initial 3D reconstruction model, the generated target 3D reconstruction model can simultaneously achieve 3D information completion for each occlusion perspective, expanding the rendering boundary range of the target 3D reconstruction model. This simplifies the generation process of the target 3D reconstruction model while ensuring image rendering quality, and improves the realism of the target 3D reconstruction model.
[0140] In one exemplary embodiment, a computer device is provided. The computer device may include, for example, a camera, a mobile phone, or other imaging device with imaging capabilities, and may also include a terminal device connected to the imaging device and having data processing capabilities. The computer device includes a processor and a memory. The memory stores a computer program, and when the processor executes the computer program, it implements the steps of any of the above-described three-dimensional reconstruction methods.
[0141] refer to Figure 10The following description serves as a structural block diagram of the computer device 100 disclosed herein. The computer device 100 includes a computing unit 101, which can perform various appropriate actions and processes based on a computer program stored in a read-only memory (ROM) 102 or a computer program loaded from a storage unit 108 into a random access memory (RAM) 103. The RAM 103 may also store various programs and data required for the operation of the computer device 100. The computing unit 101, ROM 102, and RAM 103 are interconnected via a bus 104. An input / output (I / O) interface 105 is also connected to the bus 104.
[0142] Multiple components in computer device 100 are connected to I / O interface 105, including: input unit 106, output unit 107, storage unit 108, and communication unit 109. Input unit 106 can be any type of device capable of inputting information into computer device 100. Input unit 106 can receive input numerical or character information and generate key signal inputs related to user settings and / or function control of computer device 100, and may include, but is not limited to, a mouse, keyboard, touchscreen, trackpad, trackball, joystick, microphone, and / or remote control. Output unit 107 can be any type of device capable of presenting information, and may include, but is not limited to, a monitor, speaker, video / audio output terminal, vibrator, and / or printer. Storage unit 108 may include, but is not limited to, a hard disk and an optical disk. Communication unit 109 allows computer device 100 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers, and / or chipsets, such as Bluetooth™ devices, WiFi devices, WiMax devices, cellular communication devices, and / or the like.
[0143] The computing unit 101 can be a variety of general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of the computing unit 101 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 101 performs the various methods and processes described above, such as the three-dimensional reconstruction method. For example, in some embodiments, the three-dimensional reconstruction method may be implemented as a computer software program tangibly contained in a machine-readable medium, such as storage unit 108. In some embodiments, part or all of the computer program may be loaded and / or installed on the computer device 100 via ROM 102 and / or communication unit 109. When the computer program is loaded into RAM 103 and executed by the computing unit 101, one or more steps of the three-dimensional reconstruction method described above may be performed. Alternatively, in other embodiments, the computing unit 101 may be configured to perform the three-dimensional reconstruction method by any other suitable means (e.g., by means of firmware).
[0144] The computer device 100 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to perform the three-dimensional reconstruction method described above.
[0145] Other embodiments of the invention will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention that follow the general principles of the invention and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of the invention are indicated by the following claims.
[0146] It should be understood that the present invention is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of the invention is limited only by the appended claims.
Claims
1. A three-dimensional reconstruction method, characterized in that, The three-dimensional reconstruction method includes: Based on the initial image, determine the initial 3D reconstruction model; Based on the initial 3D reconstruction model, an observation viewpoint is determined, which has an obstructed viewpoint in the initial 3D reconstruction model. Based on each of the observation viewpoints and the initial 3D reconstruction model, a 3D reconstruction completion process is performed to obtain a target 3D reconstruction model, such that the target 3D reconstruction model includes at least some of the occluded viewpoints' 3D information.
2. The method according to claim 1, characterized in that, Determining the observation viewpoint based on the initial 3D reconstruction model includes at least one of the following methods: The observation viewpoint is determined based on the user's point-marking operation on the point cloud interactive interface, and the point cloud interactive interface is generated based on the initial 3D reconstruction model. The observation viewpoint is determined based on the user's viewpoint selection operation on the 3D model interaction interface, and the 3D model interaction interface is generated based on the initial 3D reconstruction model. The inertial measurement unit (IMU) data of the shooting device is acquired, and the observation viewpoint is determined based on the IMU data and the initial 3D reconstruction model.
3. The three-dimensional reconstruction method according to claim 1, characterized in that, The step of determining the initial 3D reconstruction model based on the initial image includes: Based on the initial image and its depth information, point cloud information is determined; The initial 3D reconstruction model is generated based on the point cloud information.
4. The three-dimensional reconstruction method according to claim 1, characterized in that, The step of performing 3D reconstruction completion processing based on each of the observation viewpoints and the initial 3D reconstruction model to obtain the target 3D reconstruction model includes: The initial 3D reconstruction model is sequentially completed and updated based on each of the aforementioned observation viewpoints until all of the aforementioned observation viewpoints have completed the completion and update process, thereby obtaining the target 3D reconstruction model.
5. The three-dimensional reconstruction method according to claim 4, characterized in that, The process of sequentially completing and updating the initial 3D reconstruction model based on each of the aforementioned observation viewpoints until all of the aforementioned observation viewpoints have completed the completion and update process, thereby obtaining the target 3D reconstruction model, includes: Obtain the current observation viewpoint to be updated, and based on the current observation viewpoint and the 3D reconstruction model before this completion and update process, render the observation image of the current observation viewpoint and the depth information corresponding to the observation image. The depth information corresponding to the observed image is projected to obtain the projected depth information. Based on the depth information after projection, it is determined whether there is occlusion. If there is occlusion, an occlusion mask is generated. The observed image is completed based on the occlusion mask to obtain a completed image; Based on the completed image, the 3D reconstruction model before this completion and update process is completed and updated to obtain the 3D reconstruction model after this completion and update process.
6. The three-dimensional reconstruction method according to claim 5, characterized in that, The step of projecting the depth information corresponding to the observed image to obtain the projected depth information includes: The depth information corresponding to the observed image is projected onto the coordinate system of the known viewpoint to obtain the depth information corresponding to each coordinate system. The known viewpoint includes the initial viewpoint and the observation viewpoint corresponding to the observation viewpoint that has completed the completion and update process. The step of determining whether occlusion exists based on the projected depth information, and generating an occlusion mask if occlusion exists, includes: In the coordinate system of each known viewpoint, the depth information corresponding to the current observation viewpoint is compared with the depth information corresponding to the known viewpoint, so as to generate the occlusion mask based on the comparison result.
7. The three-dimensional reconstruction method according to claim 6, characterized in that, In the coordinate system of each known viewpoint, the depth information corresponding to the current observation viewpoint and the depth information corresponding to the known viewpoint include the depth value corresponding to each pixel in the coordinate system of the known viewpoint. The step of comparing the depth information corresponding to the current observation viewpoint with the depth information corresponding to the known viewpoint in the coordinate system of each known viewpoint, and generating the occlusion mask based on the comparison result, includes: For each known viewpoint, calculate the difference between the depth values of each pixel in the coordinate system of the current observation viewpoint and the known viewpoint; determine the set of pixels whose difference is greater than a preset difference as the occlusion mask.
8. The three-dimensional reconstruction method according to claim 6, characterized in that, The step of performing completion processing on the observed image based on the occlusion mask to obtain a completed image includes: Based on the occlusion mask in the coordinate system of each known viewpoint, the fusion mask of the observed image is determined; The completed image is obtained by performing the completion process on the fusion mask of the observed image.
9. The three-dimensional reconstruction method according to any one of claims 1 to 8, characterized in that, The three-dimensional reconstruction method also includes: Based on the target 3D reconstruction model and the target observation viewpoint, a reconstructed image is obtained, and the reconstructed image corresponds to a different viewpoint than the initial image.
10. The three-dimensional reconstruction method according to claim 9, characterized in that, The three-dimensional reconstruction method also includes: The target observation viewpoint is determined based on the user's perspective selection operation.
11. The three-dimensional reconstruction method according to any one of claims 1 to 8, characterized in that, The target 3D reconstruction model includes a Gaussian splash model or a neural radiation field model.
12. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the computer program is executed by the processor, the processor is configured to determine an initial 3D reconstruction model based on an initial image; Based on the initial 3D reconstruction model, an observation viewpoint is determined, which has an obstructed viewpoint in the initial 3D reconstruction model. Based on each of the observation viewpoints and the initial 3D reconstruction model, a 3D reconstruction completion process is performed to obtain a target 3D reconstruction model, such that the target 3D reconstruction model includes at least some of the occluded viewpoints' 3D information.
13. The computer device according to claim 12, characterized in that, The processor is configured to determine the observation viewpoint based on the user's point-marking operation on the point cloud interactive interface, which is generated based on the initial 3D reconstruction model. The observation viewpoint is determined based on the user's viewpoint selection operation on the 3D model interaction interface, and the 3D model interaction interface is generated based on the initial 3D reconstruction model. The inertial measurement unit (IMU) data of the shooting device is acquired, and the observation viewpoint is determined based on the IMU data and the initial 3D reconstruction model.
14. The computer device according to claim 12, characterized in that, The processor is configured to determine point cloud information based on the initial image and the depth information of the initial image; The initial 3D reconstruction model is generated based on the point cloud information.
15. The computer device according to claim 12, characterized in that, The processor is configured to sequentially complete and update the initial 3D reconstruction model based on each of the observation viewpoints until each of the observation viewpoints has completed the completion and update process, thereby obtaining the target 3D reconstruction model.
16. The computer device according to claim 15, characterized in that, The processor is configured to acquire the current observation viewpoint to be updated, and based on the current observation viewpoint and the 3D reconstruction model before the current completion and update process, render the observation image of the current observation viewpoint and the depth information corresponding to the observation image. The depth information corresponding to the observed image is projected to obtain the projected depth information. Based on the depth information after projection, it is determined whether there is occlusion. If there is occlusion, an occlusion mask is generated. The observed image is completed based on the occlusion mask to obtain a completed image; Based on the completed image, the 3D reconstruction model before this completion and update process is completed and updated to obtain the 3D reconstruction model after this completion and update process.
17. The computer device according to claim 16, characterized in that, The processor is configured to project the depth information corresponding to the observed image onto the coordinate system of the known viewpoint, respectively, to obtain the depth information corresponding to each coordinate system. The known viewpoint includes the initial viewpoint and the observation viewpoint corresponding to the observation viewpoint that has completed the completion and update process. In the coordinate system of each known viewpoint, the depth information corresponding to the current observation viewpoint is compared with the depth information corresponding to the known viewpoint, so as to generate the occlusion mask based on the comparison result.
18. The computer device according to claim 17, characterized in that, In the coordinate system of each known viewpoint, the depth information corresponding to the current observation viewpoint and the depth information corresponding to the known viewpoint include the depth value corresponding to each pixel in the coordinate system of the known viewpoint. The processor is configured to calculate, for each of the known viewpoints, the difference between the depth values corresponding to each pixel in the coordinate system of the current observation viewpoint and the known viewpoint. The set of pixels whose difference is greater than a preset difference is determined as the occlusion mask.
19. The computer device according to claim 18, characterized in that, The processor is configured to determine the fusion mask of the observed image based on the occlusion mask in the coordinate system of each known viewpoint; The completed image is obtained by performing the completion process on the fusion mask of the observed image.
20. The computer device according to any one of claims 12 to 19, characterized in that, The processor is configured to obtain a reconstructed image based on the target 3D reconstruction model and the target observation viewpoint, wherein the reconstructed image corresponds to a different viewpoint than the initial image.
21. The computer device according to claim 21, characterized in that, The processor is configured to determine the target observation viewpoint based on the user's perspective selection operation.
22. The computer device according to any one of claims 13 to 20, characterized in that, The target 3D reconstruction model includes a Gaussian splash model or a neural radiation field model.