Information processing system, information processing method, and program
The shooting navigation system addresses the challenge of determining optimal imaging positions by calculating and displaying shooting directions, enhancing image quality and reducing missed shots in 3D capture.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SONY GROUP CORP
- Filing Date
- 2025-12-04
- Publication Date
- 2026-07-02
Smart Images

Figure JP2025042274_02072026_PF_FP_ABST
Abstract
Description
Information Processing System, Information Processing Method, and Program
[0001] The present disclosure relates to an information processing system, an information processing method, and a program, and more particularly, to an information processing system, an information processing method, and a program that can suitably realize imaging for 3D capture.
[0002] Conventionally, as a method of 3D modeling of a subject (3D object) having a three-dimensional shape, a method called photogrammetry is known in which the subject is photographed from multiple directions and 3D data is generated based on a plurality of obtained captured images.
[0003] Patent Document 1 discloses a technique for setting a recommended imaging position for a 3D object so that each captured image can be obtained in a state where they overlap with an appropriate overlap rate according to the shape, unevenness, etc. of each part of the 3D object.
[0004] International Publication No. 2024 / 181004
[0005] In imaging for 3D capture, it has been difficult to determine which procedures and methods are appropriate for improving the imaging quality.
[0006] The present disclosure has been made in view of such a situation, and is intended to suitably realize imaging for 3D capture.
[0007] The information processing system of the present disclosure includes a processing circuit that acquires three-dimensional feature information regarding an environment or an object that is a subject of a monitor image captured by a camera unit, calculates a plurality of imaging viewpoint positions for generating a three-dimensional model of the subject based on the three-dimensional feature information, and superimposes and displays imaging viewpoint position information indicating the imaging viewpoint positions on the monitor image.
[0008] The information processing method disclosed herein is an information processing method in which a processing circuit acquires three-dimensional feature information relating to the environment or object that is the subject of a monitor image captured by the camera unit, calculates a shooting viewpoint position for generating a three-dimensional model of the subject based on the three-dimensional feature information, and superimposes and displays the shooting viewpoint position information indicating the shooting viewpoint position onto the monitor image.
[0009] The program disclosed herein is a program that causes a computer to perform the following processes: acquire three-dimensional feature information relating to the environment or object that is the subject of the monitor image captured by the camera unit; calculate the shooting viewpoint position for generating a three-dimensional model of the subject based on the three-dimensional feature information; and superimpose and display the shooting viewpoint position information indicating the shooting viewpoint position onto the monitor image.
[0010] In this disclosure, three-dimensional feature information relating to the environment or object that is the subject of the monitor image captured by the camera unit is acquired, a shooting viewpoint position for generating a three-dimensional model of the subject is calculated based on the three-dimensional feature information, and shooting viewpoint position information indicating the shooting viewpoint position is superimposed on the monitor image.
[0011] This is a diagram showing an example of the external configuration of the shooting navigation system. This is a diagram showing an example of the external configuration of the shooting navigation system. This is a diagram showing an example of the hardware configuration of the shooting navigation system. This is a block diagram showing an example of the functional configuration of the processing unit. This is a diagram illustrating shooting navigation in scene capture. This is a diagram showing an example of the scene capture workflow. This is a diagram showing an example of the display of the shooting viewpoint position guide and virtual screen. This is a diagram explaining the details of the shooting viewpoint position guide. This is a diagram showing an example of the display mode of the shooting viewpoint position guide. This is a flowchart explaining the flow of the shooting guide display process. This is a diagram explaining screen parameters. This is a diagram explaining shooting viewpoint parameters. This is a diagram showing an example of irregular screen and shooting. This is a diagram showing an example of the object capture workflow. This is a diagram explaining the details of ROI setting and shooting guide generation. This is a diagram explaining the calculation of the shooting viewpoint position according to the three-dimensional shape. This is a diagram showing another example of the object capture workflow. This is a diagram showing another example of the display of the shooting viewpoint position guide. This is a flowchart explaining the flow of the shooting guide display process. This is a diagram showing another example of the configuration of the shooting navigation system. This is a diagram showing yet another example of the configuration of the shooting navigation system. This is a block diagram showing an example of the computer hardware configuration.
[0012] The following describes the forms for implementing this disclosure (hereinafter referred to as embodiments). The explanation will be given in the following order.
[0013] 1. Conventional challenges 2. Configuration of the shooting navigation system 3. First embodiment (shooting navigation in scene capture) 4. Second embodiment (shooting navigation in object capture) 5. Modifications 6. Description of a computer to which the technology of this disclosure is applied
[0014] <1. Conventional Challenges> In the past, when shooting for 3D capture, such as scene capture or object capture, inexperienced users did not know where to shoot or how many shots to take. In particular, in scene capture for virtual production, it was difficult to visualize the range of the space to be reproduced in an actual shooting studio, and it was unclear what area should be shot.
[0015] The optimal shooting method for scene capture and object capture is not uniquely determined and varies depending on the surrounding environment, the shape of the subject, and the 3D data playback environment. Therefore, these shooting methods largely depend on the experience of the photographer, and it has been difficult to automatically calculate and present the optimal shooting position to the user.
[0016] In contrast, a shooting navigation system applying the technology described herein calculates the necessary shooting positions and number of shots based on the size of the space and the size of the objects reproduced in 3D data, and realizes the display of these shooting positions and virtual screens using AR (Augmented Reality).
[0017] <2. Configuration of the Shooting Navigation System> (Example of External Configuration) Figures 1 and 2 show an example of the external configuration of a shooting navigation system as an information processing system to which the technology relating to this disclosure is applied.
[0018] The shooting navigation system 1 shown in Figures 1 and 2 performs shooting to acquire materials to be used for 3D modeling.
[0019] The shooting navigation system 1 is configured by mechanically and electrically connecting a smartphone 10 and an ILC (Interchangeable lens Cameras) 20. In the shooting navigation system 1, navigation information that assists shooting by the ILC 20 is projected into the real space and displayed on the screen (display unit 11) of the smartphone 10 by estimating the position and orientation using the camera and sensors of the smartphone 10. In addition, the smartphone 10 controls the automatic shooting of the ILC 20 with appropriate shooting conditions and timing. On the other hand, the ILC 20 takes pictures under the control of the smartphone 10.
[0020] In other words, in the shooting navigation system 1, the screen of the smartphone 10 (display unit 11) and the screen of the ILC 20 (display unit 21) display the same subject that is the target of 3D modeling. With the shooting navigation system 1, the user can take high-quality photos with the ILC 20 while keeping an eye on the navigation information displayed on the display unit 11 of the smartphone 10.
[0021] (Hardware Configuration) Figure 3 is a block diagram showing an example of the hardware configuration of the shooting navigation system 1.
[0022] In the shooting navigation system 1 shown in Figure 3, various sensors are provided on the smartphone 10, and functions for spatial recognition, posture calculation, camera control, and shooting assist display are integrated into the application. The ILC 20 also transmits camera information and shooting completion signals to the smartphone 10.
[0023] The smartphone 10 includes a display unit 11, an IMU (Inertial Measurement Unit) 110, a depth sensor 120, a sensor group including a camera unit 130, and an operation unit 140. Furthermore, the smartphone 10 includes a processing unit 150 that can implement various functions by executing a program stored in a memory (not shown).
[0024] The ILC20 comprises a camera unit 210 with interchangeable lenses and an image sensor, which has a shooting function for capturing subjects to be 3D modeled, and an operation unit 220 for performing operations and settings related to the shooting of the camera unit 210. The captured image (also called a pass-through image, etc.) captured by the camera unit 210 is displayed in real time on the display unit 21. The ILC20 also includes a processing unit 230 that can realize various functions by executing a program stored in memory (not shown).
[0025] The processing unit 150 of the smartphone 10 and the processing unit 230 of the ILC 20 can exchange various types of information via a communication interface (not shown).
[0026] The IMU 110 detects the three-dimensional inertial motion (acceleration, angular velocity) of the smartphone 10 and outputs the detection results to the processing unit 150.
[0027] The depth sensor 120 has a LiDAR (Light Detection And Ranging) sensor (dToF (Direct Time of Flight) module), detects the depth to the subject, and outputs the detection result to the processing unit 150.
[0028] The camera unit 130 consists of a fixed-focus lens and an image sensor, and, like the camera unit 210, has a shooting function for capturing subjects that are the target of 3D modeling. The captured images taken by the camera unit 130 are output to the processing unit 150.
[0029] The operation unit 140 is composed of, for example, a touch panel superimposed on the display unit 11, and accepts instructions and information input in response to user operations. The received instructions and information are output to the processing unit 150.
[0030] The processing unit 150 controls the operation of the smartphone 10 and the ILC 20 based on various sensor data and information from the IMU 110, depth sensor 120, camera unit 130, and operation unit 140.
[0031] (Functional Configuration of the Processing Unit) Figure 4 is a block diagram showing an example of the functional configuration of the processing unit 150.
[0032] The processing unit 150 executes a program stored in memory (not shown) to realize the following functional blocks: parameter acquisition unit 301, position and orientation calculation unit 302, object recognition unit 303, shooting viewpoint position calculation unit 304, display control unit 305, and shooting control unit 306. At least one of the functional blocks realized in the processing unit 150 may be realized in the processing unit 230 of the ILC 20.
[0033] The parameter acquisition unit 301 acquires various parameters for displaying navigation information on the display unit 11. For example, the parameter acquisition unit 301 acquires three-dimensional feature information about the environment that is the subject of the captured image (hereinafter also referred to as the monitor image) captured by the camera unit 130. This three-dimensional feature information includes screen parameters that indicate the three-dimensional shape of the screen on which a three-dimensional model of the environment is displayed, such as in a virtual production.
[0034] The position and orientation calculation unit 302 calculates the position and orientation of the smartphone 10 and ILC 20 in three-dimensional space based on various sensor data from the IMU 110, depth sensor 120, and camera unit 130.
[0035] The object recognition unit 303 acquires (generates) three-dimensional feature information about the object that is the subject of the monitor image captured by the camera unit 130. Specifically, the object recognition unit 303 generates three-dimensional shape information that shows the three-dimensional shape of the object as three-dimensional feature information. Then, based on the generated three-dimensional shape information, the object recognition unit 303 recognizes the shape and size of the object and sets the area of the subject to be photographed by the ILC 20 as the region of interest (ROI).
[0036] The shooting viewpoint position calculation unit 304 calculates multiple shooting viewpoint positions for generating a three-dimensional model of the subject based on the three-dimensional feature information acquired by the parameter acquisition unit 301 and the object recognition unit 303. Specifically, the shooting viewpoint position calculation unit 304 calculates the shooting viewpoint positions using the position and orientation information indicating the position and orientation of the smartphone 10 (camera unit 130) calculated by the position and orientation calculation unit 302, and the shooting viewpoint parameters of the pre-set shooting viewpoint positions. Here, the shooting viewpoint parameters include at least the number of shooting viewpoint positions and the interval between the shooting viewpoint positions.
[0037] The display control unit 305 displays the monitor image captured by the camera unit 130 on the display unit 11. The display control unit 305 also overlays the shooting viewpoint position information, which is calculated by the shooting viewpoint position calculation unit 304, onto the monitor image. The shooting viewpoint position information overlaid on the monitor image is not only the shooting viewpoint position in the real space shown in the monitor image, but is also overlaid on the monitor image as an AR image indicating the shooting direction at that shooting viewpoint position.
[0038] The shooting control unit 306 triggers the execution of image acquisition processing to generate a three-dimensional model based on user operations on the operation unit 140, or when the position and orientation information of the smartphone 10 matches the shooting viewpoint position information. In other words, the shooting control unit 306 outputs shooting control information to the ILC 20 for executing image acquisition processing to generate a three-dimensional model.
[0039] The following describes an embodiment of the shooting navigation system configured in this manner.
[0040] <3. First Embodiment> The technology relating to this disclosure can be applied to shooting navigation in scene capture.
[0041] (Shooting Navigation in Scene Capture) Figure 5 is a diagram illustrating shooting navigation in scene capture.
[0042] In the shooting navigation system of this embodiment, in material shooting for virtual production, by inputting the size of the screen SC installed in the shooting studio SD and the size of the shooting space SP, a plurality of shooting viewpoint positions for the material shooting are calculated. Then, in the monitor image MT displayed on the smartphone 10, a plurality of shooting viewpoint position guides GD are superimposed and displayed as shooting viewpoint position information indicating the calculated shooting viewpoint positions. Further, based on the position of the shooter, a virtual screen VS having the same three-dimensional shape as the screen SC is AR-displayed on the real space reflected in the monitor image MT so that it can be seen which part of the environment is cut out and displayed on the screen SC of the shooting studio SD. Hereinafter, navigation information such as the shooting viewpoint position guide GD and the virtual screen VS is also referred to as a shooting guide.
[0043] (Example of scene capture workflow) Referring to FIG. 6, an example of the scene capture workflow in this embodiment will be described. As shown in FIG. 6, the scene capture in this embodiment is composed of three stages.
[0044] In stage ST111, the reference position for shooting is determined and a shooting guide is generated.
[0045] First, for example, the current standing position of the shooter is determined as the reference position. The reference position is, for example, a position corresponding to the position serving as the reference for shooting in the shooting space for virtual production. Next, based on the reference position, the shooting viewpoint position guide and the virtual screen as shooting guides are superimposed and displayed on the monitor image. Such an operation is repeated until the shooter is satisfied with the arrangement of the shooting guides superimposed on the monitor image.
[0046] In stage ST112, shooting is performed according to the generated shooting guide.
[0047] Here, for each of the generated plurality of shooting viewpoint position guides, shooting is performed with the area included in the virtual screen as the shooting range. Shooting is completed when shooting for all the shooting viewpoint position guides has been performed.
[0048] At the stage ST113, the captured data obtained by shooting is confirmed.
[0049] Here, an overhead image showing a three-dimensional space including all the shooting viewpoint positions based on the reference position is displayed on the display unit 11 of the smartphone 10. Icons indicating each shooting viewpoint position are displayed three-dimensionally in the overhead image. At this time, the icons of the shooting viewpoint positions where shooting has failed are displayed in a different color from other icons. As a result, the shooter can confirm whether there is any missed shooting or failed shooting, and can perform shooting again if necessary.
[0050] (Example of display of shooting guide) FIG. 7 is a diagram showing an example of display of a shooting viewpoint position guide and a virtual screen in a monitor image.
[0051] In the example of FIG. 7, a virtual screen VS having a shape corresponding to the curved screen and a plurality of shooting viewpoint position guides GD arranged so that the region included in the virtual screen VS is included in the shooting range are superimposed and displayed on the monitor image MT of the environment captured by the camera unit 130 of the smartphone 10.
[0052] FIG. 8 is a diagram for explaining the details of the shooting viewpoint position guide GD.
[0053] For example, the shooting viewpoint position guide GD is composed of a navigation frame GD1 and a direction presentation part GD2. Here, the navigation frame GD1 is hexagonal, but this is just an example. The direction presentation part GD2 indicates the shooting direction at the shooting viewpoint position of the camera unit 130 (camera unit 210) by a needle-like shape at the center of the navigation frame GD1. The tip direction of the needle indicates the shooting direction.
[0054] In the shooting viewpoint position guide GD in the left diagram of FIG. 8, since the tip portion of the needle-like shape faces the upper right, it indicates that the upper right direction seen from the shooting viewpoint position indicated by the shooting viewpoint position guide GD is the shooting direction. In the shooting viewpoint position guide GD in the right diagram of FIG. 8, since the tip portion of the needle-like shape is indicated by a dot, it indicates that the front direction seen from the shooting viewpoint position indicated by the shooting viewpoint position guide GD is the shooting direction.
[0055] Returning to Figure 7, the virtual screen VS and the shooting viewpoint position guide GD are fixedly positioned in the virtual space corresponding to the real space displayed on the monitor image MT. Therefore, when the photographer changes their standing position, the appearance of the virtual screen VS and the shooting viewpoint position guide GD changes along with the real space displayed on the monitor image MT.
[0056] Furthermore, a camera position guide CP is displayed in the center of the monitor image MT, indicating the position (camera position) of the camera unit 130 (camera unit 210) in the virtual space. The camera position guide CP, like the shooting viewpoint position guide GD, is also composed of a hexagonal frame. The camera position guide CP is displayed with a fixed position on the monitor image MT. As the camera unit 130 moves and the camera position guide CP moves relatively closer to the shooting viewpoint position guide GD, the display mode of the shooting viewpoint position guide GD changes.
[0057] For example, as shown on the left side of Figure 9, when the camera position guide CP is separated from the shooting viewpoint position guide GD, that is, when the position of the camera unit 130 is separated from the shooting viewpoint position, the shooting viewpoint position guide GD is displayed in white, for example.
[0058] On the other hand, as shown on the right side of Figure 9, when the camera position guide CP approaches the shooting viewpoint position guide GD, that is, when the position of the camera unit 130 is close to the shooting viewpoint position, the shooting viewpoint position guide GD will be displayed in orange, for example. When shooting is performed by the ILC 20 in this state, it means that shooting (image acquisition processing) has been performed at the shooting viewpoint position. The distance between the camera position guide CP and the shooting viewpoint position guide GD, which is the condition for the color of the shooting viewpoint position guide GD to change, can be adjusted arbitrarily.
[0059] In this case, the photographer may manually operate the ILC20 to take a picture after confirming that the color of the shooting viewpoint position guide GD has changed, or the ILC20 may automatically take a picture triggered by the change in the color of the shooting viewpoint position guide GD. Furthermore, if the ILC20 has an interval shooting mode function that takes continuous pictures at regular intervals, the photographer can adjust the position of the camera unit 130 (smartphone 10) so that the color of the shooting viewpoint position guide GD changes.
[0060] (Flow of the shooting guide display process) The flow of the shooting guide display process in this embodiment will be explained with reference to the flowchart in Figure 10. The process in Figure 10 is started when, for example, the operation unit 140 is operated while the monitor image is being captured by the camera unit 130 of the smartphone 10.
[0061] In step S101, the parameter acquisition unit 301 acquires screen parameters as three-dimensional feature information and pre-set shooting viewpoint parameters.
[0062] For example, as shown in Figure 11, the screen parameters include at least the width and height of the screen SC installed in the shooting space (shooting studio). If the screen SC is a curved screen, the screen parameters also include the depth. Furthermore, if the screen SC is installed at a distance from the mounting surface, the screen parameters also include the distance from the mounting surface to the bottom edge of the screen SC.
[0063] The shooting viewpoint parameters are set based on the size of the space where the screen is installed, i.e., the shooting space for virtual production. As mentioned above, the shooting viewpoint parameters include the number of shooting viewpoint positions and the spacing between them. For example, the number of shooting viewpoint positions and the spacing between them are set to correspond to the range of camera work defined with the imaging plane of the studio camera in the shooting space as the XY plane and the depth direction as the Z axis. In this embodiment, for example, as shown in Figure 12, the shooting viewpoint parameters are set to the number of shooting viewpoint positions VP per side of a rectangular parallelepiped in the XYZ coordinate space (X_Num, Y_Num, Z_Num) and the spacing between shooting viewpoint positions VP in each direction (X_Span, Y_Span, Z_Span).
[0064] Furthermore, the shooting viewpoint position is limited to the range of movement possible in the actual shooting environment. Therefore, if the shooting environment is not flat, the distance between shooting viewpoint positions may be set to be small, or the range of shooting viewpoint positions may be automatically set and adjusted by sensing the environmental conditions using a depth sensor 120 or the like.
[0065] Returning to the flowchart in Figure 10, in step S102, the position and orientation calculation unit 302 calculates position and orientation information indicating the position and orientation of the camera unit 130 (smartphone 10) by using SLAM (Simultaneous Localization and Mapping) with sensor data from the IMU 110 and the camera unit 130. The current position indicated by the position and orientation information becomes the reference position described above.
[0066] In step S103, the shooting viewpoint position calculation unit 304 calculates the position of the virtual screen based on the reference position indicated by the position and orientation information.
[0067] In step S104, the shooting viewpoint position calculation unit 304 calculates the shooting viewpoint position and shooting direction based on the screen parameters, shooting viewpoint parameters, and position and orientation information. For example, using the reference position indicated by the position and orientation information as a reference, shooting viewpoint positions are calculated with the number and interval indicated by the shooting viewpoint parameters. Furthermore, at each shooting viewpoint position, a shooting direction is calculated that includes the area contained in the virtual screen within the shooting range.
[0068] Then, in step S105, the display control unit 305 superimposes the shooting viewpoint position guide and virtual screen onto the monitor image displayed on the display unit 11, based on the shooting viewpoint position and shooting direction calculated by the shooting viewpoint position calculation unit 304 and the position of the virtual screen. At this time, the shooting viewpoint position guide and virtual screen are displayed in AR on the real space shown in the monitor image, in accordance with the position and orientation information calculated by SLAM.
[0069] The above processing makes it possible to suitably achieve shooting for scene capture. Specifically, in scene capture shooting, the optimal shooting viewpoint position becomes clear, so even inexperienced users can easily take pictures, and furthermore, it reduces missed shots and improves the quality of the shots. In addition, in scene capture, it becomes easier to visualize the range of space to be reproduced in an actual shooting studio, and the area to be shot becomes clearer.
[0070] (Handling of irregular screens and shooting conditions) The shooting navigation system of this embodiment can accommodate various screen shapes and various shooting methods in a shooting studio.
[0071] For example, for a J-shaped screen SC1 as shown in Figure 13 (left), or an L-shaped screen (not shown), the shooting viewpoint position and the virtual screen position should be calculated so that the ILC20 performs image acquisition (image capture processing) in two separate steps for each surface. Similarly, for a full-circumference screen, the shooting viewpoint position and the virtual screen position should be calculated so that the ILC20 performs image acquisition (image capture processing) in four separate steps for each surface, divided into, for example, 90-degree intervals.
[0072] Furthermore, expressing a shift in viewpoint by zooming in / out on the background projected onto the studio screen is essentially equivalent to the studio camera virtually moving. In this case, as shown in the right-hand diagram of Figure 13, the shooting viewpoint parameters should be set so that the shooting viewpoint position VP is positioned within the range where the studio camera virtually moves relative to the screen SC.
[0073] <4. Second Embodiment> The technology relating to this disclosure can also be applied to shooting navigation in object capture.
[0074] (Example of Object Capture Workflow) An example of the object capture workflow in this embodiment will be described with reference to Figure 14. As shown in Figure 14, the object capture in this embodiment consists of three stages.
[0075] In stage ST211, a region of interest (ROI), which is the area of the object, is set, and a shooting guide is generated.
[0076] For example, as shown in Figure 15, first, the photographer points the smartphone 10 (depth sensor 120, camera unit 130) at the subject SJ and scans around the subject SJ. This sets up a scene box BX that shows the ROI corresponding to the shape and size of the subject SJ. The shape and size of the scene box BX can be manually adjusted by the photographer. Next, the operation unit 140 is operated or otherwise instructed to generate a shooting guide, and a shooting viewpoint position guide GD is superimposed on the monitor image as a shooting guide. The shooting viewpoint position guide GD is arranged in a predetermined number and at predetermined intervals according to the shape and size of the set and adjusted scene box BX. In the example in Figure 15, 36 shooting viewpoint position guides GD are arranged in five layers in the height direction per revolution. This operation is repeated until the photographer is satisfied with the arrangement of the shooting guide superimposed on the monitor image.
[0077] In stage ST212, shooting is performed according to the generated shooting guide.
[0078] Here, for each of the generated shooting viewpoint position guides, shooting is performed with the ROI (scene box BX) as the shooting range. Shooting is complete when shooting has been performed for all shooting viewpoint position guides.
[0079] In stage ST213, the photographic data obtained through the shooting process is reviewed.
[0080] Here, the display unit 11 of the smartphone 10 shows an overhead image representing a three-dimensional space including all shooting viewpoint positions. Icons representing each shooting viewpoint position are displayed three-dimensionally on the overhead image. At this time, icons for shooting viewpoint positions where shooting failed are displayed in a different color from the other icons. This allows the photographer to check whether there are any missed shots or failed shots, and to retake the shots if necessary.
[0081] (Calculation of shooting viewpoint position according to three-dimensional shape) The shooting viewpoint position is not uniform with respect to the subject object (ROI), and the density of shooting viewpoint positions and the shooting direction at the shooting viewpoint position may be varied according to the three-dimensional shape of the object.
[0082] As shown in Figure 16, based on sensor data from the depth sensor 120 and camera unit 130 acquired by scanning the subject SJ with the smartphone 10, a mesh is generated on the surface of the subject SJ as three-dimensional shape information (three-dimensional feature information) indicating the three-dimensional shape of the subject SJ. The shape and size of the object (subject SJ) are recognized from the generated mesh, and the normal direction and curvature of the object surface are acquired.
[0083] The camera viewpoint position is then adjusted according to the normal direction and curvature of the acquired object surface. For example, the density of camera viewpoint positions is increased for areas of the object surface with high curvature. In addition, the shooting direction of each camera viewpoint position is adjusted to match the normal direction of the object surface.
[0084] In this way, by calculating the shooting viewpoint position according to the three-dimensional shape of the object, higher quality object capture can be achieved.
[0085] (Other examples of object capture workflows) In the object capture workflow described with reference to Figure 14, it is assumed that an ROI is set for the object. However, generating a mesh for ROI setting requires a high processing load. Therefore, it is possible to generate a simpler shooting guide without requiring the processing load of mesh generation (ROI setting).
[0086] Figure 17 illustrates another example of the object capture workflow in this embodiment. The object capture shown in Figure 17 also consists of three stages.
[0087] In stage ST221, a simplified shooting guide is generated.
[0088] First, with the photographer pointing the smartphone 10 at the subject, the operation unit 140 is operated or otherwise instructed to generate a shooting guide, and the shooting viewpoint position guide GD is superimposed on the monitor image. In this case, for example, as shown in Figure 18, the shooting viewpoint position guide GD is arranged horizontally in real space in a circular shape with a predetermined radius, centered on the center of the monitor image. By pointing the camera unit 130 (smartphone 10) at the subject SJ, the shooting viewpoint position guide GD is positioned around the subject SJ. This operation is repeated until the photographer is satisfied with the arrangement of the shooting guide superimposed on the monitor image.
[0089] In stage ST222, shooting is performed according to the generated shooting guide.
[0090] Here, a full rotation of images is captured according to the generated multiple shooting viewpoint position guides. After a full rotation of images has been captured for each of the shooting viewpoint position guides, the photographer changes the height position of the smartphone 10 (camera unit 130), which displays new circularly arranged shooting viewpoint position guides GD. This allows for another full rotation of images to be captured. By repeating this process five times for different height positions, images equivalent to those on stage ST212 in Figure 15 can be captured.
[0091] In stage ST223, the image data obtained through imaging is checked, similar to stage ST213 in Figure 15.
[0092] Furthermore, the following shooting modes are possible when using the circularly arranged shooting viewpoint position guide GD as explained with reference to Figure 18.
[0093] In the first shooting mode, the photographer moves and takes pictures by referring to the distance between the circularly displayed shooting viewpoint position guides GD as the amount of horizontal movement. In this case, the color of the shooting viewpoint position guide GD closest to the position of the smartphone 10 (the photographer's current position) changes to, for example, orange. Also, if the photographer moves to a position corresponding to an adjacent shooting viewpoint position guide GD, the smartphone 10 vibrates to notify the photographer of this.
[0094] In the second shooting mode, the photographer aligns the camera position guide CP (Figure 9), displayed in the center of the monitor image, with the shooting viewpoint position guide GD to initiate automatic shooting. For example, the color of the shooting viewpoint position guide GD closest to the camera position guide CP is changed to, for example, orange, and automatic shooting is triggered when the change in the position and orientation of the smartphone 10 stops. The color of the shooting viewpoint position guide GD for which shooting is complete is, for example, grayed out. When the camera position guide CP approaches a shooting viewpoint position guide GD for which shooting is not yet complete and is in a position to be shot, the smartphone 10 vibrates to notify the user of this. Once shooting has been completed for all shooting viewpoint position guide GDs in a full rotation, all shooting viewpoint position guide GDs are grayed out.
[0095] (Flow of the shooting guide display process) The flow of the shooting guide display process in this embodiment will be explained with reference to the flowchart in Figure 19. The process in Figure 19 is started, for example, when the operation unit 140 is operated while the monitor image is being captured by the camera unit 130 of the smartphone 10.
[0096] In step S201, the parameter acquisition unit 301 acquires pre-set shooting viewpoint parameters. These shooting viewpoint parameters may include the number of shooting viewpoint positions arranged in a ring shape, the spacing between shooting viewpoint positions, and the number of layers in the height direction.
[0097] In step S202, the position and attitude calculation unit 302 calculates position and attitude information indicating the position and attitude of the camera unit 130 using SLAM with sensor data from the IMU 110 and the camera unit 130.
[0098] In step S203, the object recognition unit 303 generates a mesh (three-dimensional shape information) that shows the three-dimensional shape of the object to be studied, as three-dimensional feature information of the object, based on the depth data (sensor data from the depth sensor 120).
[0099] In step S204, the object recognition unit 303 sets the ROI (scene box BX) based on the generated mesh.
[0100] In step S205, the shooting viewpoint position calculation unit 304 calculates the shooting viewpoint position and shooting direction based on the shooting viewpoint parameters, mesh, ROI (scene box BX), and position and orientation information. For example, based on the three-dimensional shape shown by the mesh, shooting viewpoint positions are calculated at the number and intervals indicated by the shooting viewpoint parameters. Furthermore, at each shooting viewpoint position, a shooting direction is calculated that includes the ROI (scene box BX) within the shooting range. Alternatively, shooting viewpoint positions are calculated at the number and intervals indicated by the shooting viewpoint parameters based on the camera position shown by the position and orientation information.
[0101] Then, in step S206, the display control unit 305 superimposes the shooting viewpoint position guide onto the monitor image displayed on the display unit 11, based on the shooting viewpoint position and shooting direction calculated by the shooting viewpoint position calculation unit 304. At this time, the shooting viewpoint position guide is displayed in AR on the real space shown in the monitor image, in accordance with the position and orientation information calculated by SLAM.
[0102] The above process makes it possible to suitably perform shooting for object capture. Specifically, since the optimal shooting viewpoint position becomes clear during shooting for object capture, even inexperienced users can easily perform shooting, and furthermore, it is possible to reduce missed shots and improve the quality of the captured images.
[0103] <5. Modifications> In the above, the shooting navigation system to which the technology of this disclosure is applied is assumed to consist of a smartphone 10 and an ILC 20. That is, as explained with reference to Figures 1 and 2, a processing unit, a sensor group including a camera unit, and a display unit for displaying monitor images are provided in the smartphone 10, which is an information processing device, and the smartphone 10 is connected to the ILC 20, which is a shooting device that performs image acquisition processing.
[0104] This is not limited to this example; a photography navigation system applying the technology described herein may adopt other configurations.
[0105] (ILC only) Figure 20 shows another example configuration of a photography navigation system to which the technology described herein is applied.
[0106] The shooting navigation system 401 shown in Figure 20 consists only of the ILC 20. In the shooting navigation system 401, the ILC 20, which is the shooting device, includes a processing unit, a sensor group including a camera unit, and a display unit that displays a monitor image, and performs image acquisition processing. That is, in the shooting navigation system 401, the monitor image and shooting guide are displayed on the display unit 21 of the ILC 20.
[0107] (Display device and ILC) Figure 21 shows yet another example of a configuration of a photography navigation system to which the technology of this disclosure is applied.
[0108] The shooting navigation system 501 shown in Figure 21 consists of an ILC 20 and a display device 510. In the shooting navigation system 501, the ILC 20, which is a shooting device, is equipped with a sensor group including a processing unit and a camera unit, and performs image acquisition processing. The display device 510 can be composed of, for example, a smartwatch, tablet terminal or smartphone, and only needs to be able to communicate with the ILC 20 and have the function of displaying a monitor image and a shooting guide. Since the display device 510 is not physically connected to the ILC 20, the monitor image and shooting guide can be checked on the display device 510 regardless of the direction in which the ILC 20 is shooting, for example, when the photographer extends their arm as far as it can go in the height direction to take a picture.
[0109] Even with a shooting navigation system configured as described above, it is possible to suitably achieve shooting for 3D capture.
[0110] <6. Description of a computer to which the technology relating to this disclosure is applied> The series of processes described above can be executed by hardware or by software. When the series of processes are executed by software, the programs that make up the software are installed on a computer. Here, the computer includes computers that are built into dedicated hardware, and general-purpose personal computers, for example, that can perform various functions by installing various programs.
[0111] Figure 22 is a block diagram showing an example of the hardware configuration of a computer that executes the series of processes described above using a program.
[0112] In a computer, the processing circuit 901, ROM (Read Only Memory) 902, and RAM (Random Access Memory) 903 are interconnected by a bus 904.
[0113] An input / output interface 905 is further connected to the bus 904. An input unit 906, an output unit 907, a storage unit 908, a communication unit 909, and a drive 910 are connected to the input / output interface 905.
[0114] The input unit 906 may include physical or virtual operating means that the user operates to input information, such as a keyboard, mouse, or touch panel, as well as means that the user inputs information through voice, eye gaze, etc. Furthermore, the input unit 906 may include sensors for inputting various physical quantities to the computer. For example, the input unit 906 may include sensors that acquire physical quantities such as light (including infrared light other than visible light) or sound, such as a camera or microphone. Also, for example, the input unit 906 may include sensors that acquire other physical quantities such as temperature, moisture content, acceleration, distance, etc. The output unit 907 may include means that present information to the user by stimulating the user's perception, such as a display, speaker, or haptic device. The storage unit 908 is composed of a hard disk, non-volatile or volatile memory, etc., and stores various types of information (including programs). The communication unit 909 is a network interface, etc., and performs wired or wireless communication with the outside. The drive 910 drives removable media 911 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory.
[0115] The processing circuit 901 includes a processor that executes programs such as a CPU (Central Processing Unit) and a DSP (Digital Signal Processor). The processing circuit 901 (its processor) performs the above-described series of processes by loading the program stored in the storage unit 908 into the RAM 903 via the input / output interface 905 and the bus 904 and executing it. The processing circuit 901 can output the processing results of the series of processes from the output unit 907 via the bus 904 and the input / output interface 905 as needed. The processing circuit 901 can also store the processing results in the storage unit 908 or transmit them from the communication unit 909.
[0116] The program executed by the computer (processing circuit 901) can be provided by recording it on a removable medium 911, such as a package medium. The program can also be provided via wired or wireless transmission media, such as a local area network, the internet, or digital satellite broadcasting.
[0117] In a computer, a program can be installed in the storage unit 908 via the input / output interface 905 by inserting a removable media 911 into the drive 910. Alternatively, a program can be received by the communication unit 909 from another device, such as a server, via a wired or wireless transmission medium, and installed in the storage unit 908. Furthermore, programs can be pre-installed in the ROM 902 or the storage unit 908.
[0118] The programs executed by the computer may be programs that are processed chronologically in the order described herein, or they may be programs that are processed in parallel or at necessary times, such as when a call is made.
[0119] The processes that a computer performs according to a program do not necessarily have to follow the order described in the flowchart. In other words, the processes that a computer performs according to a program include processes that are executed in parallel or individually (e.g., parallel processing and object-based processing).
[0120] The program may be processed by a single computer (processor), or it may be processed in a distributed manner by multiple computers. Furthermore, the program may be transferred to a remote computer and executed there.
[0121] When the computer executes a program to perform the above-described series of processes, the input unit 906 functions as a sensor group including the camera unit 130, the processing circuit 901 functions as a processing unit 150 that realizes each functional block by executing the program, and the output unit 907 functions as a display unit 11.
[0122] In this specification, a system means one component or a collection of multiple components (devices, modules (parts), etc.). Therefore, one or more components of a computer, for example, only the processor, or a combination of the processor and memory (for example, only the processing circuit 901, or a combination of the processing circuit 901 to the bus 904, etc.), constitute a system. Regarding a collection of multiple components, it is not necessary whether all components reside in the same enclosure. Therefore, multiple devices housed in separate enclosures and connected via a network, or a single device containing multiple modules within a single enclosure, are all systems. Furthermore, for example, the entire computer, or a combination of a computer and other devices such as a server (not shown), also constitute a system.
[0123] The components (blocks) of the apparatus illustrated in this specification are functional conceptual blocks, and the actual apparatus does not need to have the illustrated configuration. That is, the apparatus can have any configuration in which the functions of the illustrated components are divided and / or integrated into any unit, for example, a configuration having one block in which the functions of all components are integrated.
[0124] The embodiments of the technology relating to this disclosure are not limited to those described above, and various modifications are possible without departing from the gist of the technology relating to this disclosure.
[0125] The effects described herein are merely illustrative and not limited to those described herein; other effects may also occur.
[0126] Furthermore, the technology relating to this disclosure can take the following configuration: (1) An information processing system including a processing circuit that acquires three-dimensional feature information relating to the environment or object that is the subject of the monitor image captured by the camera unit, calculates a plurality of shooting viewpoint positions for generating a three-dimensional model of the subject based on the three-dimensional feature information, and superimposes and displays shooting viewpoint position information indicating the shooting viewpoint positions onto the monitor image. (2) The information processing system according to (1), wherein the shooting viewpoint position information further indicates the shooting direction at the shooting viewpoint position. (3) The information processing system according to (2), wherein the processing circuit further calculates the shooting viewpoint position using the position and orientation information of the camera unit and the shooting viewpoint parameters of the shooting viewpoint position set in advance. (4) The information processing system according to (3), wherein the shooting viewpoint position information is superimposed and displayed on the monitor image in accordance with the position and orientation information calculated by SLAM. (5) The information processing system according to (4), wherein the shooting viewpoint parameters include the number of shooting viewpoint positions and the interval between the shooting viewpoint positions. (6) The information processing system according to (5), wherein the processing circuit obtains screen parameters indicating the three-dimensional shape of the screen on which the three-dimensional model of the environment is displayed as three-dimensional feature information, and further superimposes and displays a virtual screen having the same three-dimensional shape as the screen on the monitor image based on the position and orientation information and the screen parameters. (7) The information processing system according to (6), wherein the processing circuit calculates the position of the virtual screen based on the position and orientation information, and calculates the shooting viewpoint position such that the area included in the virtual screen in the monitor image is included in the shooting range. (8) The information processing system according to (7), wherein the shooting viewpoint parameter is set based on the size of the space on which the screen is installed. (9) The information processing system according to (7), wherein the space is a shooting space for virtual production. (10) The information processing system according to (5), wherein the processing circuit generates three-dimensional shape information indicating the three-dimensional shape of the object as three-dimensional feature information, and calculates the shooting viewpoint position based on the three-dimensional shape information.(11) The information processing system according to (10), wherein the processing circuit sets a region of interest based on the three-dimensional shape information and calculates the shooting viewpoint position such that the region of interest is included in the shooting range. (12) The information processing system according to (11), wherein the three-dimensional shape information includes a mesh generated by scanning the object. (13) The information processing system according to (12), wherein the mesh is generated based on the depth data of the object. (14) The information processing system according to (13), wherein the processing circuit calculates the shooting viewpoint position such that the density of the shooting viewpoint position and at least one of the shooting direction at the shooting viewpoint position are different according to the three-dimensional shape information. (15) The information processing system according to any one of (5) to (14), wherein the processing circuit executes an image acquisition process to generate the three-dimensional model when the position and orientation information and the shooting viewpoint position information match. (16) The information processing system according to (15), comprising an information processing device comprising the processing circuit, a group of sensors including the camera unit, and a display unit for displaying the monitor image, and a shooting device connected to the information processing device and performing the image acquisition process. (17) The information processing system according to (15), comprising an shooting device comprising the processing circuit, a group of sensors including the camera unit, and a display unit for displaying the monitor image and performing the image acquisition process. (18) The information processing system according to (15), comprising a shooting device comprising the processing circuit and a group of sensors including the camera unit and performing the image acquisition process, and a display device that can communicate with the shooting device and displays the monitor image. (19) An information processing method comprising: a processing circuit acquiring three-dimensional feature information relating to the environment or object that is the subject of the monitor image captured by the camera unit; calculating a shooting viewpoint position for generating a three-dimensional model of the subject based on the three-dimensional feature information; and superimposing and displaying the shooting viewpoint position information indicating the shooting viewpoint position on the monitor image.(20) A program that causes the processing circuit to perform the following processes: acquire three-dimensional feature information relating to the environment or object that is the subject of the monitor image captured by the camera unit; calculate the shooting viewpoint position for generating a three-dimensional model of the subject based on the three-dimensional feature information; and superimpose and display the shooting viewpoint position information indicating the shooting viewpoint position onto the monitor image.
[0127] 1 Shooting navigation system, 10 Smartphone, 11 Display unit, 20 ILC, 21 Display unit, 110 IMU, 120 Depth sensor, 130 Camera unit, 140 Operation unit, 210 Camera unit, 220 Operation unit, 301 Parameter acquisition unit, 302 Position and orientation information calculation unit, 303 Object recognition unit, 304 Shooting viewpoint position calculation unit, 305 Display control unit, 306 Shooting control unit, 401 Shooting navigation system, 501 Shooting navigation system, 510 Display device
Claims
1. An information processing system including a processing circuit that acquires three-dimensional feature information about the environment or object that is the subject of the monitor image captured by the camera unit, calculates a plurality of shooting viewpoint positions for generating a three-dimensional model of the subject based on the three-dimensional feature information, and superimposes and displays the shooting viewpoint position information indicating the shooting viewpoint positions onto the monitor image.
2. The information processing system according to claim 1, wherein the shooting viewpoint position information further indicates the shooting direction at the shooting viewpoint position.
3. The information processing system according to claim 2, wherein the processing circuit further uses the position and orientation information of the camera unit and the shooting viewpoint parameters of the shooting viewpoint position which have been set in advance to calculate the shooting viewpoint position.
4. The information processing system according to claim 3, wherein the shooting viewpoint position information is superimposed on the monitor image in accordance with the position and orientation information calculated by SLAM.
5. The information processing system according to claim 4, wherein the shooting viewpoint parameter includes the number of shooting viewpoint positions and the interval between the shooting viewpoint positions.
6. The information processing system according to claim 5, wherein the processing circuit acquires screen parameters indicating the three-dimensional shape of the screen on which the three-dimensional model of the environment is displayed as three-dimensional feature information, and further superimposes and displays a virtual screen having the same three-dimensional shape as the screen on the monitor image based on the position and orientation information and the screen parameters.
7. The information processing system according to claim 6, wherein the processing circuit calculates the position of the virtual screen based on the position and orientation information, and calculates the shooting viewpoint position such that the area included in the virtual screen is included in the shooting range of the monitor image.
8. The information processing system according to claim 7, wherein the shooting viewpoint parameter is set based on the size of the space in which the screen is installed.
9. The information processing system according to claim 8, wherein the space is a shooting space for virtual production.
10. The information processing system according to claim 5, wherein the processing circuit generates three-dimensional shape information indicating the three-dimensional shape of the object as the three-dimensional feature information, and calculates the shooting viewpoint position based on the three-dimensional shape information.
11. The information processing system according to claim 10, wherein the processing circuit sets a region of interest based on the three-dimensional shape information and calculates the shooting viewpoint position such that the region of interest is included in the shooting range.
12. The information processing system according to claim 11, wherein the three-dimensional shape information includes a mesh generated by scanning the object.
13. The information processing system according to claim 12, wherein the mesh is generated based on the depth data of the object.
14. The information processing system according to claim 13, wherein the processing circuit calculates the shooting viewpoint position such that the density of the shooting viewpoint position and at least one of the shooting direction at the shooting viewpoint position are different according to the three-dimensional shape information.
15. The information processing system according to claim 5, wherein the processing circuit causes the processing circuit to execute an image acquisition process for generating the three-dimensional model when the position and orientation information and the shooting viewpoint position information match.
16. The information processing system according to claim 15, comprising an information processing device having the processing circuit, a group of sensors including the camera unit, and a display unit for displaying the monitor image, and a shooting device connected to the information processing device for performing the image acquisition process.
17. The information processing system according to claim 15, comprising the processing circuit, the sensor group including the camera unit, and the imaging device which performs the image acquisition process, wherein the imaging device is further comprising the processing circuit, the sensor group including the camera unit, and the display unit which displays the monitor image.
18. The information processing system according to claim 15, comprising: an imaging device that includes the processing circuit and a sensor group including the camera unit and performs the image acquisition processing; and a display device that can communicate with the imaging device and displays the monitor image.
19. An information processing method comprising: a processing circuit acquiring three-dimensional feature information relating to the environment or object that is the subject of the monitor image captured by the camera unit; calculating a shooting viewpoint position for generating a three-dimensional model of the subject based on the three-dimensional feature information; and superimposing and displaying the shooting viewpoint position information indicating the shooting viewpoint position onto the monitor image.
20. A program that causes a processing circuit to perform the following processes: acquire three-dimensional feature information relating to the environment or object that is the subject of the monitor image captured by the camera unit; calculate the shooting viewpoint position for generating a three-dimensional model of the subject based on the three-dimensional feature information; and superimpose and display the shooting viewpoint position information indicating the shooting viewpoint position onto the monitor image.