Image processing apparatus and image processing method
The image processing apparatus enhances virtual reality experience by superimposing eyelid-dependent images on PC and webcam systems, addressing inferior immersion in non-VR setups.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- 株式会社NTTコノキュー
- Filing Date
- 2024-12-20
- Publication Date
- 2026-07-02
Smart Images

Figure 2026110219000001_ABST
Abstract
Description
Technical Field
[0001] The present invention relates to an image processing apparatus and an image processing method.
Background Art
[0002] There is known a technology that enables a user to experience virtual reality by wearing a VR (Virtual Reality) device such as an HMD (Head Mount Display). For example, Patent Document 1 discloses an information processing apparatus that allows a user wearing an HMD to move the range displayed on the HMD by moving his / her head when viewing the content of a spherical image.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] It is difficult to say that VR devices are widely popular. Today, with remote meetings becoming common, it can be said that personal computers (PCs) and web cameras are widely popular. Therefore, a system that can pseudo-experience virtual reality using a PC and a web camera is desired, but there is a problem that the immersion in the content is inferior compared to the case of using a VR device.
[0005]
Means for Solving the Problems
[0006] A preferred embodiment of the present invention provides an image processing apparatus comprising: an acquisition unit that acquires first information indicating the movement of a user's eyelids; a display control unit that displays a display image on a display device obtained by superimposing a first image having a fixed display area and a second image that obscures at least a portion of the first image; and an area determination unit that determines the area obscured by the first image by the second image based on the first information, wherein the area determination unit maximizes the area when the user's eyelids are closed and decreases the area after a first point in time when the user's eyelids begin to open.
[0007] A preferred embodiment of the present invention involves an image processing method which acquires first information indicating the movement of a user's eyelids, displays a display image on a display device which is obtained by superimposing a first image with a fixed display area and a second image that obscures at least a portion of the first image, determines the area obscured by the second image based on the first information, maximizes the area when the user's eyelids are closed, and decreases the area after a first point in time when the user's eyelids begin to open. [Effects of the Invention]
[0008] The image processing apparatus and image processing method according to the present invention enable users to easily experience virtual reality without using VR equipment. [Brief explanation of the drawing]
[0009] [Figure 1] This figure shows the overall configuration of an image processing system including an image processing device according to the first embodiment. [Figure 2] Figure 1 is a block diagram showing an example of the configuration of an image processing device. [Figure 3] This figure shows an example of a schematic arrangement of the image processing device according to the first embodiment. [Figure 4] This diagram schematically shows the setup of virtual cameras in a virtual live venue. [Figure 5] This is a schematic diagram showing the operating axis of a virtual camera. [Figure 6] This is a schematic diagram showing the axis of motion of the user's head. [Figure 7] This figure shows an example of an image displayed on a display device when the user's head is facing forward. [Figure 8] This figure shows an example of an image displayed on a display device when the user's face is turned to the left. [Figure 9] This figure shows an example of an image displayed on a display device when the user's face is turned to the right. [Figure 10] This figure shows an example of the eyelid when it is completely closed, as shown in the second image. [Figure 11] This figure shows an example of the eyelid when it is fully open, as shown in the second image. [Figure 12] This figure shows an example of an eyelid when it is only half open, as shown in the second image. [Figure 13] This figure shows an example of a display image created by superimposing the first and second images. [Figure 14] This is a schematic diagram illustrating the degree of eyelid opening. [Figure 15] This figure shows an example of the time-dependent change in the degree of virtual eyelid opening. [Figure 16] This block diagram shows an example of the server configuration in Figure 1. [Figure 17] Figure 2 is a flowchart showing an example of the image processing operation of the processing unit. [Figure 18] This is a block diagram showing an example configuration of an image processing apparatus according to the second embodiment. [Figure 19] This figure illustrates the change in brightness of the first image processed by the processing apparatus according to the second embodiment. [Figure 20] Figure 18 is a flowchart showing an example of the image processing operation of the processing unit. [Figure 21] This is a block diagram showing an example configuration of an image processing apparatus according to the third embodiment. [Figure 22] This is a diagram illustrating a predetermined range that includes the position of the virtual camera VC. [Figure 23] Figure 21 is a flowchart showing an example of the image processing operation of the processing unit. [Figure 24] This figure shows an example of a display image shown on a display device by an image processing device according to Modification 2. [Modes for carrying out the invention]
[0010] 1. First Embodiment The configuration of the image processing apparatus according to the first embodiment of the present invention will be described below with reference to Figures 1 to 17.
[0011] 1.1. Configuration of the First Embodiment 1.1.1. Overall Structure Figure 1 shows the overall configuration of an image processing system 1 including an image processing device according to the first embodiment. The image processing system 1 comprises an image processing device 10, a server 20, and a communication network NET. In the image processing system 1, the image processing device 10 and the server 20 are connected to each other via the communication network NET so that they can communicate with each other.
[0012] The image processing device 10 is a device that provides a new video experience to user U by displaying digital content on a display device in accordance with user U's movements. The display device includes an external display connected to a personal computer, a laptop computer, a tablet device, a built-in display in a smartphone, a television, and a projector. In this embodiment, an external display will be used as an example.
[0013] Server 20 is a device that provides digital content to the image processing device 10. Server 20 receives requests from the image processing device 10 via the communication network NET and delivers various digital content to the image processing device 10 in response to the requests from the image processing device 10.
[0014] In this embodiment, one image processing device 10 is connected to the communication network NET, but multiple image processing devices 10 may be connected to the communication network NET. When multiple image processing devices 10 are connected to the communication network NET, the server 20 may distribute the same digital content to the multiple image processing devices 10. Therefore, it is possible for multiple users U to view the same digital content.
[0015] In this embodiment, we will describe the digital content using an idol's live performance at a virtual venue as an example. The live performance may be delivered individually to each user U at their desired time, or it may be delivered simultaneously to each user U to create a sense of presence.
[0016] A communication network (NET) is a telecommunications line, such as a mobile communication network, managed by a telecommunications carrier providing communication services. A communication network (NET) includes either or both wired and wireless communication networks. For example, a communication network (NET) may be connected to other networks (not shown) managed by other telecommunications carriers via the Internet.
[0017] 1.1.2. Configuration of the Image Processing Device Figure 2 is a block diagram showing an example configuration of the image processing apparatus 10 shown in Figure 1. The image processing apparatus 10 comprises a processing unit 11, a storage device 12, a communication device 13, a display device 14, an input device 15, an imaging device 16, and a sound output device 17. Each element of the image processing apparatus 10 is interconnected by one or more buses for communicating information. In this specification, the term "apparatus" may be replaced with other terms such as circuit, device, unit, etc.
[0018] The processing unit 11 is a processor that controls the entire image processing unit 10, and is configured, for example, using one or more chips. The processing unit 11 is configured, for example, using a central processing unit (CPU) that includes interfaces with peripheral devices, an arithmetic unit, and registers. Some or all of the functions of the processing unit 11 may be implemented by hardware such as a DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), or FPGA (Field Programmable Gate Array). The processing unit 11 executes various processes in parallel or sequentially.
[0019] The storage device 12 is a recording medium that can be read from and written to by the processing device 11. The storage device 12 includes, for example, non-volatile memory and volatile memory. Non-volatile memory is, for example, ROM (Read Only Memory), EPROM (Erasable Programmable Read Only Memory), and EEPROM (Electrically Erasable Programmable Read Only Memory). Volatile memory is, for example, RAM (Random Access Memory).
[0020] The storage device 12 stores multiple programs, including the control program PR1, which is executed by the processing unit 11. The storage device 12 also functions as a work area for the processing unit 11.
[0021] The communication device 13 is hardware that acts as a transmitting and receiving device for communicating with other devices. The communication device 13 is also called, for example, a network device, network controller, network card, or communication module. The communication device 13 may be equipped with a connector for wired connection and an interface circuit corresponding to the connector. The communication device 13 may also be equipped with a wireless communication interface. Examples of connectors and interface circuits for wired connection include products compliant with wired LAN, IEEE1394, USB, etc. Examples of wireless communication interfaces include products compliant with wireless LAN, Bluetooth®, etc.
[0022] The display device 14 is a device that displays images and text information. The display device 14 displays various images based on control by the processing device 11. For example, various display panels such as liquid crystal panels and organic EL (Electro-Luminescence) panels are preferably used as the display device 14.
[0023] The input device 15 accepts input from user U. For example, the input device 15 includes a pointing device such as a keyboard, touchpad, touch panel, or mouse. If the input device 15 includes a touch panel, it may also function as the display device 14.
[0024] The imaging device 16 outputs an image Gx obtained by imaging the external environment. The imaging device 16 includes, for example, a lens, an image sensor, an amplifier, and an AD converter. Light focused through the lens is converted into an analog imaging signal by the image sensor. The amplifier amplifies the imaging signal and outputs it to the AD converter. The AD converter converts the amplified analog imaging signal into digital imaging information. The converted imaging information is output to the processing device 11 as an image Gx.
[0025] The imaging device 16 may be a standalone camera such as a digital still camera, video camera, webcam, or action camera, or it may be a camera built into a mobile terminal device or notebook computer. In this embodiment, the case in which a webcam is used will be described below as an example.
[0026] The sound output device 17 is a device that outputs sound. The sound output device 17 outputs various sounds based on control by the processing unit 11. The sound output device 17 includes one or more speakers.
[0027] Figure 3 shows an example of a schematic arrangement of the image processing device 10 according to the first embodiment. As shown in Figure 3, the imaging device 16, i.e., a webcam, is positioned directly above the display device 14. The sound output device 17 is positioned near the display device 14. The display device 14 is an external display. The size of the display device 14 is not particularly limited, but the larger the screen size of the display device 14, the greater the user U's sense of immersion in the image tends to be.
[0028] When user U is positioned in front of the display device 14 to view digital content, the imaging device 16 is pre-adjusted to capture at least the upper body of user U. Since the imaging device 16 is positioned directly above the display device 14, it can capture user U's face from the front when user U is looking at the display device 14. More preferably, the imaging device 16 is positioned directly above the display device 14 and in the center of the horizontal direction of the display device 14. The imaging device 16 may also be positioned directly below the display device 14. The imaging device 16 only needs to be positioned so that at least user U's eyes can be captured when user U is looking at the display device 14.
[0029] In this embodiment, the sound output device 17 is a stereo speaker. In this embodiment, a pair of left and right stereo speakers are exemplified, but the sound output device 17 may be a mono speaker, or a surround sound system speaker composed of multiple speakers, a subwoofer, etc. It may also be headphones or earphones worn on the user U's ears. To allow user U to feel immersed, it is preferable to use a sound output device 17 having multiple speakers.
[0030] Referring again to Figure 2, the processing unit 11 functions as an analysis unit 111, an acquisition unit 112, an image determination unit 113, an area determination unit 114, and a display control unit 115, for example, by reading and executing the control program PR1 from the storage device 12.
[0031] The analysis unit 111 analyzes the captured image Gx captured by the imaging device 16. For example, the analysis unit 111 analyzes the movement of the user U's upper body based on the captured image Gx using well-known motion capture technology. More specifically, the analysis unit 111 analyzes the movement of the user U's head, arms, eyelids, and eyeballs. For example, Webcam Motion Capture (https: / / webcammotioncapture.info / ja / ) uses AI (Artificial Intelligence) technology to enable tracking of hands, fingers, head, facial expressions, gaze, blinking, mouth movements, and the upper body based on images captured by a webcam.
[0032] The acquisition unit 112 acquires head motion information indicating the movement of user U's head, arm motion information indicating the movement of user U's arms, eyelid information indicating the movement of user U's eyelids, and user gaze information indicating the movement of user U's gaze from the analysis unit 111. Eyelid information is an example of the first information.
[0033] The image determination unit 113 determines the first image IM1 based on head movement information. The first image IM1 is an image showing the field of view FV of a virtual camera VC placed in the virtual space VS. The field of view of the virtual camera VC corresponds to the field of view of the user U. That is, the image determination unit 113 determines the orientation of the virtual camera VC according to the movement of the user U's head. In other words, the position and orientation of the virtual camera VC are linked to the movement of the user U. The area S1 of the first image IM1 is fixed.
[0034] The area determination unit 114 determines the area of the second image IM2 based on eyelid information. The area determination unit 114 maximizes the area ratio of the second image IM2 when the user U's eyelids are closed, and decreases the area ratio of the second image after the first time point when the user U's eyelids begin to open. The area determination unit 114 may also decrease the area ratio of the second image from a second time point, which is one hour later than the first time point.
[0035] The display control unit 115 displays a display image IMD on the display device 14, which is a superimposed image of a first image IM1 and a second image IM2 that obscures at least a portion of the first image. The first image IM1 is an image of the virtual space VS captured by a virtual camera VC installed in the virtual space VS. The second image IM2 is a mask image that changes in conjunction with the movement of the user U's eyelids.
[0036] The virtual camera and mask image will be explained below, with reference to the diagrams.
[0037] Figure 4 schematically shows the setup of the virtual camera VC in the live venue LS on the virtual space VS. The following explanation assumes that the digital content viewed by user U is a live performance by performer PF1 in the live venue LS.
[0038] As shown in FIG. 4, a stage ST is installed in the live venue LS on the virtual space VS. The position where the stage ST is installed is in the front of the venue. On the stage ST, a performer PF1 is standing. A virtual camera VC is arranged at the rear of the venue. The virtual camera VC is facing the direction of the stage ST. There is a virtual audience VA between the stage ST and the virtual camera VC. Note that, for the sake of explanation, a small-scale venue is shown as the live venue LS, but the size of the live venue LS is not particularly limited.
[0039] The position of the virtual camera VC corresponds to the position of the viewpoint of the user U. In the present embodiment, a designated seat in the live venue LS is assigned to the user U in advance, and the position of the virtual camera VC is arranged at the position of the designated seat. Note that the position of the virtual camera VC may be automatically assigned to an available position at the start of viewing the digital content, or the user U may be able to select an available position.
[0040] FIG. 5 is a schematic diagram showing the operation axes of the virtual camera VC. The center O of the imaging surface IS of the virtual camera VC V is taken as the origin, and the axis passing through the center O V and perpendicular to the imaging surface IS is taken as the x V axis, the axis passing through the center O V and parallel to the imaging surface IS and parallel to the horizontal plane is taken as the y V axis, and the axis passing through the center O V and parallel to the imaging surface IS and perpendicular to the horizontal plane is taken as the z [[ID=X]] V axis. The rotation angle of the virtual camera VC around the x V axis is taken as the first roll angle φ V and the rotation angle of the virtual camera VC around the y V axis is taken as the first pitch angle θ V and the rotation angle of the virtual camera VC around the z V axis is taken as the first yaw angle ψ V
[0041] FIG. 6 is a schematic diagram showing the operation axes of the head of the user U. The axis passing through the origin O R inside the head of the user U and in the direction in which the face of the user U is facing is taken as the x RUsing the axis, the origin O R Passing through x R An axis perpendicular to the axis and parallel to a horizontal plane (not shown) is y. R Using the axis, the origin O R Passing through x R Axis and y R A z axis perpendicular to the axis R Use as the axis. The x of user U's head. R The rotation angle around the axis is the second roll angle φ R And the y of user U's head R The rotation angle around the axis is the second pitch angle θ R And the z of user U's head R The angle of rotation around the axis is the second yaw angle ψ R Let's assume that.
[0042] Figure 7 shows an example of an image displayed on the display device 14 when the user U's head is facing forward. The display device 14 displays the field of view of the virtual camera VC. As shown in Figure 7, the stage ST and performer PF1 are displayed in the center of the field of view.
[0043] For example, if user U turns their head and faces to the left, the second yaw angle ψ R The second yaw angle ψ changes in the positive direction. The analysis unit 111 calculates the second yaw angle ψ based on the captured image Gx output from the imaging device 16. R ψ in the positive direction R1 The analysis assumes a change of °. In this case, the acquisition unit 112 determines the second yaw angle ψ R ψ in the positive direction R1 Acquire head movement information, including any changes in the degree.
[0044] The display control unit 115 determines the first image based on head movement information. The display control unit 115 determines the first image from the image before the user U turns their face to the left, taking the first yaw angle ψ V The second yaw angle ψ in the positive direction R The image is determined by changing the angle by an amount corresponding to the change in the first yaw angle ψ. In other words, the virtual camera VC changes its orientation to follow the movement of the user U's head. In this embodiment, the first yaw angle ψ V The change in and the second yaw angle ψ R It is in a proportional relationship with the rate of change.
[0045] Figure 8 shows an example of an image displayed on the display device 14 when user U is facing left. As shown in Figure 8, the stage ST and performer PF1 are displayed on the right side of the field of view.
[0046] When user U turns their face to the right, the second yaw angle ψ R The second yaw angle ψR changes in the negative direction. Based on the captured image Gx output from the imaging device 16, the analysis unit 111 determines that the second yaw angle ψR changes in the negative direction. R1 The analysis assumes that the angle changed by °. In this case, the acquisition unit 112 determines that the second yaw angle ψR is in the negative direction ψ R1 Acquire head movement information, including any changes in the degree.
[0047] The display control unit 115 calculates the first image from the image taken before user U turned their face to the right, taking the first yaw angle ψ V The second yaw angle ψ in the negative direction R The image is determined by changing the angle by an amount corresponding to the change in the value.
[0048] Figure 9 shows an example of an image displayed on the display device 14 when user U is facing to the right. As shown in Figure 9, the stage ST and performer PF1 are displayed on the left side of the field of view.
[0049] Note that the first yaw angle ψ V The change in and the second yaw angle ψ R The change in the first yaw angle ψ does not necessarily have to be proportional. V This is the second yaw angle ψ R It is sufficient for the increase to be at least monotonically proportional to the increase in [the variable].
[0050] Furthermore, the virtual camera VC may change its orientation to follow the movement of user U's head when the orientation of user U's head exceeds a first threshold. For example, the second yaw angle ψ R If the first threshold is set to 2°, the display control unit 115 will set the second yaw angle ψ R If the size is 1°, the first image will not be changed, and the second yaw angle ψ RWhen the magnitude exceeds 2°, the first image is changed. This prevents the user's viewpoint in the virtual space VS from frequently changing due to slight movements of the user U.
[0051] Furthermore, the virtual camera VC may stop tracking the user U's head movement if the change in the user U's head orientation exceeds a second threshold greater than the first threshold. For example, the second yaw angle ψ R If the second threshold is set to 30°, the second yaw angle ψ R Even when the magnitude becomes 31°, the first image shows the second yaw angle ψ R The size remains unchanged from the first image when it is 30°. This prevents the field of view of the virtual camera VC from exceeding the display range of the provided digital content.
[0052] When user U turns their face upward or downward, the second pitch angle θ R The second roll angle φ changes in the positive or negative direction. When user U tilts their head to the right or left, R The second pitch angle θ changes in either the positive or negative direction. R The first pitch angle θ with respect to the change in V Changes and 2 roll angle φ R The first roll angle φ in response to the change in V Regarding the change in the second yaw angle ψ R The first yaw angle ψ with respect to the change V Since this is similar to the change described above, the explanation will be omitted.
[0053] User U moves their head back and forth, that is, x R When moved along the axis, the analysis unit 111 calculates x based on the captured image Gx output from the imaging device 16. R The amount of movement along the axis is analyzed. In this case, the acquisition unit 112 is x R Head motion information, including the amount of head movement along the axis, is acquired.
[0054] The display control unit 115 determines the first image based on head movement information. For example, if user U moves their head forward, the display control unit 115 changes the first image to one that is closer to the stage ST than the image before user U moved their head forward. Conversely, if user U moves their head backward, the display control unit 115 changes the first image to one that is further away from the stage ST than the image before user U moved their head backward.
[0055] Methods for changing the first image to one that is closer to the stage ST include zooming in, which narrows the field of view of the virtual camera VC, and dollying in, which moves the virtual camera VC closer to the stage ST. Methods for changing the first image to one that is further away from the stage ST include zooming out, which widens the field of view of the virtual camera VC, and dollying out, which moves the virtual camera VC further away from the stage ST. In addition, a dolly zoom, which combines dollying in and zooming out, may be used.
[0056] Next, the change in the area of the second image IM2 will be explained with reference to Figures 10 to 12. Figure 10 is a diagram showing an example of the eyelids and the second image IM2 when the eyelids are completely closed. As shown in Figure 10, the upper eyelid UE and the lower eyelid LE are in contact with each other. In this case, the second image IM2 occupies the entire display area DA of the display device 14. In this embodiment, the second image IM2 is a solid black image. The second image IM2 represents that the user U's field of vision is obstructed by the user U's own eyelids. In this example, the area of the second image IM2 is at its maximum.
[0057] Figure 11 shows an example of the eyelids and the second image IM2 when the eyelids are fully open. As shown in Figure 11, the upper eyelid UE and the lower eyelid LE are separated, and the upper eyelid UE does not cover the pupil. In this case, the second image IM2 occupies the four corners of the display area DA. The upper left second image IM2UL is displayed in the upper left of the display area DA, and the upper right second image IM2UR is displayed in the upper right of the display area DA. The upper left second image IM2UL and the upper right second image IM2UR correspond to the upper eyelid UE. The lower left second image IM2LL is displayed in the lower left of the display area DA, and the lower right second image IM2LR is displayed in the lower right of the display area DA. The lower left second image IM2LL and the lower right second image IM2LR correspond to the lower eyelid LE. In this example, the area of the second image IM2 is minimal.
[0058] Figure 12 shows an example of the eyelid and second image IM2 when the eyelid is only half open. As shown in Figure 12, the eyelid is in a state where the upper eyelid UE and lower eyelid LE are separated, but the upper eyelid UE covers the upper half of the pupil IR. In this case, the second image IM2 occupies the upper and lower parts of the display area DA. The upper second image IM2U is displayed in the upper part of the display area DA, and the lower second image IM2L is displayed in the lower part of the display area DA. The upper second image IM2U and the lower second image IM2L correspond to the upper eyelid UE and lower eyelid LE, respectively. In this example, the area of the second image IM2 is the area intermediate between the maximum and minimum areas.
[0059] As explained with reference to Figures 10 to 12, there is a correlation between the degree of eyelid opening and the area of the second image IM2. More specifically, the area of the second image IM2 decreases as the degree of eyelid opening increases. Since the area of the first image IM1 is equal to the area of the display region DA, the ratio of the area of the second image IM2 to the area of the first image IM1 decreases as the degree of eyelid opening increases.
[0060] As shown in Figure 10, when user U's eyelids are completely closed, the entire display area DA becomes the second image. However, the second image IM2 may have a gap between the upper second image IM2U and the lower second image IM2L. When the areas of the upper second image IM2U and the lower second image IM2L are changed, if the upper second image IM2U and the lower second image IM2L overlap, the second image IM2 becomes a completely black solid image. Furthermore, even when user U's eyelids are completely closed, depending on the accuracy of eyelid movement detection, this may not be reflected in the second image IM2, and the gap between the upper second image IM2U and the lower second image IM2L may not be completely closed.
[0061] Figure 13 shows an example of a display image IMD, which is created by superimposing the first image IM1 and the second image IM2. In Figure 13, the second image IM2 is an image of user U with their eyelids fully open. The display image IMD is the image that is actually displayed in the display area DA and seen by user U.
[0062] Furthermore, the second image IM2 is not limited to a solid black image. The second image IM2 can be any image that does not obstruct the user U's view, such as a solid gray or dark colored image. Also, the second image IM2 may be an image in which the overlapping first image IM1 has been blurred.
[0063] Generally, the speed of a human blink is said to be 100-150 milliseconds. Assuming that the closing and opening of the eyelids occur at the same speed in a single blink, the closing and opening of the eyelids are estimated to take approximately 50-75 milliseconds each. Therefore, the area of the second image IM2 also changes between 50 and 75 milliseconds. In the explanation referring to Figures 10-12, it was assumed that the eyelids were stationary, but the area of the second image IM2 changes in sync with the movement of user U's eyelids.
[0064] In reality, the change in the area of the second image IM2 is delayed by approximately 30 to 60 milliseconds relative to the movement of the user U's eyelids due to system factors such as the image information transfer speed and image analysis time. In this embodiment, in addition to the delay caused by system factors, lag processing is performed to intentionally delay the change in the area of the second image IM2. In this specification, the intentional delay given to the change in the area of the second image IM2 will be referred to as lag. As will be explained in detail below, the purpose of lag processing is to smooth the change in the area of the second image IM2 by slowing down the response to disturbances and to give the user U a greater sense of immersion in the digital content.
[0065] Figure 14 is a schematic diagram illustrating the degree of eyelid opening. As shown in Figure 14, the degree of eyelid opening is indicated by the eyelid opening degree Op. The degree of opening Op is set to "0" when the eyelid is completely closed, "50" when the eyelid is half open, and "100" when the eyelid is completely open.
[0066] Lag processing is performed as follows: If Op is the actual eyelid opening captured for each frame of the image, and Op_ave is the virtual eyelid opening calculated for each frame by this control, then Op_ave is expressed by equation (1) below.
[0067] Op_ave = K × (Previous value of Op_ave) + (1 - K) × (Current value of Op) ... (1)
[0068] However, the coefficient K in equation (1) is a value less than "1". The larger the value of coefficient K, the slower the response to disturbances. In this embodiment, the value of coefficient K is set to "0.8", but the value of coefficient K is not particularly limited.
[0069] The following explanation uses the example of user U opening their eyelids from a completely closed state. Assuming one frame time is 15 milliseconds, and the time from when user U's eyelids are completely closed until they are fully open is 60 milliseconds, it takes 4 frames for user U's eyelids to go from completely closed to completely open. If the speed of eyelid opening is constant, the opening degree Op will be "0" in the first frame, "25" in the second frame, "50" in the third frame, "75" in the fourth frame, and "100" in the fifth frame. For simplicity, the opening degree Op will remain "100" from the sixth frame onward.
[0070] In the first frame after system startup, the "previous value of Op_ave" in equation (1) is set to "0". In the first frame, Op_ave = 0.8 × 0 + 0.2 × 0 = 0. In the second frame, the eyelid opening Op of user U is 25, so Op_ave = 0.8 × 0 + 0.2 × 25 = 5. In the third frame, the eyelid opening Op of user U is 50, so Op_ave = 0.8 × 5 + 0.2 × 50 = 14. In the fourth frame, the eyelid opening Op of user U is 75, so Op_ave = 0.8 × 14 + 0.2 × 75 = 26.2.
[0071] In the 5th frame, the eyelid opening Op of user U is 100, so Op_ave = 0.8 × 26.2 + 0.2 × 100 = 40.96. In the 6th frame, the eyelid opening Op of user U is 100, so Op_ave = 0.8 × 40.96 + 0.2 × 100 = 52.768. Calculating similarly, Op_ave is 62.2144 in the 7th frame, 69.7715 in the 8th frame, and 75.8172 in the 9th frame.
[0072] Figure 15 shows an example of the time evolution of the virtual eyelid opening degree Op_ave. As shown in Figure 15, Op_ave increases monotonically over time and asymptotically approaches 100. Note that system delays are not considered in Figure 15. Op_ave exceeds 90 at the 13th frame. Therefore, at the 12th frame, Op_ave is approximately 90. Since the actual eyelid opening degree Op of user U becomes 100 at the 5th frame, Op_ave becomes approximately 90 7 frames after the eyelid opening degree Op of user U becomes 100. Therefore, 105 milliseconds after user U completely closes their eyelids, the "eyelids" in the displayed image IMD are approximately 90% closed.
[0073] Furthermore, the case where user U's eyelids change from a fully open state to a fully closed state can be considered in the same way as the case where user U's eyelids change from a fully closed state to a fully open state. That is, 105 milliseconds after the user completely closes their eyelids, the "eyelids" in the displayed image IMD will be about 90% open. Therefore, the delay time due to lag processing can be said to be about 100 milliseconds. When the delay time caused by system factors and the delay time due to lag processing are combined, the total delay time is approximately 130 to 160 milliseconds.
[0074] The area of the second image IM2 is calculated from the virtual eyelid opening degree Op_ave. The larger the virtual eyelid opening degree Op_ave, the smaller the area of the second image IM2. That is, when the virtual eyelid opening degree Op_ave is "100", the area of the second image IM2 is the minimum, and when the virtual eyelid opening degree Op_ave is "0", the area of the second image IM2 is the maximum. The area S2 of the second image IM2 is calculated using a predetermined function F(Op_ave). Function F(Op_ave) is a monotonically decreasing function.
[0075] The above explanation assumes that the time change of user U's eyelid opening is linear, but in reality, the time change of user U's eyelid opening is not necessarily linear. Furthermore, even if the time change of user U's eyelid opening is linear, it can become nonlinear depending on the accuracy of eyelid opening detection. Even if the detected eyelid opening Op changes irregularly, the "eyelid" in the displayed image IMD changes smoothly due to lag processing, so the displayed image IMD is an image that does not feel unnatural to user U.
[0076] Furthermore, according to the above lag processing, the virtual eyelid opening degree Op_ave asymptotically approaches 100, but does not reach 100. Therefore, even if user U fully opens their eyelids, strictly speaking, the "eyelids" in the displayed image IMD are not fully open. However, since user U cannot confirm whether the second image IM2 in the displayed image IMD is about 90% open or fully open, it is unlikely that this will cause user U any discomfort. The above lag processing is performed by the area determination unit 114.
[0077] Furthermore, in the above embodiment, an example of performing lag processing using the degree of eyelid opening Op as an indicator was described, but as another method of lag processing, lag processing may be performed using the degree to which the eyelids are closed as an indicator. That is, the degree to which the eyelids are closed is "0" when the eyelids are fully open, "50" when the eyelids are half open, and "100" when the eyelids are completely closed.
[0078] According to another lag processing method, the degree to which the virtual eyelids are closed asymptotically approaches 100, so even if user U completely closes their eyelids, strictly speaking, the "eyelids" in the displayed image IMD will not be completely closed. However, since user U cannot see the displayed image IMD when their eyelids are closed, it is considered that the fact that the "eyelids" in the displayed image IMD are not completely closed does not affect user U's experience.
[0079] Furthermore, according to the inventor's findings, a delay time of 100 to 160 milliseconds is preferable from the user U's perspective, but the delay time is not limited to 100 to 160 milliseconds. The delay time may be shorter than 100 milliseconds or longer than 160 milliseconds. However, if the delay time is shorter than 100 milliseconds, user U may not easily notice that the second image IM2 in the displayed image IMD changes when they open their eyelids. Also, if the delay time is longer than 200 milliseconds, user U tends to lose the sense that the change in their own eyelids and the change in the second image IM2 visible when they open their eyelids are synchronized.
[0080] 1.1.3. Server Configuration Figure 16 is a block diagram showing an example configuration of the server 20 in Figure 1. As shown in Figure 16, the server 20 comprises a processing unit 21, a storage device 22, and a communication device 23. Each element of the server 20 is interconnected by one or more buses for communicating information.
[0081] The processing unit 21 is a processor that controls the entire server 20 and is configured, for example, using one or more chips. The processing unit 21 is configured using a central processing unit (CPU) that includes, for example, interfaces with peripheral devices, arithmetic units, registers, etc. Some or all of the functions of the processing unit 21 may be implemented by hardware such as a DSP, ASIC, PLD, FPGA, etc. The processing unit 21 executes various processes in parallel or sequentially.
[0082] The storage device 22 is a recording medium that can be read from and written to by the processing device 21. The storage device 22 includes, for example, non-volatile memory and volatile memory. Non-volatile memory is, for example, ROM, EPROM, and EEPROM. Volatile memory is, for example, RAM.
[0083] The storage device 22 stores multiple programs, including the control program PR2 for execution by the processing unit 21, and multiple digital content DCSs. The storage device 22 also functions as the work area for the processing unit 21. The control program PR2 is a program that controls the entire processing unit 21.
[0084] The communication device 23 is hardware that acts as a transmitting and receiving device for communicating with other devices. The communication device 23 is also called, for example, a network device, network controller, network card, or communication module. The communication device 23 may be equipped with a connector for wired connection and an interface circuit corresponding to the connector. The communication device 23 may also be equipped with a wireless communication interface. Examples of connectors and interface circuits for wired connection include products compliant with wired LAN, IEEE1394, and USB. Examples of wireless communication interfaces include products compliant with wireless LAN and Bluetooth®.
[0085] The processing unit 21 functions as a device that performs the following processes, for example, by reading and executing the control program PR2 from the storage device 22.
[0086] The processing unit 21 manages a website that presents user U with information about multiple digital content DCSs stored in the storage device 22. Through this website, user U can select or search for digital content they wish to view from among the multiple digital content DCSs. When user U requests to view the selected or searched digital content, the processing unit 21 distributes the requested digital content to the image processing unit 10.
[0087] 1.2. Operation of the image processing apparatus according to the first embodiment 1.2.1. Operation of the Processing Unit Figure 17 is a flowchart illustrating an example of the image processing operation of the processing unit 11 shown in Figure 2. The image processing operation of the processing unit 11 will be explained below with reference to Figure 17.
[0088] In step S11, the processing unit 11 analyzes the captured image Gx by functioning as an analysis unit 111. Specifically, the processing unit 11, by functioning as an analysis unit 111, analyzes the movement of the user U's upper body based on the captured image Gx using well-known motion capture technology, and outputs the analysis results as head movement information, eyelid information, and user gaze information.
[0089] In step S12, the processing unit 11 functions as an acquisition unit 112 to acquire head movement information and eyelid information from the analysis unit 111.
[0090] In step S13, the processing unit 11, acting as an image determination unit 113, determines the first image IM1 based on head movement information. The first image IM1 is an image within the field of view of the virtual camera VC. The field of view of the virtual camera VC changes according to the movement of the user U's head.
[0091] In step S14, the processing unit 11, functioning as an area determination unit 114, determines the area of the second image IM2 based on eyelid information. In this step, the processing unit 11 performs lag processing to smooth the change in the area S2 of the second image IM2 and to give the user U a greater sense of immersion in the digital content.
[0092] In step S15, the processing unit 11, functioning as a display control unit 115, displays a display image IMD, which is a superimposed image of the first image IM1 and the second image IM2, on the display device 14, and then terminates this routine. In this step, the processing unit 11 renders the first image IM1 determined in step S13, and displays the rendered first image IM1 and second image IM2 superimposed on the display device 14.
[0093] 1.3. Effects of the First Embodiment According to the above description, the image processing apparatus 10 according to the first embodiment comprises an acquisition unit 112, a display control unit 115, and an area determination unit 114. The acquisition unit 112 acquires eyelid information indicating the movement of the user U's eyelids. The display control unit 115 displays a display image IMD on the display device 14, which is obtained by superimposing a first image IM1 with a fixed display area and a second image that obscures at least a part of the first image IM1.
[0094] The area determination unit 114 determines the area S2 that obstructs the first image IM1 with the second image IM2, based on eyelid information. The area determination unit 114 maximizes the area S2 when the user U's eyelids are closed, and decreases the area S2 after the first time point when the user U's eyelids begin to open.
[0095] In this embodiment, a second image IM2, whose area changes in accordance with the movement of the user U's eyelids, is displayed on the display device 14 as a virtual eyelid. The area S2 of the second image IM2 is perceived by the user U as opening with a delay from the moment the user U opens their eyelids, so the user U perceives the movement of the virtual eyelids in the displayed image IMD as being synchronized with the movement of the user's own eyelids. Therefore, the user U feels as if the virtual world of the displayed image IMD is unfolding before their eyes. Thus, in this embodiment, even when the user U experiences virtual reality without using VR equipment, the sense of immersion in virtual reality can be enhanced.
[0096] Furthermore, the first image, IM1, shows the field of view of a virtual camera VC placed within the virtual space VS. The position and orientation of the virtual camera VC are linked to the movement of the user U.
[0097] According to this embodiment, the angle of the first image IM1 can be changed in conjunction with the movement of user U, thereby further enhancing user U's sense of immersion in the displayed image IMD.
[0098] Furthermore, according to the image processing method of the first embodiment, first information indicating the movement of the user U's eyelids is acquired, a display image IMD is displayed on the display device 14 by superimposing a first image IM1 with a fixed display area and a second image IM2 that obscures at least a part of the first image IM1, and based on the first information, the area S2 obscured by the second image IM2 from the first image IM1 is determined, the area S2 is maximized when the user U's eyelids are closed, and the area S2 is decreased after the first time point when the user U's eyelids begin to open.
[0099] In this embodiment, a second image IM2, whose area changes in accordance with the movement of the user U's eyelids, is displayed on the display device 14 as a virtual eyelid. The area S2 of the second image IM2 is perceived by the user U as opening with a delay from the moment the user U opens their eyelids, so the user U perceives the movement of the virtual eyelids in the displayed image IMD as being synchronized with the movement of the user's own eyelids. Therefore, the user U feels as if the virtual world of the displayed image IMD is unfolding before their eyes. Thus, in this embodiment, even when the user U experiences virtual reality without using VR equipment, the sense of immersion in virtual reality can be enhanced.
[0100] 2. Second Embodiment The configuration of the image processing apparatus according to the second embodiment of the present invention will be described below with reference to Figures 18 to 20. The image processing apparatus according to the second embodiment differs from the image processing apparatus 10 according to the first embodiment in that it temporarily increases the brightness of the displayed image IMD when the user U opens their eyelids.
[0101] 2.1. Configuration of the Second Embodiment 2.1.1. Overall Structure Figure 18 is a block diagram showing an example configuration of the image processing apparatus 10A according to the second embodiment. The image processing apparatus 10A comprises a processing apparatus 11A, a storage device 12A, a communication device 13, a display device 14, an input device 15, an imaging device 16, and a sound output device 17.
[0102] The processing unit 11A is a processor that controls the entire image processing unit 10A, and is configured, for example, using one or more chips. The processing unit 11A is configured, for example, using a central processing unit (CPU) that includes interfaces with peripheral devices, an arithmetic unit, and registers. Some or all of the functions of the processing unit 11A may be implemented by hardware such as a DSP, ASIC, PLD, FPGA, etc. The processing unit 11A executes various processes in parallel or sequentially.
[0103] The storage device 12A is a recording medium that can be read from and written to by the processing device 11A. The storage device 12A includes, for example, non-volatile memory and volatile memory. Non-volatile memory is, for example, ROM, EPROM, and EEPROM. Volatile memory is, for example, RAM.
[0104] The storage device 12A stores multiple programs, including the control program PR1A, which is executed by the processing unit 11A. The storage device 12A also functions as a work area for the processing unit 11A.
[0105] The communication device 13, display device 14, input device 15, imaging device 16, and sound output device 17 have the same configuration as the communication device 13, display device 14, input device 15, imaging device 16, and sound output device 17 according to the first embodiment, so their description is omitted.
[0106] The processing unit 11A functions as an analysis unit 111, an acquisition unit 112, an image determination unit 113, an area determination unit 114, a display control unit 115, and an adjustment unit 116, for example, by reading and executing the control program PR1A from the storage device 12A.
[0107] The analysis unit 111, acquisition unit 112, image determination unit 113, area determination unit 114, and display control unit 115 according to the second embodiment have the same configuration as the analysis unit 111, acquisition unit 112, image determination unit 113, area determination unit 114, and display control unit 115 according to the second embodiment, so their explanation is omitted.
[0108] The adjustment unit 116 adjusts the brightness of the first image IM1 from the second brightness, which is lower than the reference first brightness, to the third brightness, which is higher than the first brightness, and then adjusts it so that it transitions from the third brightness to the first brightness by the third time point. Brightness is an example of lightness. The adjustment unit 116 adjusts the brightness of the first image IM1 according to the degree of eyelid opening from the first time point onward.
[0109] Note that brightness may be adjusted instead of lightness. That is, the adjustment unit 116 may adjust the brightness of the first image IM1 from a second brightness lower than the reference first brightness to a third brightness higher than the first brightness after the first time point, and then adjust it from the third brightness to the first brightness by the third time point. Brightness is an example of lightness.
[0110] Figure 19 is a diagram illustrating the change in brightness of the first image IM1 processed by the processing device 11A according to the second embodiment. In Figure 19, the upper panel shows the change in the eyelid opening Op over time, and the lower panel shows the change in the brightness of the first image IM1 over time. Here, the first image IM1 is a still image, and the brightness of the first image IM1 is the average brightness of the entire image. Note that, for the sake of explanation, delays due to system delays and lag processing are not considered in Figure 19.
[0111] Within the range shown in Figure 19, user U opens and closes their eyelids twice. Specifically, user U has their eyelids completely closed before time T1, and begins to open their eyelids at time T1. The degree of eyelid opening Op before time T1 is "0". User U maintains their eyelids completely open from time T2 to time T4, and begins to close their eyelids at time T4. That is, the degree of eyelid opening Op between time T2 and time T4 is "100". User U completely closes their eyelids at time T5. The change in the degree of eyelid opening Op from time T6 to T10 is the same as the change in the degree of eyelid opening Op from time T1 to T5.
[0112] The brightness of the first image IM1 is set to a second brightness B2 at time T1, when the eyelid opening Op begins to increase from "0", and the brightness of the first image IM1 increases as the opening Op increases. Here, the standard first brightness B1 refers to the original brightness of the first image IM1 when no brightness adjustment is performed. At time T2, when the opening Op becomes "100", the brightness of the first image IM1 is set to a third brightness B3, which is higher than the first brightness. After time T2, the brightness of the first image IM1 decreases until time T3, at which time T3 the brightness of the first image IM1 is set to the standard first brightness B1. Thereafter, the brightness is maintained at the first brightness B1 until time T4 when the eyelid is closed.
[0113] As shown in Figure 19, in this embodiment, for the sake of simplicity, an example is shown in which the brightness of the first image IM1 reaches the maximum third brightness B3 at time T2 when the eyelid opening Op is "100". However, the time at which the brightness of the first image IM1 reaches the third brightness B3 may be later or earlier than time T2. Note that the actual time at which the brightness of the first image IM1 reaches the third brightness will be delayed due to system delays and lag processing.
[0114] This can enhance the user U's sense of immersion. The brightness of the first image IM1 may be maintained at the third brightness level B3 for a period of time, and then reduced to the standard first brightness level B1 after a predetermined period of time.
[0115] 2.2. Operation of the image processing apparatus according to the second embodiment 2.2.1. Operation of the Processing Unit Figure 20 is a flowchart illustrating an example of the image processing operation of the processing unit 11A shown in Figure 18. The image processing operation of the processing unit 11A will be described below with reference to Figure 20. In Figure 20, the same reference numerals are used for steps that are the same as those in Figure 17.
[0116] In step S11, the processing unit 11A analyzes the captured image Gx by functioning as an analysis unit 111. Specifically, the processing unit 11A, by functioning as an analysis unit 111, analyzes the movement of the user U's upper body based on the captured image Gx using well-known motion capture technology, and outputs the analysis results as head movement information, eyelid information, and user gaze information.
[0117] In step S12, the processing unit 11A functions as an acquisition unit 112 to acquire head movement information and eyelid information from the analysis unit 111.
[0118] In step S13, the processing unit 11A, functioning as an image determination unit 113, determines the first image IM1 based on head movement information. The first image IM1 is an image within the field of view of the virtual camera VC. The field of view of the virtual camera VC changes according to the movement of the user U's head.
[0119] In step S14, the processing unit 11A, functioning as an area determination unit 114, determines the area of the second image IM2 based on eyelid information. In this step, the processing unit 11A performs lag processing to smooth the changes in the area S2 of the second image IM2 and to give the user U a greater sense of immersion in the digital content.
[0120] In step S21, the processing unit 11A functions as an adjustment unit 116 to adjust the brightness of the first image IM1 based on eyelid information. Specifically, as shown in Figure 19, the processing unit 11A uniformly changes the brightness of the entire first image IM1 according to the degree of eyelid opening Op.
[0121] In step S15, the processing unit 11A, functioning as a display control unit 115, displays a display image IMD, which is a superimposed image of the first image IM1 and the second image IM2, on the display device 14, and then terminates this routine. In this step, the processing unit 11A renders the first image IM1 determined in step S13, and displays the rendered first image IM1 and second image IM2 superimposed on the display device 14.
[0122] 2.3. Effects of the Second Embodiment According to the above description, the image processing apparatus 10A according to the second embodiment comprises an analysis unit 111, an acquisition unit 112, an image determination unit 113, an area determination unit 114, a display control unit 115, and an adjustment unit 116.
[0123] The adjustment unit 116 adjusts the brightness of the first image IM1 from a second brightness lower than the standard first brightness to a third brightness higher than the first brightness, and then adjusts it so that it transitions from the third brightness to the first brightness by the third time point. The adjustment unit 116 adjusts the brightness of the first image IM1 according to the degree of eyelid opening Op of the user U from the first time point onward.
[0124] In this configuration, as user U opens their eyelids, user U's field of vision gradually brightens from a dark area, and fades in from a point where user U perceives it as dazzling, allowing them to see the stage ST, thereby further enhancing user U's sense of immersion.
[0125] Furthermore, the adjustment unit 116 adjusts the overall brightness of the first image IM1.
[0126] According to this embodiment, the fade-in effect can be further enhanced.
[0127] 3. Third Embodiment The configuration of the image processing apparatus according to the third embodiment of the present invention will be described below with reference to Figures 21 to 23. The image processing apparatus according to the third embodiment differs from the image processing apparatus 10 according to the first embodiment and the image processing apparatus 10A according to the second embodiment in that when user U recognizes that he or the performer has made eye contact with the performer, the performer takes an action that suggests he or the performer has noticed the presence of user U.
[0128] 3.1. Configuration of the Third Embodiment 3.1.1. Configuration of the Image Processing Device Figure 21 is a block diagram showing an example configuration of the image processing apparatus 10B according to the third embodiment. The image processing apparatus 10B includes a processing apparatus 11B, a storage device 12B, a communication device 13, a display device 14, an input device 15, an imaging device 16, and a sound output device 17.
[0129] The processing unit 11B is a processor that controls the entire image processing unit 10B, and is configured, for example, using one or more chips. The processing unit 11B is configured, for example, using a central processing unit (CPU) that includes interfaces with peripheral devices, an arithmetic unit, and registers. Some or all of the functions of the processing unit 11B may be implemented by hardware such as a DSP, ASIC, PLD, FPGA, etc. The processing unit 11B executes various processes in parallel or sequentially.
[0130] The storage device 12B is a recording medium that can be read from and written to by the processing device 11B. The storage device 12B includes, for example, non-volatile memory and volatile memory. Non-volatile memory is, for example, ROM, EPROM, and EEPROM. Volatile memory is, for example, RAM.
[0131] The storage device 12B stores multiple programs, including the control program PR1B, which is executed by the processing unit 11B. The storage device 12B also functions as a work area for the processing unit 11B.
[0132] The communication device 13, display device 14, input device 15, imaging device 16, and sound output device 17 have the same configuration as the communication device 13, display device 14, input device 15, imaging device 16, and sound output device 17 according to the first embodiment, so their description is omitted.
[0133] The processing unit 11B functions as an analysis unit 111, an acquisition unit 112B, an image determination unit 113, an area determination unit 114, a display control unit 115B, and a gaze determination unit 117, for example, by reading and executing the control program PR1B from the storage device 12B.
[0134] The analysis unit 111, image determination unit 113, and area determination unit 114 have the same configuration as the analysis unit 111, image determination unit 113, and area determination unit 114 according to the first embodiment, so their description is omitted.
[0135] The acquisition unit 112A acquires head movement information, arm movement information, eyelid information, and user gaze information, as well as performer gaze information, performer position information, and virtual camera position information. Performer gaze information is information about the gaze of performer PF1. Performer gaze information is pre-included in the digital content. Performer position information is information about the position of performer PF1 in the virtual space VS. Performer position information is pre-included in the digital content. Virtual camera position information is information about the position of virtual camera VC in the virtual space VS. The position of virtual camera VC is, for example, the center O of the imaging plane IS of virtual camera VC. V It is defined as the position of
[0136] The gaze determination unit 117 determines, based on the performer's gaze information, the performer's position information, and the virtual camera position information, whether the performer's gaze LV intersects with a predetermined range PR including the position of the virtual camera VC for at least one hour. Figure 22 is a diagram illustrating the predetermined range including the position of the virtual camera VC. As shown in Figure 22, the predetermined range PR is the center of the imaging plane IS of the virtual camera VC. V It is defined as a sphere with radius R centered at [a specific point].
[0137] The radius R may be a predetermined value, or it may be a value corresponding to the distance between the performer PF1 and the virtual camera VC. More specifically, it is preferable that the radius R be larger as the distance between the performer PF1 and the virtual camera VC increases. The first time is set to, for example, 1 second. If the time during which the performer PF1's line of sight intersects with the predetermined range is less than the first time, it is possible that the performer PF1's line of sight has simply passed through the predetermined range, so it is preferable to set a predetermined time.
[0138] Referring again to Figure 21, the display control unit 115B performs a wink action for performer PF1 if the gaze determination unit 117 determines that the gaze LV of performer PF1 intersects with a predetermined range PR. After the wink action is performed, the display control unit 115B displays a display image IMD, which is a superimposed image of the first image IM1 and the second image IM2, on the display device 14.
[0139] 3.2. Operation of the image processing apparatus according to the third embodiment 3.2.1. Operation of the Processing Unit Figure 23 is a flowchart illustrating an example of the image processing operation of the processing unit 11B shown in Figure 21. The image processing operation of the processing unit 11B will be described below with reference to Figure 23. In Figure 23, the same reference numerals are used for steps that are the same as those in Figure 17.
[0140] In step S11, the processing unit 11B analyzes the captured image Gx by functioning as an analysis unit 111. Specifically, the processing unit 11B, by functioning as an analysis unit 111, analyzes the movement of the user U's upper body based on the captured image Gx using well-known motion capture technology, and outputs the analysis results as head movement information, eyelid information, and user gaze information.
[0141] In step S12, the processing unit 11B functions as an acquisition unit 112B to acquire head movement information and eyelid information from the analysis unit 111.
[0142] In step S13, the processing unit 11B, functioning as an image determination unit 113, determines the first image IM1 based on head movement information. The first image IM1 is an image within the field of view of the virtual camera VC. The field of view of the virtual camera VC changes according to the movement of the user U's head.
[0143] In step S14, the processing unit 11B, functioning as an area determination unit 114, determines the area of the second image IM2 based on eyelid information. In this step, the processing unit 11B performs lag processing to smooth the change in the area S2 of the second image IM2 and to give the user U a greater sense of immersion in the digital content.
[0144] In step S31, the processing unit 11B functions as an acquisition unit 112B to acquire performer gaze information, performer position information, and virtual camera position information.
[0145] In step S32, the processing unit 11B functions as a gaze determination unit 117 to determine whether the gaze LV of the performer PF1 intersects with a predetermined range PR for one hour or longer.
[0146] If it is determined that the gaze line LV of performer PF1 intersects with the predetermined range PR for one hour or longer, that is, if the determination result in step S32 is positive, the processing unit 11B functions as a display control unit 115 and performs the wink operation process for performer PF1 in step S33.
[0147] In step S15, the processing unit 11B functions as a display control unit 115B and displays a display image IMD, which is a superimposed image of the first image IM1 and the second image IM2, on the display device 14, and then terminates this routine. In this step, the processing unit 11B renders the first image IM1, which was processed in step S32 when the performer PF1 performed a wink action, and displays the rendered first image IM1 and second image IM2 superimposed on the display device 14. As a result, when user U perceives that they have made eye contact with performer PF1, performer PF1 winks, allowing user U to experience a high level of satisfaction.
[0148] On the other hand, if it is not determined that the performer PF1's line of sight LV intersects with the predetermined range PR for more than one hour, that is, if the determination result in step S32 is negative, the processing unit 11B will not execute the processing in step S33, but will function as a display control unit 115B to display a display image IMD, which is a superimposed image of the first image IM1 and the second image IM2, on the display device 14, and terminate this routine. In other words, in this case, performer PF1 will not wink.
[0149] 3.3. Effects of the First Embodiment According to the above description, the image processing apparatus 10B according to the third embodiment comprises an analysis unit 111, an acquisition unit 112B, an image determination unit 113, an area determination unit 114, a display control unit 115B, and a gaze determination unit 117. The acquisition unit 112B acquires head movement information, arm movement information, eyelid information, user gaze information, performer gaze information, performer position information, and virtual camera position information.
[0150] The gaze determination unit 117 determines whether the gaze LV of performer PF1 intersects with a predetermined range PR for at least one hour, based on performer gaze information, performer position information, and virtual camera position information. The predetermined range PR is the range that includes the position of the virtual camera VC.
[0151] If the gaze determination unit 117 determines that the gaze of performer PF1 intersects with a predetermined range PR for one hour or longer, the display control unit 115B performs an action that indicates performer PF1 has noticed the presence of user U, such as a wink.
[0152] In this configuration, when user U perceives that they have made eye contact with performer PF1, performer PF1 winks, allowing user U to experience a high level of satisfaction.
[0153] 4. Variations This disclosure is not limited to the embodiments illustrated above. Specific variations are illustrated below. Two or more embodiments may be arbitrarily selected from the following examples and combined. Furthermore, the embodiments described above and the variations described below can be combined arbitrarily as long as they do not contradict each other.
[0154] 4.1. Variation 1 In each of the embodiments described above, there was one performer PF1, but the number of performers is not limited to one. Multiple performers may be on the stage ST. If there are multiple performers, any performer whose line of sight intersects with a predetermined range including the position of the virtual camera VC for one hour or more may wink. Furthermore, it may be possible for user U to designate only their favorite performer, their so-called "favorite," among the multiple performers to wink.
[0155] 4.2. Variation 2 In each of the above embodiments, a virtual arm may be displayed in the first image IM1 in response to the user U raising their arm based on arm movement information. Figure 24 is a diagram showing an example of a display image IMD displayed on the display device 14 by the image processing device 10 according to Modification 2. Figure 24 shows the display image IMD when the user U raises their right arm. In this example, a virtual right arm ARM is displayed in the first image IM1 from the user's perspective. The virtual right arm ARM may be holding cheering goods such as a glow stick or a fan. The cheering goods may be included as standard with the digital content, or the user U may purchase the cheering goods. In the example shown in Figure 24, the virtual right arm ARM is holding a glow stick GS.
[0156] 4.3. Variation 3 In the embodiments described above, an example was given in which one imaging device 16 is used to track user U. However, multiple imaging devices may be used to track user U. Using multiple imaging devices enables more accurate tracking. In addition, in the embodiments described above, only an imaging device 16 such as a webcam was used as a motion detection device to detect the movement of user U's head, gaze, and eyelids. However, eye trackers, acceleration sensors, gyroscopes, etc. may also be used as motion detection devices.
[0157] The eye tracker tracks the movement of the user U's left and right eyeballs and detects the movement of the user U's left and right eyelids. The eye tracker is installed above or below the display device 14, similar to a webcam. The eye tracker may also be worn on the user U's head as a glasses-type wearable device. The accelerometer and gyroscope are worn on the user U's head as glasses-type wearable devices.
[0158] 4.4. Variation 4
[0159] In the second embodiment, the adjustment unit 116 is shown to uniformly change the brightness of the entire first image IM1. However, the present invention is not limited to a configuration in which the brightness of the entire first image IM1 is uniformly changed. For example, the adjustment unit 116 may change the brightness of the center of the first image IM1 so that it is higher than the brightness of the surrounding area, or it may change the brightness of a predetermined range including the position of the performer PF1 so that it is higher than the brightness outside the predetermined range.
[0160] 4.5. Modification 5 In the third embodiment, a winking motion processing for performer PF1 was shown when the performer PF1's line of sight intersected a predetermined range for one hour or more. However, the present invention is not limited to the implementation of a winking motion processing. For example, the display control unit 115B may cause performer PF1 to wave instead of winking. Alternatively, performer PF1 may turn its face towards user U, speak to them, or perform actions that indicate performer PF1 has noticed user U's presence. Or, performer PF1 may make a pose as if shooting the virtual camera VC with its fingers.
[0161] 4.6. Variation 6 Although the above embodiments were described using digital content of live performances in a virtual space as an example, the digital content is not limited to live performances. The digital content may be computer games such as esports or shooting games, or experiential content such as online shopping or virtual travel.
[0162] 5. Others (1) In the embodiments described above, the storage devices 12, 12A, 12B, and 22 are exemplified by ROM and RAM, but can also be flexible disks, magneto-optical disks (e.g., compact disks, digital multipurpose disks, Blu-ray® disks), smart cards, flash memory devices (e.g., cards, sticks, key drives), CD-ROMs (Compact Disc-ROMs), registers, removable disks, hard disks, floppy® disks, magnetic strips, databases, servers, and other suitable storage media. The program may also be transmitted from a network via a telecommunications line. The program may also be transmitted from a communication network NET via a telecommunications line.
[0163] (2) In the embodiments described above, the information, signals, etc. may be represented using any of the various different techniques. For example, the data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be mentioned throughout the above description may be represented by voltage, current, electromagnetic waves, magnetic fields or magnetic particles, optical fields or photons, or any combination thereof.
[0164] (3) In the embodiments described above, the input and output information may be stored in a specific location (e.g., memory) or managed using a management table. The input and output information may be overwritten, updated, or appended to. The output information may be deleted. The input information may be transmitted to other devices.
[0165] (4) In the embodiments described above, the determination may be made by a value represented using 1 bit (0 or 1), by a boolean value (true or false), or by a numerical comparison (for example, a comparison with a predetermined value).
[0166] (5) The processing procedures, sequences, flowcharts, etc., exemplified in the embodiments described above may be rearranged in order, as long as they do not contradict each other. For example, the methods described in this disclosure present various step elements using an exemplary order and are not limited to the specific order presented.
[0167] (6) Each function illustrated in Figures 1 to 24 is implemented by any combination of at least one of hardware and software. Furthermore, the method of implementing each functional block is not particularly limited. That is, each functional block may be implemented using one device that is physically or logically coupled, or it may be implemented using two or more physically or logically separated devices that are directly or indirectly connected (for example, using wired or wireless connections). A functional block may also be implemented by combining the above one device or the above multiple devices with software.
[0168] (7) The programs illustrated in the embodiments described above should be broadly interpreted to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executable files, execution threads, procedures, functions, etc., whether they are called software, firmware, middleware, microcode, hardware description languages or by other names.
[0169] Furthermore, software, instructions, information, etc., may be transmitted and received via a transmission medium. For example, if software is transmitted from a website, server, or other remote source using at least one of wired technology (such as coaxial cable, fiber optic cable, twisted pair, or digital subscriber line (DSL)) and wireless technology (such as infrared or microwave), then at least one of these wired and wireless technologies is included in the definition of a transmission medium.
[0170] (8) In each of the above-mentioned forms, the terms “system” and “network” shall be used interchangeably.
[0171] (9) The information, parameters, etc. described in this disclosure may be expressed using absolute values, relative values from a given value, or other corresponding information.
[0172] (10) In the embodiments described above, the image processing devices 10, 10A, and 10B may be mobile stations (MS). A mobile station may also be referred to by those skilled in the art as a subscriber station, mobile unit, subscriber unit, wireless unit, remote unit, mobile device, wireless device, wireless communication device, remote device, mobile subscriber station, access terminal, mobile terminal, wireless terminal, remote terminal, handset, user agent, mobile client, client, or several other appropriate terms. In this disclosure, terms such as “mobile station,” “user terminal,” “user equipment (UE),” and “terminal” may be used interchangeably.
[0173] (11) In the embodiments described above, the terms “connected,” “coupled,” or any variation thereof, mean any direct or indirect connection or coupling between two or more elements, and may include the presence of one or more intermediate elements between two elements that are “connected” or “coupled” with each other. The coupling or connection between elements may be a physical coupling or connection, a logical coupling or connection, or a combination thereof. For example, “connection” may be reinterpreted as “access.” As used in this disclosure, two elements may be considered to be “connected” or “coupled” with each other using at least one of one or more wires, cables and printed electrical connections, and, in some non-limiting and non-exclusive examples, electromagnetic energy having wavelengths in the radio frequency domain, microwave domain and optical (both visible and invisible) domain.
[0174] (12) In the embodiments described above, the phrase “based on” does not mean “based solely on” unless otherwise specified. In other words, the phrase “based on” means both “based solely on” and “based at least on.”
[0175] (13) The terms “determining” and “determining” as used in this disclosure may encompass a wide variety of actions. “Determining” may include, for example, judging, calculating, computing, processing, deriving, investigating, looking up, searching, inquiry (e.g., searching in a table, database or other data structure), and ascertaining. “Determining” may also include, for example, receiving (e.g., receiving information), transmitting (e.g., sending information), input, output, and accessing (e.g., accessing data in memory). Furthermore, "judgment" and "decision" can include considering something as having been "judged" or "decided" after resolving, selecting, choosing, establishing, comparing, etc. In other words, "judgment" and "decision" can include considering something as having been "judged" or "decided" after some action. Also, "judgment (decision)" can be reinterpreted as "assuming," "expecting," or "considering."
[0176] (14) Where the terms “include,” “including,” and variations thereof are used in the embodiments described above, these terms are intended to be inclusive, as is the term “comprising.” Furthermore, the term “or” as used in this disclosure is not intended to be exclusive OR.
[0177] (15) In the present disclosure, if articles are added by translation, such as a, an, and the in English, the present disclosure may include the fact that the noun following these articles is plural.
[0178] (16) In this disclosure, the term “A and B are different” may mean “A and B are different from each other.” The term may also mean “A and B are each different from C.” Terms such as “separate” and “combine” may be interpreted in the same way as “different.”
[0179] (17) Each aspect / embodiment described herein may be used individually, in combination, or switched between as needed in practice. Furthermore, notification of certain information (e.g., notification that "X is") is not limited to explicit notification, but may also be implicit (e.g., by not providing such notification).
[0180] Although the present disclosure has been described in detail above, it will be clear to those skilled in the art that the present disclosure is not limited to the embodiments described herein. The present disclosure can be implemented in modified and altered forms without departing from the intent and scope of the present disclosure as defined by the claims. Accordingly, the descriptions in the present disclosure are illustrative and not restrictive in any way. [Explanation of Symbols]
[0181] 1...Image processing system, 10...Image processing device, 20...Server, 112...Acquisition unit, 113...Image determination unit, 114...Area determination unit, 115...Display control unit, 116...Adjustment unit, IM1...First image, IM2...Second image, IMD...Display image, S2...Area, U...User, VC...Virtual camera, VS...Virtual space.
Claims
1. An acquisition unit that acquires first information indicating the movement of the user's eyelids, A display control unit that displays a display image on a display device obtained by superimposing a first image with a fixed display area and a second image that obscures at least a part of the first image, An area determination unit that determines the area that the second image obstructs the first image based on the first information, Equipped with, The area determination unit maximizes the area when the user's eyelids are closed, and decreases the area after the first point in time when the user's eyelids begin to open. Image processing device.
2. The system further includes an adjustment unit for adjusting the brightness of the first image, The adjustment unit adjusts the brightness of the first image from a second brightness that is darker than the standard first brightness to a third brightness that is brighter than the first brightness, and then adjusts it so that it transitions from the third brightness to the first brightness by the third time point. The image processing apparatus according to claim 1.
3. The adjustment unit adjusts the overall brightness of the first image. The image processing apparatus according to claim 2.
4. The first image above is an image showing the field of view of a virtual camera placed in a virtual space, The position and orientation of the virtual camera are linked to the user's movements. The image processing apparatus according to claim 1.
5. First information is obtained that shows the movement of the user's eyelids. A display image is shown on a display device by superimposing a first image with a fixed display area and a second image that obscures at least a portion of the first image. Based on the first information, the area that the second image obstructs the first image is determined. The area is maximized when the user's eyelids are closed, and the area is reduced after the first point in time when the user's eyelids begin to open. Image processing methods.