Electronic device for editing visual object in image, and non-transitory computer-readable storage medium
The electronic device and storage medium use a trained model to upscale and edit visual objects in motion photos, addressing alignment issues by enhancing resolution and pose alignment to match user intent, thereby improving visual object representation.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SAMSUNG ELECTRONICS CO LTD
- Filing Date
- 2025-10-29
- Publication Date
- 2026-06-18
Smart Images

Figure KR2025017405_18062026_PF_FP_ABST
Abstract
Description
Electronic device for editing visual objects within an image and non-transient computer-readable storage medium
[0001] The following descriptions relate to an electronic device for editing visual objects within an image and a non-transient computer-readable storage medium.
[0002] An electronic device can identify a visual object within an image to edit the visual object included in the image. For example, the electronic device can edit the visual object by modifying, deleting, or replacing the visual object identified within the image.
[0003] The information described above may be provided as related art for the purpose of aiding understanding of the present disclosure. No claim or determination is made as to whether any of the foregoing may be applied as prior art related to the present disclosure.
[0004] An electronic device is provided. The electronic device may include at least one processor comprising a processing circuit and a memory comprising one or more storage media for storing instructions. The instructions may cause the electronic device to identify an image containing a visual object when executed individually or collectively by the at least one processor. The image may have a first resolution. The instructions may cause the electronic device to receive, through the electronic device, an input for editing the visual object of the image when executed individually or collectively by the at least one processor. The instructions may cause the electronic device to identify a video acquired together with the image based on the input when executed individually or collectively by the at least one processor. The video frames of the video may have a second resolution lower than the first resolution. The above instructions may cause the electronic device to identify, among the video frames, a video frame containing a visual object to be used for editing the visual object of the image, when executed individually or collectively by the at least one processor. The above instructions may cause the electronic device to upscale the visual object within the video frame based on identifying the video frame when executed individually or collectively by the at least one processor. The above instructions may cause the electronic device to identify an upscaled visual object of the video frame based on performing the upscaling when executed individually or collectively by the at least one processor.The above instructions, when executed individually or collectively by the at least one processor, may cause the electronic device to edit the visual object of the image using the upscaled visual object of the video frame.
[0005] A non-transient computer-readable storage medium is provided. The non-transient computer-readable storage medium may store one or more programs. The one or more programs may include instructions that cause the electronic device to identify an image containing a visual object when executed by the electronic device. The image may have a first resolution. The one or more programs may include instructions that cause the electronic device to receive input through the electronic device for editing the visual object of the image when executed by the electronic device. The one or more programs may include instructions that cause the electronic device to identify a video acquired together with the image based on the input when executed by the electronic device. The video frames of the video may have a second resolution lower than the first resolution. The one or more programs may include instructions that cause the electronic device to identify, when executed by the electronic device, a video frame containing a visual object to be used for editing the visual object of the image among the video frames. The one or more programs may include instructions that cause the electronic device to upscale the visual object within the video frame based on identifying the video frame when executed by the electronic device. The one or more programs may include instructions that cause the electronic device to identify an upscaled visual object of the video frame based on performing the upscaling when executed by the electronic device.The above one or more programs may include instructions that cause the electronic device to edit the visual object of the image using the upscaled visual object of the video frame when executed by the electronic device.
[0006] An electronic device is provided. The electronic device may include at least one processor comprising a display and a processing circuit, and a memory comprising one or more storage media for storing instructions. When the instructions are executed individually or collectively by the at least one processor, the electronic device may cause an image containing a visual object to be displayed through the display. The image may have a first resolution. When the instructions are executed individually or collectively by the at least one processor, the electronic device may cause an input to be received through the electronic device for editing the visual object of the image. When the instructions are executed individually or collectively by the at least one processor, the electronic device may cause a video acquired together with the image to be identified based on the input. The video frames of the video may have a second resolution lower than the first resolution. The above instructions may cause the electronic device to identify a video frame containing a visual object to be used for editing the visual object of the image among the video frames, when executed individually or collectively by the at least one processor. The above instructions may cause the electronic device to display a user interface (UI) object representing the visual object of the video frame through the display, when executed individually or collectively by the at least one processor. The above instructions may cause the electronic device to upscale the visual object within the video frame, when executed individually or collectively by the at least one processor.The above instructions may cause the electronic device to edit the visual object of the image using the upscaled visual object of the video frame when executed individually or collectively by the at least one processor. The above instructions may cause the electronic device to display the image containing the edited visual object through the display when executed individually or collectively by the at least one processor.
[0007] A non-transient computer-readable storage medium is provided. The non-transient computer-readable storage medium may store one or more programs. The one or more programs may include instructions that cause the electronic device to display an image containing a visual object through the display when executed by the electronic device having a display. The image may have a first resolution. The one or more programs may include instructions that cause the electronic device to receive input through the electronic device for editing the visual object of the image when executed by the electronic device. The one or more programs may include instructions that cause the electronic device to identify a video acquired together with the image based on the input when executed by the electronic device. The video frames of the video may have a second resolution lower than the first resolution. The one or more programs may include instructions that cause the electronic device to identify, when executed by the electronic device, a video frame containing a visual object to be used for editing the visual object of the image among the video frames. The one or more programs may include instructions that cause the electronic device to display, through the display, a user interface (UI) object representing the visual object of the video frame, when executed by the electronic device. The one or more programs may include instructions that cause the electronic device to upscale the visual object within the video frame based on input to the UI object when executed by the electronic device.The one or more programs mentioned above may include instructions that cause the electronic device to edit the visual object of the image using the upscaled visual object of the video frame when executed by the electronic device. The one or more programs mentioned above may include instructions that cause the electronic device to display the image containing the edited visual object through the display when executed by the electronic device.
[0008] Figure 1 is a schematic view of an exemplary electronic device.
[0009] FIGS. 2A and FIGS. 2B illustrate examples of motion photos displayed through the display of an electronic device.
[0010] Figure 3 is a flowchart illustrating a method for editing visual objects within an image of a motion photo.
[0011] FIGS. 4a and FIGS. 4b illustrate examples of operations for upscaling visual objects within video frames of motion photos.
[0012] FIG. 5 illustrates an example of an operation to edit a visual object within an image of a motion photo using an upscaled visual object.
[0013] Figure 6 is a flowchart illustrating a method for editing a visual object according to the pose of the visual object within an image of a motion photo.
[0014] FIGS. 7A and 7B illustrate examples of operations for editing a visual object according to the pose of the visual object within an image of a motion photo.
[0015] FIGS. 8A and FIGS. 8B illustrate examples of a user interface for editing one or more visual objects within an image of a motion photo.
[0016] FIG. 9 is a block diagram of an electronic device in a network environment according to various embodiments.
[0017] Figure 1 is a schematic view of an exemplary electronic device.
[0018] Referring to FIG. 1, the electronic device (101) may include at least one processor (110), a memory (120), at least one camera (130), and a display (140). The electronic device (101) may include at least a part of the electronic device (901) of FIG. 9 or correspond to at least a part of the electronic device (901) of FIG. 9.
[0019] At least one processor (110) may include a processing circuit. At least one processor (110) may include a single processor or multiple processors. At least one processor (110) may control the memory (120) and / or one or more components (at least one camera (130) and a display (140)) of the electronic device (101). For example, at least one processor (110) may include at least a part of the processor (920) of FIG. 9 or correspond to at least a part of the processor (920) of FIG. 9. For example, at least one processor (110) may include an image signal processor included in the camera module (980) of FIG. 9.
[0020] Memory (120) may store one or more programs configured to be executed individually and / or collectively by at least one processor (110). The one or more programs may include instructions. The instructions may cause an electronic device (101) to perform operations described with reference to FIGS. 2a through 8b. Memory (120) may include one or more storage media. At least some of the one or more programs may be available to manage, control, and / or execute gallery application software described below. For example, memory (120) may include at least some of the memory (930) of FIG. 9 or correspond to at least some of the memory (930) of FIG. 9.
[0021] At least one camera (130) can capture (or take) images (e.g., still images) and video. For example, at least one camera (130) may include one or more lenses, image sensors, and / or flashes. For example, at least one camera (130) may include at least a part of the camera module (980) of FIG. 9 or correspond to at least a part of the camera module (980) of FIG. 9.
[0022] A display (140) can visually provide information to an external (e.g., user) outside of an electronic device (101). For example, the display (140) may include a display panel and / or a touch sensor. For example, the display panel may be used to display visual information (e.g., images, screens, objects, UI (user interface), GUI (graphic user interface) and / or visual objects). For example, the display panel may have a display area capable of receiving touch input. For example, the touch sensor may be used to obtain data about an external object located on the display panel. For example, the touch sensor may be located within or on the display panel to provide an area of the display panel capable of receiving the touch input. For example, the touch sensor may be configured to obtain data about contact points on at least a portion of the area. For example, the display (140) may include at least a part of the display module (960) of FIG. 9 or correspond to at least a part of the display module (960) of FIG. 9.
[0023] The electronic device (101) can capture (or take) images (e.g., still images) and video through at least one camera (130). For example, the electronic device (101) can acquire said images and said video concurrently based on user input for capturing a visual object through at least one camera (130). According to an embodiment, said visual object may be described in various ways. For example, said visual object may be described as a person, a part of a body (e.g., head or face), an animal, a part of an animal's body (e.g., head or face), or an object. That image may be described as a still image containing said visual object. That video may include video frames showing the movement of said visual object. As an example without limitation, said user input may be described as a touch input to a UI object (e.g., a shooting button) displayed through a display (140) while at least one camera (130) is active.
[0024] For example, the operation of simultaneously acquiring the image and the video may be described as an operation of acquiring the image and the video having a video frame corresponding to the image based on the user input for capturing the visual object. For example, the image and the video frame corresponding to the image may include the visual object that is substantially simultaneously captured through at least one camera (130). For example, the image may be a still image acquired based on a reference point in time when the user input for capturing the visual object is received through the electronic device (101). For example, the video may include video frames acquired during a time interval including the reference point in time when the user input is received through the electronic device (101). According to an embodiment, the time interval may be set in various ways in relation to the reference point in time. For example, the time interval may be set as a time interval from a point in time that is earlier than the reference point in time (e.g., 3 seconds) to the reference point in time. As another example, the above time interval may be set as a time interval from the reference point to a point later than the reference point by a defined time (e.g., 3 seconds). However, the present disclosure is not limited thereto. For example, the above time interval may be set as one of the time intervals including the reference point.
[0025] For example, the electronic device (101) may store the image and the video in memory (120) in association with gallery application software. For example, the image and the video stored in memory (120) may be referred to as motion photos. A method of operation in which the electronic device (101) displays the motion photos through a display (140) is described with reference to FIGS. 2a and 2b.
[0026] FIGS. 2A and FIGS. 2B illustrate examples of motion photos displayed through the display of an electronic device.
[0027] Referring to FIGS. 2a and 2b, examples of an image and video of a motion photo displayed through a display (140) of an electronic device (101) are illustrated. FIG. 2a corresponds to an example of the image of the motion photo displayed through a display (140) of an electronic device (101). For example, the image of the motion photo may be described as an image (210). FIG. 2b corresponds to an example of the video of the motion photo displayed through a display (140) of an electronic device (101). For example, the video of the motion photo may include a video frame (220), a video frame (230), and a video frame (240).
[0028] The electronic device (101) can capture (or take) the image and the video of the motion photo through at least one camera (130). The image and the video of the motion photo may include a visual object. For example, the visual object may correspond to a human head. For example, an image (210) may include a visual object (211) corresponding to the head. For example, a video frame (220) may include a visual object (221) corresponding to the head. For example, a video frame (230) may include a visual object (231) corresponding to the head. For example, a video frame (240) may include a visual object (241) corresponding to the head.
[0029] The image of the motion photo may be a still image obtained based on a reference point in time when user input for capturing the visual object through at least one camera (130) is received through the electronic device (101). For example, the image (210) may be obtained based on a reference point in time when user input for capturing the visual object (211) is received through the electronic device (101). For example, the image (210) may include the visual object (211) captured at the reference point in time ('t' seconds).
[0030] The video of the motion photo may include video frames acquired during a time interval including a reference point in time when the user input for capturing the visual object through at least one camera (130) is received through the electronic device (101). For example, video frame (220), video frame (230), and video frame (240) may be acquired during a time interval including the reference point in time. For example, video frame (220) may include a visual object (221) captured at a time point ('t-2' seconds) that is 2 seconds ahead of the reference point in time ('t' seconds). For example, video frame (230) may include a visual object (231) captured at a time point ('t-1' seconds) that is 1 second ahead of the reference point in time ('t' seconds). For example, video frame (240) may include a visual object (241) captured at the reference point in time ('t' seconds).
[0031] The image of the motion photo may correspond to one of the video frames included in the video of the motion photo. For example, the image (210) may correspond to the video frame (240) among the video frame (220), video frame (230), and video frame (240). For example, the visual object (211) of the image (210) and the visual object (241) of the video frame (240) may be described as visual objects captured substantially simultaneously (e.g., at a reference point) through at least one camera (130). For example, the image (210) may be referred to as the representative image of the motion photo in that it corresponds to the video frame (240) among the video frame (220), video frame (230), and video frame (240).
[0032] The electronic device (101) can store the image and video of the motion photo in memory (120) in association with gallery application software. The image stored in memory (120) may have a higher resolution than the resolution of the video frames stored in memory (120). For example, the image (210) may have a first resolution (e.g., high resolution). For example, each of the video frame (220), video frame (230), and video frame (240) may have a second resolution (e.g., low resolution) lower than the first resolution. For example, the electronic device (101) may reduce the storage capacity of the memory (120) associated with the gallery application software and increase the playback speed of the video of the motion photo by storing the video in memory (120) having a resolution lower than the resolution of the image of the motion photo.
[0033] The electronic device (101) can identify the image of the motion photo based on user input for searching (or accessing) (or loading) the image of the motion photo. The electronic device (101) can display the image through the display (140) based on identifying the image of the motion photo. For example, the electronic device (101) can identify the image (210) based on user input for searching the image (210). For example, the electronic device (101) can display the image (210) through the display (140) based on identifying the image (210).
[0034] The electronic device (101) may receive input for playing the video of the motion photo while the image of the motion photo is displayed through the display (140). At least one processor (110) may receive the input for playing the video through the electronic device (101). By example, without limitation, the input may be described as a touch input to a user interface (UI) object for playing the video of the motion photo. The electronic device (101) may sequentially display the video frames of the video through the display (140) based on the input for playing the video. For example, the electronic device (101) may receive a touch input to a UI object (212) while the image (210) is displayed through the display (140). For example, the electronic device (101) can play the video of the motion photo by sequentially displaying the video frame (220), video frame (230), and video frame (240) through the display (140) based on the touch input.
[0035] For example, since the image of the motion photo is based on a reference point in time when the user input for capturing the visual object is received, the image may display the appearance of the visual object differently from the user's intention depending on the reference point in time. By example, the image of the motion photo that differs from the user's intention may be described as an image in which the eyes of the visual object (e.g., person or animal) are closed, an image containing blur due to the movement of the visual object, or an image with an abnormal facial expression of the visual object. For example, the electronic device (101) may modify the image so that the appearance of the visual object displayed by the image corresponds to the user's intention by editing the visual object within the image of the motion photo using the video frames included in the video of the motion photo. A method of editing the visual object within the image of the motion photo is described with reference to FIG. 3.
[0036] Figure 3 is a flowchart illustrating a method for editing visual objects within an image of a motion photo.
[0037] Referring to FIG. 3, in operation 301, at least one processor (110) can identify an image containing a visual object based on user input for searching (or accessing) (or loading) an image of a motion photo. For example, the motion photo may include the image and a video associated with the image. By example, without limitation, the image may be described as a representative image of the motion photo. For example, the image and the video of the motion photo may be stored in memory (120) in association with gallery application software. By example, without limitation, the image may be stored in memory (120) with an extension (.jpg), an extension (.gif), or an extension (.png). For example, the image stored in memory (120) may have a first resolution (e.g., high resolution). For example, the user input may be received through an electronic device (101) while the gallery application software is executed by at least one processor (110). For example, the above visual object may be described as a person, a part of a body (e.g., head or face), an animal, a part of an animal's body (e.g., head or face), or an object.
[0038] In operation 302, at least one processor (110) may receive input for editing the visual object of the image through an electronic device (101). For example, the input for editing the visual object may be received through the electronic device (101) while the gallery application software is executed by at least one processor (110). For example, the input for editing the visual object may be described as a touch input to a UI object (e.g., a best face activation button) included in the user interface (UI) of the gallery application software.
[0039] In operation 303, at least one processor (110) can identify the video of the motion photo acquired together with the image based on the input for editing the visual object. For example, the video may include video frames. The video frames may be stored in memory (120) in association with the gallery application software. As an example without limitation, the video frames may be stored in memory (120) using a container format. For example, the video frames may be stored in the container format with an extension (.jpg), an extension (.gif), or an extension (.png), or as a compressed file using a video codec. For example, the video frames stored in memory (120) may have a second resolution (e.g., low resolution) lower than the first resolution of the image of the motion photo.
[0040] For example, at least one processor (110) may acquire the image and the video concurrently based on user input for capturing the visual object through at least one camera (130). For example, the operation of acquiring the image and the video concurrently may be described as an operation of acquiring the image and the video having video frames corresponding to the image based on the user input for capturing the visual object. For example, the image and the video frames corresponding to the image may include the visual object that is substantially captured simultaneously through at least one camera (130). For example, the image may be a still image acquired based on a reference point in time when the user input for capturing the visual object is received through the electronic device (101). For example, the video may include video frames acquired during a time interval including the reference point in time when the user input is received through the electronic device (101). According to an embodiment, the time interval may be set in various ways with respect to the reference point in time. For example, the above time interval may be set as a time interval from a point in time defined by time (e.g., 3 seconds) earlier than the above reference point to the above reference point. As another example, the above time interval may be set as a time interval from the above reference point to a point in time defined by time (e.g., 3 seconds) later than the above reference point. However, the present disclosure is not limited thereto. For example, the above time interval may be set as one of the time intervals including the above reference point.
[0041] In operation 304, at least one processor (110) can identify, among the video frames, a video frame containing the visual object to be used for editing the visual object of the image. For example, at least one processor (110) can identify that the video frame contains the visual object to be used for editing the visual object of the image based on identifying the video frame satisfying a reference condition among the video frames. The reference condition may include a first condition associated with the blinking of the eyes of the visual object, a second condition associated with the blur of the visual object, and / or a third condition associated with the orientation of the visual object. For example, the first condition may be that the eyes of the visual object are not closed within the video frame. For example, the second condition may be that a value indicating the blur of the visual object within the video frame is below a threshold value. Depending on the embodiment, the threshold value may be set in various ways. For example, the third condition may be that a value representing the pose of the visual object within the video frame is within a reference range. For example, the value may represent the pose of the visual object by representing the rotation direction of the visual object (e.g., yaw, pitch, and / or roll). Depending on the embodiment, the reference range may be set in various ways. In one embodiment, at least one processor (110) may identify a video frame having a point higher than the point of the image of the motion photo as a video frame for editing the visual object of the image by assigning a point to each of the video frames according to the first condition, the second condition, and the third condition included in the reference condition.
[0042] In operation 305, at least one processor (110) may upscale the visual object within the video frame based on identifying the video frame. For example, the upscaling may be used to increase the resolution of the visual object in the video frame by increasing the number of pixels representing the visual object within the video frame. For example, the upscaling may be performed by adjusting the resolution of the visual object in the video frame from the second resolution to the first resolution. For example, the upscaling may be performed to edit the visual object in the image by adjusting the resolution of the visual object in the video frame to the same resolution as the first resolution, which is the resolution of the visual object in the image. According to an embodiment, the upscaling may be performed on the entire video frame or on the visual object in the video frame.
[0043] For example, at least one processor (110) can upscale the visual object within the video frame using a trained model available for image processing. For example, the trained model may represent one or more calculations to be computed by at least one processor (110). For example, the trained model may include a computational model designed to simulate the neural activity of an organism and / or a program for performing the calculations of said computational model. For example, the trained model may be trained for the upscaling of the visual object based on the application of images of different resolutions representing the same scene containing the visual object. For example, the trained model may be described as a deep learning model available for super-resolution that converts a low-resolution image into a high-resolution image (e.g., a convolutional neural network (CNN) based model or a generative adversarial network (GAN) based model).
[0044] In one embodiment, at least one processor (110) may upscale the visual object within the video frame based on applying the video frame to the trained model. For example, the trained model may be trained for upscaling based on the application of images having different resolutions (e.g., a first image and a second image for training the model to be trained for upscaling) that include other visual objects different from the visual object. For example, the trained model may be trained for upscaling based on the application of an image having the first resolution (e.g., a first image) and an image having the second resolution (e.g., a second image). For example, each of the image having the first resolution and the image having the second resolution may include other visual objects different from the visual object included in the image of the motion photo and the video. A method of upscaling the visual object by applying the video frame to the above-mentioned trained model is described later with reference to FIG. 4a.
[0045] In one embodiment, the video frame may be a first video frame. For example, the video frames may include a second video frame corresponding to the image (e.g., a representative image) of the motion photo. For example, the image and the second video frame may be described as images having different resolutions that represent the same scene including the visual object. At least one processor (110) may upscale the visual object within the first video frame based on applying the first video frame, the second video frame, and the image to the trained model. For example, at least one processor (110) may further train the trained model in relation to the upscaling while the operation of the electronic device (101) is performed based on applying the second video frame and the image to the trained model. For example, the present disclosure may further train the trained model in relation to the upscaling of images associated with a user of the electronic device (101) by applying the second video frame and the image to the trained model. A method of upscaling the visual object by applying the first video frame, the second video frame, and the image to the trained model is described below with reference to FIG. 4b.
[0046] In operation 306, at least one processor (110) can identify an upscaled visual object of the video frame based on the upscaling being performed. For example, at least one processor (110) can identify an upscaled visual object of the video frame obtained from the trained model based on the upscaling being performed.
[0047] In operation 307, at least one processor (110) can edit the visual object of the image using the upscaled visual object of the video frame. For example, at least one processor (110) can edit the visual object of the image by applying at least a portion of the upscaled visual object of the video frame to the visual object of the image. For example, the operation of editing the visual object of the image may include the operation of modifying the visual object of the image according to the at least portion of the upscaled visual object. For example, at least one processor (110) can perform inpainting and / or outpainting on the modified visual object of the image while editing the visual object of the image. For example, at least one processor (110) can identify the difference between the orientation of the visual object within the image and the orientation of the visual object within the video frame. For example, at least one processor (110) can identify a part of the visual object (e.g., head or face) of the video frame to be used to edit the visual object of the image according to the difference. For example, at least one processor (110) can edit the visual object of the image using the identified part of the visual object included in the video frame. A method of editing the visual object according to the pose of the visual object is described later with reference to FIG. 6.
[0048] As described above, the electronic device (101) can obtain the upscaled visual object having the same resolution as the visual object included in the image of the motion photo by upscaled the visual object included in the video frame of the motion photo. For example, the electronic device (101) can modify the image so that the appearance of the visual object represented by the image corresponds to the user's intention by editing the visual object within the image of the motion photo using the upscaled visual object.
[0049] FIGS. 4a and FIGS. 4b illustrate examples of operations for upscaling visual objects within video frames of motion photos.
[0050] Referring to FIG. 4a, an upscaling environment (400a) is illustrated for upscaling a visual object (411) by applying a video frame (410) included in a video of a motion photo to a trained model (401a) available for image processing. The video frame (410) may include a visual object (411) to be used to edit a visual object of an image of the motion photo among the video frames included in the video of the motion photo. The visual object (411) may correspond to a head. The trained model (401a) may be in a state trained for upscaling based on the application of images having different resolutions (e.g., a first image and a second image for training the model to be trained for upscaling) that include a different visual object different from the visual object of the motion photo represented by the visual object (411).
[0051] The size of the figure shown in FIG. 4a may represent the resolution of the figure. For example, a visual object (411) of a video frame (410) may have a second resolution (e.g., low resolution). An upscaled visual object (420) may have a first resolution (e.g., high resolution).
[0052] The electronic device (101) can upscale a visual object (411) within a video frame (410) based on applying a video frame (410) to a trained model (401a). The electronic device (101) can identify an upscaled visual object (420) of a video frame (410) obtained from a trained model (401a) based on the upscaling being performed. The electronic device (101) can identify an upscaled visual object (420) having the first resolution by performing upscaling that adjusts the resolution of the visual object (411) from the second resolution to the first resolution.
[0053] Referring to FIG. 4b, an upscaling environment (400b) is illustrated for upscaling a visual object (411) by applying a video frame (410), a video frame (430), and an image (440) of a motion photo to a trained model (401b) available for image processing. The video frame (410) may include a visual object (411) to be used to edit the visual object (441) of the image (440) of the motion photo among the video frames included in the video of the motion photo. The video frame (430) may be a video frame corresponding to the image (440) of the motion photo among the video frames included in the video of the motion photo. The video frame (430) may include a visual object (431) that is substantially simultaneously captured with the visual object (441) of the image (440) through at least one camera (130). Each of the visual object (411), visual object (431), and visual object (441) can correspond to a head.
[0054] The size of the figure illustrated in FIG. 4b may represent the resolution of the figure. For example, the visual object (411) of the video frame (410) and the visual object (431) of the video frame (430) may each have a second resolution (e.g., low resolution). The visual object (441) of the image (440) and the upscaled visual object (420) may each have a first resolution (e.g., high resolution).
[0055] The electronic device (101) can upscale a visual object (411) within a video frame (410) based on applying a video frame (410), a video frame (420), and an image (440) to a trained model (401b). The electronic device (101) can identify an upscaled visual object (420) of a video frame (410) obtained from the trained model (401b) based on the upscaling being performed. The electronic device (101) can identify an upscaled visual object (420) having the first resolution by performing upscaling that adjusts the resolution of the visual object (411) from the second resolution to the first resolution. The electronic device (101) can further train the trained model (401b) in relation to the upscaling while the operation of the electronic device (101) is performed, based on applying video frames (430) and images (440) containing the same scene and having different resolutions to the trained model (401b).
[0056] FIG. 5 illustrates an example of an operation to edit a visual object within an image of a motion photo using an upscaled visual object.
[0057] Referring to FIG. 5, an editing environment (500) is shown for editing a visual object (511) of an image (510) of a motion photo using an upscaled visual object (520) of a video frame of the motion photo. Each of the visual object (511) and the upscaled visual object (520) may correspond to a head. Each of the visual object (511) and the upscaled visual object (520) may have a first resolution (e.g., high resolution).
[0058] The electronic device (101) can edit a visual object (511) of an image (510) using an upscaled visual object (520). For example, the electronic device (101) can edit a visual object (511) of an image (510) by applying an upscaled visual object (520) of a video frame to a visual object (511) of an image (510). Based on editing a visual object (511) of an image (510), the electronic device (101) can obtain an image (530) containing an edited visual object (531). The edited visual object (531) may have the first resolution.
[0059] Figure 6 is a flowchart illustrating a method for editing a visual object according to the pose of the visual object within an image of a motion photo.
[0060] Referring to FIG. 6, in operation 601, at least one processor (110) can identify a first value representing the orientation of the visual object of the image based on an input for editing the visual object of the image of the motion photo. For example, the visual object may correspond to a head. For example, the first value may represent the orientation of the visual object of the image by representing the rotation direction (e.g., yaw, pitch, and / or roll) of the visual object of the image. For example, the first value may indicate that the yaw of the visual object of the image is 'y1' degrees, the pitch of the visual object of the image is 'p1' degrees, and the roll of the visual object of the image is 'r1' degrees.
[0061] In operation 602, at least one processor (110) can identify a second value representing the orientation of the visual object included in the video frame of the motion photo based on the input for editing the visual object of the image of the motion photo. For example, the video frame of the motion photo may be a video frame among the video frames included in the video of the motion photo that is to be used to edit the visual object of the image. For example, the second value may represent the orientation of the visual object of the video frame by representing the rotation direction (e.g., yaw, pitch, and / or roll) of the visual object of the video frame. For example, the second value may indicate that the yaw of the visual object in the video frame is 'y2' degrees, the pitch of the visual object in the video frame is 'p2' degrees, and the roll of the visual object in the video frame is 'r2' degrees.
[0062] In operation 603, at least one processor (110) can identify a difference value between the first value and the second value. For example, the difference value may indicate a difference in attitude between the visual object of the image and the visual object of the video frame. For example, the difference value may indicate that the difference in attitude for yaw is 'y1-y2' degrees, the difference in attitude for pitch is 'p1-p2' degrees, and the difference in attitude for roll is 'r1-r2' degrees.
[0063] In operation 604, at least one processor (110) can identify whether the difference value exceeds the threshold value. For example, the threshold value may represent a value for each rotational direction (e.g., yaw, pitch, and / or roll) for the attitude difference between the visual object of the image and the visual object of the video frame. For example, the threshold value may indicate that the attitude difference for yaw is 20 degrees, the attitude difference for pitch is 15 degrees, and the attitude difference for roll is 30 degrees. For example, at least one processor (110) can identify that the difference value exceeds the threshold value based on identifying that the difference value exceeds the threshold value, such that the difference value exceeds the threshold value, such that the difference value exceeds the threshold value, such that the difference value exceeds the threshold value, such that the difference value exceeds the threshold value, such that the difference value exceeds the threshold value, such that the difference value exceeds the threshold value, such that the difference value exceeds the threshold value, such that the difference value exceeds the threshold value, such that the difference value exceeds the threshold value.
[0064] In operation 605, at least one processor (110) may edit the visual object of the image corresponding to the head using the upscaled visual object of the video frame corresponding to the head, based on identifying the difference value greater than the threshold value. By example, without limitation, the visual object corresponding to the head may include the head, neck, and part of the shoulder of the visual object. For example, at least one processor (110) may perform inpainting and / or outpainting on the visual object of the image while editing the visual object of the image corresponding to the head.
[0065] In operation 606, at least one processor (110) may edit the visual object of the image corresponding to the face using the upscaled visual object of the video frame corresponding to the face, based on identifying the difference value that is smaller than the threshold value. By example, without limitation, the visual object corresponding to the face may include the eyes, nose, and mouth of the visual object. For example, while editing the visual object of the image corresponding to the face, at least one processor (110) may apply the eyes, nose, and mouth of the upscaled visual object to the visual object of the image according to the direction of the eyes, nose, and mouth of the visual object of the image.
[0066] In one embodiment, at least one processor (110) may edit the visual object of the image corresponding to the head when it is determined that noise will be generated in the edited visual object when editing the visual object of the image corresponding to the face, even if the difference value is smaller than the threshold value is identified.
[0067] FIGS. 7A and 7B illustrate examples of operations for editing a visual object according to the pose of the visual object within an image of a motion photo.
[0068] Referring to FIG. 7a, an environment (700a) for editing a visual object (711a) corresponding to a head within an image (710a) of a motion photo is illustrated. Within the environment (700a), a difference value representing a difference in posture between the visual object (711a) of the image (710a) of the motion photo and the visual object (721a) of the video frame (720a) of the motion photo may be greater than a threshold value. Based on identifying the difference value greater than the threshold value, the electronic device (101) can edit the visual object (711a) of the image (710a) corresponding to the head using the upscaled visual object (721a) of the video frame (720a) corresponding to the head. The electronic device (101) can obtain an image (730a) including an edited visual object (731a) based on editing a visual object (711a) of an image (710a) corresponding to a head.
[0069] Referring to FIG. 7b, an environment (700b) for editing a visual object (712b) corresponding to a face within an image (710b) of a motion photo is illustrated. Within the environment (700b), a difference value representing a difference in posture between a visual object (711b) of the image (710b) of the motion photo corresponding to a head and a visual object (721b) of the video frame (720b) of the motion photo corresponding to a head may be smaller than a threshold value. Based on identifying the difference value being smaller than the threshold value, the electronic device (101) can edit the visual object (712b) of the image (710b) corresponding to a face using an upscaled visual object (722b) of the video frame (720b) corresponding to a face. The electronic device (101) can obtain an image (730b) containing an edited visual object (732b) corresponding to a face, based on editing a visual object (712b) of an image (710b) corresponding to a face. The edited visual object (732b) may be included in a visual object (731b) of an image (730b) corresponding to a head.
[0070] As described above, when the electronic device (101) edits the visual object of the image of the motion photo using the visual object of the video frame of the motion photo, it can increase the accuracy of the editing of the visual object by selectively editing the head of the visual object or the face of the visual object within the image according to the difference in posture between the visual object of the video frame and the visual object of the image.
[0071] FIGS. 8A and FIGS. 8B illustrate examples of a user interface for editing one or more visual objects within an image of a motion photo.
[0072] Referring to FIG. 8a, a user interface (UI) of an electronic device (101) for editing an image of a motion photo containing one or more visual objects is shown.
[0073] The electronic device (101) can identify the image (810) of the motion photo based on user input for searching (or accessing) (or loading) the image (810) of the motion photo in relation to the gallery application software. For example, the image (810) may be stored in memory (120) in association with the gallery application software. For example, the image (810) stored in memory (120) may have a first resolution (e.g., high resolution). The electronic device (101) can display the image (810) through a display (140) based on identifying the image (810) of the motion photo. The image (810) may include a first visual object (e.g., a person's head located on the left within the image (810)), a second visual object (e.g., a person's head located in the center within the image (810)), and a third visual object (e.g., a person's head located on the right within the image (810)). The electronic device (101) can display a UI object (811) for editing a visual object within the image (810) of the motion photo through the display (140) while displaying the image (810) through the display (140).
[0074] In one embodiment, the electronic device (101) can identify a visual object to be edited (e.g., a second visual object) among the visual objects (e.g., a first visual object, a second visual object, and a third visual object) included in the image (810) of the motion photo, based on the identification of the input (e.g., a touch input) for the UI object (811). In one embodiment, the defined priority condition may include a condition for identifying a face with closed eyes as a first priority among the visual objects included in the image of the motion photo, a condition for identifying a face located in the center of the image as a second priority among the visual objects, a condition for identifying a large face as a third priority among the visual objects, and a condition for identifying a face facing forward as a fourth priority among the visual objects. In one embodiment, the defined priority conditions may include a condition for identifying the face of the user of the electronic device (101) as a first priority among the visual objects included in the image of the motion photo, a condition for identifying the face of the user's family as a second priority, and a condition for identifying the face of a person included in the images stored in relation to the gallery application software as a third priority.
[0075] For example, the electronic device (101) may display visual information (e.g., UI object (822)) indicating that the visual object (e.g., second visual object) is to be edited, based on identifying the visual object to be edited (e.g., second visual object) according to the priority conditions defined above, through the display (140). For example, the electronic device (101) may display visual information (e.g., UI object (833)) indicating that the other visual object (e.g., third visual object) is to be edited, based on an input (e.g., touch input) to the UI object (e.g., UI object (823)) for changing the visual object to be edited (e.g., second visual object) to another visual object (e.g., third visual object), through the display (140).
[0076] For example, the electronic device (101) may display, through the display (140), a UI object (821) for editing the first visual object within the image (810), a UI object (822) for editing the second visual object, and a UI object (823) for editing the third visual object, based on the identification of an input (e.g., touch input) to the UI object (811). The UI object (822) may indicate that, among the first visual object, the second visual object, and the third visual object included in the image (810), the second visual object has been identified as the visual object to be edited according to the defined priority condition. The electronic device (101) may display, through the display (140), UI objects (824), UI objects (825), UI objects (826), and UI objects (827) representing the second visual object, based on the identification of the second visual object as the visual object to be edited. A UI object (824) may represent the second visual object of the image (810) of the motion photo. A UI object (825) may represent the second visual object of a video frame included in the video of the motion photo. A UI object (825) may be for editing the second visual object of the image (810) using the second visual object of the video frame. A UI object (826) may represent the second visual object of another video frame included in the video of the motion photo. A UI object (826) may be for editing the second visual object of the image (810) using the second visual object of the other video frame. A UI object (827) may represent the second visual object of yet another video frame included in the video of the motion photo.The UI object (827) may be for editing the second visual object of the image (810) using the second visual object of the other video frame. For example, the video frames corresponding to the UI object (825), UI object (826), and UI object (827) may be video frames identified among the video frames of the motion photo for editing the second visual object of the image (810). For example, the video frames of the motion photo may be stored in memory (120) in association with the gallery application software. For example, the video frames stored in memory (120) may have a second resolution (e.g., low resolution).
[0077] For example, the electronic device (101) may display, through the display (140), a UI object (831) for editing the first visual object within the image (810), a UI object (832) for editing the second visual object, and a UI object (833) for editing the third visual object, based on the identification of an input (e.g., touch input) to the UI object (823). The UI object (833) may indicate that among the first visual object, the second visual object, and the third visual object included in the image (810), the third visual object has been identified as the visual object to be edited. The electronic device (101) may display, through the display (140), UI objects (834), UI objects (835), UI objects (836), and UI objects (837) representing the third visual object, based on the identification of the third visual object as the visual object to be edited. A UI object (834) may represent the third visual object of the image (810) of the motion photo. A UI object (835) may represent the third visual object of a video frame included in the video of the motion photo. A UI object (835) may be used to edit the third visual object of the image (810) using the third visual object of the video frame. A UI object (836) may represent the third visual object of another video frame included in the video of the motion photo. A UI object (836) may be used to edit the third visual object of the image (810) using the third visual object of the other video frame. A UI object (837) may represent the third visual object of another video frame included in the video of the motion photo. A UI object (837) may be used to edit the third visual object of the image (810) using the third visual object of the other video frame.For example, video frames corresponding to UI objects (835), UI objects (836), and UI objects (837) may be video frames identified among the video frames of the motion photo to edit the third visual object of the image (810). For example, the video frames of the motion photo may be stored in memory (120) in association with the gallery application software. For example, the video frames stored in memory (120) may have the second resolution (e.g., low resolution).
[0078] Referring to FIG. 8b, another UI (user interface) of an electronic device (101) for editing an image of a motion photo containing one or more visual objects is illustrated.
[0079] For example, the electronic device (101) may edit the third visual object contained within the image (810) based on the identification of an input (e.g., touch input) for the UI object (836) illustrated in FIG. 8a, using the third visual object of the video frame represented by the UI object (836). For example, the electronic device (101) may change the image (810) into an image (840) containing the edited third visual object and display it through a display (140). For example, the electronic device (101) may upscale the third visual object within the video frame based on the input for the UI object (836). For example, the electronic device (101) may edit the third visual object of the image (810) using the upscaled visual object of the video frame. For example, the electronic device (101) can upscale the third visual object within the video frame based on applying the video frame to a trained model available for image processing. For example, the electronic device (101) can identify the upscaled third visual object of the video frame obtained from the trained model based on the upscaling being performed.
[0080] For example, the electronic device (101) may display, through the display (140), a swipe bar (851) comprising a portion of preview images obtained by downsizing video frames included in the video of the motion photo, based on the identification of input (e.g., touch input) for a UI object (841) displayed through the display (140) together with the image (840). For example, the electronic device (101) may display, through the display (140), a UI object (852) representing the third visual object of the video frame identified according to the swipe bar (851). For example, the UI object (852) may be for editing the third visual object of the image (840) using the third visual object of the video frame identified according to the swipe bar (851).
[0081] For example, the electronic device (101) may display, through the display (140), a swipe bar (861) including a portion of the preview images corresponding to the swipe input based on the swipe input for the swipe bar (851). For example, the electronic device (101) may display, through the display (140), a UI object (862) representing the third visual object of the video frame corresponding to the swipe input among the video frames included in the video of the motion photo. For example, the third visual object represented by the UI object (862) may be the third visual object of the video frame identified by the swipe bar (861). For example, the UI object (862) may be for editing the third visual object of the image (840) using the third visual object of the video frame identified by the swipe bar (861).
[0082] The electronic device (101) may correspond to the electronic device (901) described with reference to FIG. 9 below.
[0083] FIG. 9 is a block diagram of an electronic device in a network environment according to various embodiments.
[0084] Referring to FIG. 9, in a network environment (900), an electronic device (901) may communicate with an electronic device (902) through a first network (998) (e.g., a short-range wireless communication network) or with at least one of an electronic device (904) or a server (908) through a second network (999) (e.g., a long-range wireless communication network). According to one embodiment, the electronic device (901) may communicate with the electronic device (904) through a server (908). According to one embodiment, the electronic device (901) may include a processor (920), memory (930), input module (950), sound output module (955), display module (960), audio module (970), sensor module (976), interface (977), connection terminal (978), haptic module (979), camera module (980), power management module (988), battery (989), communication module (990), subscriber identification module (996), or antenna module (997). In some embodiments, at least one of these components (e.g., connection terminal (978)) may be omitted from the electronic device (901), or one or more other components may be added. In some embodiments, some of these components (e.g., sensor module (976), camera module (980), or antenna module (997)) may be integrated into a single component (e.g., display module (960)).
[0085] The processor (920) can control at least one other component (e.g., a hardware or software component) of the electronic device (901) connected to the processor (920) by executing software (e.g., a program (940)), and can perform various data processing or operations. According to one embodiment, as at least part of the data processing or operations, the processor (920) can store commands or data received from other components (e.g., a sensor module (976) or a communication module (990)) in volatile memory (932), process the commands or data stored in volatile memory (932), and store the resulting data in non-volatile memory (934). According to one embodiment, the processor (920) may include a main processor (921) (e.g., a central processing unit or an application processor) or an auxiliary processor (923) that can operate independently or together with it (e.g., a graphics processing unit, a neural processing unit (NPU), an image signal processor, a sensor hub processor, or a communication processor). For example, if the electronic device (901) includes a main processor (921) and an auxiliary processor (923), the auxiliary processor (923) may be configured to use lower power than the main processor (921) or to be specialized for a designated function. The auxiliary processor (923) may be implemented separately from the main processor (921) or as part thereof.
[0086] The auxiliary processor (923) may control at least some of the functions or states associated with at least one component of the electronic device (901) (e.g., display module (960), sensor module (976), or communication module (990)) on behalf of the main processor (921) while the main processor (921) is in an inactive (e.g., sleep) state, or together with the main processor (921) while the main processor (921) is in an active (e.g., application execution) state. According to one embodiment, the auxiliary processor (923) (e.g., image signal processor or communication processor) may be implemented as part of another functionally related component (e.g., camera module (980) or communication module (990)). According to one embodiment, the auxiliary processor (923) (e.g., neural network processing unit) may include a hardware structure specialized for processing an artificial intelligence model. The artificial intelligence model may be generated through machine learning. Such learning may be performed, for example, on the electronic device (901) itself where the artificial intelligence model is executed, or through a separate server (e.g., server (908)). The learning algorithm may include, for example, supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning, but is not limited to the examples described above. The artificial intelligence model may include a plurality of artificial neural network layers.An artificial neural network may be a deep neural network (DNN), a convolutional neural network (CNN), a recurrent neural network (RNN), a restricted Boltzmann machine (RBM), a deep belief network (DBN), a bidirectional recurrent deep neural network (BRDNN), a deep Q-network, or a combination of two or more of the above, but is not limited to the examples described above. In addition to the hardware structure, the artificial intelligence model may include a software structure, either additionally or substantially.
[0087] The memory (930) can store various data used by at least one component of the electronic device (901) (e.g., processor (920) or sensor module (976)). The data may include, for example, software (e.g., program (940)) and input or output data for related commands. The memory (930) may include volatile memory (932) or non-volatile memory (934).
[0088] The program (940) may be stored as software in memory (930) and may include, for example, an operating system (942), middleware (944), or an application (946).
[0089] The input module (950) can receive commands or data to be used for a component of the electronic device (901) (e.g., processor (920)) from outside the electronic device (901) (e.g., user). The input module (950) may include, for example, a microphone, a mouse, a keyboard, a key (e.g., a button), or a digital pen (e.g., a stylus pen).
[0090] The sound output module (955) can output an audio signal to the outside of the electronic device (901). The sound output module (955) may include, for example, a speaker or a receiver. The speaker may be used for general purposes, such as multimedia playback or recording playback. The receiver may be used to receive incoming calls. According to one embodiment, the receiver may be implemented separately from the speaker or as part thereof.
[0091] The display module (960) can visually provide information to an external (e.g., user) of the electronic device (901). The display module (960) may include, for example, a display, a holographic device, or a projector and a control circuit for controlling said device. According to one embodiment, the display module (960) may include a touch sensor configured to detect a touch, or a pressure sensor configured to measure the intensity of the force generated by said touch.
[0092] The audio module (970) can convert sound into an electrical signal or, conversely, convert an electrical signal into sound. According to one embodiment, the audio module (970) can acquire sound through the input module (950) or output sound through the sound output module (955) or an external electronic device (e.g., electronic device (902)) (e.g., speaker or headphones) connected directly or wirelessly to the electronic device (901).
[0093] The sensor module (976) can detect the operating state of the electronic device (901) (e.g., power or temperature) or the external environmental state (e.g., user state) and generate an electrical signal or data value corresponding to the detected state. According to one embodiment, the sensor module (976) may include, for example, a gesture sensor, a gyroscope sensor, a barometric pressure sensor, a magnetic sensor, an accelerometer sensor, a grip sensor, a proximity sensor, a color sensor, an IR (infrared) sensor, a biosensor, a temperature sensor, a humidity sensor, or an illuminance sensor.
[0094] The interface (977) may support one or more specified protocols that can be used for the electronic device (901) to be connected directly or wirelessly to an external electronic device (e.g., electronic device (902)). According to one embodiment, the interface (977) may include, for example, a high definition multimedia interface (HDMI), a universal serial bus (USB) interface, an SD card interface, or an audio interface.
[0095] The connection terminal (978) may include a connector through which the electronic device (901) can be physically connected to an external electronic device (e.g., electronic device (902)). According to one embodiment, the connection terminal (978) may include, for example, an HDMI connector, a USB connector, an SD card connector, or an audio connector (e.g., a headphone connector).
[0096] The haptic module (979) can convert an electrical signal into a mechanical stimulus (e.g., vibration or movement) or an electrical stimulus that the user can perceive through tactile or kinesthetic senses. According to one embodiment, the haptic module (979) may include, for example, a motor, a piezoelectric element, or an electric stimulation device.
[0097] The camera module (980) can capture still images and video. According to one embodiment, the camera module (980) may include one or more lenses, image sensors, image signal processors, or flashes.
[0098] The power management module (988) can manage power supplied to the electronic device (901). According to one embodiment, the power management module (988) can be implemented, for example, as at least part of a power management integrated circuit (PMIC).
[0099] The battery (989) can supply power to at least one component of the electronic device (901). According to one embodiment, the battery (989) may include, for example, a non-rechargeable primary battery, a rechargeable secondary battery, or a fuel cell.
[0100] The communication module (990) can support the establishment of a direct (e.g., wired) communication channel or a wireless communication channel between an electronic device (901) and an external electronic device (e.g., electronic device (902), electronic device (904), or server (908)), and the performance of communication through the established communication channel. The communication module (990) may include one or more communication processors that operate independently of the processor (920) (e.g., application processor) and support direct (e.g., wired) communication or wireless communication. According to one embodiment, the communication module (990) may include a wireless communication module (992) (e.g., cellular communication module, short-range wireless communication module, or GNSS (global navigation satellite system) communication module) or a wired communication module (994) (e.g., LAN (local area network) communication module, or power line communication module). The corresponding communication module among these communication modules can communicate with an external electronic device (904) through a first network (998) (e.g., a short-range communication network such as Bluetooth, WiFi (wireless fidelity) direct, or IrDA (infrared data association)) or a second network (999) (e.g., a legacy cellular network, a 5G network, a next-generation communication network, the Internet, or a computer network (e.g., a LAN or WAN). These various types of communication modules may be integrated into a single component (e.g., a single chip) or implemented as multiple separate components (e.g., multiple chips). The wireless communication module (992) can identify or authenticate the electronic device (901) within a communication network such as the first network (998) or the second network (999) using subscriber information (e.g., International Mobile Subscriber Identifier (IMSI)) stored in the subscriber identification module (996).
[0101] The wireless communication module (992) can support 5G networks and next-generation communication technologies following 4G networks, for example, new radio access technology. NR access technology can support high-speed transmission of high-capacity data (enhanced mobile broadband (eMBB)), minimization of terminal power and connection of multiple terminals (massive machine type communications (mMTC)), or high reliability and low latency (ultra-reliable and low-latency communications (URLLC)). The wireless communication module (992) can support a high-frequency band (e.g., mmWave band) to achieve a high data transmission rate, for example. The wireless communication module (992) can support various technologies for securing performance in the high-frequency band, such as beamforming, massive MIMO (multiple-input and multiple-output), full-dimensional MIMO (FD-MIMO), array antenna, analog beam-forming, or large-scale antenna. The wireless communication module (992) can support various requirements specified in the electronic device (901), external electronic device (e.g., electronic device (904)), or network system (e.g., second network (999)). According to one embodiment, the wireless communication module (992) may support a Peak data rate (e.g., 20 Gbps or more) for eMBB realization, loss coverage (e.g., 164 dB or less) for mMTC realization, or U-plane latency (e.g., downlink (DL) and uplink (UL) each 0.5 ms or less, or round trip 1 ms or less) for URLLC realization.
[0102] An antenna module (997) can transmit a signal or power to or from an external source (e.g., an external electronic device). According to one embodiment, the antenna module (997) may include an antenna comprising a radiator made of a conductor or a conductive pattern formed on a substrate (e.g., a PCB). According to one embodiment, the antenna module (997) may include a plurality of antennas (e.g., an array antenna). In this case, at least one antenna suitable for a communication method used in a communication network, such as a first network (998) or a second network (999), may be selected from the plurality of antennas, for example, by a communication module (990). A signal or power may be transmitted or received between the communication module (990) and an external electronic device through the selected at least one antenna. According to some embodiments, in addition to the radiator, other components (e.g., a radio frequency integrated circuit (RFIC)) may be additionally formed as part of the antenna module (997).
[0103] According to various embodiments, the antenna module (997) may form a mmWave antenna module. According to one embodiment, the mmWave antenna module may include a printed circuit board, an RFIC disposed on or adjacent to a first surface (e.g., bottom surface) of the printed circuit board and capable of supporting a specified high frequency band (e.g., mmWave band), and a plurality of antennas (e.g., array antennas) disposed on or adjacent to a second surface (e.g., top surface or side surface) of the printed circuit board and capable of transmitting or receiving a signal of the specified high frequency band.
[0104] At least some of the above components can be connected to each other via a communication method between peripheral devices (e.g., bus, GPIO (general purpose input and output), SPI (serial peripheral interface), or MIPI (mobile industry processor interface)) and exchange signals (e.g., commands or data) with each other.
[0105] According to one embodiment, commands or data may be transmitted or received between an electronic device (901) and an external electronic device (904) through a server (908) connected to a second network (999). Each of the external electronic devices (902, or 904) may be the same or a different type of device as the electronic device (901). According to one embodiment, all or part of the operations performed on the electronic device (901) may be performed on one or more of the external electronic devices (902, 904, or 908). For example, if the electronic device (901) needs to perform a function or service automatically or in response to a request from a user or another device, the electronic device (901) may request one or more external electronic devices to perform at least part of the function or service instead of performing the function or service itself or additionally. One or more external electronic devices that receive the above request may execute at least part of the requested function or service, or additional function or service related to the request, and transmit the result of the execution to the electronic device (901). The electronic device (901) may provide the result as is or additionally processed as at least part of the response to the request. For this purpose, for example, cloud computing, distributed computing, mobile edge computing (MEC), or client-server computing technology may be used. The electronic device (901) may provide ultra-low latency services using, for example, distributed computing or mobile edge computing. In one embodiment, the external electronic device (904) may include an Internet of Things (IoT) device. The server (908) may be an intelligent server using machine learning and / or neural networks. According to one embodiment, the external electronic device (904) or the server (908) may be included within a second network (999).The electronic device (901) can be applied to intelligent services (e.g., smart home, smart city, smart car, or healthcare) based on 5G communication technology and IoT-related technology.
[0106] The technical problems to be solved in this disclosure are not limited to those mentioned above, and other technical problems not mentioned will be clearly understood by those skilled in the art to which this disclosure pertains.
[0107] As described above, an electronic device (e.g., electronic device (101)) may include at least one processor (e.g., at least one processor (110)) comprising a processing circuit; and a memory (e.g., memory (120)) that stores instructions and includes one or more storage media. When the instructions are executed individually or collectively by the at least one processor, the electronic device may: identify an image containing a visual object, wherein the image has a first resolution; receive, through the electronic device, an input for editing the visual object of the image; identify a video acquired together with the image based on the input, wherein the video frames of the video have a second resolution lower than the first resolution; among the video frames, identify a video frame containing the visual object to be used for editing the visual object of the image; and upscale the visual object within the video frame based on identifying the video frame. Based on the upscaling being performed, the electronic device may identify the upscaled visual object of the video frame; and edit the visual object of the image using the upscaled visual object of the video frame.
[0108] For example, the electronic device may further include a camera (e.g., at least one camera (130)). The image and the video may be acquired based on user input for capturing the visual object through the camera. The video frame may be a first video frame. The video frames may include a second video frame corresponding to the image. The image and the second video frame may include the visual object captured through the camera substantially simultaneously.
[0109] For example, the upscaling can be performed by adjusting the resolution of the visual object of the video frame from the second resolution to the first resolution.
[0110] For example, the above instructions, when executed individually or collectively by the at least one processor, may cause the electronic device to: upscale the visual object within the video frame based on applying the video frame to a trained model available for image processing; and identify the upscaled visual object of the video frame obtained from the trained model based on the upscaling being performed.
[0111] For example, the above image may be a first image. The trained model may be trained based on the application of a second image having the first resolution and a third image having the second resolution. The second image and the third image may include other visual objects.
[0112] For example, the video frame may be a first video frame. The video frames may include a second video frame corresponding to the image. The instructions, when executed individually or collectively by the at least one processor, may cause the electronic device to: upscale the visual object within the first video frame based on applying the first video frame, the second video frame, and the image to a trained model available for image processing; and identify the upscaled visual object of the first video frame obtained from the trained model based on the upscaling being performed.
[0113] For example, when the above instructions are executed individually or collectively by the at least one processor, the electronic device may cause the trained model to be further trained in relation to the upscaling based on applying the second video frame and the image to the trained model.
[0114] For example, the electronic device may further include a display (e.g., display (140)). The instructions, when executed individually or collectively by the at least one processor, may cause the electronic device to: display the image through the display based on identifying the image; receive an input for playing the video through the electronic device while the image is being displayed; and sequentially display the video frames through the display based on the input for playing the video.
[0115] For example, when the above instructions are executed individually or collectively by the at least one processor, the electronic device may be caused to: identify, based on the input: a first value representing the orientation of the visual object of the image, identify a second value representing the orientation of the visual object of the video frame, and identify a difference value between the first value and the second value; based on the identification of the difference value greater than a threshold value, edit the visual object of the image corresponding to the head using the upscaled visual object of the video frame corresponding to the head; and based on the identification of the difference value less than the threshold value, edit the visual object of the image corresponding to the face using the upscaled visual object of the video frame corresponding to the face.
[0116] For example, when the above instructions are executed individually or collectively by the at least one processor, the electronic device may be caused to identify that the video frame contains the visual object to be used to edit the visual object of the image, based on the identification of the video frame satisfying a reference condition. The reference condition may include a condition associated with the blinking of the eyes of the visual object, a condition associated with the blur of the visual object, and / or a condition associated with the orientation of the visual object.
[0117] A non-transient computer-readable storage medium as described above may store one or more programs. When the one or more programs are executed by an electronic device (e.g., electronic device (101)), they identify an image containing a visual object, wherein the image has a first resolution; receive input for editing a visual object of the image through the electronic device; based on the input, identify a video acquired together with the image, wherein the video frames of the video have a second resolution lower than the first resolution; among the video frames, identify a video frame containing the visual object to be used for editing the visual object of the image; based on identifying the video frame, upscale the visual object within the video frame; and based on the upscaled visual object of the video frame being performed. and may include instructions that cause the electronic device to edit the visual object of the image using the upscaled visual object of the video frame.
[0118] For example, the above one or more programs may include instructions that cause the electronic device to: upscale the visual object within the video frame based on applying the video frame to a trained model available for image processing; and identify the upscaled visual object of the video frame obtained from the trained model based on the upscaling being performed.
[0119] For example, the above image may be a first image. The trained model may be trained based on the application of a second image having the first resolution and a third image having the second resolution. The second image and the third image may include other visual objects.
[0120] For example, the video frame may be a first video frame. The video frames may include a second video frame corresponding to the image. The one or more programs may include instructions that, when executed by the electronic device: upscale the visual object within the first video frame based on applying the first video frame, the second video frame, and the image to a trained model available for image processing; and identify the upscaled visual object of the first video frame obtained from the trained model based on the upscaling being performed.
[0121] For example, the above one or more programs may include instructions that cause the electronic device to further train the trained model in relation to the upscaling, based on applying the second video frame and the image to the trained model when executed by the electronic device.
[0122] As described above, an electronic device (e.g., electronic device (101)) may include a display (e.g., display (140)), at least one processor (e.g., at least one processor (110)) including a processing circuit; and a memory (e.g., memory (120)) that stores instructions and includes one or more storage media. When the instructions are executed individually or collectively by the at least one processor, the electronic device may: display an image containing a visual object through the display, wherein the image has a first resolution; receive an input through the electronic device for editing the visual object of the image; identify a video acquired together with the image based on the input, wherein the video frames of the video have a second resolution lower than the first resolution; identify, among the video frames, a video frame containing the visual object to be used for editing the visual object of the image; and display a UI (user interface) object representing the visual object of the video frame through the display. Based on input to the UI object, the electronic device may upscale the visual object within the video frame; edit the visual object of the image using the upscaled visual object of the video frame; and display the image including the edited visual object through the display.
[0123] For example, the above instructions, when executed individually or collectively by the at least one processor, may cause the electronic device to: upscale the visual object within the video frame based on applying the video frame to a trained model available for image processing; and identify the upscaled visual object of the video frame obtained from the trained model based on the upscaling being performed.
[0124] For example, when the above instructions are executed individually or collectively by the at least one processor: to display, through the display, a swipe bar including preview images obtained by downsizing the video frames based on the input; and to display, through the display, a UI object representing the visual object of the video frame corresponding to the swipe input among the video frames based on the swipe input for the swipe bar.
[0125] For example, when the above instructions are executed individually or collectively by the at least one processor: receiving an input for playing the video through the electronic device while the image is displayed; and causing the electronic device to sequentially display the video frames through the display based on the input for playing the video.
[0126] For example, the image may include a first visual object and a second visual object. The visual object may be the first visual object or the second visual object. When the instructions are executed individually or collectively by the at least one processor, the electronic device may be caused to: identify the first visual object to be edited according to a priority condition defined among the first visual object and the second visual object based on the input; display visual information indicating that the first visual object is to be edited through the display based on identifying the first visual object; and display visual information indicating that the second visual object is to be edited through the display based on an input to change the visual object to be edited from the first visual object to the second visual object.
[0127] A non-transient computer-readable storage medium as described above may store one or more programs. When the one or more programs are executed by an electronic device (e.g., electronic device (101)) having a display (e.g., display (140)), the program displays an image containing a visual object through the display, the image having a first resolution; receives input through the electronic device for editing the visual object of the image; based on the input, identifies a video acquired together with the image, the video frames of the video having a second resolution lower than the first resolution; among the video frames, identifies a video frame containing the visual object to be used for editing the visual object of the image; displays a UI (user interface) object representing the visual object of the video frame through the display; and based on input to the UI object, upscales the visual object within the video frame. The electronic device may include instructions for editing the visual object of the image using the upscaled visual object of the video frame; and displaying the image including the edited visual object through the display.
[0128] The effects obtainable from the present disclosure are not limited to those mentioned above, and other unmentioned effects will be clearly understood by those skilled in the art to which the present disclosure belongs.
[0129] The electronic device according to the various embodiments disclosed in this document may be of various forms. The electronic device may include, for example, a portable communication device (e.g., a smartphone), a computer device, a portable multimedia device, a portable medical device, a camera, a wearable device, or a consumer electronics device. The electronic device according to the embodiments of this document is not limited to the devices described above.
[0130] The various embodiments of this document and the terms used therein are not intended to limit the technical features described in this document to specific embodiments, and should be understood to include various modifications, equivalents, or substitutions of said embodiments. In connection with the description of the drawings, similar reference numerals may be used for similar or related components. The singular form of a noun corresponding to an item may include one or more of said items unless the relevant context clearly indicates otherwise. In this document, phrases such as "A or B," "at least one of A and B," "at least one of A or B," "A, B or C," "at least one of A, B and C," and "at least one of A, B, or C" may each include any one of the items listed together in the corresponding phrase, or all possible combinations thereof. Terms such as "first," "second," or "first" or "second" may be used simply to distinguish said components from other said components and do not limit said components in any other aspect (e.g., importance or order). Where any (e.g., 1st) component is referred to as "coupled" or "connected" to another (e.g., 2nd) component, with or without the terms "functionally" or "communicationly," it means that said any component may be connected to said other component directly (e.g., via a wire), wirelessly, or through a third component.
[0131] The term “module” as used in the various embodiments of this document may include a unit implemented in hardware, software, or firmware, and may be used interchangeably with terms such as logic, logic block, component, or circuit, for example. A module may be a component formed integrally, or a minimum unit of said component or a part thereof that performs one or more functions. For example, according to one embodiment, a module may be implemented in the form of an application-specific integrated circuit (ASIC).
[0132] Various embodiments of the present document may be implemented as software (e.g., program (940)) comprising one or more instructions stored in a storage medium (e.g., internal memory (936) or external memory (938)) readable by a machine (e.g., electronic device (901)). For example, a processor (e.g., processor (920)) of the machine (e.g., electronic device (901)) may call at least one of the one or more instructions stored from the storage medium and execute it. This enables the machine to operate to perform at least one function according to the at least one called instruction. The one or more instructions may include code generated by a compiler or code that can be executed by an interpreter. The storage medium readable by the machine may be provided in the form of a non-transitory storage medium. Here, 'non-temporary' simply means that the storage medium is a tangible device and does not contain a signal (e.g., electromagnetic waves), and the term does not distinguish between cases where data is stored semi-permanently and cases where it is stored temporarily.
[0133] According to one embodiment, the method according to the various embodiments disclosed herein may be provided by being included in a computer program product. The computer program product may be traded between a seller and a buyer as a product. The computer program product may be distributed in the form of a device-readable storage medium (e.g., compact disc read-only memory (CD-ROM)), or distributed online (e.g., download or upload) through an application store (e.g., Play Store™) or directly between two user devices (e.g., smartphones). In the case of online distribution, at least a portion of the computer program product may be temporarily stored or temporarily created on a device-readable storage medium, such as the memory of a manufacturer's server, an application store's server, or a relay server.
[0134] According to various embodiments, each component (e.g., module or program) of the components described above may include a singular or multiple entities, and some of the multiple entities may be separated and placed in other components. According to various embodiments, one or more of the components or operations of the aforementioned components may be omitted, or one or more other components or operations may be added. Generally or additionally, multiple components (e.g., module or program) may be integrated into a single component. In this case, the integrated component may perform one or more functions of each of the multiple components in the same or similar manner as those performed by the corresponding component among the multiple components prior to integration. According to various embodiments, operations performed by the module, program, or other components may be executed sequentially, in parallel, iteratively, or heuristically, or one or more of the operations may be executed in a different order, omitted, or one or more other operations may be added.
Claims
1. In an electronic device, At least one processor including a processing circuit; and Memory that stores instructions and includes one or more storage media, When the above instructions are executed individually or collectively by the at least one processor: Identify an image containing a visual object, wherein the image has a first resolution; Receive input for editing the visual object of the above image through the electronic device; Based on the above input, a video acquired together with the image is identified, and the video frames of the video have a second resolution lower than the first resolution; Among the above video frames, identify a video frame containing the visual object to be used to edit the visual object of the image; Based on identifying the above video frame, upscale the visual object within the above video frame; Based on the upscaling being performed, identify the upscaled visual object of the video frame; and To edit the visual object of the above image using the upscaled visual object of the above video frame, The above electronic device, causing, Electronic device.
2. In Claim 1, The above electronic device further includes a camera, The above image and the above video are acquired based on user input for capturing the visual object through the camera, and The above video frame is a first video frame, and The above video frames include a second video frame corresponding to the image, and The above image and the above second video frame include the visual object captured substantially simultaneously through the camera, Electronic device.
3. In Claim 1, The above upscaling is performed by adjusting the resolution of the visual object of the video frame from the second resolution to the first resolution. Electronic device.
4. In Claim 1, When the above instructions are executed individually or collectively by the at least one processor: Upscaling the visual object within the video frame based on applying the video frame to a trained model available for image processing; and Based on the fact that the above upscaling is performed, to identify the upscaled visual object of the video frame obtained from the above-trained model, The above electronic device, causing, Electronic device.
5. In Claim 4, The above image is the first image, and The above-mentioned trained model is trained based on the application of a second image having the first resolution and a third image having the second resolution, and The second image and the third image above include other visual objects, Electronic device.
6. In Claim 1, The above video frame is a first video frame, and The above video frames include a second video frame corresponding to the image, and When the above instructions are executed individually or collectively by the at least one processor: Based on applying the first video frame, the second video frame, and the image to a trained model available for image processing, upscaling the visual object within the first video frame; and Based on the fact that the above upscaling is performed, to identify the upscaled visual object of the first video frame obtained from the trained model, The above electronic device, causing, Electronic device.
7. In Claim 6, When the above instructions are executed individually or collectively by the at least one processor: Based on applying the second video frame and the image to the trained model, to further train the trained model in relation to the upscaling, The above electronic device, causing, Electronic device.
8. In Claim 1, The above electronic device further includes a display, and When the above instructions are executed individually or collectively by the at least one processor: Based on identifying the above image, display the above image through the display; While the above image is displayed, an input for playing the above video is received through the electronic device; and Based on the input for playing the above video, the video frames are to be displayed sequentially through the display. The above electronic device, causing, Electronic device.
9. In Claim 1, When the above instructions are executed individually or collectively by the at least one processor: Based on the above input: Identifying a first value representing the orientation of the visual object in the above image, and Identifying a second value representing the pose of the visual object of the above video frame, and Identify the difference value between the first value and the second value; Based on identifying the difference value greater than a threshold value, the visual object of the image corresponding to the head is edited using the upscaled visual object of the video frame corresponding to the head; and Based on identifying the difference value smaller than the threshold value, the visual object of the image corresponding to the face is edited using the upscaled visual object of the video frame corresponding to the face. The above electronic device, causing, Electronic device.
10. In Claim 1, When the above instructions are executed individually or collectively by the at least one processor: Based on identifying the video frame satisfying the reference condition, to identify that the video frame includes the visual object to be used to edit the visual object of the image, The above electronic device, causing, The above standard conditions include conditions associated with the blinking of the eyes of the visual object, conditions associated with the blur of the visual object, and / or conditions associated with the orientation of the visual object. Electronic device.
11. In a non-transient computer-readable storage medium storing one or more programs, said one or more programs, when executed by an electronic device: Identify an image containing a visual object, wherein the image has a first resolution; Receive input for editing a visual object of the above image through the electronic device; Based on the above input, a video acquired together with the image is identified, and the video frames of the video have a second resolution lower than the first resolution; Among the above video frames, identify a video frame containing the visual object to be used to edit the visual object of the image; Based on identifying the above video frame, upscale the visual object within the above video frame; Based on the upscaling being performed, identify the upscaled visual object of the video frame; and To edit the visual object of the above image using the upscaled visual object of the above video frame, Instructions including those that cause the above electronic device Non-transient computer-readable storage media.
12. In Claim 11, When one or more of the above programs are executed by the electronic device: Upscaling the visual object within the video frame based on applying the video frame to a trained model available for image processing; and Based on the fact that the above upscaling is performed, to identify the upscaled visual object of the video frame obtained from the above-trained model, Instructions including those that cause the above electronic device Non-transient computer-readable storage media.
13. In Claim 12, The above image is the first image, and The above-mentioned trained model is trained based on the application of a second image having the first resolution and a third image having the second resolution, and The second image and the third image above include other visual objects, Non-transient computer-readable storage media.
14. In Claim 11, The above video frame is a first video frame, and The above video frames include a second video frame corresponding to the image, and When one or more of the above programs are executed by the electronic device: Based on applying the first video frame, the second video frame, and the image to a trained model available for image processing, upscaling the visual object within the first video frame; and Based on the fact that the above upscaling is performed, to identify the upscaled visual object of the first video frame obtained from the trained model, Instructions including those that cause the above electronic device Non-transient computer-readable storage media.
15. In Claim 14, When one or more of the above programs are executed by the electronic device: Based on applying the second video frame and the image to the trained model, to further train the trained model in relation to the upscaling, Instructions including those that cause the above electronic device Non-transient computer-readable storage media.