Image processing method and device

By extracting and synthesizing the main image from dynamic images, the problem of fixed cutout results in dynamic photos is solved, enabling user-defined editing and personalized creation of dynamic images, thus enhancing playability and visual experience.

WO2026138360A1PCT designated stage Publication Date: 2026-07-02HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2025-11-28
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

The result of background removal in dynamic photos is fixed and cannot meet users' personalized needs, resulting in low playability and user experience.

Method used

By extracting the main image from a moving image and compositing it with other moving images, users can customize and edit moving images, including setting and replacing positions, frame alignment, and matching animation effects to generate new moving images.

Benefits of technology

It enhances the playability and personalization of dynamic images, improves image quality and visual experience, and supports saving and sharing edited images.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025138422_02072026_PF_FP_ABST
    Figure CN2025138422_02072026_PF_FP_ABST
Patent Text Reader

Abstract

The present application relates to the technical field of image processing, and provides an image processing method and a device. The method is applied to a first device. The method comprises: the first device displays a first dynamic image, the first dynamic image comprising a first subject; the first device may acquire, from the first dynamic image, a first subject image corresponding to the first subject; and, in response to an editing operation for the first subject image, the first device may perform compositing on the first subject image and a second dynamic image to display a third dynamic image, the third dynamic image comprising the first subject. In the embodiments of the present application, a subject image corresponding to a subject in one dynamic image and another dynamic image are subjected to compositing, thus generating and displaying a new dynamic image. The present application can support customized editing of dynamic images by users, thereby improving the playability of dynamic images and improving personalized experience for users.
Need to check novelty before this filing date? Find Prior Art

Description

An image processing method and apparatus

[0001] This application claims priority to Chinese Patent Application No. 202411921628.5, filed on December 23, 2024, entitled “An Image Processing Method and Apparatus”, the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of image processing technology, and in particular to an image processing method and apparatus. Background Technology

[0003] With the widespread use of electronic devices, people have increasingly higher demands for camera performance and photo quality. As an option for users to take photos, dynamic photos can be captured and displayed. Furthermore, users can utilize background removal tools to identify and separate the subject from the background in dynamic photos.

[0004] However, the cutout results obtained from background removal processing of animated photos are usually fixed, and users can only use fixed cutout results. This fails to meet users' personalized needs for animated photos, resulting in low playability and a poor user experience. Summary of the Invention

[0005] This application provides an image processing method and apparatus that generates and displays a new dynamic image by compositing the subject image corresponding to the main subject in one dynamic image with other dynamic images. It supports user-defined editing of dynamic images, enhancing the playability and personalized experience of users.

[0006] To achieve the above objectives, this application adopts the following technical solution:

[0007] In a first aspect, this application provides an image processing method applied to a first device. The method includes: the first device displaying a first dynamic image, the first dynamic image including a first subject. The first device can obtain a first subject image corresponding to the first subject from the first dynamic image. In response to an editing operation on the first subject image, the first device can combine the first subject image with a second dynamic image to display a third dynamic image, the third dynamic image including the first subject.

[0008] Thus, the first device in this application embodiment can obtain the subject image corresponding to the main subject in a dynamic image, and synthesize the subject image with other dynamic images to generate and display a new dynamic image. Users can edit the subject image in any dynamic image, such as setting the subject image to a certain position in another dynamic image, or replacing the subject image with the subject image in another dynamic image. It supports user-defined editing of dynamic images, allowing users to create animated images from animated GIFs, thus enhancing the playability of dynamic images and the user's personalized experience.

[0009] In one possible implementation, the first device can display an image editing interface, which includes a first subject image. The first device can also, in response to a drag operation on the first subject image, overlay the first subject image onto a target area in a second animated image to obtain a third animated image, where the target area is the operation area corresponding to the drag operation. Then, the first device can display the third animated image.

[0010] Thus, in this embodiment, users can edit the subject in any animated image, such as copying or inserting the subject into any position in other animated images. This supports user-defined editing of animated images, enhancing their playability and user experience.

[0011] In one possible implementation, the second animated image includes a second subject, and the first device can display an image editing interface that includes a first subject image. The first device can also, in response to a replacement operation on the first subject image, determine a target region in the second animated image, the target region being the subject region corresponding to the second subject. The first device can then overlay the first subject image onto the target region in the second animated image to obtain a third animated image. Subsequently, the first device can display the third animated image.

[0012] Thus, in this embodiment, users can edit the subject in any animated image, such as replacing the subject with the subject in other animated images. For example, if a user wants the background in another animated image, they can replace the desired subject with the subject in another animated image to achieve a satisfactory subject and background in a single animated image. This supports user-defined editing of animated images, enhancing the playability of animated images and the user experience.

[0013] In one feasible approach, during the process of compositing the first subject image and the second motion image, the first device can compare the number of image frames in the first subject image and the second motion image. The first device can also perform frame number alignment processing on the first subject image and the second motion image if the number of image frames in the first subject image and the second motion image differs.

[0014] Therefore, in this embodiment of the application, when the frame rates of the first main image and the second dynamic image are inconsistent, frame rate alignment processing, i.e., frame rate balancing adjustment, can be performed on the two dynamic images. This improves the image quality of the third dynamic image, resulting in a better display effect and ultimately enhancing the user's visual experience.

[0015] In one feasible approach, during the frame number alignment process of the first subject image and the second dynamic image, if the number of image frames of the first subject image is less than the number of image frames of the second dynamic image, the first device performs frame number padding on the first subject image according to the number of image frames of the second dynamic image to obtain an updated first subject image.

[0016] If the number of image frames in the first main image is greater than the number of image frames in the second dynamic image, redundant images in the first main image are removed, and the first main image after removing redundant images is semantically analyzed and sorted to obtain an intermediate result of the first main image. The first device can also remove redundant frames from the intermediate result of the first main image based on the number of image frames in the second dynamic image to obtain an updated first main image.

[0017] Thus, in this embodiment of the application, when the frame rates of the first main image and the second dynamic image are inconsistent, frame rate balancing can be performed on the two animations to maintain smooth edge transitions and semantic information consistency during the fusion process. This improves the image quality of the third dynamic image, resulting in a better display effect and ultimately enhancing the user's visual experience.

[0018] In one feasible approach, when the third dynamic image includes a first subject and a second subject, the animation effects of the first subject and the second subject are matched.

[0019] Thus, in this embodiment of the application, when the third dynamic image includes a first subject and a second subject, the animation effects of the first subject and the second subject can be matched. This allows the user to experience a more harmonious and realistic viewing experience of the first subject and the second subject, enhancing the user's visual experience.

[0020] In one possible implementation, the first device includes a first application, and the method further includes: the first device can store a third animated image according to a preset image format. During the display of the third animated image, the first device can load the third animated image with the preset image format through the first application and display the third animated image.

[0021] Thus, in this embodiment of the application, the user can save and share the third dynamic image through an application on the first device. Not only can the dynamic image be edited, but the edited dynamic image can also be saved and displayed. This enhances the interactivity between the user and the device, as well as the playability of the dynamic image. Simultaneously, all applications on the phone can load and display the edited dynamic image, maintaining consistency in the user experience.

[0022] In one possible implementation, the first device includes a second application, which can send a third moving image to the second device via the second application in response to a sharing operation for the third moving image.

[0023] Thus, in this embodiment of the application, users can share the third dynamic image through an application on the first device. Not only can the dynamic image be edited, but it can also be shared after editing. This enhances the interactivity between the user and the device, as well as the playability of the dynamic image. Simultaneously, all applications on the first device can share the completed dynamic image, maintaining consistency in user experience.

[0024] In one possible implementation, the first device includes a first application and a second application. The first device can store a first subject image according to a preset image format and load the first subject image with the preset image format through the first application. The first device can also display the first subject image. Subsequently, in response to a sharing operation for the first subject image, the first device can also send the first subject image to the second device through the second application.

[0025] Thus, in this embodiment of the application, the user can display and share the first main image through applications on the first device, which can be all applications or some applications. This enhances the interactivity between the user and the device, as well as the playability of dynamic images. Simultaneously, all applications on the first device can display and share the first main image, maintaining consistency in user experience.

[0026] In one feasible approach, the first application includes an input method application and a gallery application, and the second application includes a third-party application.

[0027] In one possible implementation, during the process of acquiring the first subject image corresponding to the first subject from the first dynamic image, the first device can input the cover image in the first dynamic image into the first model, output the region of interest corresponding to the cover image, and input the region of interest and the first dynamic image into the second model to output the first subject image.

[0028] Thus, by utilizing the first model and the second model to obtain the first subject image, the embodiments of this application can improve the speed and accuracy of obtaining the first subject image.

[0029] In one feasible approach, the first device may also display the main outline of the first subject with a target effect, the target effect including: the main outline of the first subject changing according to a preset rule.

[0030] Thus, in this embodiment, the outline of the first subject can also be displayed with the desired effect, that is, presenting a regular dynamic effect according to preset rules, such as a cyclical flashing dynamic effect based on adjusting transparency. This can prompt the user that they are identifying the first subject in the first dynamic image. Therefore, the visual visibility of the first subject and the guidance of the identification process are improved, giving the user a better visual experience.

[0031] In a second aspect, this application provides an electronic device, which includes a display screen, a memory, and one or more processors; the display screen is used to display dynamic images, and the memory is coupled to the processor; wherein, the memory stores computer program code, which includes computer instructions, and when the computer instructions are executed by the processor, the electronic device performs the image processing method as described in the first aspect above.

[0032] Thirdly, this application provides a computer-readable storage medium storing instructions that, when executed on a computer, enable the computer to perform the image processing method as described in the first aspect above.

[0033] Fourthly, this application provides a computer program product containing instructions that, when executed by an electronic device, cause the electronic device to perform the image processing method described in the first aspect above. Attached Figure Description

[0034] Figure 1 is a schematic diagram of a related technology provided in an embodiment of this application;

[0035] Figure 2 is a structural schematic diagram of a mobile phone provided in an embodiment of this application;

[0036] Figure 3 is a software structure block diagram of a mobile phone provided in an embodiment of this application;

[0037] Figure 4 is a flowchart illustrating an image processing method provided in an embodiment of this application;

[0038] Figure 5 is a schematic diagram of an image interface provided in an embodiment of this application;

[0039] Figure 6 is an interactive schematic diagram of displaying a first dynamic image provided in an embodiment of this application;

[0040] Figure 7 is a schematic diagram of an interface showing a target effect provided in an embodiment of this application;

[0041] Figure 8 is a schematic diagram of an interaction method for a first subject image provided in an embodiment of this application;

[0042] Figure 9 is a schematic diagram of an image editing interface provided in an embodiment of this application;

[0043] Figure 10 is a schematic diagram of an image editing interface provided in an embodiment of this application;

[0044] Figure 11 is a schematic diagram of an image editing interface provided in an embodiment of this application;

[0045] Figure 12 is a schematic flowchart of a frame number alignment process provided in an embodiment of this application;

[0046] Figure 13 is a schematic diagram of a function menu provided in an embodiment of this application;

[0047] Figure 14 is a schematic diagram of an interaction method for a third dynamic image provided in an embodiment of this application;

[0048] Figure 15 is a schematic diagram of an interface for an interactive method of a third dynamic image provided in an embodiment of this application;

[0049] Figure 16 is a schematic diagram of the structure of a mobile phone provided in an embodiment of this application. Detailed Implementation

[0050] The technical solutions of the embodiments of this application will be described below with reference to the accompanying drawings. In the description of this application, unless otherwise stated, " / " indicates that the objects before and after are in an "or" relationship. For example, A / B can represent A or B. "And / or" in this application is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A alone, A and B simultaneously, and B alone, where A and B can be singular or plural. Furthermore, in the description of this application, unless otherwise stated, "multiple" refers to two or more. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, or c can represent: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple. In addition, in order to clearly describe the technical solutions of the embodiments of this application, the terms "first" and "second" are used in the embodiments of this application to distinguish the same or similar items with basically the same function and effect.

[0051] Those skilled in the art will understand that the terms "first," "second," etc., do not limit the quantity or order of execution, and that "first," "second," etc., are not necessarily different. Furthermore, in some embodiments of this application, words such as "exemplary" or "for example" are used to indicate that something is being described as an example, illustration, or description. Any embodiment or design scheme described as "exemplary" or "for example" in the embodiments of this application should not be construed as being more preferred or advantageous than other embodiments or design schemes. Specifically, the use of words such as "exemplary" or "for example" is intended to present the relevant concepts in a concrete manner for ease of understanding.

[0052] Furthermore, the device architecture and business scenarios described in the embodiments of this application are for the purpose of more clearly illustrating the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided in the embodiments of this application. As those skilled in the art will know, with the evolution of device architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of this application are also applicable to similar technical problems.

[0053] With the widespread use of electronic devices, people have increasingly higher demands for camera performance and photo quality. As an option for users to take photos, dynamic photos can be captured and displayed. Furthermore, users can utilize background removal tools to identify and separate the subject from the background in dynamic photos.

[0054] For example, in related technologies, users can cut out an image from a live photo and use the cutout result as a sticker. Referring to Figure 1(A), the electronic device can respond to the user's operation, perform cutout processing, and obtain the cutout result. The electronic device can also display a function menu, as shown in Figure 1(B). The function menu includes a function control corresponding to adding a sticker. The user can use the cutout result as a sticker. The electronic device can save the cutout result in response to the operation of the function control corresponding to adding a sticker. Then, the user can use the cutout result in a messaging application. The electronic device can display the cutout result through the messaging application, as shown in Figure 1(C). The user can use the cutout result in the messaging application. The electronic device can share the cutout result to other devices through the messaging application in response to the operation on the cutout result.

[0055] However, the cutout results obtained from background removal processing on animated photos are fixed, meaning users can only use a fixed cutout result. Furthermore, the application scenarios for these cutout results are limited, such as only being usable within messaging applications. This fails to meet users' personalized needs for animated photos, resulting in low playability and a poor user experience.

[0056] To address the aforementioned problems, embodiments of this application provide an image processing method applied to a first device. The method includes: the first device displaying a first dynamic image, the first dynamic image including a first subject; the first device also acquiring a first subject image corresponding to the first subject from the first dynamic image; and the first device further responding to an editing operation on the first subject image by compositing the first subject image with a second dynamic image to display a third dynamic image, the third dynamic image including the first subject.

[0057] Thus, the first device in this application embodiment can obtain the subject image corresponding to the main subject in a dynamic image, and synthesize the subject image with other dynamic images to generate and display a new dynamic image. Users can edit the subject image in any dynamic image, such as setting the subject image to a certain position in another dynamic image, or replacing the subject image with the subject image in another dynamic image. It supports user-defined editing of dynamic images, allowing users to create animated images from animated GIFs, thus enhancing the playability of dynamic images and the user's personalized experience.

[0058] In the embodiments of this application, the first device may be a mobile phone, tablet computer, wearable device, vehicle-mounted device, augmented reality (AR) / virtual reality (VR) device, laptop computer, ultra-mobile personal computer (UMPC), netbook, personal digital assistant (PDA), etc. The embodiments of this application do not impose any restrictions on the specific type of the first device.

[0059] The operating system installed on the first device includes, but is not limited to, Or other operating systems. This application does not limit the specific type of the first device or the type of operating system if an operating system is installed.

[0060] For example, taking a mobile phone as the first device, Figure 2 shows a structural schematic diagram of the mobile phone 100.

[0061] Mobile phone 100 may include processor 110, external memory interface 120, internal memory 121, Universal Serial Bus (USB) interface 130, charging management module 140, power management module 141, battery 142, antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display screen 194, and Subscriber Identification Module (SIM) card interface 195, etc.

[0062] The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, a barometric pressure sensor 180C, a magnetic sensor 180D, an accelerometer sensor 180E, a distance sensor 180F, a proximity sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and an image sensor 180N.

[0063] It is understood that the structures illustrated in the embodiments of this application do not constitute a specific limitation on the mobile phone 100. In other embodiments of this application, the mobile phone 100 may include more or fewer components than illustrated, or combine some components, or split some components, or have different component arrangements. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

[0064] Processor 110 may include one or more processing units, such as an application processor (AP), a modem processor, a graphics processing unit (GPU), an image signal processor (ISP), a controller, memory, a video codec, a digital signal processor (DSP), a baseband processor, and / or a neural network processing unit (NPU). These different processing units may be independent devices or integrated into one or more processors.

[0065] The controller can serve as the central nervous system and command center of the mobile phone 100. Based on the instruction operation code and timing signals, the controller generates operation control signals to control the fetching and execution of instructions.

[0066] The processor 110 may also include a memory for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. This memory can store instructions or data that the processor 110 has just used or that are used repeatedly. If the processor 110 needs to use the instruction or data again, it can retrieve it directly from the memory. This avoids repeated accesses, reduces the waiting time of the processor 110, and thus improves the efficiency of the system.

[0067] The wireless communication function of mobile phone 100 can be realized through antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, modem processor and baseband processor.

[0068] The wireless communication module 160 can provide solutions for wireless communication applications on the mobile phone 100, including Wireless Local Area Networks (WLAN) (such as Wireless Fidelity (Wi-Fi) networks), Bluetooth (BT), Global Navigation Satellite System (GNSS), Frequency Modulation (FM), Near Field Communication (NFC), and Infrared (IR) technologies. The wireless communication module 160 can be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via antenna 2, modulates and filters the electromagnetic wave signals, and sends the processed signal to processor 110. The wireless communication module 160 can also receive signals to be transmitted from processor 110, modulate and amplify them, and then convert them into electromagnetic waves for radiation via antenna 2.

[0069] In some embodiments, antenna 1 of mobile phone 100 is coupled to mobile communication module 150, and antenna 2 is coupled to wireless communication module 160, enabling mobile phone 100 to communicate with networks and other devices via wireless communication technology. Wireless communication technology may include Global System for Mobile Communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Time-Division Code Division Multiple Access (TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC, FM, and / or IR technologies, etc. GNSS can include the Global Positioning System (GPS), the Global Navigation Satellite System (GLONASS), the Beidou Navigation Satellite System (BDS), the Quasi-Zenith Satellite System (QZSS), and / or Satellite Based Augmentation Systems (SBAS).

[0070] The mobile phone 100 implements display functions through a GPU, a display screen 194, and an application processor. The GPU is a microprocessor for image processing, connected to the display screen 194 and the application processor. The GPU is used to perform mathematical and geometric calculations and for graphics rendering. The processor 110 may include one or more GPUs, which execute program instructions to generate or modify display information.

[0071] The display screen 194 is used to display images, videos, etc. The display screen 194 includes a display panel. In some embodiments, the mobile phone 100 may include one or N displays screens 194, where N is a positive integer greater than 1.

[0072] The mobile phone 100 can achieve shooting functions through ISP, camera 193, video codec, GPU, display 194 and application processor.

[0073] The ISP (Image Signal Processor) is used to process data fed back from the camera 193. For example, when taking a picture, the shutter is opened, and light is transmitted through the lens to the camera's photosensitive element. The light signal is converted into an electrical signal, and the camera's photosensitive element transmits the electrical signal to the ISP for processing, transforming it into an image visible to the naked eye. The ISP can also perform algorithmic optimization on image noise and brightness. The ISP can also optimize parameters such as exposure and color temperature of the shooting scene. In some embodiments, the ISP can be set in the camera 193.

[0074] Camera 193 is used to capture still images or videos. An object is projected onto a photosensitive element by generating an optical image through the lens. The photosensitive element can be a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the light signal into an electrical signal, which is then passed to an ISP for conversion into a digital image signal. The ISP outputs the digital image signal to a DSP for processing. The DSP converts the digital image signal into image signals in standard RGB, YUV, or other formats. In some embodiments, mobile phone 100 may include one or N cameras 193, where N is a positive integer greater than 1.

[0075] Here, the camera 193 can be located within the mobile phone 100. Alternatively, it can be a component of the first device. In some implementations, the camera can also be located externally to the first device and connected via wired or wireless means. For example, the camera can connect to the first device via Bluetooth or a mobile hotspot. The first device can control the camera by sending or receiving commands.

[0076] The mobile phone 100 may also include a camera module, which can be located within the camera 193. Alternatively, it can be located in other positions within the mobile phone 100. The camera module includes a lens, a focusing motor, a base, a circuit board, and an image sensor.

[0077] The base is fixedly connected to one side of the circuit board. The focusing motor is located on the side of the base away from the circuit board and is fixedly connected to the periphery of the base. The lens is mounted in the center of the focusing motor. The image sensor is fixed to the side of the circuit board facing the lens.

[0078] The lens is used to capture the light signal reflected from the subject. The focusing motor is used to drive the lens to move in a direction parallel to the optical axis. The optical axis refers to the line passing through the center of the lens. In some embodiments, the mobile phone 100 can control the focusing motor to move the lens to the focusing position, thereby completing the focusing process.

[0079] In some embodiments, the focusing motor may be a voice coil motor (VCM), a shape memory alloy (SMA) motor, a piezo motor (PM), or a stepper motor (STM), etc.

[0080] The external storage interface 120 can be used to connect an external storage card, such as a Micro SD card, to expand the storage capacity of the mobile phone 100. The external storage card communicates with the processor 110 through the external storage interface 120 to perform data storage functions. For example, music, video, and other files can be saved on the external storage card.

[0081] The mobile phone 100 can achieve audio functions such as music playback and recording through the audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, and application processor.

[0082] The image sensor 180N can be used to detect objects within the range captured by the camera, with each photosensitive unit corresponding to a pixel in the image sensor. The image sensor 180N may include a color (red, green, blue, RGB) image sensor, a monochrome image sensor, and an infrared image sensor, etc., but this embodiment does not limit the specific type. The image sensor 180N is used to acquire raw images, which may include RGB images, RYB images, monochrome images, and infrared images, etc.

[0083] For ease of description, the 180N image sensor will be used as an example, with an RGB image sensor as well as an example of an RGB image as the original image. For instance, the original image can be a single frame of an RGB image. Each photosensitive unit is covered with an RGB (red, green, blue) filter. Thus, after receiving light, the photosensitive unit generates a corresponding current, the magnitude of which corresponds to the light intensity. Therefore, the electrical signal directly output by the photosensitive unit is analog. This analog electrical signal is then converted into a digital signal, and finally, all the resulting digital signals are output as a digital image matrix to a dedicated DSP processing chip for processing. The RGB image sensor outputs a full-frame image of the captured area in frame format.

[0084] In some embodiments, multiple image sensors can be arranged in the same camera. For example, a single-lens dual-sensor camera integrates both an RGB image sensor and a motion sensor within a single camera. Other examples include dual-lens dual-sensor cameras and single-lens triple-sensor cameras, where these sensors are used to image the same subject. When the number of lenses is less than the number of sensors, a beam splitter can be placed between the lenses and sensors to distribute the light entering through one lens across multiple sensors, ensuring that each sensor receives light. Furthermore, the number of processors in these cameras can be one or more. This application does not specifically limit the arrangement or number of components.

[0085] In other embodiments, only one image sensor may be provided in the same camera.

[0086] In some embodiments, camera calibration is typically performed before the image sensor leaves the factory to make the image information acquired by the image sensor more accurate.

[0087] The first device provided in this application embodiment can run an operating system (OS). This operating system can be various operating systems used in industry, such as an operating system developed based on OpenHarmony, for example... Or other operating systems, such as The iOS mobile operating system; it can also be various open-source operating systems or their derivatives, such as Linux. This includes other embedded operating systems; it can also refer to future new operating systems, such as AI operating systems based on artificial intelligence. An operating system is a set of interconnected system software programs that manage and control the operation of a primary device, utilize and run hardware and software resources, and provide public services to organize user interaction. In a primary device, the operating system connects downwards to the physical devices at the hardware layer and upwards to provide a runtime environment for application software.

[0088] An operating system typically includes a kernel layer, a middleware layer, and an application layer. The application layer includes applications, which can include system applications and third-party applications. The middleware layer includes a suite of software providing various services to application developers, or frameworks providing services such as databases, multimedia, and graphics, or capabilities such as distributed scheduling and system scaling. For example, the middleware layer may include a framework layer and / or a system service layer. The framework layer provides application programming interfaces (APIs) and programming frameworks for applications in the application layer. The system service layer includes the system's core capabilities, providing services to applications through the framework layer. The kernel layer is the layer between hardware and software. The kernel layer may include hardware drivers and the operating system kernel. In addition to providing hardware drivers, the kernel layer also supports functions such as memory management and system process management.

[0089] The types and forms of first devices used in our daily lives vary greatly, and the application scenarios of these devices are also very wide. Therefore, based on different forms and functions of first devices, different application scenarios, and different user needs, the operating systems used in these first devices may also be different. The basic functions implemented by the first device provided in this application can be implemented using a general-purpose operating system or a dedicated operating system. To more clearly illustrate the implementation of the embodiments of this application under a specific operating system, the following shows... Based on the architecture, those skilled in the art can deduce the implementation of the embodiments of this application under other specific operating systems, such as... Implementation under operating systems, etc.

[0090] Figure 3 is a software structure block diagram of a mobile phone 100 according to an embodiment of this application.

[0091] The software architecture of Mobile Phone 100 can be divided into several layers. In some embodiments, from bottom to top, these layers are: kernel layer, system service layer, framework layer, and application layer. Layers communicate with each other through software interfaces. System functions can be tailored, added, or combined at the subsystem level depending on the deployment scenario of different device forms. Each subsystem can also be tailored, added, or combined at the functional level.

[0092] The kernel layer includes the kernel abstraction layer, the kernel subsystem, and the driver subsystem.

[0093] The Kernel Abstraction Layer (KAL) provides basic kernel capabilities to upper layers by shielding the differences between multiple kernels, including but not limited to process / thread management, memory management, file system, network management, and peripheral device management.

[0094] Kernel Subsystem: Supports the selection of a suitable OS kernel for different resource-constrained devices, including but not limited to Linux kernel, HarmonyOS kernel, LiteOS (Lite Operating System), etc.

[0095] Driver Subsystem: The driver framework is the foundation for the open system hardware ecosystem, providing unified peripheral access capabilities and a framework for driver development and management. The driver framework includes: display drivers, camera drivers, audio drivers, Bluetooth drivers, sensor drivers, etc.

[0096] The system service layer comprises the core capabilities of the system, providing services to applications through the framework layer. This layer includes, but is not limited to, the following subsystems:

[0097] The system's basic capability subsystem set provides fundamental capabilities for the operation, scheduling, and migration of distributed applications across multiple devices. This set may include distributed soft bus, distributed data management, distributed task scheduling, and Ark multi-language runtime; it may also include multi-modal input subsystem, graphics subsystem, security subsystem, and AI business subsystem.

[0098] Basic software service subsystem set: provides public and general software services; the basic software service subsystem set may include event notification subsystem, telephone service subsystem, multimedia subsystem, etc.

[0099] Enhanced software service subsystem suite: Provides differentiated enhanced software services for different devices; the enhanced software service subsystem suite may include smart screen proprietary business subsystem, wearable proprietary business subsystem, IoT proprietary business subsystem, etc.

[0100] Hardware service subsystem set: Provides hardware services; the hardware service subsystem set may include location service subsystem, user IAM (Identity and Access Management) subsystem, wearable proprietary hardware service subsystem, biometric identification subsystem, IoT proprietary hardware service subsystem, etc.

[0101] Distributed task scheduling enables distributed service management (discovery, synchronization, registration, and invocation), supporting remote startup, remote invocation, remote connection, and migration of applications across devices.

[0102] Distributed data management enables data synchronization, data storage, data sharing, and data access across all scenarios and devices.

[0103] The distributed soft bus provides communication-related capabilities for seamless interconnection between multiple devices, including: WLAN service capabilities, Bluetooth service capabilities, soft bus, inter-process communication RPC (Remote Procedure Call), and StarFlash communication capabilities.

[0104] Ark Multilingual Runtime is a unified compilation runtime platform designed to support the joint compilation and execution of multiple programming languages ​​and multiple chip platforms.

[0105] The framework layer provides application programming interfaces (APIs) and programming frameworks for applications in the application layer. The framework layer includes: the ArkUI framework (which provides a complete infrastructure for UI development of system applications, including UI functions such as components, layouts, animations, and interactive events, as well as a real-time interface preview tool), the user application framework, and the Ability framework (an Ability is a lightweight application; the Ability framework schedules and manages the operation and lifecycle of Abilities). Different devices may have different operating systems, and the APIs they support may also differ.

[0106] The HarmonyOS API is designed to support... HarmonyOS API provides a range of open capabilities for application development. It can be configured at the framework layer or independently of it. The HarmonyOS API includes the Audio API, Push API, and Account API, among others.

[0107] In some examples, the ArkUI framework includes a MovingPhotoView component and a mask management framework. The MovingPhotoView component is used to display dynamic images, and the mask management framework is used to manage the dynamic images.

[0108] In some examples, the framework layer also includes a file subsystem, a media framework subsystem, an AI framework, and capability development services.

[0109] The file subsystem includes a media library and a media library service. The media framework subsystem and the media library can store dynamic images.

[0110] The AI ​​framework provides AI capabilities, which can include speech and vision service services. These services can include a vision engine and a speech engine. The vision engine provides capabilities such as subject segmentation, face detection, and text recognition. The speech engine provides capabilities such as speech translation, natural language understanding, speech recognition, and speech synthesis.

[0111] Capability development services include image analysis business and image analysis capabilities. Among them, image analysis capabilities can include image text recognition, natural language understanding in images, subject segmentation, and face detection capabilities.

[0112] Applications can include system apps and extended / third-party apps. System apps can include the desktop, control bar, settings, contacts, input method, gallery, etc., while extended / third-party apps can include social apps, travel apps, etc.

[0113] The following embodiments, with reference to the accompanying drawings, will illustrate the image processing method provided in this application, taking a mobile phone with the structure shown in FIG1 as an example. Referring to FIG4, the method may include:

[0114] S401. The mobile phone displays a first dynamic image, which includes a first subject.

[0115] S402, The mobile phone obtains the first subject image corresponding to the first subject from the first dynamic image.

[0116] In some embodiments of this application, the first moving image may include moving images stored on the mobile phone. The first moving image may be a moving image captured by the user using the live mode corresponding to the camera, or it may be a moving image saved by the user from an application.

[0117] Meanwhile, the first moving image may include a first subject, which can be understood as a foreground subject in the image, such as the subject being photographed. The number of first subjects can be one or more. The first subject may include a subject in motion and / or a subject that is not in motion.

[0118] For example, the subject in motion can include people, animals, vehicles, etc. The subject not in motion can include mountains, trees, buildings, etc. Of course, the subject not in motion can also include people, animals, vehicles, etc., meaning that the subject capable of motion is stationary during the shooting process. This application does not limit the specific source or the number of subjects in the first dynamic image.

[0119] In some examples, a user can view the first animated image through an application on their mobile phone, such as a gallery app. For instance, the phone can display an image interface of a gallery app, which includes the first animated image. Referring to Figure 5(A), the phone can display an image interface that includes the first animated image. The first animated image includes a first subject, which is a puppy.

[0120] Specifically, referring to Figure 6, the user can input an image viewing event. In response to this event, the gallery application sends an image acquisition request to the media library (S601), requesting the acquisition of a first animated image. In response, the media library sends the first animated image to the gallery application (S602). The first animated image includes multiple frames of still images, such as a cover image and a video clip. The image application can also create a display control to display the first animated image. This display control can be a Moving Photo control. Thus, the image interface can include a Moving Photo control to display the first animated image.

[0121] In some embodiments of this application, the gallery application may also send a create management object event to the overlay management framework in response to an image viewing event. The create management object event instructs the overlay management framework to create a management object to manage the first dynamic image. In response to the create management object event, the overlay management framework sends a registration detection event to the multi-mode module, which detects and manages all corresponding events. The overlay management framework may also send configuration information of the first dynamic image to the first model and the second model, causing the first and second models to perform configuration initialization and prepare to receive all subsequent events related to the first dynamic image.

[0122] The configuration information for the first animated image may include its development status, cover image, frame rate, and file descriptors for the video within it. The first model can use this configuration information to perform operations such as environment registration, configuration updates, and engine process initialization. After the first model performs configuration initialization, the second model can also perform configuration initialization. After initialization, the second model sends configuration completion information to the first model, which then sends this information to the mask management framework. The mask management framework can create target nodes based on the configuration completion information and control the Moving Photo control to create target child nodes. These target nodes and child nodes store image information related to subsequent editing of the first animated image to achieve the desired display effect.

[0123] It should be noted that the embodiments of this application do not limit the display process of the first dynamic image and the initialization process of the components in the system.

[0124] Users can edit the first animated image to obtain an animated image corresponding to the first subject. The editing operation can include a long press on the first subject. In response to the editing operation on the first animated image, the phone retrieves the first subject image from the first animated image; that is, the phone can extract the animated image corresponding to the puppy from the first animated image.

[0125] In some examples, the mobile phone can utilize a first model, a second model, and a masking management framework within the system to obtain a first subject image corresponding to a first subject from a first dynamic image. The mobile phone can receive a user long-press event within the masking management framework, perform ROI region recognition on the cover image in the first dynamic image using the first model, and then use the recognition result and the second model to accurately identify the subject, thereby obtaining the first subject image.

[0126] Specifically, referring to Figure 6, the user can input a long press event. In response to the long press event, the overlay management framework sends a subject detection request to the first model (S603). The subject detection request requests the detection of the first subject in the cover image of the first dynamic image, and includes the cover image in the first dynamic image. In response to the subject detection request, the first model performs subject recognition on the cover image in the first dynamic image (S604). The mobile phone can input the cover image from the first dynamic image into the first model and output the region of interest corresponding to the cover image. The first model sends a subject detection request to the second model (S605). The subject detection request requests the detection of the first subject in the first dynamic image, and includes the cover image and the video in the first dynamic image. In response to the subject detection request, the second model performs subject recognition on all images in the first dynamic image (S606). The second model can also send the recognition result to the first model (S607). The first model can also send the recognition result to the overlay management framework (S608). The recognition result indicates whether a subject has been identified or not. When the recognition result indicates that a subject has been identified, the first model can send a subject segmentation request to the second model (S609). The subject segmentation request requests subject segmentation of the first dynamic image to obtain a first subject image. The subject segmentation request includes the region of interest of the cover image and the first dynamic image. In response to the subject segmentation request, the second model performs subject segmentation on the first dynamic image to obtain the first subject image (S610). The second model can also send the first subject image to the first model (S611). The first model can cache the first subject image, and the first model can also send the first subject image to the masking management framework (S612). In this way, the masking management framework can control the Moving Photo control to display the first subject image. Continuing to refer to Figure 5(B), the masking management framework can display an image editing interface, which includes the Moving Photo control, and the user can see the first subject image in the image editing interface.

[0127] Understandably, the Moving Photo control can be located on a new layer, which is a layer above the first moving image.

[0128] In the process of obtaining the first subject image corresponding to the first subject from the first dynamic image, the mobile phone can input the cover image from the first dynamic image into the first model and output the region of interest corresponding to the cover image. The mobile phone can also input the region of interest and the first dynamic image into the second model and output the first subject image.

[0129] The first model can be an AI model, and its input can include the cover image in the first dynamic image. The output of the first model can include the region of interest (ROI) of the cover image.

[0130] It should be noted that the embodiments of this application also train the first model. When training the first model, a large number of dynamic images can be used as training samples. This enables the first model to learn the ability to identify the ROI regions of cover images in dynamic images and the corresponding region types.

[0131] The second model can also be an AI model. The input to the second model can include the region of interest of the cover image and the entire image of the first dynamic image. The output of the second model can include the subject segmentation (also known as instance segmentation) result, i.e., the first subject image.

[0132] It should be noted that the embodiments of this application also train the second model. When training the second model, a large number of dynamic images can be used as training samples to enable the second model to learn the ability to identify the main region in the image. The embodiments of this application do not limit the model types corresponding to the first model and the second model.

[0133] In some examples of this application, the first model and the second model can be deployed inside the mobile phone or outside the mobile phone, such as on a server. Alternatively, the first model can be deployed inside the mobile phone, and the second model can be deployed outside the mobile phone. This application does not impose specific limitations on these aspects.

[0134] Thus, by utilizing the first model and the second model to obtain the first subject image, the embodiments of this application can improve the speed and accuracy of obtaining the first subject image.

[0135] In some embodiments of this application, when the recognition result indicates that a subject has been identified, the overlay management framework can control the Moving Photo control to display a first subject image. When the recognition result indicates that a subject has been identified, the overlay management framework can control the Moving Photo control to display the first dynamic image again. The overlay management framework can also control the Moving Photo control to display the first dynamic image again if no recognition result is received within a first time period. The first time period can be 500 milliseconds, 600 milliseconds, etc. This application does not limit the specific value corresponding to the first time period.

[0136] In some embodiments of this application, after the mobile phone obtains the first subject image corresponding to the first subject from the first dynamic image, it can display the subject outline of the first subject with a target effect, the target effect including: the subject outline of the first subject changes according to a preset rule.

[0137] Specifically, when the recognition result is used to represent the identified subject, the masking management framework can display the subject outline of the first subject with the target effect. The subject outline can be understood as the edge outline of the first subject.

[0138] The preset rules can include dividing the main outline into multiple outline points and adjusting the transparency of these points sequentially. Each outline point can have its transparency adjusted sequentially and then restored to its original level; for example, the transparency of each outline point can be increased and then restored sequentially. This achieves a dynamic effect where the main outline of the first subject exhibits a flickering quality.

[0139] For example, referring to Figures 7(A)-7(B), the transparency of each contour point can be 0, and then the transparency of each contour point can be adjusted from 0 to 0.5. Then, the transparency of each contour point can be adjusted from 0.5 to 0. Preset rules may include dividing the main contour into multiple contour points and adjusting the display size of these multiple contour points sequentially, with the display size of each contour point increasing sequentially and then returning to its original size.

[0140] It should be noted that the embodiments of this application do not specifically limit the preset rules.

[0141] Thus, in this embodiment, the outline of the first subject can also be displayed with the desired effect, that is, presenting a regular dynamic effect according to preset rules, such as a cyclical flashing dynamic effect based on adjusting transparency. This can prompt the user that they are identifying the first subject in the first dynamic image. Therefore, the visual visibility of the first subject and the guidance of the identification process are improved, giving the user a better visual experience.

[0142] In some embodiments of this application, the mobile phone may store the first subject image according to a preset image format.

[0143] For example, the preset image format may include GIF format, and the mobile phone may store the first main image in GIF format and store the first main image in local storage.

[0144] As another example, the preset image format may include GIF format. The mobile phone can store the first main image in GIF format and store the first main image in the application's gallery application.

[0145] Of course, the preset image format can also include MP4, AVI, and MOV formats, etc. It should be noted that the embodiments of this application do not specifically limit the image format and storage location of the first subject image.

[0146] In some embodiments of this application, the first device includes a first application, and the mobile phone can also load and display a first main image with a preset image format through the first application.

[0147] Specifically, after the mobile phone stores the first subject image according to a preset image format, the first application on the mobile phone can recognize the first subject image with the preset image format. The first application can also generate a display component and load the first subject image into the display component so that the user can see the first subject image in the first application.

[0148] The first application can be an application installed on the mobile phone, which may include system applications and third-party applications. This application embodiment does not specifically limit the application type of the first application.

[0149] For example, referring to Figure 5(B), after the user performs a long press operation, i.e., after the user releases their finger, the phone can display a function menu corresponding to the first main image. The function menu includes a save control for the save function, a copy control for the copy function, an insert control for the insert function, a replace control for the replace function, and a share control for the share function. In this way, the user can operate on the controls in the function menu to edit the first main image. The user can click the save control. Referring to Figure 5(C), the phone responds to the user's click operation and displays an application list. The application list includes at least one first application.

[0150] In one possible implementation, the first application is an input method application. Referring to Figure 8, the user can select the interaction method for the first main image. The user can click on the input method application in the application list. In response to the user's click, the phone saves the first main image to its local storage according to a preset image format. Furthermore, the input method application can recognize the first main image stored locally with the preset image format. The user can open the application interface of the input method application, which can load and display the first main image with the preset image format, such as displaying the first main image in the application interface of the input method application.

[0151] Users can share the primary image using the input method application. The phone can respond to actions performed on the primary image within the input method application interface, sending the primary image to other applications. Typically, "animated emoticons" are a common business model for animated images, which can be displayed in other applications.

[0152] In another possible implementation, the mobile phone, in response to a user's click, stores a first main image in the input method application according to a preset image format. The preset image format can be any image format supported by the input method application. In this way, the input method application can load and display the first main image with the preset image format. Specifically, the input method application can directly generate a display component and load the first main image into the display component, so that the user can see the first main image in the input method application.

[0153] In another possible implementation, the first application is a gallery app and a third-party app (social media app). The phone responds to the user's click and stores the first main image in the gallery app according to a preset image format. The social media app can recognize the first main image with the preset image format. Similarly, the social media app can load the first main image with the preset image format. Thus, the social media app can load and display the first main image with the preset image format. Specifically, the social media app can directly generate a display component and load the first main image into the display component, displaying it as an "animated emoticon."

[0154] In some embodiments of this application, during the storage of the first subject image, the masking management framework can update the first subject image by sending a request to obtain an animated image to the first model. This request includes the first subject image. The first model sends the first subject image to a second model, which then fuses the first subject image frame by frame, outputting the updated first subject image. The mobile phone can then save the updated first subject image. It should be noted that the embodiments of this application do not specifically limit the generation and saving processes of the first subject image.

[0155] In some embodiments of this application, the first device includes a second application, and the mobile phone can also respond to a sharing operation for the first subject image by sending the first subject image to the second device through the second application.

[0156] The second application can be a third-party application.

[0157] For example, a user can click the share control in the function menu to share the first subject image. The mobile phone can send the first subject image to a third-party application. The third-party application can then load and display the first subject image with a preset image format. Specifically, the third-party application can directly generate a display component and load the first subject image into it, displaying it as an "animated emoticon." Alternatively, the third-party application can actively identify and load a locally available first subject image. This application does not specifically limit this approach.

[0158] Thus, in this embodiment of the application, users can save the first subject image through a first application on their mobile phone and share it through a second application. This enhances the interactivity between the user and the device, as well as the playability of dynamic images.

[0159] S403. In response to the editing operation on the first subject image, the mobile phone combines the first subject image with the second dynamic image to display a third dynamic image, the third dynamic image including the first subject.

[0160] In some embodiments of this application, based on the first subject image obtained above, the user can edit the first subject image, such as by selecting a new animated image for personalized creation. Specifically, in response to the editing operation on the first subject image, the mobile phone can combine the first subject image with a second animated image to display a third animated image, which includes the first subject. Thus, the user can copy, replace, or insert the first subject image into a new animated image to generate an edited animated image that still includes the first subject.

[0161] For example, the mobile phone can respond to a user's editing operation on the first animated image by displaying a function menu corresponding to the first main image, as shown in Figure 5(B). The user's editing operation on the first animated image can include a long press operation. After the user releases their finger following the long press operation, the mobile phone can display the function menu corresponding to the first main image. The function menu includes a copy control for the copy function, an insert control for the insert function, and a replace control for the replace function. In response to operations on the function menu, such as clicking the copy control, insert control, or replace control, the mobile phone can combine the first main image with the second animated image to generate and display a third animated image. Thus, the user can edit the first main image by operating the controls in the function menu.

[0162] In some embodiments of this application, the mobile phone supports copying the first subject image to other moving images.

[0163] Specifically, referring to Figure 9(A), the mobile phone can display an image editing interface, which includes a first main image. The user can drag the first main image into a new animated image. Referring to Figure 9(B), the user can drag the first main image to a target area of ​​a second animated image. In response to the dragging operation on the first main image, the mobile phone can overlay the first main image onto the target area of ​​the second animated image to generate a third animated image, where the target area is the operation area corresponding to the dragging operation. The mobile phone can also display the third animated image, i.e., display the generated third animated image, as shown in Figure 9(C).

[0164] In some embodiments of this application, the mobile phone supports inserting a first subject image into other moving images.

[0165] Specifically, referring to Figure 10(A), the mobile phone can display an image editing interface, which includes a first main image. The user can insert the first main image into a new animated image. Referring to Figure 10(B), after displaying the first main image, the mobile phone displays a list of selectable animated images. The list of selectable animated images includes at least one second animated image. The user can select a second animated image from the list of selectable animated images and drag the first main image to the target area of ​​the second animated image, as shown in Figure 10(C). In response to the dragging operation on the first main image, the mobile phone can overlay the first main image onto the target area of ​​the second animated image to obtain a third animated image, where the target area is the operation area corresponding to the dragging operation. Referring to Figure 10(D), the mobile phone can also display the third animated image.

[0166] Thus, in this embodiment, users can edit the subject in any animated image, such as copying or inserting the subject into any position in other animated images. This supports user-defined editing of animated images, enhancing their playability and user experience.

[0167] In some embodiments of this application, when a second subject is included in other moving images, the mobile phone supports replacing the first subject in the first moving image with the second subject in other moving images.

[0168] Specifically, the second animated image may include a second subject. Referring to Figure 11(A), the phone can display an image editing interface, which includes a first subject image. The user can replace the first subject image with the second subject image corresponding to the second subject. Referring to Figure 11(B), after displaying the first subject image, the phone displays a list of selectable animated images. The list of selectable animated images includes at least one second animated image. The user can select a second animated image from the list of selectable animated images and perform a long-press operation on the second subject in the second animated image, as shown in Figure 11(C). In response to the replacement operation on the first subject image, the phone can determine a target area in the second animated image, which is the subject area corresponding to the second subject. That is, the phone can also obtain the second subject image corresponding to the second subject from the second animated image. After determining the subject area corresponding to the second subject, the phone can directly overlay the first subject image onto the target area in the second animated image to obtain a third animated image. Referring to Figure 11(D), the phone can also display a third animated image.

[0169] It should be noted that during the display of the second animated image on the aforementioned mobile phone, the overlay management framework can also control the display of the Moving Photo control. For example, the Moving Photo control can reside on a new layer, which is a layer above the first animated image.

[0170] Thus, in this embodiment, users can edit the subject in any animated image, such as replacing the subject with the subject in other animated images. For example, if a user wants the background in another animated image, they can replace the desired subject with the subject in another animated image to achieve a satisfactory subject and background in a single animated image. This supports user-defined editing of animated images, enhancing the playability of animated images and the user experience.

[0171] In some embodiments of this application, referring to FIG12, during the process of compositing the first subject image and the second dynamic image, the number of image frames of the first subject image and the second dynamic image can be compared (S1201). If the number of image frames of the first subject image and the number of image frames of the second dynamic image are different, frame number alignment processing can be performed on the first subject image and the second dynamic image (S1202).

[0172] In one possible implementation, if the number of image frames of the first main image is less than the number of image frames of the second dynamic image, the first main image is padded with frames according to the number of image frames of the second dynamic image to obtain an updated first main image.

[0173] In other words, when the number of frames in the first subject image and the second animated image is inconsistent, frame rate balancing adjustment is needed. If the number of frames in the first subject image is less than that in the second animated image, the first subject image can be combined with its adjacent frames for speech information analysis to obtain an updated first subject image. This ensures that the number of frames in the updated first subject image matches that of the second animated image. Subsequently, the updated first subject image and the second animated image can be synthesized to obtain a third animated image.

[0174] In one possible implementation, when the number of image frames in the first main image is less than the number of image frames in the second dynamic image, the mobile phone can remove redundant images from the first main image and perform semantic analysis and sorting on the first main image after removing redundant images to obtain an intermediate result of the first main image. It can be understood that the mobile phone can perform data redundancy processing on the first main image, that is, normalize the data of the first main image, and perform semantic analysis and sorting on the first main image after data redundancy processing to obtain an intermediate result of the first main image. Then, the mobile phone can remove redundant frames from the intermediate result of the first main image based on the number of image frames in the second dynamic image to obtain an updated first main image.

[0175] In other words, when the number of image frames in the first main image is greater than the number of image frames in the second dynamic image, redundant sampling and semantic analysis sorting can be performed on the first main image, while redundant image frames are removed. This ensures that the number of image frames in the updated first main image matches the number of image frames in the second dynamic image. Subsequently, the updated first main image and the second dynamic image can be synthesized to obtain the third dynamic image.

[0176] It should be noted that during the synthesis process using the updated first subject image and the second dynamic image, an AI fusion method can be used to synthesize the updated first subject image and the second dynamic image. This application does not impose specific limitations on this aspect.

[0177] Meanwhile, the embodiments of this application do not limit the specific implementation process of the above frame number alignment process.

[0178] Thus, in this embodiment of the application, when the frame rates of the first main image and the second dynamic image are inconsistent, frame rate balancing can be performed on the two animations to maintain smooth edge transitions and semantic information consistency during the fusion process. This improves the image quality of the third dynamic image, resulting in a better display effect and ultimately enhancing the user's visual experience.

[0179] In some embodiments of this application, when the third dynamic image includes a first subject and a second subject, the animation effects of the first subject and the second subject are matched.

[0180] In one feasible approach, the third animated image includes both the first and second subjects, provided that the second animated image includes the second subject and the user does not replace the second subject with the first subject. During the display of the third animated image, the animation effects of the first and second subjects can conform to the target rules.

[0181] For example, the target rule could include ensuring that the animation effects of the first subject and the second subject are consistent. That is, during the display of the third dynamic image, the animation effects of the first subject and the second subject are identical. For instance, the display speed of the corresponding animations of the first subject and the second subject could be the same, so that the first subject and the second subject appear more visually harmonious.

[0182] Of course, the target rules can include a certain difference in the animation effects of the first and second subjects. This difference can be preset. For example, the first subject is a rabbit, and the second subject is a turtle. In this case, the display speed of the animation corresponding to the first subject should be higher than the display speed of the animation corresponding to the second subject. There can be a certain ratio between the display speed of the animation corresponding to the first subject and the display speed of the animation corresponding to the second subject.

[0183] It should be noted that the embodiments of this application do not specifically limit the target rules.

[0184] Thus, in this embodiment of the application, when the third dynamic image includes a first subject and a second subject, the animation effects of the first subject and the second subject can be matched. This allows the user to experience a more harmonious and realistic viewing experience of the first subject and the second subject, enhancing the user's visual experience.

[0185] In some embodiments of this application, the mobile phone can store a third dynamic image according to a preset image format.

[0186] For example, the preset image format may include GIF format, and the mobile phone may store the third dynamic image in GIF format and store the third dynamic image in local storage.

[0187] As another example, the preset image format may include GIF format, and the mobile phone may store the third animated image in GIF format and store the third animated image in the application's gallery application.

[0188] Of course, the preset image format can also include MP4, AVI, and MOV formats, etc. It should be noted that the embodiments of this application do not specifically limit the image format and storage location of the third dynamic image.

[0189] In some embodiments of this application, the mobile phone can also load a third dynamic image with a preset image format through a first application and display the third dynamic image.

[0190] Specifically, after the mobile phone stores the third animated image according to a preset image format, the first application on the phone can recognize the third animated image with the preset image format. The first application can also generate a display component and load the third animated image onto the display component so that the user can see the first main image in the first application.

[0191] For example, referring to Figure 13(A), a user can edit a third animated image. The phone can display a function menu corresponding to the third animated image, including a save control for the save function and a share control for the share function. In this way, the user can operate on the controls in the function menu to use the third animated image. The user can click the save control. Referring to Figure 13(B), the phone responds to the user's click and displays an application list. The application list includes at least one first application.

[0192] In one possible implementation, the first application is an input method application. Referring to Figure 14, the user can select the interaction method for the third animated image. The user can click on the input method application in the application list. The phone responds to the user's click and saves the third animated image to its local storage according to a preset image format. Furthermore, the input method application can recognize the third animated image with the preset image format saved locally. The user can open the application interface of the input method application, which can load and display the third animated image with the preset image format.

[0193] Users can share third-party animated images using the input method application. Referring to Figure 15, the phone can respond to the operation on the third-party animated image in the input method application interface and send the third-party animated image to a social application, which can then display it as an "animated emoticon".

[0194] In another possible implementation, the mobile phone responds to the user's click by storing a third animated image in the input method application according to a preset image format. In this way, the input method application can load and display the third animated image with the preset image format.

[0195] In another possible implementation, the first application is a gallery app and a third-party app (social media app). The phone responds to the user's click by storing a third animated image in the gallery app according to a preset image format. The social media app can recognize third animated images with the preset image format. Similarly, the social media app can load and display third animated images with the preset image format.

[0196] In some embodiments of this application, the first device includes a second application, and the mobile phone can also respond to a sharing operation for the third dynamic image by sending the third dynamic image to the second device through the second application. The second application can be a third-party application.

[0197] For example, a user can click the share control in the function menu to share a third-party animated image. The phone can then send the third-party animated image to a third-party application. The third-party application can then load and display the third-party animated image with a preset image format. Specifically, the third-party application can directly generate a display component and load the main image into it, displaying it as an "animated emoticon." Alternatively, the third-party application can actively recognize and load locally stored third-party animated images.

[0198] It should be noted that the second application may also include a system application, and the first application and the second application may be the same or different. This application does not specifically limit the application types of the first application and the second application, or the process of saving and sharing the first dynamic image.

[0199] Thus, in this embodiment of the application, users can save and share third-party animated images through applications on their mobile phones. Not only can animated images be edited, but the edited animated images can also be saved and shared. This enhances the interactivity between the user and the device, as well as the playability of the animated images. Simultaneously, all applications on the mobile phone can load the edited animated images, maintaining consistency in the user experience.

[0200] In some solutions, multiple embodiments of this application can be combined, and the combined solution can be implemented. Optionally, some operations in the process of each method embodiment may be combined, and / or the order of some operations may be changed. Furthermore, the execution order between the steps of each process is merely exemplary and does not constitute a limitation on the execution order between steps; other execution orders are also possible. It is not intended to indicate that the execution order is the only possible order in which these operations can be performed.

[0201] Those skilled in the art will conceive of various ways to reorder the operations described in the embodiments of this application. Furthermore, it should be noted that process details involved in one embodiment of this application are similarly applicable to other embodiments, or different embodiments can be combined.

[0202] Furthermore, some steps in the method embodiments can be equivalently replaced with other possible steps. Alternatively, some steps in the method embodiments may be optional and can be deleted in certain use cases. Or, other possible steps may be added to the method embodiments.

[0203] Furthermore, the various method embodiments can be implemented individually or in combination.

[0204] This application also provides an electronic device, such as the mobile phone described above, as shown in FIG16. The mobile phone may include one or more processors 1610, memory 1620 and communication interface 1630.

[0205] The memory 1620, communication interface 1630, and processor 1610 are coupled together. For example, the memory 1620, communication interface 1630, and processor 1610 can be coupled together via bus 1640.

[0206] The communication interface 1630 is used for data transmission with other devices. The memory 1620 stores computer program code. The computer program code includes computer instructions, which, when executed by the processor 1610, cause the electronic device to perform the relevant method steps in the embodiments of this application.

[0207] Processor 1610 may be a processor or controller, such as a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It may implement or execute the various exemplary logic blocks, modules, and circuits described in conjunction with this disclosure. The processor may also be a combination that implements computational functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, etc.

[0208] Bus 1640 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The aforementioned bus 1640 can be divided into address bus, data bus, control bus, etc. For ease of illustration, only one thick line is used in Figure 16, but this does not indicate that there is only one bus or one type of bus.

[0209] This application also provides an electronic device, which includes a memory and one or more processors; the memory is coupled to the processors; wherein the memory stores computer program code, which includes computer instructions, and when the computer instructions are executed by the processor, the electronic device performs the relevant method steps in the above method embodiments.

[0210] This application also provides a communication device, which includes a memory and one or more processors; the memory is coupled to the processors; wherein the memory stores computer program code, which includes computer instructions, and when the computer instructions are executed by the processor, the communication device performs the relevant method steps in the above method embodiments.

[0211] This application also provides a computer-readable storage medium storing computer program code. When the processor executes the computer program code, the electronic device executes the relevant method steps in the above method embodiments.

[0212] This application also provides a computer program product containing instructions that, when executed on a computer or processor, cause the computer or processor to perform the relevant method steps as described in the above method embodiments.

[0213] This application also provides a chip system, including: a processor coupled to a memory, the memory being used to store programs or instructions, and when the program or instructions are executed by the processor, the chip system enables the methods in any of the above method embodiments.

[0214] Optionally, the chip system may contain one or more processors. These processors can be implemented in hardware or software. When implemented in hardware, the processor can be a logic circuit, an integrated circuit, etc. When implemented in software, the processor can be a general-purpose processor, implemented by reading software code stored in memory.

[0215] Optionally, the chip system may contain one or more memories. The memory may be integrated with the processor or disposed separately from it; this application embodiment does not limit this. For example, the memory may be a non-transient processor, such as a read-only memory (ROM), which may be integrated with the processor on the same chip or disposed separately on different chips. This application embodiment does not specifically limit the type of memory or the arrangement of the memory and processor.

[0216] For example, the chip system may be a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), a system on chip (SoC), a central processor unit (CPU), a network processor (NP), a digital signal processor (DSP), a micro controller unit (MCU), a programmable logic device (PLD), or other integrated chips.

[0217] The electronic devices, computer storage media, or computer program products provided in this application are all used to execute the corresponding methods provided above. Therefore, the beneficial effects they can achieve can be referred to the beneficial effects in the corresponding methods provided above, and will not be repeated here.

[0218] Through the above description of the embodiments, those skilled in the art can clearly understand that, for the sake of convenience and brevity, only the division of the above functional modules is used as an example. In actual applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above.

[0219] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another apparatus, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0220] The units described as separate components may or may not be physically separate. A component shown as a unit can be one or more physical units, located in one place or distributed in multiple different locations. Some or all of the units can be selected to achieve the purpose of this embodiment, depending on actual needs.

[0221] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0222] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium. Based on this understanding, the technical solutions of the embodiments of this application, or the contributing parts, or all or part of the technical solutions, can be embodied in the form of a software product. This software product is stored in a storage medium and includes several instructions to cause a device (which may be a microcontroller, chip, etc.) or processor to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0223] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any changes or substitutions within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. An image processing method, characterized in that, Applied to a first device, the method includes: Display a first dynamic image, the first dynamic image including a first subject; Obtain the first subject image corresponding to the first subject from the first dynamic image; In response to an editing operation on the first subject image, the first subject image is combined with a second motion image to display a third motion image, the third motion image including the first subject.

2. The method according to claim 1, characterized in that, The method further includes: Display an image editing interface, the image editing interface including the first main image; The step of compositing the first subject image with the second dynamic image and displaying the third dynamic image in response to an editing operation on the first subject image includes: In response to a drag operation on the first subject image, the first subject image is overlaid onto a target area in the second dynamic image to obtain the third dynamic image, wherein the target area is the operation area corresponding to the drag operation; The third dynamic image is displayed.

3. The method according to claim 1, characterized in that, The second dynamic image includes a second subject, and the method further includes: Display an image editing interface, the image editing interface including the first main image; The step of compositing the first subject image with the second dynamic image and displaying the third dynamic image in response to an editing operation on the first subject image includes: In response to a replacement operation on the first subject image, a target region in the second dynamic image is determined, wherein the target region is the subject region corresponding to the second subject; The first main image is overlaid onto the target area in the second dynamic image to obtain a third dynamic image; The third dynamic image is displayed.

4. The method according to any one of claims 1-3, characterized in that, The step of combining the first main image with the second dynamic image further includes: Compare the number of image frames of the first main image with the number of image frames of the second dynamic image; If the number of image frames of the first main image is different from the number of image frames of the second dynamic image, frame number alignment processing is performed on the first main image and the second dynamic image.

5. The method according to claim 4, characterized in that, The frame number alignment process for the first main image and the second dynamic image includes: If the number of frames in the first main image is less than the number of frames in the second dynamic image, the first main image is padded with frames according to the number of frames in the second dynamic image to obtain an updated first main image. If the number of image frames of the first main image is greater than the number of image frames of the second dynamic image, redundant images in the first main image are removed, and the first main image after removing redundant images is semantically analyzed and sorted to obtain an intermediate result of the first main image; Based on the number of image frames in the second dynamic image, redundant frames in the intermediate results of the first main image are removed to obtain the updated first main image.

6. The method according to any one of claims 1-5, characterized in that, In the case where the third dynamic image includes the first subject and the second subject, the animation effects of the first subject and the second subject are matched.

7. The method according to any one of claims 1-6, characterized in that, The first device includes a first application, and the method further includes: The third dynamic image is stored according to a preset image format; The display of the third dynamic image includes: The third dynamic image with the preset image format is loaded through the first application; The third dynamic image is displayed.

8. The method according to any one of claims 1-7, characterized in that, The first device includes a second application, and the method further includes: In response to a sharing operation for a third moving image, the third moving image is sent to the second device via the second application.

9. The method according to any one of claims 1-8, characterized in that, The first device includes a first application and a second application, and the method further includes: The first main image is stored according to a preset image format; The first main image with the preset image format is loaded through the first application; Display the first main image; In response to a sharing operation for the first subject image, the first subject image is sent to the second device via the second application.

10. The method according to any one of claims 1-9, characterized in that, The first application includes an input method application and a gallery application, and the second application includes a third-party application.

11. The method according to any one of claims 1-10, characterized in that, The step of obtaining the first subject image corresponding to the first subject from the first dynamic image includes: The cover image in the first dynamic image is input into the first model, and the region of interest corresponding to the cover image is output. The region of interest and the first dynamic image are input into the second model, and the first main image is output.

12. The method according to any one of claims 1-11, characterized in that, After obtaining the first subject image corresponding to the first subject from the first dynamic image, the method further includes: The outline of the first subject is displayed with a target effect, the target effect including: the outline of the first subject changes according to a preset rule.

13. An electronic device, characterized in that, The electronic device includes a display screen, a memory, and one or more processors; the display screen is used to display dynamic images, and the memory is coupled to the processors; wherein the memory stores computer program code, the computer program code including computer instructions, which, when executed by the processor, cause the electronic device to perform the image processing method as described in any one of claims 1-12.

14. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed on a computer, enable the computer to perform the image processing method as described in any one of claims 1-12.

15. A computer program product, characterized in that, The computer program product includes instructions that, when executed by an electronic device, cause the electronic device to perform the image processing method as described in any one of claims 1-12.