Watermark rendering method and device of media information, electronic equipment and storage medium
By identifying the areas of interest to users in the media information display interface and rendering watermarks in other areas, the problem of watermarks affecting user viewing is solved, achieving effective watermark exposure and improving user experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TENCENT TECHNOLOGY (SHENZHEN) CO LTD
- Filing Date
- 2022-07-26
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies for adding watermarks to media information can negatively impact the user viewing experience and fail to effectively prevent content leaks.
By locating the target object's area of interest in the media information display interface, the watermark area is determined to be in other display areas outside the area of interest, and the watermark rendering result is updated in real time to avoid rendering the watermark in the area of interest to the user.
While ensuring watermark exposure, we aim to improve the user viewing experience and avoid the watermark's impact on the areas of interest.
Smart Images

Figure CN117494080B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of Internet technology, and in particular to a method, apparatus, electronic device and computer-readable storage medium for watermarking media information. Background Technology
[0002] In related technologies, the methods of adding watermark information to media information are to fix the physical position of the watermark or to render the watermark directly according to specific animation rules (such as horizontal movement). However, although this method of directly adding watermarks to media information prevents content leakage, it will affect the user's viewing experience. Summary of the Invention
[0003] This application provides a watermark rendering method, apparatus, electronic device, computer-readable storage medium, and computer program product for media information, which can both ensure the exposure of the watermark and improve the user's viewing experience.
[0004] The technical solution of this application embodiment is implemented as follows:
[0005] This application provides a method for watermarking media information, including:
[0006] When the terminal displays media information on the media information display interface, locate the target object's area of interest in the media information display interface;
[0007] Based on the area of interest, the watermark area of the media information is determined; wherein, the watermark area is included in other display areas outside the area of interest in the media information display interface;
[0008] Watermark rendering is applied to the media information displayed in the watermark area, and the area of interest is detected.
[0009] When the detection result indicates a change in the region of interest, the watermark rendering result of the media information is updated based on the changed region of interest.
[0010] This application provides a watermark rendering device for media information, including:
[0011] The positioning module is used to locate the area of interest of the target object in the media information display interface when the terminal displays media information on the media information display interface;
[0012] The determining module is used to determine the watermark area of the media information based on the area of interest; wherein the watermark area is included in other display areas outside the area of interest in the media information display interface;
[0013] The rendering module is used to render the media information displayed in the watermark area with a watermark and to detect the area of interest.
[0014] An update module is used to update the watermark rendering result of the media information based on the changed region of interest when the detection result indicates a change in the region of interest.
[0015] In the above scheme, the positioning module is further used to obtain the convergence point of the target object's gaze in the media information display interface; with the convergence point as the center point of the image, draw the target image according to the target length, and take the area corresponding to the target image in the media information display interface as the area of interest.
[0016] In the above scheme, the rendering module is further configured to obtain the new area of interest of the target object on the media information after watermark rendering; compare the new area of interest with the area of interest to obtain a first overlap rate between the new area of interest and the area of interest; compare the first overlap rate with a first overlap rate threshold to obtain a comparison result; and based on the comparison result, determine a detection result for indicating whether the area of interest has changed.
[0017] In the above scheme, the update module is further configured to determine the position of the changed area of interest in the media information display interface when the detection result indicates that the area of interest has changed; adjust the watermark area based on the position of the changed area of interest to obtain a new watermark area; and re-watermark the media information based on the new watermark area.
[0018] In the above scheme, when the detection result indicates that the area of interest has not changed and the size of the media information display interface has changed, the device further includes a first comparison module. The first comparison module is used to compare the media information display interface with the area of interest to obtain a second overlap rate between the media information display interface and the area of interest; compare the second overlap rate with a second overlap rate threshold; and when the second overlap rate reaches the second overlap rate threshold, perform watermark rendering on the media information displayed on the media information display interface.
[0019] In the above scheme, the first comparison module is further configured to determine the target watermark area based on the area of interest when the second overlap rate is less than the second overlap rate threshold, and to perform watermark rendering on the media information displayed in the target watermark area.
[0020] In the above scheme, the determining module is further used to obtain other areas on the media information display interface besides the attention area; and determine the other areas as the watermark area.
[0021] In the above scheme, the determining module is further used to obtain other areas on the media information display interface besides the area of interest; compare the area size of the other areas with an area threshold; when the comparison result indicates that the area size of the other areas is greater than the area threshold, select a portion of the other areas as watermark areas.
[0022] In the above scheme, the determining module is further used to obtain the size of the watermark to be rendered in the watermark area; based on the size, at least one rectangular area that matches the watermark to be rendered from the other areas is selected as the watermark area.
[0023] In the above scheme, the positioning module is further configured to acquire a facial image of the target object through an image acquisition device; perform gaze analysis on the facial image to obtain a gaze analysis result for indicating the gaze of the target object; and determine the area in the media information display interface that the target object is looking at as the attention area based on the gaze analysis result.
[0024] In the above scheme, the positioning module is further configured to extract features from the face image to obtain the gaze vector and head pose vector of the target object; adjust the gaze vector based on the head pose vector to obtain a target gaze vector for indicating the gaze of the target object, and use the target gaze vector as the gaze analysis result; locate the area that the target object's eyes are looking at according to the gaze analysis result; when the area is located on the media information display interface, determine the corresponding area in the media information display interface as the area of interest.
[0025] In the above scheme, the positioning module is further configured to collect the voice data of the target object through an audio acquisition device; perform semantic analysis on the voice data to obtain a semantic analysis result; wherein the semantic analysis result includes target text words, which are used to indicate the position that the target object is looking at on the media information display interface; and determine the area of attention of the target object on the media information display interface based on the position indicated by the target text words in the voice analysis result and the position that the target object is looking at on the media information display interface.
[0026] This application provides an electronic device, including:
[0027] Memory, used to store executable instructions;
[0028] The processor, when executing executable instructions stored in the memory, implements the watermark rendering method for media information provided in the embodiments of this application.
[0029] This application provides a computer-readable storage medium storing executable instructions for inducing a processor to execute and implement the watermark rendering method for media information provided in this application.
[0030] This application provides a computer program product or computer program that includes computer instructions stored in a computer-readable storage medium. An electronic device's processor reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the electronic device to perform the media information watermark rendering method provided in this application.
[0031] The embodiments of this application have the following beneficial effects:
[0032] By locating the target audience's area of interest in the media information, and determining the watermark area outside this area, watermark rendering is applied to the media information displayed within that area. This allows for real-time watermark rendering that avoids the target audience's viewing area, ensuring proper watermark rendering while maintaining uninterrupted viewing experience. Attached Figure Description
[0033] Figure 1 This is a schematic diagram of the architecture of the watermark rendering system 100 for media information provided in this application embodiment;
[0034] Figure 2 This is a schematic diagram of the structure of the electronic device provided in the embodiments of this application;
[0035] Figure 3 This is a flowchart illustrating the watermark rendering method for media information provided in the embodiments of this application;
[0036] Figure 4 This is a schematic diagram illustrating the determination of the region of interest provided in an embodiment of this application;
[0037] Figure 5 This is a schematic diagram illustrating the determination of the region of interest provided in an embodiment of this application;
[0038] Figure 6 This is a schematic diagram of the watermark area for media information determined based on the region of interest, provided in an embodiment of this application.
[0039] Figure 7 This is a schematic diagram of the watermark area for media information determined based on the region of interest, provided in an embodiment of this application.
[0040] Figure 8 This is a schematic diagram of the watermark area for media information determined based on the region of interest, provided in an embodiment of this application.
[0041] Figure 9 This is a schematic diagram of watermarked media information provided in an embodiment of this application;
[0042] Figure 10 This is a schematic diagram of watermarked media information provided in an embodiment of this application;
[0043] Figure 11 This is a flowchart illustrating the watermark rendering method for media information provided in the embodiments of this application;
[0044] Figure 12 This is a schematic diagram of the user viewing area provided in the embodiments of this application;
[0045] Figure 13 This is a schematic diagram of the watermark rendering process for media information provided in the embodiments of this application. Detailed Implementation
[0046] To make the objectives, technical solutions, and advantages of this application clearer, the application will be further described in detail below with reference to the accompanying drawings. The described embodiments should not be regarded as limitations on this application. All other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0047] In the following description, references are made to “some embodiments,” which describe a subset of all possible embodiments. However, it is understood that “some embodiments” may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict.
[0048] In the following description, the terms "first, second, third" are used merely to distinguish similar objects and do not represent a specific ordering of objects. It is understood that "first, second, third" may be interchanged in a specific order or sequence where permitted, so that the embodiments of this application described herein can be implemented in an order other than that illustrated or described herein.
[0049] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.
[0050] In the implementation of this application, the collection and processing of relevant data should strictly comply with the requirements of relevant laws and regulations, obtain the informed consent or separate consent of the personal information subject, and carry out subsequent data use and processing within the scope of laws and regulations and the authorization of the personal information subject.
[0051] Before providing a further detailed description of the embodiments of this application, the nouns and terms involved in the embodiments of this application will be explained, and the nouns and terms involved in the embodiments of this application shall be interpreted as follows.
[0052] 1) Client: Also known as user terminal, it refers to the program that provides local services to users in contrast to the server. Except for some applications that can only run locally, it is generally installed on ordinary client machines and needs to cooperate with the server to run. That is, there needs to be a corresponding server and service program in the network to provide the corresponding services. Thus, a specific communication connection needs to be established between the client and the server to ensure the normal operation of the application.
[0053] 2) Watermarking, a method of protecting digital information, can be understood as adding certain digital information to media information (such as images and videos) to achieve the purpose of authenticating digital multimedia and protecting copyright. Typically, watermark information is hidden within the host (e.g., media information) file and does not affect the objectivity and integrity of the host file. Watermarks are divided into soft watermarks and hard watermarks. Hard watermarks are encoded in the video image and do not require additional rendering during playback, while soft watermarks are independent images or text that need to be dynamically rendered during playback. Examples include images or text information rendered on the video image layer during video playback, such as brand logos or personal account information.
[0054] 3) Eye-tracking technology, a technology that uses a camera to track and locate the movement trajectory of the eyes, and is used to identify the area on the screen where the eyes are focused on based on the biological characteristics of the eyes.
[0055] The inventors discovered that in related technologies, when displaying media information, watermarks such as personal names and company names are rendered on the media information to prevent content leakage. This makes it easy to trace the source of the leak if the information is leaked. However, the current practice for watermark rendering is to render according to specific areas, which results in a poor viewing experience for users.
[0056] Based on this, embodiments of this application provide a watermark rendering method, apparatus, electronic device, computer-readable storage medium, and computer program product for media information, which can identify the physical area of the content that the user is interested in on the screen and dynamically avoid rendering the watermark in that area when the terminal performs watermark rendering, thereby avoiding any impact on the user's viewing experience.
[0057] See Figure 1 , Figure 1 This is a schematic diagram of the architecture of the media information watermark rendering system 100 provided in this application embodiment. To realize the application scenario of media information watermark rendering (for example, the application scenario of media information watermark rendering can be that when an enterprise is broadcasting live on its internal network, it determines the area of the user's gaze on the content in the player and renders the watermark of the enterprise or individual on the media information corresponding to other display areas outside the gaze area), the terminal (terminal 400 is shown as an example) is connected to the server 200 through the network 300. The network 300 can be a wide area network or a local area network, or a combination of the two. The terminal 400 is used for users to use the client 401 to display on the display interface (media information display interface 401-1 is shown as an example). The terminal 400 and the server 200 are connected to each other through wired or wireless networks.
[0058] Among them, terminal 400 is used to display media information on the media information display interface;
[0059] Server 200 is used to: locate the area of interest of the target object in the media information display interface when the terminal displays media information on the media information display interface; determine the watermark area of the media information based on the area of interest; wherein the watermark area is included in other display areas outside the area of interest in the media information display interface; perform watermark rendering on the media information displayed in the watermark area, and send the watermark-rendered media information to terminal 400.
[0060] Terminal 400 is also used to display media information rendered with watermarks;
[0061] Server 200 is also used to detect the area of interest of the displayed watermarked media information; when the detection result indicates that the area of interest has changed, the watermark rendering result of the media information is updated according to the changed area of interest, and the updated watermarked media information is sent to terminal 400.
[0062] In some embodiments, server 200 can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDNs), and big data and artificial intelligence platforms. Terminal 400 can be a smartphone, tablet, laptop, desktop computer, set-top box, smart voice interaction device, smart home appliance, vehicle terminal, aircraft, and mobile devices (e.g., mobile phones, portable music players, personal digital assistants, dedicated messaging devices, portable gaming devices, smart speakers, and smartwatches), but is not limited to these. Terminal devices and servers can be directly or indirectly connected via wired or wireless communication, which is not limited in this embodiment.
[0063] See Figure 2 , Figure 2 This is a schematic diagram of the structure of the electronic device provided in the embodiments of this application. In practical applications, the electronic device can be... Figure 1 The server 200 or terminal 400 shown are described in the following document. Figure 2 , Figure 2 The illustrated electronic device includes at least one processor 410, a memory 450, at least one network interface 420, and a user interface 430. The various components in terminal 400 are coupled together via a bus system 440. It is understood that the bus system 440 is used to implement communication between these components. In addition to a data bus, the bus system 440 also includes a power bus, a control bus, and a status signal bus. However, for clarity, in… Figure 2 The general labeled all buses as Bus System 440.
[0064] Processor 410 can be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (DSP), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. Among them, the general-purpose processor can be a microprocessor or any conventional processor, etc.
[0065] User interface 430 includes one or more output devices 431 that enable the presentation of media content, including one or more speakers and / or one or more visual displays. User interface 430 also includes one or more input devices 432, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
[0066] The memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid-state storage, hard disk drives, optical disk drives, etc. The memory 450 may optionally include one or more storage devices physically located away from the processor 410.
[0067] The memory 450 may include volatile memory or non-volatile memory, or both. The non-volatile memory may be read-only memory (ROM), and the volatile memory may be random access memory (RAM). The memory 450 described in this application embodiment is intended to include any suitable type of memory.
[0068] In some embodiments, memory 450 is capable of storing data to support various operations, examples of which include programs, modules, and data structures or subsets or supersets thereof, as illustrated below.
[0069] Operating system 451 includes system programs for handling various basic system services and performing hardware-related tasks, such as the framework layer, core library layer, driver layer, etc., for implementing various basic business functions and handling hardware-based tasks;
[0070] The network communication module 452 is used to reach other electronic devices via one or more (wired or wireless) network interfaces 420, exemplary network interfaces 420 including: Bluetooth, WiFi, and Universal Serial Bus (USB), etc.
[0071] Presentation module 453 is configured to enable the presentation of information (e.g., a user interface for operating peripheral devices and displaying content and information) via one or more output devices 431 (e.g., a display screen, a speaker, etc.) associated with user interface 430.
[0072] The input processing module 454 is used to detect and translate one or more user inputs or interactions from one or more input devices 432.
[0073] In some embodiments, the apparatus provided in this application can be implemented in software. Figure 2A message transmission device 455 stored in the business system of memory 450 is shown. This device can be software in the form of programs and plug-ins, and includes the following software modules: a positioning module 4551, a determining module 4552, a rendering module 4553, and an updating module 4554. These modules are logically connected and can therefore be arbitrarily combined or further divided according to the functions they implement. The functions of each module will be described below.
[0074] In other embodiments, the apparatus provided in this application can be implemented in hardware. As an example, the watermark rendering apparatus for media information provided in this application can be a processor in the form of a hardware decoding processor, which is programmed to execute the watermark rendering method for media information provided in this application. For example, the processor in the form of a hardware decoding processor can be one or more application-specific integrated circuits (ASICs), DSPs, programmable logic devices (PLDs), complex programmable logic devices (CPLDs), field-programmable gate arrays (FPGAs), or other electronic components.
[0075] In some embodiments, the terminal or server can implement the watermark rendering method for media information provided in this application by running a computer program. For example, the computer program can be a native program or software module in an operating system; it can be a native application (APP), that is, a program that needs to be installed in the operating system to run, such as an instant messaging APP or a web browser APP; it can also be a mini-program, that is, a program that only needs to be downloaded into the browser environment to run; or it can be a mini-program that can be embedded in any APP. In short, the above-mentioned computer program can be any form of application, module or plugin.
[0076] Based on the above description of the media information watermark rendering system and electronic device provided in the embodiments of this application, the watermark rendering method for media information provided in the embodiments of this application is described below. In actual implementation, the media information watermark rendering method provided in the embodiments of this application can be implemented by the terminal or the server alone, or by the terminal and the server working together, so that... Figure 1 The following description uses the example of server 200 executing the watermark rendering method for media information provided in this embodiment of the application. See also... Figure 3 , Figure 3This is a flowchart illustrating the watermark rendering method for media information provided in the embodiments of this application, which will be combined with... Figure 3 The steps shown are explained.
[0077] Step 101: When the terminal displays media information on the media information display interface, the server locates the area of interest of the target object in the media information display interface.
[0078] It should be noted that the target object can be the user of the terminal, the media information can be such as videos or images, and the media information display interface can be the terminal's display screen. Alternatively, when the media information is displayed using a projection device, the media information display interface can also be a screen or other device outside the terminal used to display the projected media information. When a user runs an application for media information display on the terminal to view the media information, the viewed media information is displayed on the terminal's media information display interface. Thus, while the user is viewing the media information, the server locates the target object's area of interest within the media information display interface in real time.
[0079] In actual implementation, when a terminal displays media information on a media information display interface, there are multiple ways for the server to locate the target object in the area of interest on the media information display interface. The following will explain how the server locates the target object in the area of interest on the media information display interface.
[0080] In some embodiments, the process of locating the area of interest of a target object in a media information display interface specifically includes: acquiring a facial image of the target object using an image acquisition device; performing gaze analysis on the facial image to obtain gaze analysis results for indicating the gaze of the target object; and determining the area in the media information display interface that the target object is gazing at as the area of interest based on the gaze analysis results.
[0081] In practical implementation, for capturing facial images of target objects using image acquisition devices, specifically, during the user's viewing of media information, the terminal's image acquisition device starts working to perform real-time detection of the user's facial image. Here, the image acquisition device can be a camera. The camera can be a monocular camera, a binocular camera, a depth camera, a 3D (3 Dimensions) camera, etc. For example, the camera is activated to scan the target object in the camera's field of view in real time, and an image, i.e., a facial image, is generated according to a specified frame rate. Alternatively, the image acquisition device can also be a radar device such as LiDAR or millimeter-wave radar. LiDAR is a radar device that detects the position, velocity, attitude, shape, and other feature data of a target object by emitting a laser beam. Millimeter-wave radar is a radar device that detects in the millimeter-wave band. The radar device emits detection signals to the target object in real time, receives the echo signals reflected back from the target object, and determines the feature data of the target object based on the difference between the detection signal and the echo signal. The radar device uses multiple transmitters and receivers, and the image acquired is a 3D point cloud image, i.e., a facial image.
[0082] In practice, after acquiring the user's facial image, gaze analysis is performed on the facial image. The methods for obtaining gaze analysis results include at least eye-tracking technology, electromagnetic coil method, electrooculography method, and contact lens method.
[0083] In some embodiments, the process of performing gaze analysis on a face image to obtain gaze analysis results specifically includes: extracting features from the face image to obtain the gaze vector and head pose vector of the target object; adjusting the gaze vector based on the head pose vector to obtain a target gaze vector used to indicate the gaze of the target object; and using the target gaze vector as the gaze analysis result.
[0084] In practice, the process of extracting features from a face image to obtain the gaze vector and head pose vector of the target object involves extracting features from the face image to obtain the display area of the target object's eyes and the infrared reflection spots of the pupils; determining the target object's head pose vector based on the display area of the target object's eyes; and determining the target object's gaze vector through the infrared reflection spots of the target object's pupils.
[0085] The process of determining the gaze vector of a target object specifically includes: first, identifying the location of the user's eyes through a facial image; then, identifying the user's eyeballs and determining the coordinates of the infrared reflection spots of the user's left and right eyeballs in the corresponding facial image; calculating the left eye gaze vector based on the coordinates of the infrared reflection spots of the left eyeball and the center coordinates of the left pupil, and calculating the right eye gaze vector based on the coordinates of the infrared reflection spots of the right eyeball and the center coordinates of the right pupil; and finally, determining the user's gaze vector based on the left and right eye gaze vectors.
[0086] It should be noted that during image acquisition, the image acquisition device illuminates the user with an external auxiliary light source, such as an infrared light source, and then detects and records the infrared light reflected from different areas. Then, based on the position of the eyes, the light reflected from the corresponding area of the user's eyes is determined. Here, since the emitted infrared light forms infrared reflection spots after illuminating the human eye, after determining the light reflected from the corresponding area of the user's eyes, the coordinates of the infrared reflection spots of the left eye and the right eye in the face image can be obtained. Then, combined with the coordinates of the left and right pupil centers determined by the position of the eyes, the left eye gaze vector and the right eye gaze vector are calculated respectively to determine the gaze vector.
[0087] The process of obtaining the head pose vector of the target object specifically includes: first, identifying the location of the user's eyes through the user image and determining the outline of the user's eyes; then, based on the outline of the user's eyes, determining the area of the left eye and the area of the right eye in the face image; calculating the difference between the area of the left eye and the area of the right eye, and finding the preset mapping relationship between the difference between the two eye areas and the head pose vector based on the difference, thereby determining the head pose vector corresponding to the difference.
[0088] It's important to note that when a user's head shifts to the left, the left eye area in the resulting face image is generally smaller than the right eye area; conversely, when the user's head shifts to the right, the left eye area is generally larger than the right eye area. Therefore, we can first extract the left and right eye images from the face image, then calculate their areas. The difference in area is used to determine the head pose vector that represents the direction of the user's head rotation. If the area difference is greater than zero, it indicates the face is facing right; if the area difference is less than zero, it indicates the face is facing left. A pre-defined mapping relationship between the difference in the area of the user's two eyes and the head pose vector allows for rapid determination of the head pose vector based on the currently calculated difference in the area of the two eyes.
[0089] It should be noted that for the same gaze vector, different head postures can lead to differences in the final area of focus. Therefore, the gaze vector is adjusted based on the head posture vector to obtain the target gaze vector. Here, the target gaze vector is used to indicate the user's gaze direction when the user's head is facing the media information display interface.
[0090] In practice, after obtaining the gaze analysis results, the process of determining the area in the media information display interface that the target object is looking at as the region of interest based on the gaze analysis results specifically includes: locating the area being gazed at by the target object's eyes based on the gaze analysis results; and when the area is located on the media information display interface, determining the corresponding area in the media information display interface as the region of interest. Here, when the area is not located on the media information display interface, a new gaze analysis is performed on the newly acquired facial image to relocate the area being gazed at by the target object's eyes based on the obtained gaze analysis results.
[0091] In other embodiments, the process of locating the area of interest of a target object in the media information display interface specifically includes: acquiring the voice data of the target object through an audio acquisition device; performing semantic analysis on the voice data to obtain a semantic analysis result; wherein the semantic analysis result includes target text words, which are used to indicate the position that the target object is looking at on the media information display interface; and determining the area of interest of the target object in the media information display interface based on the position indicated by the target text words in the voice analysis result. Specifically, in response to an audio acquisition command triggered by the target object, the voice data of the target object is acquired through an audio acquisition device, and then the voice data is converted into text to obtain text data corresponding to the voice data; semantic analysis is performed on the text data to obtain text words in the text data that represent the position that the target object is looking at; the text words are matched with the content in the displayed media information, and when the matching result indicates a successful match, the text words are taken as target text words, and the position that the target object is looking at on the media information display interface is determined based on the position corresponding to the target text words in the displayed media information, thereby taking this position as the area of interest of the target object in the media information display interface.
[0092] As an example, when the media display interface is an internal meeting interface and the media information is a meeting document, in response to the user-triggered explanation acquisition command, the user's explanation of the meeting document is acquired through an audio acquisition device such as a recording device. By performing semantic analysis on the explanation, the text words in the explanation that represent the position of the target object's gaze are determined. For example, if the explanation is "Let's see the summary section," then "summary" can be used as a text word representing the position of the target object's gaze. Then, this text word is matched with the content in the displayed meeting document. Based on the matching result, it is determined that the meeting document contains the word "summary," so the word "summary" is taken as the target text word. The position of the word "summary" in the displayed meeting document is used to determine the position of the target object's gaze on the meeting interface, thereby determining the user's attention area on the meeting interface.
[0093] As an example, when the media display interface is a song recording interface in a singing application, and the media information is song lyrics, in response to the user's triggered singing recording command, the user's singing content is recorded through an audio acquisition device such as a song recording device. By performing semantic analysis on the singing content, the text words in the singing content that represent the position of the user's gaze are determined. Then, the text words are matched with the displayed lyrics. Based on the matching result, it is determined that the displayed lyrics contain the text words, thus taking the text words as target text words. The position of the target text words in the displayed lyrics is then used to determine the position of the target object's gaze on the song recording interface, thereby determining the user's attention area on the song recording interface.
[0094] It should be noted that the user can trigger the audio acquisition command by triggering the function item used for audio acquisition, or by triggering the voice determination function item such as the voice of "acquire voice". The methods of triggering the audio acquisition command include but are not limited to the above two, and this application embodiment does not limit them.
[0095] In some embodiments, the convergence point of the target object's gaze in the media information display interface can also be obtained; the convergence point is used as the center point of the image, and the target image is drawn according to the target length, with the area corresponding to the target image in the media information display interface as the area of interest. It should be noted that the target image and target length are preset here.
[0096] As an example, see Figure 4 , Figure 4 This is a schematic diagram illustrating the determination of the region of interest provided in an embodiment of this application, based on... Figure 4When the target image is a rectangle, the length and width of the rectangle are h and w respectively. The rectangle is drawn with point P(X,Y) as the center point and h and w as the length and width. The area corresponding to the drawn rectangle in the media information display interface is then taken as the area of interest.
[0097] As an example, see Figure 5 , Figure 5 This is a schematic diagram illustrating the determination of the region of interest provided in an embodiment of this application, based on... Figure 5 When the target image is a circle, and the target length, i.e. the radius of the circle, is r, a circle is drawn with point P(X,Y) as the center point and r as the radius. The area corresponding to the drawn circle in the media information display interface is then taken as the area of interest.
[0098] Step 102: Based on the area of interest, determine the watermark area of the media information; wherein, the watermark area is included in other display areas outside the area of interest in the media information display interface.
[0099] It should be noted that the watermark area can be the entire area of the display area other than the area of interest in the media information display interface, or it can be a part of the area.
[0100] In some embodiments, when the watermark area encompasses the entirety of all display areas other than the area of interest in the media information display interface, the process of determining the watermark area based on the area of interest specifically includes: acquiring other areas on the media information display interface besides the area of interest; and determining these other areas as the watermark area. For example, see [link to example]. Figure 6 , Figure 6 This is a schematic diagram of the watermark area of media information determined based on the region of interest, provided in an embodiment of this application. Figure 6 After determining the rectangular area of interest, the entire area of the media information display interface, excluding the rectangular area, is used as the watermark area for watermark rendering.
[0101] In other embodiments, when the watermark area is a portion of another display area outside the area of interest in the media information display interface, the process of determining the watermark area based on the area of interest specifically includes: acquiring other areas on the media information display interface besides the area of interest; comparing the area size of the other areas with an area threshold; and when the comparison result indicates that the area size of the other areas is greater than the area threshold, selecting a portion of the other areas as the watermark area. For example, see [link to relevant documentation]. Figure 7 , Figure 7 This is a schematic diagram of the watermark area of media information determined based on the region of interest, provided in an embodiment of this application. Figure 7After determining the rectangular area of interest, the area outside the rectangular area in the media information display interface is determined, and a rectangular area is selected from this area as the watermark area for watermark rendering.
[0102] In practical implementation, when the watermark area is a portion of a display area outside the focus area in the media information display interface, multiple areas can be selected as the watermark area. Specifically, the size of the watermark to be rendered in the watermark area is obtained; based on the size, at least one rectangular area that matches the watermark to be rendered is selected from the other areas as the watermark area. For example, see [link to example]. Figure 8 , Figure 8 This is a schematic diagram of the watermark area of media information determined based on the region of interest, provided in an embodiment of this application. Figure 8 After determining the rectangular area of interest, the area outside the rectangular area in the media information display interface is determined, and four rectangular areas are selected from this area as watermark areas for watermark rendering.
[0103] It should be noted that at least one rectangular area that is compatible with the watermark to be rendered can be selected from other areas as the watermark area, or at least one circular area that is compatible with the watermark to be rendered can be selected from other areas as the watermark area. The shape of the selected area includes, but is not limited to, the above two types. This application embodiment does not limit this.
[0104] Step 103: Watermark the media information displayed in the watermark area and detect the area of interest.
[0105] It should be noted that detecting the area of interest specifically includes detecting changes in the area of interest, including but not limited to the size of the area of interest and its position on the media display interface.
[0106] As an example, when detecting changes in the size of the region of interest, specifically, the new region of interest of the target object relative to the media information after watermark rendering is obtained; the size of the new region of interest and the size of the region of interest are obtained; the difference between the size of the new region of interest and the size of the region of interest is calculated to obtain the difference value between the size of the new region of interest and the size of the region of interest; the absolute value of the difference value is compared with the difference threshold to obtain the comparison result; based on the comparison result, a detection result is determined to indicate whether the region of interest has changed.
[0107] As an example, when detecting changes in the position of a region of interest, specifically, the process involves: acquiring the new region of interest of the target object relative to the watermarked media information; comparing the new region of interest with the existing region of interest to obtain a first overlap rate; comparing the first overlap rate with a first overlap rate threshold to obtain a comparison result; and determining a detection result to indicate whether the region of interest has changed based on the comparison result. Specifically, the process of comparing the new region of interest with the existing region of interest to obtain a first overlap rate; comparing the first overlap rate with a first overlap rate threshold to obtain a comparison result; and determining a detection result to indicate whether the region of interest has changed based on the comparison result can be as follows: acquiring the position of the new region of interest and the position of the existing region of interest; comparing the position of the new region of interest with the position of the existing region of interest to obtain an overlap rate; comparing this overlap rate with a first overlap rate threshold to obtain a comparison result; and determining a detection result to indicate whether the region of interest has changed based on the comparison result.
[0108] It should be noted that the detection of the area of interest can be performed in real time or periodically. For new areas of interest, the acquisition time is after the determination time of the area of interest. That is, the user only watches the media information corresponding to the new area of interest on the media information display interface after watching the media information corresponding to the area of interest on the media information display interface.
[0109] Step 104: When the detection result indicates a change in the region of interest, update the watermark rendering result of the media information based on the changed region of interest.
[0110] It should be noted that the detection result can indicate a change in the region of interest or that the region of interest has not changed. When detecting changes in the position of the region of interest, the process of determining the detection result indicating whether the region of interest has changed based on the comparison results specifically includes: when the first overlap rate is less than a first overlap rate threshold, a detection result indicating a change in the region of interest is determined; when the first overlap rate reaches the first overlap rate threshold, a detection result indicating that the region of interest has not changed is determined. When detecting changes in the size of the region of interest, the process of determining the detection result indicating whether the region of interest has changed based on the comparison results specifically includes: when the absolute value of the difference reaches a difference threshold, a detection result indicating a change in the region of interest is determined; when the absolute value of the difference is less than the difference threshold, a detection result indicating that the region of interest has not changed is determined. It should be noted that the first overlap rate threshold and the difference threshold can be preset. For example, if the first overlap rate threshold is 90%, a detection result indicating a change in the region of interest is determined when the first overlap rate is 80%, and a detection result indicating no change in the region of interest is determined when the first overlap rate is 95%. Here, after determining the detection results that indicate the region of interest has not changed, the system determines the region of interest has changed based on the detection results that indicate the region of interest has changed, and then updates the watermark rendering result of the media information based on the new region of interest.
[0111] In practice, the process of updating the watermark rendering result of the media information based on the changed or new area of interest specifically includes: when the detection result indicates that the area of interest has changed, determining the position of the changed area of interest in the media information display interface; adjusting the watermark area based on the position of the changed area of interest to obtain a new watermark area; and re-watermarking the media information based on the new watermark area.
[0112] For example, see Figure 9 , Figure 9 This is a schematic diagram of watermarked media information provided in an embodiment of this application, based on... Figure 9 The area within the dashed box 901 represents the user's focus area, while the surrounding area is the watermark area. This means that after watermark rendering, the result will look like... Figure 9 The watermark rendering result is shown. When the detection result represents a change in the region of interest, see [reference needed]. Figure 10 , Figure 10 This is a schematic diagram of watermarked media information provided in an embodiment of this application, based on... Figure 10 The dashed box 1001 represents the changed area of interest. After determining the position of the changed area of interest, the watermark area is adjusted based on this position to obtain a new watermark area. The media information is then re-watermarked based on this new watermark area, resulting in the following presentation: Figure 10 The watermark rendering result is shown.
[0113] It should be noted that the process of adjusting the watermark area based on the changed position of the area of interest is the same as the process of determining the watermark area of media information based on the aforementioned area of interest. Therefore, this embodiment will not be described in detail.
[0114] In some embodiments, when the detection result indicates that the region of interest has not changed, the size of the media information display interface can also be detected. If the size of the media information display interface changes, such as by making the media information display interface smaller, then after detecting the region of interest, the media information display interface is compared with the region of interest to obtain a second overlap rate between the media information display interface and the region of interest. The second overlap rate is compared with a second overlap rate threshold. When the second overlap rate reaches the second overlap rate threshold, watermark rendering is performed on the media information displayed on the media information display interface. When the second overlap rate is less than the second overlap rate threshold, a target watermark region is determined based on the region of interest, and watermark rendering is performed on the media information displayed in the target watermark region. Specifically, the process involves comparing the media information display interface with the area of interest to obtain a second overlap rate. This second overlap rate is then compared to a second overlap rate threshold. When the second overlap rate reaches the threshold, watermark rendering is applied to the media information displayed on the media information display interface. When the second overlap rate is less than the threshold, a target watermark area is determined based on the area of interest, and watermark rendering is applied to the media information displayed in the target watermark area. This process can be achieved by obtaining the ratio of the size of the area of interest to the size of the media information display interface (i.e., the second overlap rate), comparing this ratio to the second overlap rate threshold, and applying watermark rendering to the media information displayed on the media information display interface when the ratio reaches the threshold. When the ratio is less than the threshold, a target watermark area is determined based on the area of interest, and watermark rendering is applied to the media information displayed in the target watermark area. It should be noted that the second overlap rate threshold here can also be preset. For example, if the second overlap rate threshold is 50%, when the ratio is 0.7, that is, the second overlap rate is 70%, watermark rendering is performed on the media information displayed on the media information display interface; when the ratio is 0.3, that is, the second overlap rate is 30%, the target watermark area is determined based on the area of interest, and watermark rendering is performed on the media information displayed in the target watermark area.
[0115] In practice, when the second overlap rate reaches the second overlap rate threshold, the size of the attention area can be adjusted to reduce the second overlap rate between the media information display interface and the attention area, so that the second overlap rate is less than the second overlap rate threshold. Specifically, adjusting the size of the attention area involves obtaining the spare length and the convergence point of the target object's gaze in the media information display interface; using the convergence point as the center point of the image, drawing the target image according to the spare length, and using the area corresponding to the target image in the media information display interface as the adjusted attention area, thereby obtaining the overlap rate between the adjusted attention area and the media information display interface for the aforementioned subsequent processing.
[0116] It should be noted that multiple backup lengths can exist, and the backup lengths can be obtained by selecting the largest backup lengths from the smallest and then processing them accordingly, until the overlap rate between the attention area adjusted based on the obtained backup lengths and the media information display interface is less than a second overlap rate threshold. Here, when the overlap rate between the attention area adjusted based on all backup lengths and the media information display interface reaches the second overlap rate threshold, watermark rendering is directly applied to the media information displayed on the media information display interface.
[0117] The watermark rendering method for media information provided in the embodiments of this application will be further described below. See also... Figure 11 , Figure 11 This is a flowchart illustrating the watermark rendering method for media information provided in this application embodiment, based on... Figure 11 The watermark rendering method for media information provided in this application embodiment is implemented collaboratively by the client and the server.
[0118] Step 201: In response to the triggered operation on the media information, the client displays the media information on the media information display interface.
[0119] In practice, the client can be a video playback client set up on the terminal for playing videos, or a live conference client for live streaming meetings. The media information can be documents, videos, or images. When the client is a video playback client set up on the terminal for playing videos, the triggering operation for the media information can be initiated by the user through the client's human-computer interaction interface, triggering the video playback function in the human-computer interaction interface to make the terminal play the corresponding video. When the client is a live conference client set up on the terminal for live streaming meetings, the triggering operation for the media information can be initiated by the user through the client's human-computer interaction interface, triggering the start meeting function in the human-computer interaction interface to make the terminal start live streaming the meeting.
[0120] Step 202: When the client displays media information on the media information display interface, it acquires the face image of the target object through the image acquisition device.
[0121] In practice, the image acquisition device can be a camera. The facial image of the target object can be captured by the camera, which is connected to the terminal for communication. After capturing the facial image of the target object, the camera transmits the facial image of the target object to the terminal, and the terminal automatically uploads it to the client.
[0122] Step 203: The client sends the captured face image to the server.
[0123] Step 204: Based on the received face image, the server locates the area of interest of the target object in the media information display interface.
[0124] Step 205: Obtain the areas other than the area of interest on the media information display interface, and identify the other areas as watermark areas.
[0125] Step 206: Watermark rendering is performed on the media information displayed in the watermark area.
[0126] Step 207: Send the watermarked media information to the client.
[0127] Step 208: The client displays the watermarked media information on the media information display interface, and while displaying the watermarked media information, it collects the face image of the target object again.
[0128] Step 209: Send the re-acquired face image to the server.
[0129] Step 210: Based on the received re-acquired face image, the server determines the new area of interest of the target object for the watermarked media information.
[0130] Step 211: Compare the new region of interest with the region of interest. When the comparison result indicates that the region of interest has changed, adjust the watermark region based on the position of the new region of interest to obtain the new watermark region.
[0131] In practice, comparing the new region of interest with the existing region of interest can be done by comparing at least one of their respective positions and sizes. Here, when the comparison result indicates that the overlap rate between the new region of interest and the existing region of interest is less than a pre-set overlap rate threshold, it is determined that the region of interest has changed.
[0132] Step 212: Based on the new watermark area, update the watermark rendering result of the media information.
[0133] Step 213: Send the updated watermark rendering result of the media information to the client.
[0134] Step 214: The client displays the updated watermark rendering result of the media information on the media information display interface.
[0135] By applying the above embodiments of this application, the watermark area outside the target object's area of interest in the media information is determined by locating the target object's area of interest. Watermark rendering is then performed on the media information displayed in the watermark area, and the watermark rendering result is updated when the area of interest changes. In this way, the target object's viewing area is avoided in real time, allowing watermark rendering in other locations. This ensures that the watermark is rendered correctly while preventing it from affecting the user's viewing experience.
[0136] The following will describe an exemplary application of the embodiments of this application in a real-world application scenario.
[0137] In related technologies, during live streaming on a company's intranet, to prevent content leakage, the name of an individual or company is rendered on the player (i.e., a watermark). This makes it easier to trace the source of the leak if the information is leaked. However, this practice of rendering watermarks according to specific areas, such as using specific horizontal and vertical fixed intervals, can easily obscure some key content, resulting in a poor user experience.
[0138] Based on this, this application provides a watermark rendering method for media information. By identifying the physical area (attention area) of the content that the user's eye is focused on on the screen, the watermark rendering on the terminal needs to dynamically avoid rendering in this area to avoid affecting the user's viewing experience.
[0139] The following section elaborates on this solution. Specifically, this technical solution requires the viewing device to have a camera acquisition capability and an eye-tracking system (image acquisition device). The eye-tracking system acquires the user's viewing area (attention area), and uses this as the basis for rendering the watermark.
[0140] In practical implementation, an eye-tracking system (image acquisition device) needs to be activated when the video (media information) begins playback. This system operates on the same lifecycle as the player, continuing until the video playback ends. During operation, a camera continuously acquires the user's facial data (facial images) and performs eye recognition to determine the area of interest the user is viewing. The system must adhere to a specific frame rate to prevent excessively high frame rates from impacting device performance. Since the human eye's reaction time is approximately 40 milliseconds per second, a frame rate of less than 25 frames per second is sufficient. Too low a frame rate will cause noticeable latency. A frame rate between 5 and 25 frames per second can be selected based on the device's performance.
[0141] In practice, the identified user viewing area is typically represented by a rectangular local area. This can be represented using complete coordinates or by the coordinates of any point and the rectangle's width and height, such as... Figure 12 As shown, Figure 12 This is a schematic diagram of the user viewing area provided in the embodiments of this application, based on Figure 12 Using the coordinates (X, Y) of the bottom left point A, along with its width and height, we can represent the area as follows: point B is (X, Y + height), point C is (X + width, Y + height), and point D is (X + width, Y). Alternatively, other coordinate representations can be used, as long as the coordinates of the four points are labeled. In this way, the user's viewing area can be represented using the coordinates of point A and the width and height of the rectangle.
[0142] In actual implementation, after the camera and eye-tracking system capture the area viewed by the user, the system needs to promptly transmit the data to the rendering system. If the data is consistent with the previous one, watermark repainting will not be triggered; otherwise, watermark repainting will be triggered. The operation of this system is independent of the player's state. In other words, if the player is paused and the watermark is inconsistent with the previous one, the watermark will still need to be repainted to ensure that the content the user wants to see is not obscured even when the video is paused.
[0143] It should be noted that, for ease of maintenance, the watermark rendering process should not alter the original watermark coordinate calculation system. Instead, a watermark position validity judgment logic should be added. After the original watermark coordinates are calculated, it should be determined which coordinates exist within the user's viewing area. If all of them are within the user's viewing area, then the coordinates need to be recalculated to obtain a new set of coordinates. If only some are within the user's viewing area, then the coordinates within the user's viewing area should not be rendered, while the watermarks with other coordinates should be rendered normally.
[0144] Next, see Figure 13 , Figure 13This is a schematic diagram of the watermark rendering process for media information provided in the embodiments of this application, based on Figure 13 The watermark rendering method for media information provided in this application embodiment can be implemented by executing steps 1 to 7. Specifically, Step 1: After the user clicks to play the video, the client's video playback scheduling module sends a request for video information to the video backend; Step 2: After receiving the user's request, the video backend transmits the requested video information to the client. This information includes at least the video playback address and watermark information. The watermark information can be an image or text; if it's an image, it's a network address; if it's text, it's a text message; Step 3: The obtained video playback address is transmitted to the player kernel, and the obtained watermark information is transmitted to the watermark rendering module; Step 4: After receiving the playback address, the player kernel starts running, reads data from the server, and performs decoding and image / audio rendering. Once all preparations are complete, it notifies the video playback scheduling module to begin video playback; Step 5: After the video starts playing, the eye-tracking system is notified. This system needs to activate the camera and collect image data (facial images) at a pre-set frequency; Step 6: The eye-tracking system calculates the area the user is currently viewing using the collected camera data (facial images). This area information is relative to the screen content area and is a rectangular area, meaning its coordinates relative to the screen. These are four specific coordinate values (the coordinates of the four points of the rectangle), or a single coordinate value plus a length and a width (one coordinate can be defined, for example, the bottom left corner (x, y), the length is width, and the width is height; then the other three coordinates can be calculated as the top left corner (x, y + height), the bottom right corner (x + width, y), and the top right corner (x + width, y + height)). It should be noted that step six is a repetitive process, performing camera capture and calculation at regular intervals (which can be customized by the program, generally no less than 1 second) to ensure the acquired information is the latest user information, continuing until the video viewing ends. Step 7: Pass the data calculated in step 6 to the watermark rendering module. Step 8: Calculate based on the transmitted user viewing area data. The watermark rendering calculation rules need to avoid this area; that is, the watermark cannot be rendered within this area, while ensuring that the watermark can be rendered outside the area to guarantee normal user viewing. Here, this step is a repetitive process, continuously performing watermark rendering, and redrawing each time based on the transmitted user viewing area information.
[0145] By applying the above embodiments of this application, the watermark area outside the target object's area of interest in the media information is determined by locating the target object's area of interest. Watermark rendering is then performed on the media information displayed in the watermark area, and the watermark rendering result is updated when the area of interest changes. In this way, the target object's viewing area is avoided in real time, allowing watermark rendering in other locations. This ensures that the watermark is rendered correctly while preventing it from affecting the user's viewing experience.
[0146] The following description continues to illustrate the exemplary structure of the watermark rendering device 455 for media information provided in the embodiments of this application as a software module. In some embodiments, such as Figure 2 As shown, the software module in the watermark rendering device 455 storing media information in the memory 440 may include:
[0147] The positioning module 4551 is used to locate the area of interest of the target object in the media information display interface when the terminal displays media information on the media information display interface.
[0148] The determining module 4552 is used to determine the watermark area of the media information based on the area of interest; wherein, the watermark area is included in other display areas outside the area of interest in the media information display interface;
[0149] The rendering module 4553 is used to perform watermark rendering on the media information displayed in the watermark area and to detect the area of interest.
[0150] The update module 4554 is used to update the watermark rendering result of the media information according to the changed region of interest when the detection result indicates that the region of interest has changed.
[0151] In some embodiments, the positioning module 4551 is further configured to obtain the convergence point of the target object's gaze in the media information display interface; draw a target image based on the target length using the convergence point as the center point of the image, and use the area corresponding to the target image in the media information display interface as the area of interest.
[0152] In some embodiments, the rendering module 4553 is further configured to obtain a new area of interest of the target object on the media information after watermark rendering; compare the new area of interest with the area of interest to obtain a first overlap rate between the new area of interest and the area of interest; compare the first overlap rate with a first overlap rate threshold to obtain a comparison result; and determine a detection result for indicating whether the area of interest has changed based on the comparison result.
[0153] In some embodiments, the updating module 4554 is further configured to: determine the position of the changed region of interest in the media information display interface when the detection result indicates that the region of interest has changed; adjust the watermark region based on the position of the changed region of interest to obtain a new watermark region; and re-watermark the media information based on the new watermark region.
[0154] In some embodiments, when the detection result indicates that the region of interest has not changed and the size of the media information display interface has changed, the device further includes a first comparison module, which is used to compare the media information display interface with the region of interest to obtain a second overlap rate between the media information display interface and the region of interest; compare the second overlap rate with a second overlap rate threshold; and when the second overlap rate reaches the second overlap rate threshold, perform watermark rendering on the media information displayed on the media information display interface.
[0155] In some embodiments, the first comparison module is further configured to determine a target watermark region based on the region of interest when the second overlap rate is less than a second overlap rate threshold, and to perform watermark rendering on the media information displayed in the target watermark region.
[0156] In some embodiments, the determining module 4552 is further configured to obtain other areas on the media information display interface besides the area of interest; and determine the other areas as the watermark area.
[0157] In some embodiments, the determining module 4552 is further configured to obtain other areas on the media information display interface besides the area of interest; compare the area size of the other areas with an area threshold; and when the comparison result indicates that the area size of the other areas is greater than the area threshold, select a portion of the other areas as watermark areas.
[0158] In some embodiments, the determining module 4552 is further configured to obtain the size of the watermark to be rendered in the watermark region; and based on the size, select at least one rectangular region from the other regions that is compatible with the watermark to be rendered as the watermark region.
[0159] In some embodiments, the positioning module 4551 is further configured to acquire a facial image of the target object using an image acquisition device; perform gaze analysis on the facial image to obtain a gaze analysis result for indicating the gaze of the target object; and determine, based on the gaze analysis result, the area in the media information display interface that the target object is looking at as the attention area.
[0160] In some embodiments, the positioning module 4551 is further configured to extract features from the face image to obtain the gaze vector and head pose vector of the target object; adjust the gaze vector based on the head pose vector to obtain a target gaze vector for indicating the gaze of the target object, and use the target gaze vector as the gaze analysis result; locate the area that the eyes of the target object are looking at according to the gaze analysis result; when the area is located on the media information display interface, determine the corresponding area in the media information display interface as the attention area.
[0161] In some embodiments, the positioning module 4551 is further configured to collect voice data of the target object through an audio acquisition device; perform semantic analysis on the voice data to obtain a semantic analysis result; wherein the semantic analysis result includes target text words, the target text words being used to indicate the position that the target object is looking at on the media information display interface; and determine the area of attention of the target object on the media information display interface based on the position indicated by the target text words in the voice analysis result and the position that the target object is looking at on the media information display interface.
[0162] This application provides a computer program product or computer program that includes computer instructions stored in a computer-readable storage medium. An electronic device's processor reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the electronic device to perform the media information watermark rendering method described above in this application.
[0163] This application provides a computer-readable storage medium storing executable instructions. When these executable instructions are executed by a processor, they cause the processor to execute the watermark rendering method for media information provided in this application. For example, ... Figure 3 The watermark rendering method for the media information is shown.
[0164] In some embodiments, the computer-readable storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or it may be a variety of devices including one or any combination of the above-mentioned memories.
[0165] In some embodiments, executable instructions may take the form of a program, software, software module, script, or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and may be deployed in any form, including as a standalone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
[0166] As an example, executable instructions may, but do not necessarily, correspond to files in a file system. They may be stored as part of a file that holds other programs or data, for example, in one or more scripts in a Hyper Text Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple collaborating files (e.g., a file that stores one or more modules, subroutines, or code sections).
[0167] As an example, executable instructions can be deployed to execute on a single electronic device, or on multiple electronic devices located in one location, or on multiple electronic devices distributed across multiple locations and interconnected via a communication network.
[0168] In summary, the embodiments of this application have the following beneficial effects:
[0169] By avoiding the content area viewed by the target audience in real time, watermark rendering is performed in other locations, ensuring that the watermark can be rendered normally while ensuring that the watermark information does not affect the user's viewing experience.
[0170] The above description is merely an embodiment of this application and is not intended to limit the scope of protection of this application. Any modifications, equivalent substitutions, and improvements made within the spirit and scope of this application are included within the scope of protection of this application.
Claims
1. A method for watermarking media information, characterized in that, The method includes: When the terminal displays media information on the media information display interface, the face image of the target object is captured in real time through the image acquisition device; the face image is subjected to gaze analysis to obtain gaze analysis results used to indicate the gaze of the target object; Based on the gaze analysis results, the area in the media information display interface that the target object is looking at is determined as the area of interest; Based on the area of interest, the watermark area of the media information is determined; wherein, the watermark area is included in other display areas outside the area of interest in the media information display interface; Watermark rendering is performed on the media information displayed in the watermark area, wherein the watermark rendering includes: if all the original watermark coordinates corresponding to the watermark area are located within the area of interest, then coordinate recalculation is triggered to obtain new watermark coordinates; if only some of the original watermark coordinates are located within the area of interest, then only the original watermark coordinates outside the area of interest are watermarked. The region of interest is detected, wherein the detection is independent of the playback status of the media information; When the detection result indicates a change in the region of interest, the watermark rendering result of the media information is updated based on the changed region of interest.
2. The method as described in claim 1, characterized in that, The detection of the region of interest includes: Obtain the new area of interest of the target object for the media information after watermark rendering; The new region of interest is compared with the region of interest to obtain the first overlap rate between the new region of interest and the region of interest; The first overlap rate is compared with the first overlap rate threshold to obtain the comparison result; Based on the comparison results, a detection result is determined to indicate whether the region of interest has changed.
3. The method as described in claim 1, characterized in that, When the detection result indicates a change in the region of interest, updating the watermark rendering result of the media information based on the changed region of interest includes: When the detection result indicates that the area of interest has changed, determine the position of the changed area of interest in the media information display interface. Based on the changed position of the area of interest, the watermark area is adjusted to obtain a new watermark area; Based on the new watermark area, the media information is re-watermarked.
4. The method as described in claim 1, characterized in that, When the detection result indicates that the region of interest has not changed, but the size of the media information display interface has changed, after detecting the region of interest, the method further includes: The media information display interface is compared with the area of interest to obtain a second overlap rate between the media information display interface and the area of interest. The second overlap rate is compared with the second overlap rate threshold. When the second overlap rate reaches the second overlap rate threshold, watermark rendering is applied to the media information displayed on the media information display interface.
5. The method as described in claim 4, characterized in that, The method further includes: When the second overlap rate is less than the second overlap rate threshold, a target watermark region is determined based on the region of interest, and the media information displayed in the target watermark region is watermarked.
6. The method as described in claim 1, characterized in that, Determining the watermark area of the media information based on the area of interest includes: Obtain other areas on the media information display interface besides the area of interest; The other areas are identified as the watermark areas.
7. The method as described in claim 1, characterized in that, Determining the watermark area of the media information based on the area of interest includes: Obtain other areas on the media information display interface besides the area of interest; The area size of the other regions is compared with the area threshold. When the comparison result indicates that the area of the other regions is greater than the area threshold, a portion of the other regions is selected as the watermark region.
8. The method as described in claim 7, characterized in that, The step of selecting a portion of the other regions as the watermark region includes: Obtain the size of the watermark to be rendered in the watermark area; Based on the size, at least one rectangular region that matches the watermark to be rendered is selected from the other regions as the watermark region.
9. The method of claim 1, wherein performing gaze analysis on the face image to obtain gaze analysis results for indicating the gaze of the target object includes: Feature extraction is performed on the face image to obtain the gaze vector and head pose vector of the target object; Based on the head posture vector, the gaze vector is adjusted to obtain a target gaze vector for indicating the gaze of the target object, and the target gaze vector is used as the gaze analysis result; The step of determining the area in the media information display interface that the target object is looking at as the attention area based on the gaze analysis results includes: Based on the gaze analysis results, locate the area that the target object's eyes are looking at; When the area is located on the media information display interface, the corresponding area in the media information display interface is determined as the area of interest.
10. A watermark rendering device for media information, characterized in that, The device includes: The positioning module is used to capture a face image of a target object in real time through an image acquisition device when the terminal displays media information on the media information display interface; perform gaze analysis on the face image to obtain gaze analysis results for indicating the gaze of the target object; and determine the area in the media information display interface that the target object is looking at as the area of interest based on the gaze analysis results. The determining module is used to determine the watermark area of the media information based on the area of interest; wherein the watermark area is included in other display areas outside the area of interest in the media information display interface; A rendering module is used to perform watermark rendering on the media information displayed in the watermark area. The watermark rendering includes: if all the original watermark coordinates corresponding to the watermark area are located within the area of interest, then coordinate recalculation is triggered to obtain new watermark coordinates; if only some of the original watermark coordinates are located within the area of interest, then only the original watermark coordinates outside the area of interest are watermarked; and the area of interest is detected, wherein the detection is independent of the playback status of the media information. An update module is used to update the watermark rendering result of the media information based on the changed region of interest when the detection result indicates a change in the region of interest.
11. The apparatus according to claim 10, characterized in that, The rendering module is also used for: Obtain the new area of interest of the target object for the media information after watermark rendering; The new region of interest is compared with the region of interest to obtain the first overlap rate between the new region of interest and the region of interest; The first overlap rate is compared with the first overlap rate threshold to obtain the comparison result; Based on the comparison results, a detection result is determined to indicate whether the region of interest has changed.
12. The apparatus according to claim 10, characterized in that, The update module is also used for: When the detection result indicates that the area of interest has changed, determine the position of the changed area of interest in the media information display interface. Based on the changed position of the area of interest, the watermark area is adjusted to obtain a new watermark area; Based on the new watermark area, the media information is re-watermarked.
13. An electronic device, characterized in that, include: Memory, used to store executable instructions; A processor, when executing executable instructions stored in the memory, implements the watermark rendering method for media information as described in any one of claims 1 to 9.
14. A computer-readable storage medium, characterized in that, It stores executable instructions for causing the processor to execute, thereby implementing the watermark rendering method for media information as described in any one of claims 1-9.
15. A computer program product comprising computer instructions, characterized in that, When the computer instructions are executed by the processor, they implement the watermark rendering method according to any one of claims 1 to 9.