Display device, video playing method and related apparatus

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using video structure analysis and dynamic playback algorithms, key information is identified and redundant segments are removed to generate the optimal playback plan. This solves the conflict between long videos and user time, and enables a complete and coherent viewing experience within a limited time.

CN122269080APending Publication Date: 2026-06-23SHENZHEN TAILIWEI INTELLIGENT TECHNOLOGY CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: SHENZHEN TAILIWEI INTELLIGENT TECHNOLOGY CO LTD
Filing Date: 2026-03-18
Publication Date: 2026-06-23

Smart Images

Figure CN122269080A_ABST

Patent Text Reader

Abstract

Embodiments of the present application provide a display device, a video playing method and related apparatus, and relate to the technical field of display. The method comprises: obtaining a video stream to be played and a desired viewing time length of a user; performing content analysis on the video stream, and dividing the video stream into a plurality of video segments; determining a plurality of target video segments from the plurality of video segments according to the desired viewing time length, and determining a playing speed of each target video segment; the number of the plurality of target video segments is less than the number of the plurality of video segments; merging the plurality of target video segments into a target video stream, and controlling a display screen to play the target video stream. The above method solves the core technical problem that a traditional static playing cannot adapt to a user's time budget, and effectively improves the user's use experience.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of display technology, and in particular to a display device, a video playback method, and related apparatus. Background Technology

[0002] Currently, the duration of a single episode of long-form videos (such as TV dramas) is generally over 45 minutes, while users may only have 10-30 minutes of continuous viewing time available each day. This discrepancy between the length of long-form videos and the amount of time users can watch continuously leads to frequent interruptions and difficulties in connecting the plot.

[0003] While existing technologies offer options such as skipping the intro / outro and adjusting playback speed, they still cannot guarantee that users will have a complete, coherent, and focused viewing experience of the video content within a limited time, resulting in a poor user experience. Summary of the Invention

[0004] This application provides a display device, a video playback method, and related apparatus to enable users to obtain a complete, coherent, and focused video content viewing experience within a limited time, thereby improving the user experience.

[0005] In a first aspect, embodiments of this application provide a display device, the display device comprising:

[0006] The display screen is configured to play a video stream;

[0007] A controller connected to the display screen is configured to:

[0008] Obtain the video stream to be played and the user's expected viewing duration;

[0009] The video stream is subjected to content analysis, and the video stream is divided into multiple video segments;

[0010] Based on the expected viewing duration, a plurality of target video segments are determined from the plurality of video segments, and the playback speed of each of the target video segments is determined; the number of the plurality of target video segments is less than the number of the plurality of video segments.

[0011] The multiple target video segments are merged into a target video stream, and the display screen is controlled to play the target video stream.

[0012] In some embodiments, the controller is configured to:

[0013] The video stream is decompressed into multiple image frames;

[0014] Obtain the feature difference value between each image frame and its adjacent image frames, and use the image frames with feature difference values greater than a preset threshold as the separation points;

[0015] The video stream is divided into multiple video segments based on the dividing points.

[0016] In some embodiments, the controller is configured to:

[0017] Extract the dialogue from each of the video segments, perform semantic analysis on the dialogue, and determine the key information corresponding to each of the video segments;

[0018] Character and emotion recognition are performed on each of the video segments to obtain recognition results;

[0019] Based on the identification results and the key information, a must-watch video segment is determined from among the multiple video segments;

[0020] Based on the expected viewing time and the must-see video segments, multiple target video segments are determined from the multiple video segments.

[0021] In some embodiments, the controller is configured to:

[0022] When the total duration of the must-watch video segments exceeds the expected viewing time, the default playback speed of each of the must-watch video segments is adjusted according to the type of the must-watch video segments.

[0023] After the default playback speed is adjusted, if the total duration of the must-watch video segments is still greater than the expected viewing time, the target video segment is determined from the must-watch video segments based on the expected viewing time and the importance score corresponding to each of the must-watch video segments, so that the total duration of the target video segment is less than or equal to the expected viewing time.

[0024] In some embodiments, the controller is configured to:

[0025] A plot summary is generated based on the remaining must-see video clips; the remaining must-see video clips are video clips that are not the target video clip among the must-see video clips.

[0026] Insert the plot summary into the target video clip.

[0027] In some embodiments, the controller is configured to:

[0028] When the total duration of the must-watch video segments is less than the expected viewing time, the default playback speed of the must-watch video segments is adjusted according to their type.

[0029] Based on the remaining playback time and the importance score of the remaining video segments, important video segments are determined from the remaining video segments; the remaining video segments are video segments that are not the must-watch video segments.

[0030] The must-see video clips and the important video clips are used as the target video clips.

[0031] In some embodiments, the controller is configured to:

[0032] The remaining video segments with an importance score greater than a preset threshold are used as the initial important video segments;

[0033] The default playback speed of the initial important video segments is adjusted according to their type.

[0034] Based on the remaining playback time, the top N initial important video segments ranked by importance score are selected as the important video segments, so that the total playback time of the top N initial important video segments is less than or equal to the remaining playback time.

[0035] In some embodiments, the controller is configured to:

[0036] Based on the content of the video clip and the user's historical viewing preferences, the video clip is scored to obtain an importance score for the video clip.

[0037] In some embodiments, the controller is configured to:

[0038] During the process of controlling the display screen to play the target video stream, user playback feedback information is obtained;

[0039] The target video stream is dynamically adjusted based on the playback feedback information.

[0040] Secondly, embodiments of this application provide a video playback method, including:

[0041] Obtain the video stream to be played and the user's expected viewing duration;

[0042] The video stream is subjected to content analysis, and the video stream is divided into multiple video segments;

[0043] Based on the expected viewing duration, a plurality of target video segments are determined from the plurality of video segments, and the playback speed of each of the target video segments is determined; the number of the plurality of target video segments is less than the number of the plurality of video segments.

[0044] Based on the playback speed of each target video segment, multiple target video segments are merged into a target video stream, and the target video stream is played.

[0045] Thirdly, this application provides an electronic device, including: a memory and a processor;

[0046] The memory is used to store computer instructions; the processor is used to execute the computer instructions stored in the memory to implement the method of any one of the first aspects.

[0047] Fourthly, this application provides a computer-readable storage medium having a computer program stored thereon, the computer program being executed by a processor to implement the method of any of the second aspects.

[0048] Fifthly, this application provides a computer program product, including a computer program that, when executed by a processor, implements the method of any one of the second aspects.

[0049] The display device, video playback method, and related apparatus provided in this application embodiment acquire the video stream to be played and the user's expected viewing time; perform content analysis on the video stream, dividing the video stream into multiple video segments; determine multiple target video segments from the multiple video segments according to the expected viewing time, and determine the playback speed of each target video segment; the number of multiple target video segments is less than the number of multiple video segments; merge the multiple target video segments into a target video stream, and control the display screen to play the target video stream. This method, while retaining important segments, compresses videos that are originally long enough to fit within the user's expected viewing time. Simultaneously, it avoids logical gaps caused by cuts through plot continuity constraints, ultimately fulfilling the user's core need to "watch TV by time," solving the core technical problem that traditional static playback cannot adapt to the user's time budget, and effectively improving the user experience. Attached Figure Description

[0050] Figure 1 A scenario diagram provided for an embodiment of this application;

[0051] Figure 2 This is a schematic diagram of the structure of a display device provided in an embodiment of this application;

[0052] Figure 3 A flowchart illustrating a video playback method provided in this application embodiment. Figure 1 ;

[0053] Figure 4 A schematic diagram of a user input interface provided for an embodiment of this application;

[0054] Figure 5 A flowchart illustrating a video playback method provided in this application embodiment. Figure 2 ;

[0055] Figure 6 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0056] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0057] In the embodiments of this application, the terms "first" and "second" are used to distinguish identical or similar items with essentially the same function and effect, without limiting their order. Those skilled in the art will understand that the terms "first" and "second" do not limit the quantity or execution order, and that the terms "first" and "second" do not necessarily imply that they are different.

[0058] It should be noted that, in the embodiments of this application, the terms "exemplary" or "for example" are used to indicate examples, illustrations, or descriptions. Any embodiment or design scheme described as "exemplary" or "for example" in this application should not be construed as being more preferred or advantageous than other embodiments or design schemes. Specifically, the use of terms such as "exemplary" or "for example" is intended to present the relevant concepts in a specific manner.

[0059] With the explosive growth of digital media content, the contradiction between users' demand for long-form videos (such as TV dramas, variety shows, and documentaries) and their fragmented time is becoming increasingly prominent. Modern working professionals, students, and housewives generally experience the pain point of "limited time but strong content needs".

[0060] For example, a single episode of a TV series is usually 45-60 minutes long, while users may only have 10-30 minutes of continuous viewing time per day, leading to frequent interruptions in watching the show and difficulties in connecting the plot.

[0061] While existing technologies offer options such as skipping the intro / outro and adjusting playback speed, they still cannot guarantee that users will have a complete, coherent, and focused viewing experience of the video content within a limited time, resulting in a poor user experience.

[0062] To address the aforementioned issues, this application provides a display device, a video playback method, and related apparatus. By combining user time budget, content analysis technology, and dynamic adjustment algorithms, it achieves intelligent compression of video content and generation of playback strategies. Using the user-inputted available time as a constraint, it leverages video structure analysis, semantic understanding, and duplicate segment detection technologies to identify key information and remove redundant segments from the program content. Furthermore, a dynamic programming algorithm generates an optimal playback plan, ultimately enabling speed-up playback, segment skipping, and seamless transitions. This solution resolves the conflict between video duration and the user's continuous viewing time, achieving an intelligent "watch video by time" experience and effectively improving the user experience.

[0063] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties. Furthermore, the collection, storage, use, processing, transmission, provision, disclosure, and application of the relevant data all comply with the relevant laws, regulations, and standards of the relevant countries and regions, have taken necessary confidentiality measures, do not violate public order and good morals, and provide corresponding operation access points for users to choose to authorize or refuse.

[0064] The technical solution of this application and how the technical solution of this application solves the above-mentioned technical problems are described in detail below with specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of this application will now be described with reference to the accompanying drawings.

[0065] Figure 1 A scenario illustration provided for this application, such as Figure 1 As shown, after a user selects the video they want to watch, the display device can display multiple time controls on the display interface, each of which provides different time options.

[0066] After the display device obtains the user's desired viewing time, it can process the video to be played based on the desired viewing time, compressing the playback duration of the video to be played to the user's desired viewing time. This provides the user with a relatively complete, coherent, and key video content within the user's desired viewing time, allowing the user to understand the entire video content within a limited time and not miss any key information.

[0067] For example, users can input their desired viewing time into the display device via voice, remote control, or touch time control. Upon receiving the user's desired viewing time, the display device can adjust the playback speed or other settings of the video to be played, resulting in a playback duration that is less than or equal to the user's desired viewing time.

[0068] After the display device has adjusted the video to be played, it can play the adjusted video to the user through the display screen.

[0069] Figure 2 This is a schematic diagram of the structure of a display device 20 provided in an embodiment of this application, as shown below. Figure 2 As shown, it includes: display 21 and controller 22.

[0070] Display 21 is configured to play a video stream.

[0071] The controller 22 is configured to adjust the video to be played based on the acquired user's expected viewing duration, so that the playback duration of the video to be played is less than or equal to the user's expected viewing duration.

[0072] The controller 22 is also configured to convert the video stream to be played into a video signal and send the video signal to the display screen 21 so that the display screen 21 plays the video stream.

[0073] The display device provided in this application can have various implementation forms, such as a smart TV, laser projection device, monitor, electronic bulletin board, electronic table, etc. This application does not limit the type of display device.

[0074] The following is combined Figure 3 The video playback method provided in the embodiments of this application is described with the controller as the execution subject.

[0075] Figure 3 This is a flowchart illustrating a video playback method provided in an embodiment of this application, as shown below. Figure 3 As shown, it includes:

[0076] S301. Obtain the video stream to be played and the user's expected viewing duration.

[0077] In some embodiments, the controller may, in response to a user instruction, determine the identifier of the video to be played and obtain the video stream to be played from the local storage of the display device, a cloud server, or a streaming media platform based on the video identifier.

[0078] For example, the controller can display a video list interface on a screen, showing multiple playable video identifiers (e.g., video name and frame). Users can select the video identifier they wish to watch from the video list interface. After receiving the selected video identifier, the controller can retrieve the corresponding video stream based on that identifier.

[0079] In some embodiments, after determining the identifier of the video to be played, the controller may display, as shown below. Figure 1 The time control interface shown is used to obtain the user's expected viewing time.

[0080] In some embodiments, the controller may also provide the user with playback mode options, with different playback modes having different processing strategies for the video to be played.

[0081] For example, such as Figure 4 As shown, playback modes include Quick View, Highlights, and Full View. Quick View mode focuses on the main storyline and allows for more significant cuts. Highlights mode performs moderate compression while preserving key details. Full View mode removes only advertisements and obviously repetitive segments.

[0082] In some embodiments, users can select either the viewing duration or the playback mode individually, or they can select both the viewing duration and the playback mode simultaneously.

[0083] When a user selects a viewing duration or playback mode individually, the video stream can be processed according to either option. When a user selects both viewing duration and playback mode simultaneously, and provided that the viewing duration and playback mode do not conflict, the video stream can be processed based on the requirements of both options.

[0084] In this context, a conflict between viewing duration and playback mode refers to the inability to simultaneously meet the requirements of both. For example, if a viewing duration of 15 minutes is selected, but the full playback mode is chosen, a conflict exists. In such cases, the controller can display a prompt message to the user, allowing them to make a different selection.

[0085] In some embodiments, the controller may also automatically recommend viewing duration and / or viewing mode based on time periods, such as automatically recommending "15 minutes + highlights mode" on weekday evenings.

[0086] S302. Perform content analysis on the video stream and divide the video stream into multiple video segments.

[0087] In some embodiments, after acquiring the video stream, the controller can divide the video stream into multiple video segments based on shot boundary detection; each video segment corresponds to one shot.

[0088] In this context, a shot can refer to the smallest semantic unit of video. A shot is a continuous sequence of images recorded by a camera from the start of filming to the stop. Within the same shot, the user sees a continuous flow of action or scene in time and space. The shot boundary is the point where one shot ends and another begins.

[0089] For example, the video stream is decompressed into multiple image frames; the feature difference value between each image frame and its adjacent image frames is obtained, and the image frames with a difference value greater than a preset threshold are used as separation points; the video stream is divided into multiple video segments based on the separation points.

[0090] For example, video decompression algorithms or tools can be used to decompress the video stream, resulting in multiple image frames. For each image frame, its feature values (such as color histogram, edge intensity, motion vector, etc.) can be calculated, and the feature differences between adjacent image frames can be compared.

[0091] Because the content difference between two adjacent frames can be very large when switching instantly from one shot to another, meaning the feature difference values between adjacent image frames can be very large, after determining the feature difference values between adjacent image frames, image frames with feature difference values greater than a preset threshold can be used as separation points (i.e., shot switching points), and then the video stream can be divided into multiple video segments based on these separation points.

[0092] For example, if a video stream has 1000 frames after decompression, and the feature difference between frames 100 and 101 exceeds a preset threshold, the feature difference between frames 400 and 401 exceeds a preset threshold, and the feature difference between frames 750 and 751 exceeds a preset threshold, then frames 100, 400, and 750 can be used as dividing points to divide the video stream into four video segments. That is, frames 1-100 form the first video segment, frames 101-400 form the second video segment, frames 401-750 form the third video segment, and frames 751-1000 form the fourth video segment.

[0093] In some embodiments, the camera transition is not instantaneous, but rather occurs through a brief transition effect lasting several frames or even dozens of frames. The feature differences between these frames are small, but the cumulative feature differences are large. Therefore, a dual-threshold method can be used to detect the separation points.

[0094] For example, when the difference between adjacent image frames is greater than the threshold T1, the image frame is marked as the beginning of a possible transition. If the difference between N consecutive frames is greater than T1, it is determined that the image is in the transition phase. When the difference between adjacent image frames is greater than the threshold T2, the image frame is determined as a separation point. Here, T1 is less than T2.

[0095] It should be understood that the specific sizes of T1 and T2 can be set based on practical experience.

[0096] S303. Based on the desired viewing duration, determine multiple target video segments from multiple video segments, and determine the playback speed of each target video segment. The number of target video segments is less than the number of video segments.

[0097] In some embodiments, after obtaining each video segment, volume features, station logos, watermarks, and opening / closing templates can be used to match each video segment, identify the video segments corresponding to the opening and closing credits, and remove those video segments.

[0098] In some embodiments, video segments can be matched based on insertion flags and screen features to identify the video segment corresponding to the advertisement and remove that video segment.

[0099] For the remaining video clips, content analysis can be performed on each clip to identify those with higher importance and those that ensure narrative coherence. For example, artificial intelligence or deep learning models can be used to perform visual and semantic analysis on each clip. Based on the results of these analyses, each clip is scored for importance and coherence, and important clips with scores exceeding a threshold, as well as clips with high coherence to important clips, are selected.

[0100] For example, the importance score of a video clip = (semantic importance weight × plot turning point judged by the large language model) + (emotion weight × audio energy peak) + (visual weight × face close-up / motion intensity).

[0101] After obtaining the aforementioned video clips, it can be determined whether the total playback time of each video clip is less than the user's expected playback time. If it is less, then each of the aforementioned video clips is selected as the target video clip. If it is greater, then the playback speed of each video clip can be adjusted. For example, a speed of 1.5-2 times can be used for slower-paced dialogues and everyday scenes, while the original playback speed can be maintained for high-action scenes.

[0102] After adjusting the playback speed of each video segment, it can be determined whether the total playback time of the adjusted video segments is less than the user's expected playback time. If it is less, the video segments with adjusted playback speeds will be used as target video segments. If it is still greater than the user's expected playback time, video segments with lower importance scores and related video segments can be deleted to make the total playback time of the remaining video segments less than or equal to the user's expected playback time, and the remaining video segments will be used as target video segments.

[0103] S304. Merge multiple target video segments into a target video stream and control the display screen to play the target video stream.

[0104] In some embodiments, after obtaining each target video segment, the target video segments can be spliced together according to their time sequence in the original video stream to obtain the target video stream.

[0105] After obtaining the target video stream, the controller can convert the target video stream into a video signal and send the video signal to the display screen so that the display screen can play the target video stream.

[0106] The video playback method provided in this application involves: acquiring the video stream to be played and the user's expected viewing time; performing content analysis on the video stream to divide it into multiple video segments; determining multiple target video segments from the multiple video segments based on the expected viewing time; determining the playback speed of each target video segment; ensuring the number of target video segments is less than the number of video segments; merging the multiple target video segments into a target video stream; and controlling the display screen to play the target video stream. This method, while retaining important segments, compresses videos that are originally long enough to fit within the user's expected viewing time. Simultaneously, it avoids logical gaps caused by cuts by constraining the continuity of the storyline, ultimately fulfilling the user's core need to "watch TV by time." This solves the core technical problem that traditional static playback cannot adapt to the user's time budget, effectively improving the user experience.

[0107] exist Figure 3 Based on the illustrated embodiment, the following is combined with Figure 5 The video playback method provided in the embodiments of this application will be further described.

[0108] Figure 5 A flowchart illustrating a video playback method provided in this application embodiment. Figure 2 ,like Figure 5 As shown, it includes:

[0109] S501, Obtain the video stream to be played and the user's expected viewing duration.

[0110] S502. Perform content analysis on the video stream and divide the video stream into multiple video segments.

[0111] The specific implementation methods shown in S501-S502 of this application embodiment are the same as those shown in the embodiment of this application. Figure 3 The specific implementation methods of the corresponding steps in the illustrated embodiments are similar, and will not be repeated here.

[0112] S503. Based on the content of the video clip and the user's historical viewing preferences, score the video clip to obtain the importance score of the video clip.

[0113] In some embodiments, the video clip content in this step may include the plot or information density of the clip, whether the clip is the first appearance of an important character and / or important information, etc.

[0114] For example, a pre-trained model can be used to process the content of a video clip and the user's historical viewing preferences to obtain the plot or information density score, the score for the first appearance of an important character and / or important information, and the user's historical viewing preference score. Then, the obtained scores are weighted and fused to obtain the importance score of the video clip.

[0115] For example, attention mechanisms and optical flow algorithms are used to calculate visual features of video clips, obtaining visual features such as motion intensity. Dialogue is extracted from the video clips and converted into text. Text classifiers and other models are used to analyze the semantics of the text, obtaining semantic features. Then, a pre-trained model is used to process the visual and semantic features to obtain a plot or information density score for the video clip.

[0116] A face detection model is used to perform face recognition on video clips, and an object detection model is used to detect objects in the video clips. The detected faces and objects are input into a feature extraction model to obtain corresponding feature vectors. Then, the obtained face feature vectors and object feature vectors are compared with pre-maintained person vector libraries and object vector libraries, respectively, and a matching score is output. The face matching score and object matching score are then fused to obtain a score for the first appearance of an important person and / or important information.

[0117] The visual and semantic features corresponding to the video, along with the user's interest profile, are input into a user preference prediction model (such as the EMER framework) to obtain the user's historical viewing preference score.

[0118] After obtaining the scores from the above dimensions, the scores from each dimension can be weighted and merged based on the preset weights of each dimension to output the importance score of the video segment.

[0119] S504. Extract the dialogue from each video segment, perform semantic analysis on the dialogue, and determine the key information corresponding to each video segment.

[0120] In some embodiments, key information may refer to information in a video clip related to important events such as changes in character relationships, key plot revelations, or the outbreak of conflict.

[0121] For example, speech-to-text technology can be used to convert the speech of various video clips into text, and speaker differentiation can be used during the speech-to-text process to obtain the dialogue of different people in the video clips.

[0122] After obtaining the dialogues of different characters in a video clip, natural language processing can be used to process the dialogues and obtain corresponding semantic recognition results. After obtaining the semantic recognition results, they can be separated based on a pre-defined model or template to extract key information.

[0123] For example, based on the semantic recognition results, preset relationship trigger words can be used to analyze the results and determine changes in relationships between characters within video clips. Preset keyword templates can be used to match the semantic recognition results to identify key plot revelations, conflict outbreaks, and other important events.

[0124] S505. Based on key information, determine the must-watch video clips from multiple video clips.

[0125] In some embodiments, a must-see video clip may refer to a segment among various video clips that is of high importance and highly relevant to the progression of the plot.

[0126] For example, character and emotion recognition is performed on each video clip to obtain recognition results; based on the recognition results and key information, the must-watch video clips are determined from multiple video clips.

[0127] For example, based on the dialogues of different characters obtained above and the pre-identified character list, it is possible to count the characters appearing in each video segment and their appearance duration, dialogue duration, and other character information.

[0128] A facial emotion recognition model is used to identify facial expressions of different characters in various videos, obtaining their facial emotions. Based on the dialogue obtained earlier, a text sentiment analysis model is used to obtain the textual emotions of each character. Finally, the facial emotions of each character are fused using an emotion fusion algorithm to obtain the overall emotion of each character.

[0129] After obtaining character information, character emotions, and key information from video clips, this information can be input into a large language model, allowing the model to identify video clips that have a significant or decisive impact on the main plot—that is, must-see video clips.

[0130] S506. Determine the target video segments based on the must-watch video segments and the expected viewing time.

[0131] After identifying the must-watch video segments, you can determine whether the total duration of the must-watch video segments (i.e., the total duration at the default playback speed) is less than the expected viewing time.

[0132] In some embodiments, when the total duration of the must-watch video segments exceeds the expected viewing time, the default playback speed of each must-watch video segment is adjusted according to its type. If, after adjusting the default playback speed, the total duration of the must-watch video segments still exceeds the expected viewing time, a target video segment is determined from the must-watch video segments based on the expected viewing time and the importance score corresponding to each must-watch video segment, so that the total duration of the target video segment is less than or equal to the expected viewing time.

[0133] For example, if the total length of the must-watch video clips is 20 minutes and the expected viewing time is 15 minutes, then the default playback speed of the must-watch video clips with slower dialogue pace and daily scene video clips will be adjusted to 1.5-2 times speed, while the default playback speed or slightly increased (such as 1.2 times speed) will be maintained for high-action scenes.

[0134] After adjusting the default playback speed of each must-watch video segment, the total playback duration of the must-watch video segments is redefined. If it still exceeds the desired viewing time, one or more video segments with lower importance scores can be deleted based on their importance scores. For example, if the total playback duration of the must-watch video segments is redefined as 17 minutes, one or two of the must-watch video segments with the lowest importance scores can be deleted to make the total playback duration of the must-watch video segments less than or equal to 15 minutes, and the remaining must-watch video segments will be used as target video segments.

[0135] To ensure that the deletion of the must-see video clips does not affect the user's understanding of the plot, the controller can generate a plot summary based on the remaining must-see video clips; the remaining must-see video clips are video segments that are not the target video clip among the must-see video clips; and the plot summary is inserted into the target video clip.

[0136] For example, based on the text corresponding to the remaining must-watch video clips (i.e. the deleted must-watch video clips), a text content summary is generated and inserted into the corresponding position of the remaining must-watch video clips.

[0137] In some embodiments, when the total duration of the must-watch video segments is less than the expected viewing time, the default playback speed of the must-watch video segments is adjusted according to their type; important video segments are determined from the remaining video segments based on the remaining playback time and the importance scores of the remaining video segments; the remaining video segments are video segments that are not must-watch video segments; and the must-watch video segments and important video segments are used as target video segments.

[0138] For example, if the total length of the must-watch video clips is 13 minutes and the expected viewing time is 15 minutes, the default playback speed of the must-watch video clips such as those with slower dialogue pace and everyday scene video clips can be adjusted to 1.5-2 times speed, while the default playback speed or a slight increase (such as 1.2 times speed) can be maintained for high-action scenes.

[0139] After adjusting the playback speed, the total duration of the must-watch video segments is 10 minutes. At this point, there are 5 minutes of remaining playback time compared to the user's expected viewing time of 15 minutes. The controller can also select some more important video segments from the remaining video segments (non-must-watch video segments).

[0140] For example, the remaining video segments with importance scores greater than a preset threshold are used as initial important video segments; the default playback speed of the initial important video segments is adjusted according to their type; based on the remaining playback time, the top N initial important video segments with the highest importance scores are used as important video segments, so that the total playback time of the top N initial important video segments is less than or equal to the remaining playback time.

[0141] For example, if there are 5 initial important video clips with an importance score greater than the preset threshold, and their total playback time is 8 minutes, then the default playback speed of video clips with slower dialogue pace and daily scene video clips can be adjusted to 1.5-2 times speed. For high-action scenes, the default playback speed can be maintained or the playback speed can be slightly increased (such as 1.2 times speed).

[0142] If the total playback time of the initial important video segments after playback speed adjustment is less than or equal to 5 minutes, then the initial important video segments are considered important video segments. If the total playback time of the initial important video segments after playback speed adjustment is greater than 5 minutes, then one or two video segments with the lowest importance score among the five initial important video segments are deleted, so that the total playback time of the remaining initial important video segments is less than or equal to 5 minutes, and the remaining initial important video segments are considered important video segments.

[0143] Optionally, the controller can also generate corresponding text or voice plot summaries based on the deleted initial important video segments, and insert the text or voice plot summaries into the corresponding positions.

[0144] In some embodiments, when the total duration of the must-watch video segments is less than the expected viewing time, one or more important video segments can be directly selected from the remaining video segments based on the remaining playback time.

[0145] After obtaining the must-watch and important video clips, you can use them as target video clips.

[0146] S507. Merge multiple target video segments into a target video stream and control the display screen to play the target video stream.

[0147] The specific implementation method shown in S507 of this application embodiment is the same as Figure 3 The specific implementation methods of the corresponding steps in the illustrated embodiments are similar, and will not be repeated here.

[0148] S508. During the process of controlling the display screen to play the target video stream, obtain the user's playback feedback information and dynamically adjust the target video stream based on the playback feedback information.

[0149] In some embodiments, during the playback of the target video stream, the controller can also acquire real-time playback feedback information from the user and dynamically adjust the subsequent playback plan based on this feedback. For example, playback feedback information could indicate that the playback is too fast, prompting adjustments such as increasing the speed or extending the viewing time.

[0150] For example, during playback, if a user says "too fast" or "I want to watch this part completely" via voice, the controller can slow down the currently playing segment to normal speed and appropriately increase the magnification of subsequent minor segments or delete some less important segments to still approximate the user's desired viewing time. For example, if a user says "give me 10 more minutes" via voice, the controller can use the newly input time and the remaining time to re-plan subsequent segments, making the content more complete. Furthermore, when a user wants to quickly watch multiple episodes within a limited time, the controller can first generate an independent compression scheme for each episode, then perform horizontal integration to ensure the continuity of key plot points across episodes; it can automatically identify and significantly compress already watched portions, presenting them only as a "previous episode recap" or summary.

[0151] In some embodiments, the process of generating the target video stream can be performed locally by the controller or via a cloud server.

[0152] The video playback method provided in this application uses user-input available time as a constraint. It leverages video structure analysis, semantic understanding, and duplicate segment detection technologies to identify key information and remove redundant segments from program content. A dynamic programming algorithm generates an optimal playback plan, and the playback control module ultimately enables speed-up playback, segment skipping, and seamless transitions. This truly achieves "watching TV by time." Users only need to tell the TV "available viewing time," and the system automatically generates the most suitable content combination and playback strategy, maximizing time utilization efficiency. It significantly improves information acquisition efficiency by automatically skipping advertisements, replays, and low-information segments, greatly compressing the duration while maintaining plot continuity. It significantly reduces the burden of manual operation; users no longer need to frequently fast-forward / rewind to find key points, achieving a comfortable viewing rhythm with a single setting.

[0153] This application also provides an electronic device.

[0154] Figure 6 This is a schematic diagram of the structure of the electronic device 60 provided in the embodiments of this application, such as... Figure 6 As shown, the electronic device may include: a transceiver 601, a processor 602, and a memory 603. The electronic device may be a controller as described in any of the above embodiments.

[0155] The processor 602 executes computer execution instructions stored in the memory, causing the processor 602 to perform the scheme in the above embodiments. The processor 602 can be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; it can also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.

[0156] The memory 603 is connected to the processor 602 via the system bus and completes communication between them. The memory 603 is used to store computer program instructions.

[0157] Transceiver 601 can perform the functions of receiving and sending data and instructions.

[0158] Optionally, the electronic device 60 may also include a communication interface to communicate and interact with external or internal devices, such as client devices (e.g., mobile phones, tablets). In specific implementations, if the communication interface, memory 603, and processor 602 are implemented independently, they can be interconnected via a bus to complete communication with each other.

[0159] The system bus can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The system bus can be divided into address bus, data bus, control bus, etc. For ease of representation, only one thick line is used in the diagram, but this does not indicate that there is only one bus or one type of bus. Transceivers are used to enable communication between database access devices and other computers (e.g., clients, read-write libraries, and read-only libraries). Memory may include random access memory (RAM) and may also include non-volatile memory.

[0160] Optionally, in a specific implementation, if the communication interface, memory 603, and processor 602 are integrated on a single chip, then the communication interface, memory 603, and processor 602 can communicate through an internal interface.

[0161] This application also provides a chip for executing instructions, which is used to execute the monitoring method described in the above embodiments.

[0162] This application also provides a computer-readable storage medium storing a computer program thereon. When the computer program is executed by a processor, it implements the technical solution of the above-described monitoring method embodiment. Its implementation principle and technical effect are similar, and will not be repeated here.

[0163] In one possible implementation, a computer-readable medium may include random access memory (RAM), read-only memory (ROM), compact discread-only memory (CD-ROM) or other optical disc storage, disk storage or other magnetic storage devices, or any other medium targeted to carry or to store the required program code in the form of instructions or data structures, and accessible by a computer. Furthermore, any connection is appropriately referred to as a computer-readable medium. For example, if software is transmitted from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. As used herein, disks and optical discs include optical discs, laser discs, optical discs, Digital Versatile Discs (DVDs), floppy disks, and Blu-ray discs, where disks typically reproduce data magnetically, while optical discs optically reproduce data using lasers. Combinations of the above should also be included within the scope of computer-readable media.

[0164] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the technical solution of the above-described API gateway production verification method embodiment. Its implementation principle and technical effects are similar and will not be repeated here.

[0165] In the specific implementation of the aforementioned terminal device or server, it should be understood that the processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. A general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in the embodiments of this application can be directly manifested as execution by a hardware processor, or execution by a combination of hardware and software modules within the processor.

[0166] Those skilled in the art will understand that all or part of the steps in any of the above method embodiments can be implemented by hardware associated with program instructions. The aforementioned program can be stored in a computer-readable storage medium, and when the program is executed, all or part of the steps in the above method embodiments are performed.

[0167] If the technical solution of this application is implemented in software form and sold or used as a product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the technical solution of this application can be embodied in the form of a software product, which is stored in a storage medium and includes a computer program or several instructions. This computer software product enables a computer device (which may be a personal computer, server, network device, or similar electronic device) to execute all or part of the steps of the methods in the embodiments of this application.

[0168] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that this application is not limited to the described order of actions, as some steps may be performed in other orders or simultaneously according to this application. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are all optional embodiments, and the actions and modules involved are not necessarily essential to this application.

[0169] It should be further noted that although the steps in the flowchart are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowchart may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these sub-steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.

[0170] It should be understood that the above-described device embodiments are merely illustrative, and the device of this application can also be implemented in other ways. For example, the division of units / modules in the above embodiments is only a logical functional division, and there may be other division methods in actual implementation. For example, multiple units, modules, or components may be combined, or integrated into another system, or some features may be ignored or not executed.

[0171] Furthermore, unless otherwise specified, the functional units / modules in the various embodiments of this application can be integrated into one unit / module, or each unit / module can exist physically separately, or two or more units / modules can be integrated together. The integrated units / modules described above can be implemented in hardware or as software program modules.

[0172] When integrated units / modules are implemented in hardware, the hardware can be digital circuits, analog circuits, etc. The physical implementation of the hardware structure includes, but is not limited to, transistors, memristors, etc. Unless otherwise specified, the processor can be any suitable hardware processor, such as a CPU, GPU, FPGA, DSP, and ASIC, etc. Unless otherwise specified, the storage unit can be any suitable magnetic or magneto-optical storage medium, such as Resistive Random Access Memory (RRAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), Enhanced Dynamic Random Access Memory (EDRAM), High-Bandwidth Memory (HBM), Hybrid Memory Cube (HMC), etc.

[0173] If the integrated unit / module is implemented as a software program module and sold or used as an independent product, it can be stored in a computer-readable storage device (CMD). Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a memory and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned memory includes various media capable of storing program code, such as a USB flash drive, read-only memory (ROM), random access memory (RAM), portable hard drive, magnetic disk, or optical disk.

[0174] In the above embodiments, the descriptions of each embodiment have their own emphasis. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments. The technical features of the above embodiments can be combined arbitrarily. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as the combination of these technical features does not contradict each other, it should be considered within the scope of this specification.

[0175] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features therein. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of this application.

Claims

1. A display device, characterized in that, The display device includes: The display screen is configured to play a video stream; A controller connected to the display screen is configured to: Obtain the video stream to be played and the user's expected viewing duration; The video stream is subjected to content analysis, and the video stream is divided into multiple video segments; Based on the expected viewing duration, a plurality of target video segments are determined from the plurality of video segments, and the playback speed of each of the target video segments is determined; the number of the plurality of target video segments is less than the number of the plurality of video segments. The multiple target video segments are merged into a target video stream, and the display screen is controlled to play the target video stream.

2. The display device according to claim 1, characterized in that, The controller is configured to: The video stream is decompressed into multiple image frames; Obtain the feature difference value between each image frame and its adjacent image frames, and use the image frames with feature difference values greater than a preset threshold as the separation points; The video stream is divided into multiple video segments based on the dividing points.

3. The display device according to claim 2, characterized in that, The controller is configured to: Extract the dialogue from each of the video segments, perform semantic analysis on the dialogue, and determine the key information corresponding to each of the video segments; Character and emotion recognition are performed on each of the video segments to obtain recognition results; Based on the identification results and the key information, a must-watch video segment is determined from among the multiple video segments; Based on the expected viewing time and the must-see video segments, multiple target video segments are determined from the multiple video segments.

4. The display device according to claim 3, characterized in that, The controller is configured to: When the total duration of the must-watch video segments exceeds the expected viewing time, the default playback speed of each of the must-watch video segments is adjusted according to the type of the must-watch video segments. After the default playback speed is adjusted, if the total duration of the must-watch video segments is still greater than the expected viewing time, the target video segment is determined from the must-watch video segments based on the expected viewing time and the importance score corresponding to each of the must-watch video segments, so that the total duration of the target video segment is less than or equal to the expected viewing time.

5. The display device according to claim 4, characterized in that, The controller is configured to: A plot summary is generated based on the remaining must-see video clips; the remaining must-see video clips are video clips that are not the target video clip among the must-see video clips. Insert the plot summary into the target video clip.

6. The display device according to claim 3, characterized in that, The controller is configured to: When the total duration of the must-watch video segments is less than the expected viewing time, the default playback speed of the must-watch video segments is adjusted according to their type. Based on the remaining playback time and the importance score of the remaining video segments, important video segments are determined from the remaining video segments; the remaining video segments are video segments that are not the must-watch video segments. The must-see video clips and the important video clips are used as the target video clips.

7. The display device according to claim 6, characterized in that, The controller is configured to: The remaining video segments with an importance score greater than a preset threshold are used as the initial important video segments; The default playback speed of the initial important video segments is adjusted according to their type. Based on the remaining playback time, the top N initial important video segments ranked by importance score are selected as the important video segments, so that the total playback time of the top N initial important video segments is less than or equal to the remaining playback time.

8. The display device according to any one of claims 1-7, characterized in that, The controller is configured to: Based on the content of the video clip and the user's historical viewing preferences, the video clip is scored to obtain an importance score for the video clip.

9. The display device according to any one of claims 1-7, characterized in that, The controller is configured to: During the process of controlling the display screen to play the target video stream, user playback feedback information is obtained; The target video stream is dynamically adjusted based on the playback feedback information.

10. A video playback method, characterized in that, include: Obtain the video stream to be played and the user's expected viewing duration; The video stream is subjected to content analysis, and the video stream is divided into multiple video segments; Based on the expected viewing duration, a plurality of target video segments are determined from the plurality of video segments, and the playback speed of each of the target video segments is determined; the number of the plurality of target video segments is less than the number of the plurality of video segments. Based on the playback speed of each target video segment, multiple target video segments are merged into a target video stream, and the target video stream is played.

11. A computer-readable storage medium, characterized in that, It stores a computer program, which is executed by a processor to implement the method of claim 10.

12. A computer program product, characterized in that, It includes a computer program that, when executed by the controller, implements the method of claim 10.