Method and apparatus for training a machine learning classifier

By training a machine learning classifier and updating the initial labels using a confidence learning algorithm, the lag problem in real-time QoE evaluation of streaming video is solved, enabling accurate and real-time user experience quality evaluation of streaming video and improving the accuracy and adaptability of the evaluation.

CN116866323BActive Publication Date: 2026-06-19HONOR DEVICE CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HONOR DEVICE CO LTD
Filing Date
2022-03-22
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies cannot effectively evaluate the quality of user experience (QoE) of streaming video in real time, especially under HTTP-based adaptive bitrate streaming protocols, where network performance cannot be directly mapped to QoE, resulting in delayed or inaccurate evaluation.

Method used

The training method of machine learning classifier is adopted. By obtaining the feature vector set and the initial label set, the initial label is updated in each round of training using the confidence learning algorithm, thereby optimizing the robustness and generalization of the machine learning classifier and finally realizing the real-time QoE evaluation of streaming video.

Benefits of technology

It achieves accurate, real-time QoE evaluation of streaming video, improves the robustness and generalization ability of machine learning classifiers, and can dynamically adjust and optimize evaluation results during streaming video playback.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116866323B_ABST
    Figure CN116866323B_ABST
Patent Text Reader

Abstract

This application provides a training method and apparatus for a machine learning classifier, relating to the field of artificial intelligence technology. In this application, during each iteration of training, an initial label set corresponding to that iteration is used to train the machine learning classifier. Then, the initial labels in this initial label set that meet preset conditions are updated to generate an initial label set for the next iteration. This process continues until a preset number of iterations is reached, at which point iterative training stops, resulting in a machine learning classifier with good robustness and generalization ability. Thus, when a user watches streaming video online, the machine learning classifier can accurately evaluate the QoE of the streaming video in real time.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of artificial intelligence (AI) technology, and in particular to a training method and device for a machine learning classifier. Background Technology

[0002] Streaming media technology allows video files to be continuously and uninterruptedly transmitted from a server to a terminal device, enabling users to watch videos in real time. To characterize user satisfaction with streaming video services, it is sometimes necessary to evaluate the quality of the streaming video service in real time.

[0003] The Quality of Experience (QoE) defined by the International Telecommunication Union Telecommunication Standardization Sector (ITU-T) is typically used as a standard for evaluating user satisfaction with streaming video services. However, streaming video commonly uses HTTP Live Streaming (HLS), an adaptive bitrate streaming protocol based on Hypertext Transfer Protocol (HTTP), to provide video-on-demand and live streaming services. Because HLS employs a progressive and adaptive download strategy, it's impossible to simply map network performance to the QoE of streaming video; that is, real-time QoE evaluation of streaming video cannot be achieved using simple network performance metrics. Therefore, how to perform real-time QoE evaluation of streaming video has become a pressing issue. Summary of the Invention

[0004] This application provides a training method and device for a machine learning classifier, solving the technical problem of how to perform real-time QoE evaluation on a streaming video that is currently playing.

[0005] To achieve the above objectives, this application adopts the following technical solution:

[0006] In a first aspect, embodiments of this application provide a method for training a machine learning classifier. The method includes:

[0007] Obtain a set of feature vectors, which includes multiple feature vectors obtained from the log files of the streamed video that has finished playing. Each of the multiple feature vectors is used to evaluate the QoE of the streamed video that has finished playing at a certain moment.

[0008] In each iteration of training, a machine learning classifier is trained based on the feature vector set and the initial label set used for this iteration, to obtain the mapping relationship between the feature vector and the QoE evaluation result at each time step. The initial label set includes multiple initial labels, one of which indicates whether the streaming video that has ended playback is stuttering or smooth at a certain time step. The optimization objective of the machine learning classifier is to minimize the loss corresponding to the degree of difference between the initial label and the QoE evaluation result at each time step.

[0009] The initial labels in the initial label set that meet the preset conditions are updated to obtain the updated initial label set, which is used for the next round of iterative training.

[0010] After a preset number of iterations, the final machine learning classifier is obtained, which is used to evaluate the QoE of streaming video in real time.

[0011] In the above scheme, for each iteration of training the machine learning classifier, after obtaining the mapping relationship between the feature vector and the QoE evaluation result at each time step, the initial labels in the initial label set that meet the preset conditions are updated to obtain the initial labels for the next iteration. That is, each iteration optimizes the initial labels, thus enabling the finally trained machine learning classifier to have good robustness and generalization ability. In this way, when a user watches streaming video online, the final machine learning classifier can make an accurate and real-time evaluation of the QoE of the streaming video.

[0012] In one possible implementation, the preset conditions include:

[0013] The initial label corresponds to a time period within the lag timeframe.

[0014] It is understandable that since lag moments usually occur in the middle of a session, the probability of noisy labels is relatively high during this period. Therefore, updating the initial labels during the lag period can improve the robustness and generalization of the machine learning classifier trained with those labels.

[0015] In one possible implementation, the lag time period includes any of the following:

[0016] The moment of mutation and the preset duration preceding that moment;

[0017] The moment of mutation and the preset duration after that moment of mutation;

[0018] The time of the mutation, the preset duration before the time of the mutation, and the preset duration after the time of the mutation;

[0019] Among them, the initial label corresponding to the mutation time is different from the initial label corresponding to the previous time of the mutation time, or the initial label corresponding to the mutation time is different from the initial label corresponding to the next time of the mutation time.

[0020] It is understandable that during the period before and after the mutation, the streaming video switches from smooth to choppy, or from choppy to smooth. In this case, the probability of the label being a noisy label is relatively high. Therefore, correcting the initial labels for the period before and after the mutation helps to assist in the training of the machine learning classifier.

[0021] In one possible implementation, the QoE evaluation result corresponding to a given moment includes a prediction label and a predicted stutter probability, or a prediction label and a predicted smoothness probability. A prediction label indicates whether the streamed video that has finished playing was stuttering or smooth at a given moment, a predicted stutter probability indicates the probability that the streamed video that has finished playing was stuttering at a given moment, and a predicted smoothness probability indicates the probability that the streamed video that has finished playing was smooth at a given moment.

[0022] The initial tags in the initial tag set that meet the preset conditions are updated to obtain the updated initial tag set, including:

[0023] For any moment within the lag period: if the predicted label for the target moment indicates lag, and the predicted lag probability for the target moment is less than or equal to the average lag probability, then the predicted lag probability for the target moment is added to the first set; or, if the predicted label for the target moment indicates smoothness, and the predicted smoothness probability for the target moment is less than or equal to the average smoothness probability, then the predicted smoothness probability for the target moment is added to the first set; where the average lag probability is the average of the predicted lag probabilities corresponding to moments predicted as lag within the lag period, and the average smoothness probability is the average of the predicted smoothness probabilities corresponding to moments predicted as smooth within the lag period.

[0024] Determine P probabilities from the first set, where each of the P probabilities is less than the other probabilities in the first set, and P is a positive integer.

[0025] Update the P initial labels in the initial label set to obtain the updated initial label set, where the P initial labels correspond to the same time as the P probabilities.

[0026] It is understandable that by comparing the predicted probability of each label in the initial label set with the average predicted probability, some initial labels with lower confidence can be selected from the initial label set, and then some of the labels with the lowest confidence among these initial labels with lower confidence can be updated.

[0027] In one possible implementation, the P initial labels in the initial label set are updated to obtain the updated initial label set, which includes:

[0028] Update the first type of labels in the P initial labels to the second type of labels, and update the second type of labels in the P initial labels to the first type of labels, to obtain the updated set of initial labels;

[0029] The first type of label indicates that a streaming video that has finished playing is choppy at a certain moment, while the second type of label indicates that a streaming video that has finished playing is smooth at a certain moment.

[0030] It is understandable that for initial labels with low confidence, correcting stuttering labels to smooth labels, or vice versa, can help assist in the training of machine learning classifiers.

[0031] In one possible implementation, the QoE evaluation result corresponding to a moment includes a predicted stutter probability, which represents the probability that a streaming video that has finished playing is stuttering at a moment.

[0032] The initial tags in the initial tag set that meet the preset conditions are updated to obtain the updated initial tag set, including:

[0033] For any moment within the lag time period: determine the predicted lag probability of the target moment based on the mapping relationship between the feature vector corresponding to each moment and the QoE evaluation result; and determine the cross-entropy loss function based on the initial label corresponding to the target moment and the predicted lag probability of the target moment.

[0034] After obtaining the cross-entropy loss function corresponding to each time step, the Q initial labels in the initial label set are updated to obtain the updated initial label set; where the cross-entropy loss function of the Q initial labels is greater than the cross-entropy loss function of the other initial labels in the initial label set, and Q is a positive integer.

[0035] It is understandable that by comparing the cross-entropy loss function corresponding to each time step, the initial labels with lower confidence can be selected from the initial label set, and then these initial labels with lower confidence can be updated.

[0036] In one possible implementation, the Q initial labels in the initial label set are updated to obtain the updated initial label set, which includes:

[0037] Update the first type of labels in the Q initial labels to the second type of labels, and update the second type of labels in the Q initial labels to the first type of labels, to obtain the updated set of initial labels;

[0038] The first type of label indicates that a streaming video that has finished playing is choppy at a certain moment, while the second type of label indicates that a streaming video that has finished playing is smooth at a certain moment.

[0039] It is understandable that for initial labels with low confidence, correcting stuttering labels to smooth labels and vice versa can help assist in the training of machine learning classifiers.

[0040] In one possible implementation, the initial set of labels used for the first round of iterative training is either manually labeled or automatically generated based on log files of streaming videos that have finished playing.

[0041] It's understandable that manually labeled initial tags have inherent errors, and automatically generated initial tags based on log files have non-real-time issues. Therefore, there's a certain margin of error between the initial tags and the true QoE tags; that is, the initial tag set contains noisy tags. After using the manually labeled or log-file-generated initial tag set for the first round of training, by gradually correcting the noisy tags in the initial tag set through multiple rounds of iterative training, a machine learning classifier with good robustness and generalization can be obtained.

[0042] In one possible implementation, the set of feature vectors is obtained, including:

[0043] The offline data, which is terminal-side parameters obtained from the log files of streaming video that has finished playing, is normalized.

[0044] Extract all feature vectors from the normalized offline data;

[0045] Redundant feature vectors are removed from all feature vectors to obtain a set of feature vectors.

[0046] In one possible implementation, all feature vectors include at least one of the following:

[0047] The global feature vector is a feature vector extracted from all data, which includes offline data from the initial moment when the video application starts playing the streaming video to time t.

[0048] The window feature vector is a feature vector extracted from a portion of the data, which includes offline data at time t and a preset duration prior to time t.

[0049] Other feature vectors, which are feature vectors that are independent of the network data.

[0050] In one possible implementation, the terminal-side parameters include at least one of the following:

[0051] Transport layer parameters, which reflect the network transmission status when playing a streaming video that has finished playing;

[0052] Quality of Service (QoS) parameters are used to evaluate a network's ability to provide service for streaming video that has finished playing.

[0053] Terminal parameters are the parameters of the terminal device itself when playing a streaming video that has finished playing.

[0054] Secondly, this application provides a training apparatus comprising units / modules for performing the method described in the first aspect above. This apparatus can correspond to performing the method described in the first aspect above. For a detailed description of the units / modules in this apparatus, please refer to the description in the first aspect above; for brevity, it will not be repeated here.

[0055] Thirdly, a terminal device is provided, including a processor coupled to a memory, the processor being configured to execute a computer program or instructions stored in the memory, such that the terminal device implements a training method for a machine learning classifier as described in any of the first aspects.

[0056] Fourthly, a chip coupled to a memory is provided for reading and executing a computer program stored in the memory to implement a training method for a machine learning classifier as described in any of the first aspects.

[0057] Fifthly, a computer-readable storage medium is provided that stores a computer program, which, when run on a terminal device, causes the terminal device to execute a training method for a machine learning classifier as described in any of the first aspects.

[0058] Sixthly, a computer program product is provided that, when the computer program product is run on a computer, causes the computer to execute a training method for a machine learning classifier as described in any of the first aspects.

[0059] It is understood that the beneficial effects of the second to sixth aspects mentioned above can be found in the relevant descriptions in the first aspect mentioned above, and will not be repeated here. Attached Figure Description

[0060] Figure 1 This application provides a schematic diagram of the architecture of a communication system.

[0061] Figure 2 This is a schematic diagram of the overall process of the QoE evaluation method provided in the embodiments of this application;

[0062] Figure 3 A flowchart illustrating the training method for a machine learning classifier based on confidence learning provided in this application embodiment;

[0063] Figure 4 A schematic diagram of a cross-entropy loss function provided in an embodiment of this application;

[0064] Figure 5 A schematic diagram of another cross-entropy loss function provided in an embodiment of this application;

[0065] Figure 6 A flowchart illustrating a method for updating initial labels based on a confidence learning algorithm, provided in an embodiment of this application;

[0066] Figure 7 A flowchart illustrating a method for updating initial labels based on a confidence learning algorithm, provided in another embodiment of this application;

[0067] Figure 8 A schematic diagram illustrating the process of obtaining a set of feature vectors provided in an embodiment of this application;

[0068] Figure 9 This is a schematic diagram of the structure of the training device provided in the embodiments of this application;

[0069] Figure 10 This is a schematic diagram of the structure of a terminal device provided in an embodiment of this application. Detailed Implementation

[0070] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are some embodiments of this application, but not all embodiments.

[0071] In the description of this application, unless otherwise stated, " / " means "or". For example, A / B can mean A or B. In the description of this application, "and / or" is merely a way of describing the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent three cases: A alone, A and B simultaneously, and B alone.

[0072] In the specification and claims of this application, the terms "first" and "second," etc., are used to distinguish different objects or to distinguish different treatments of the same object, rather than to describe a specific order of objects. For example, "first type" and "second type," etc., are used to distinguish different networks, rather than to describe a specific order of types.

[0073] References to "one embodiment" or "some embodiments" as described in this specification mean that one or more embodiments of this application include a specific feature, structure, or characteristic described in connection with that embodiment. Therefore, the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in still other embodiments," etc., appearing in different parts of this specification do not necessarily refer to the same embodiment, but rather mean "one or more, but not all, embodiments," unless otherwise specifically emphasized. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless otherwise specifically emphasized.

[0074] First, some of the terms or terms used in this application will be explained.

[0075] 1. Streaming media refers to media formats such as audio, video, or multimedia that are continuously played in real time over a network using streaming media technology. Streaming media technology is also known as streaming media technology. It involves compressing continuous video and / or audio information, uploading it to a server, and then having the server transmit the compressed files to terminal devices in real time. This allows users to watch streaming media files while downloading them, without having to wait for the entire compressed file to download to their device. In this application embodiment, when the media format of the streaming media is video, the video is referred to as streaming video or video streaming media, etc. For ease of explanation, the following embodiments use streaming video as an example and do not limit the scope of this application embodiment.

[0076] 2. QoE refers to the overall level of user satisfaction with network services or services under certain objective conditions. QoE is a user's perception of the overall quality of network services or services. It reflects the user's satisfaction with service performance by understanding the user's subjective feelings about the quality and performance of equipment, networks, systems, applications, or services.

[0077] In some embodiments, the parameters used to evaluate QoE are simply referred to as QoE parameters. During video playback, the frequency and total duration of video buffering also affect the video QoE assessment. For example, frequent buffering and / or a large total duration, i.e., stuttering events occurring during video playback, will lower the user's QoE. Furthermore, factors influencing QoE assessment may include other factors such as user expectations, user preferences and privacy, user payment fees, and the type of service offered by the application.

[0078] 3. Quality of Service (QoS) is a technical metric applied to a network to ensure or enhance network QoE quality. QoS can be used to address issues such as network latency and congestion.

[0079] In some embodiments, the parameters used to evaluate QoS are simply referred to as QoS parameters. QoS parameters may include parameters such as system throughput, signal strength, network transmission stability, reliability, transmission latency, latency jitter, packet loss rate, transmission bit rate, bit error rate, transmission failure rate, and security. Based on these QoS parameters, it is possible to assess whether the network communication quality meets the requirements of business communication.

[0080] It should be understood that the QoE and QoS parameters mentioned above are merely illustrative examples and may include other arbitrary parameters. The specific parameters can be determined according to actual usage requirements, and this application embodiment does not impose any limitations.

[0081] It should be noted that both QoE and QoS can be used to measure the overall quality of network services. QoE is related to specific services, and different services have different QoS requirements. For example, some services are more sensitive to latency metrics in QoS, such as Voice over Internet Protocol (VoIP) services; while other services are more sensitive to packet loss rate metrics in QoS, such as file transfer services.

[0082] Based on the above description, since QoE can more completely depict the user's overall level of acceptance of the network service or business used in a certain objective environment, this application adopts QoE as the evaluation standard for user satisfaction with streaming video services.

[0083] In practice, the following QoE evaluation methods are commonly used:

[0084] One method is an offline QoE assessment. During a user's viewing of a streaming video, the terminal device's log file records information about the interactions between the system and the user, automatically capturing data such as the type, content, and timing of these interactions. After the user has finished watching the video, QoE influencing factors are extracted from the log file. Then, a model is built based on these QoE influencing factors to obtain a QoE score. However, this offline assessment method can only predict and assess QoE after the user has finished watching the video, resulting in a time lag.

[0085] Another approach is real-time QoE evaluation. This method includes the following steps: first, extracting the round-trip time (RTT) information of uplink data packets in the video stream and constructing an input vector; then, constructing a neural network model containing convolutional and fully connected layers; next, inputting the constructed input vector into the neural network model, extracting features, executing fully connected layers, and predicting the video QoE metric; finally, inputting the round-trip time (RTT) information of the encrypted traffic of the video to be estimated into the trained neural network model to predict the video QoE metric. The above steps involve a large amount of repetitive computation, resulting in wasted resources. Furthermore, this approach only uses RTT as an input parameter, resulting in a relatively simple parameter dimension. This approach is also limited to the Transmission Control Protocol (TCP) and is not applicable to low-latency Internet transport layer protocols such as Quick UDP Internet Connection (QUIC) based on User Datagram Protocol.

[0086] In view of this, this application provides a method for training a machine learning classifier: In any scenario of playing streaming media video, such as video-on-demand, online live streaming, real-time video conferencing, multimedia news release, online advertising, e-commerce, distance education, telemedicine, or internet radio, the terminal device first obtains terminal-side parameters based on the log file of the finished streaming media video, then extracts offline features from the terminal-side parameters, and then uses the offline features and QoE labels to train a machine learning classifier to obtain the mapping relationship between offline features and QoE evaluation results.

[0087] For example, suppose the terminal device operates from time t1 to time t i If at least one streaming video has been played continuously or intermittently within a certain time period, then after the last streaming video finishes playing, the terminal device can obtain offline data from the log file of the at least one streaming video and extract the data from time t1 to time t2 from the offline data. i The feature vector. Additionally, the terminal device can obtain manually annotated or automatically generated QoE tag sets. Among them, for time t1 to time t i Any time t in the table: the QoE label of time t. Used to indicate whether a streaming video is smooth or choppy at time t. For example, the QoE tag at time t. When the streaming video at time t is smooth; when the QoE tag at time t is... At time t, the streaming video is choppy.

[0088] For time t1 to time t i The eigenvector at any time t in The training objective is to obtain a machine learning classifier F. ml (·), satisfying the following conditions:

[0089]

[0090]

[0091]

[0092] in, Let be the eigenvector used to evaluate the QoE at time t.

[0093] The predicted label for time t, output by the machine learning classifier, i.e., the QoE evaluation result. When the predicted label corresponds to a given time... When, it represents the machine learning classifier predicting that the streaming video at that moment is smooth; when the predicted label corresponding to a moment... At that moment, the machine learning classifier predicted that the streaming video was choppy.

[0094] F ml (·) is used to represent the mapping relationship between the independent variable and the dependent variable, specifically to represent the eigenvector corresponding to time t. and prediction labels The mapping relationship.

[0095] L(·) is the loss function, used to represent the "risk" or "loss" of a random event. express Make decisions under certain conditions The corresponding loss or risk, that is, the loss or risk used to represent and The loss corresponding to the degree of difference. Specifically, The loss function is 0-1. When... When the loss function is 0, the loss function is 0; when The loss function is 1. It should be understood that when... At that time, the loss or risk is the lowest.

[0096] "St" is an abbreviation for "subject to," used to indicate that something satisfies a constraint. In this embodiment, "stmin"... This indicates that the loss function The goal of training a machine learning classifier is to minimize the loss function, thereby minimizing the loss or risk.

[0097] However, in practice, because the QoE labels mentioned above are either manually labeled or automatically generated based on log files, manually labeled QoE labels have error issues, and automatically generated QoE labels based on log files have non-real-time issues. Therefore, there is a certain time delay error between the QoE labels and the true QoE labels; that is, there are noisy labels. These labels can interfere with the training of the machine learning classifier. Furthermore, even with true QoE labels, at moments when the QoE label changes abruptly (e.g., the QoE label at time t is stuttering, while at time t-1 it is smooth; or, the QoE label at time t is smooth, while at time t-1 it is stuttering), the difference in real-time network parameters is small, but the corresponding QoE evaluation results at these moments may be drastically different. This causes similar feature vectors to map to different QoE evaluation results, thus interfering with the training of the machine learning classifier. Therefore, under such circumstances, machine learning classifiers trained using the above methods may not possess good robustness and generalization ability.

[0098] To address the issue of poor robustness and generalization of machine learning classifiers, this application improves upon the aforementioned training method by providing a confidence-based learning algorithm to assist in the training of the machine learning classifier. Specifically, it provides a confidence-based machine learning classifier training method. First, a set of feature vectors is obtained from the log file of the streamed video that has finished playing. Then, in each iteration of training, the machine learning classifier is trained using the initial label set corresponding to that iteration; and the initial labels in this initial label set that meet preset conditions are updated to generate the initial label set for the next iteration. This process continues until the preset number of iterations is reached, at which point iterative training stops, resulting in the final machine learning classifier used for real-time evaluation of the QoE of the streamed video. In this training method, because the initial labels in the initial label set that meet preset conditions are updated and optimized in each iteration, the final trained machine learning classifier possesses good robustness and generalization. Thus, when a user watches a streamed video online, the final machine learning classifier can accurately and in real-time evaluate the QoE of the streamed video.

[0099] Figure 1 A schematic diagram of the communication system architecture involved in various embodiments of this application is shown. For example... Figure 1As shown, the communication system includes a terminal device 1 and a server 2. Terminal device 1 can access a wireless local area network (WLAN) through an access point (AP) device 3, for example, by accessing a wireless-fidelity (Wi-Fi) network through a router, thereby establishing a connection with server 2 and exchanging data. Alternatively, terminal device 1 can access a mobile network (also known as a cellular network or mobile data network) through a network device 4, for example, by accessing a mobile network through a base station, thereby establishing a connection with server 2 and exchanging data.

[0100] The terminal device 1 can be a mobile terminal, a non-mobile terminal, a user equipment, or other devices or apparatus capable of real-time QoE assessment. For example, a mobile terminal can be a mobile phone, tablet computer, laptop computer, PDA, in-vehicle terminal, wearable device, ultra-mobile personal computer (UMPC), netbook, or personal digital assistant (PDA), etc., while a non-mobile terminal can be a personal computer (PC), smart screen, television (TV), ATM, or self-service machine, etc. One example is that the terminal device 1 can be a smartphone with a built-in Wi-Fi module, or a computer equipped with a wireless network card. This application embodiment does not impose any limitations on the specific type of terminal device.

[0101] Server 2 can be a wide area network (WAN) web server and / or a streaming media server, etc. A streaming media server is also called a streaming media server or an audio / video (A / V) server. The specific type of server is not limited in this embodiment.

[0102] In some embodiments, data transmission between terminal device 1 and server 2 can be performed via connection-oriented TCP or connectionless User Datagram Protocol (UDP).

[0103] In some embodiments, terminal device 1 and server 2 can exchange uplink and / or downlink data. For example, terminal device 1 can send playback request information for a streaming media file to server 2. As another example, server 2 can send streaming media files to terminal device 1 in real time. Yet another example is that terminal device 1 can send streaming media file data packets to other devices through server 2, and receive media file data packets sent by other devices through server 2.

[0104] In some embodiments, server 2 can be used to provide streaming media files, and terminal device 1 can be used to play the streaming media files. Taking a server including a web server and an A / V server as an example, the streaming media transmission process can specifically include:

[0105] After the A / V server obtains the original video file, it preprocesses it to compress it into a streaming media file. When a user selects a streaming media file for playback via terminal device 1, the web browser and web server exchange control information using HTTP / TCP to retrieve the necessary real-time data from the original file. The web browser then launches the audio / video helper (A / Vhelper) program, using HTTP to retrieve relevant parameters from the A / V server to initialize the helper program. These parameters include directory information, the encoding type of the A / V data, and the server address related to A / V retrieval. Afterward, the A / V Helper program and the A / V server run the streaming media protocol to exchange control information required for A / V transmission. This streaming media protocol can be HLS or Dynamic Adaptive Streaming over HTTP (DASH), providing methods for controlling playback, fast forward, rewind, pause, and recording. The A / V server uses Real-Time Transport Protocol (RTP) / UDP to transmit A / V data to the A / V client program, i.e., the A / V helper program. When terminal device 1 receives the A / V data, the A / V client program can play it. During the playback process, low-level parameters can be collected in real time and input into a pre-built QoE evaluation model, enabling real-time evaluation and prediction of QoE.

[0106] It should be noted that the aforementioned streaming media files can be audio, video, or multimedia files, etc. This embodiment of the application illustrates the process of server 2 sending various data packets of streaming media video to terminal device 1, and terminal device 1 recording the playback information of the media video in a log file. Terminal device 1 then trains a machine learning classifier based on the log file.

[0107] Figure 2 This is a schematic diagram of the overall process of the QoE evaluation method provided in the embodiments of this application.

[0108] like Figure 2 As shown, the QoE evaluation method can include two parts:

[0109] The first part is the offline training process.

[0110] The offline training process is mainly used to train a machine learning classifier, also known as a machine learning model, for real-time QoE evaluation based on the log files of streaming video that has finished playing.

[0111] The offline training process can include several sub-processes such as offline dataset collection, parameter normalization, offline feature calculation, feature selection, and offline training of machine learning classifiers based on confidence learning algorithms.

[0112] For example, during the playback of streaming video using a video application (APP) (e.g., APP1 and APP2) on a terminal device for video-on-demand, live streaming, real-time video conferencing, multimedia news releases, online advertising, e-commerce, distance education, telemedicine, or internet radio, the terminal device's log file can record information about the interaction between the system and the user. After the streaming video playback ends, the terminal device can obtain an offline dataset based on the log file. This offline dataset includes multiple offline parameters, which are the underlying parameters input during streaming video playback, such as transport layer parameters, QoS parameters, and / or terminal parameters. The terminal device then performs parameter normalization, offline feature calculation, and offline feature selection on the offline dataset sequentially to obtain a feature vector set for QoE evaluation. Next, the terminal device trains a machine learning classifier based on this feature vector set and the initial label set corresponding to each iteration of training, updating the initial labels in the initial label set that meet preset conditions to generate an initial label set for the next iteration. After a preset number of iterations, the iterative training stops, resulting in the final machine learning classifier used for real-time evaluation of the QoE of streaming video.

[0113] The second part is the online assessment (real-time measurement) process.

[0114] The online evaluation process is mainly used to input the online dataset collected in real time during streaming media playback into a pre-built machine learning classifier, thereby outputting the QoE evaluation result at the current moment.

[0115] The online evaluation process can include several sub-processes such as online dataset collection, parameter normalization, online feature calculation, and online inference of machine learning classifiers.

[0116] For example, when a user plays a target streaming video using a video-based app (e.g., app1 and app2) on a terminal device, the terminal device can acquire an online dataset in real time. This online dataset includes multiple online parameters, which are low-level parameters collected online by the terminal device during the playback of the target streaming video, such as transport layer parameters, QoS parameters, and / or terminal parameters. The terminal device then sequentially performs parameter normalization and online feature calculation on the online dataset to obtain a set of feature vectors for QoE evaluation. This set of feature vectors is then input into a pre-trained machine learning classifier to output the QoE evaluation result for the current time step.

[0117] It should be noted that, in the embodiments of this application, the QoE evaluation results can be used to determine the QoE score. For example, the QoE score can be determined from time t1 to time t2. i The QoE evaluation result is input into the preset model, and a QoE score is obtained. The final QoE score is 1, 2, 3, 4 or 5.

[0118] Figure 3 This is a flowchart illustrating the training method for a machine learning classifier based on confidence learning, as provided in an embodiment of this application.

[0119] like Figure 3 As shown, the method may include the following steps S101 to S104.

[0120] S101, The terminal device obtains the feature vector set.

[0121] The feature vector set comprises multiple feature vectors. Each feature vector in the feature vector set is obtained from the log file of the streamed video that has finished playing. For each feature vector in the feature vector set: a feature vector can be used to evaluate the QoE at a given time.

[0122] For example, suppose the terminal device operates from time t1 to time t i If at least one streaming video has been played continuously or intermittently within a certain time period, then after the playback of the last streaming video ends, the terminal device can obtain the offline dataset corresponding to that at least one streaming video. Among them, sample data It can be used to represent time t i The underlying parameters and sample data are recorded in the log file. It can be used to represent time t i-1 The underlying parameters recorded in the log file, ..., sample data This can be used to represent the underlying parameters recorded by the log file at time t1. Then, the terminal device processes the offline dataset. Each offline parameter x int Normalization is performed, and feature vectors of the normalized offline parameters are extracted. Furthermore, the terminal device can select from these feature vectors to obtain the final feature vector. Feature vector It can be used to evaluate at time t i QoE.

[0123] Following the method described above, the eigenvectors used to evaluate the QoE at time t1 can be obtained sequentially. Eigenvectors used to evaluate the QoE at time t2 …, used to evaluate at time t i eigenvectors of QoE These eigenvectors form the eigenvector set.

[0124] S102. In each round of iterative training, the terminal device trains a machine learning classifier based on the feature vector set and the initial label set used for this round of iterative training, and obtains the mapping relationship between the feature vector and the QoE evaluation result corresponding to each time step.

[0125] Optionally, the machine learning classifier mentioned above can be any of the following: decision tree, support vector machine (SVM), random forest (RF), reinforcement learning or boosting method (AdaBoost), and gradient boosting decision tree (GBDT), etc.

[0126] Optionally, the aforementioned set of initial tags may include multiple initial tags. For each of the multiple initial tags: an initial tag can be used to indicate whether the streaming video that has finished playing was choppy or smooth at a given moment. In some embodiments, when the initial tag is 0, it represents that the streaming video was smooth; when the initial tag is 1, it represents that the streaming video was choppy. In other embodiments, when the initial tag is 0, it represents that the streaming video was choppy; when the initial tag is 1, it represents that the streaming video was smooth.

[0127] When training a machine learning classifier through multiple iterations, for each iteration, after obtaining the set of feature vectors... and the initial label set corresponding to this round of iterative training. Then, the terminal device can train a machine learning classifier to obtain the feature vector corresponding to each time step. Mapping relationship F with QoE evaluation results ml (·). Wherein, the initial tag set... For: the initial label set used in the previous training iteration. After the update, the initial label set generated is used for this round of iterative training.

[0128] Optionally, the QoE evaluation result corresponding to a given time point includes any of the following:

[0129] 1) A predicted label;

[0130] 2) A prediction of the probability of stuttering;

[0131] 3) A predicted smoothness probability;

[0132] 4) A predicted label and a predicted stutter probability;

[0133] 5) A predicted label and a predicted fluency probability.

[0134] Among them, a prediction label is used to indicate whether the streaming video that has ended playback is stuttering or smooth at a certain moment, a predicted stuttering probability is used to indicate the probability that the streaming video that has ended playback is stuttering at a certain moment, and a predicted smoothness probability is used to indicate the probability that the streaming video that has ended playback is smooth at a certain moment.

[0135] Suppose the predicted label at a certain time is used This indicates that the probability of a predicted stutter at a given moment is expressed as... If it is indicated, then it exists:

[0136]

[0137] Among them, when or When, it indicates that the streaming video is smooth; when or At this time, it indicates that the streaming video is buffering.

[0138] Let's assume the predicted label at another time point is... This indicates that the probability of predicting fluency at a given moment is expressed as... If it is indicated, then it exists:

[0139]

[0140] Among them, when or When, it indicates that the streaming video is smooth; when or At this time, it indicates that the streaming video is buffering.

[0141] In this embodiment, the optimization objective of the machine learning classifier is to minimize the loss corresponding to the degree of difference between the initial label and the QoE evaluation result at each time step.

[0142] The following examples, 1 and 2, provide illustrative examples.

[0143] Example 1: When the QoE evaluation result corresponding to a certain time point includes a prediction label. When, then the following relation exists:

[0144]

[0145]

[0146]

[0147] in, The predicted label output by the machine learning classifier. The predicted label at a given time step. When, it represents the machine learning classifier predicting that the streaming video at that moment is smooth; when the predicted label corresponding to a moment... At that moment, the machine learning classifier predicted that the streaming video was choppy.

[0148] F ml The · symbol is used to represent the mapping relationship between the independent and dependent variables, specifically to represent the eigenvector corresponding to a given time point. and prediction labels The mapping relationship.

[0149] L(·) is the loss function, used to represent the "risk" or "loss" of a random event. Indicates the initial label and prediction labels The loss corresponding to the degree of difference. Specifically, The loss function is 0-1. When... When the loss function is 0, the loss function is 0; when The loss function is 1. It should be understood that when... At that time, the loss or risk is the lowest.

[0150] "St" is an abbreviation for "subject to," used to indicate that something satisfies a constraint. In this embodiment, "stmin"... This indicates that the loss function Finding the minimum value, that is, the optimization objective of training the machine learning classifier is to find the minimum value corresponding to the initial label at each time step. and prediction labels To minimize the loss corresponding to the degree of difference, thereby bringing the loss or risk to a minimum.

[0151] Example 2: When the QoE evaluation result corresponding to a certain time point includes a predicted stuttering probability... When, then the following relation exists:

[0152]

[0153]

[0154]

[0155] in, The predicted stuttering probability output by the machine learning classifier. The predicted stuttering probability at a given time step. When the machine learning classifier predicts that the streaming video service will not experience buffering at that moment, the predicted buffering probability for a given moment is... This indicates that the machine learning classifier predicts that the streaming video service will definitely experience buffering at that moment. It should be understood that... The larger the value of , the higher the probability that the machine learning classifier will predict that the streaming video service will experience buffering.

[0156] F ml The · symbol is used to represent the mapping relationship between the independent and dependent variables, specifically to represent the eigenvector corresponding to a given time point. And predicting the probability of stuttering The mapping relationship.

[0157] L(·) is the loss function, used to represent the "risk" or "loss" of a random event. Indicates the probability of predicted stuttering. and initial label The loss corresponds to the degree of difference. It should be understood that... The smaller the value, the lower the loss or risk.

[0158] "St" is an abbreviation for "subject to," used to indicate that something satisfies a constraint. In this embodiment, "stmin"... This indicates that the loss function The goal of training the machine learning classifier is to minimize the predicted stuttering probability at each time step. and initial label To minimize the loss corresponding to the degree of difference, thereby bringing the loss or risk to a minimum.

[0159] Due to the prediction of the probability of stuttering Can be converted into predicted labels Therefore, regardless of whether the optimization methods in Example 1 or Example 2 are used, the final optimization goal can be achieved: minimizing the loss corresponding to the degree of difference between the initial label and the QoE evaluation result at each time step.

[0160] Optionally, the loss function can be a cross entropy loss function, a hinge loss function, or an exponential loss function, etc.

[0161] For example, when the loss function in Example 2 above is the cross-entropy loss function, the cross-entropy loss function is expressed as:

[0162]

[0163] Where a is the base of the logarithmic function. For example, a can be an irrational number e.

[0164] To understand the cross-entropy loss function more intuitively, let a = e below, and combine it with... Figure 4 and Figure 5 An example is provided.

[0165] like Figure 4 As shown, when the initial label At that time, the cross-entropy loss function The horizontal axis represents the probability of prediction stuttering in the output of the machine learning classifier. The ordinate represents the cross-entropy loss function L. When predicting the probability of stuttering... The closer the value is to 1, the smaller the cross-entropy loss function L becomes; when predicting the probability of stuttering... The closer it is to 0, the larger the cross-entropy loss function L becomes.

[0166] like Figure 5 As shown, when the initial label At that time, the cross-entropy loss function The horizontal axis represents the probability of prediction stuttering in the output of the machine learning classifier. The ordinate represents the cross-entropy loss function L. When predicting the probability of stuttering... The closer the value is to 0, the smaller the cross-entropy loss function L becomes; when predicting the probability of stuttering... The closer it is to 1, the larger the cross-entropy loss function L becomes.

[0167] The above Figure 4 and Figure 5 The cross-entropy loss function L in the model characterizes the probability of prediction stuttering. With initial label The difference. Because when predicting the probability of stuttering. With initial label The smaller the difference, the smaller the cross-entropy loss function L, and the smaller the "penalty" on the machine learning classifier. Therefore, when training the classifier, the probability of prediction lag should be minimized as much as possible. With initial label The gap is used to obtain a machine learning classifier that minimizes the loss function.

[0168] S103. The terminal device updates the initial labels in the initial label set that meet the preset conditions, obtaining an updated initial label set. The updated initial label set is used for the next round of iterative training.

[0169] In this embodiment of the application, the initial label set is used for the first round of iterative training. These are either manually annotated or automatically generated based on log files of streaming videos that have finished playing.

[0170] For example, in the first round of iterative training, the terminal device can base its training on the set of feature vectors. and the initial label set corresponding to the first round of iterative training. Train the machine learning classifier, and after completing the first round of iterations of training the machine learning classifier, work on the initial label set. The initial labels that meet the preset conditions are updated to obtain the initial label set used for the second round of iterative training. In the second round of iterative training, the terminal device can use this set of feature vectors. and the initial tag set Train the machine learning classifier, and after completing the training of the machine learning classifier in the second round of iterations, train the initial label set. The initial labels that meet the preset conditions are updated to obtain the initial label set used for the third round of iterative training. …Repeat the above iterative steps until the preset number of iterations is reached, then stop the iterative training.

[0171] Optionally, the aforementioned preset condition may include: the time corresponding to the initial label is within the lag period. It is understood that since lag times typically occur in the middle of a session, the probability of noisy labels is higher during this phase. Therefore, updating the initial label within the lag period can improve the robustness and generalization of the machine learning classifier trained using that label.

[0172] The aforementioned lag time period may include any of the following:

[0173] 1) The moment of mutation and the preset duration preceding that moment;

[0174] 2) The moment of mutation and the preset duration after that moment of mutation;

[0175] 3) The time of mutation, the preset duration before the time of mutation, and the preset duration after the time of mutation.

[0176] In this case, the initial label corresponding to the mutation moment is different from the initial label corresponding to the previous moment. For example, the previous moment was "smooth" and the mutation moment is "stuttering". Alternatively, the initial label corresponding to the mutation moment is different from the initial label corresponding to the next moment. For example, the mutation moment is "stuttering" and the next moment is "smooth".

[0177] Assuming that in the initial manually labeled tags, the streaming video is within the time period [t] s , t e [Stuttering occurs at other times, but is smooth at other times. Based on the description of the above embodiments, since the initial label is a noisy label, it is necessary to adjust the stuttering time period [t].] s , t e Perform calibration. Set the lag time period after calibration to [value]. If n is a pre-defined number of iterations, then the following holds: Among them, the preset duration T s and T e These are all pre-set fixed values ​​related to specific video services. Preset duration T s and T e They can be equal or unequal. After each iteration of training the machine learning classifier, a confidence learning algorithm can be used to analyze the lag time. and / or tags The settings are updated to obtain the initial label set for the next round of training iterations. And generate new lag time periods. This process continues with the next round of iterative training until the preset number of iterations is reached. It should be understood that since noisy labels are more likely to exist at the moment of mutation, correcting the initial labels during the pauses before and after the mutation can improve the generalization ability of the machine learning classifier.

[0178] Furthermore, the aforementioned preset condition may also include: the confidence level of the initial label is outside the confidence interval. For initial labels located within the lag time period, some initial labels have low confidence levels, affecting the training of the machine learning classifier. Therefore, by correcting the initial labels located outside the confidence interval, the generalization ability of the machine learning classifier can be further improved. It should be noted that the implementation method of correcting the initial label based on confidence level will be described in the following embodiments and will not be repeated here.

[0179] Optionally, the training process for a confidence-based machine learning classifier is as follows:

[0180] 1. Take The initial labels are manually labeled, and the lag time period [t] is included. s -T s , t s The corresponding initial tag The issue has been corrected to prevent stuttering, and will be used for subsequent training.

[0181] 2. for k = 1 to n / / Incrementing for loop, 1 is the initial value of the loop, and n is the final value of the loop.

[0182] 3.do / / This means that the instruction following do is executed in each iteration of the loop.

[0183] 4. Based on the set of feature vectors and the initial label set corresponding to the k-th iteration training. Training machine learning classifier F ml (·), satisfying the following conditions:

[0184]

[0185]

[0186]

[0187] 5. A confidence learning algorithm is used for time periods. and / or tags The settings are updated to obtain the initial label set for the next round of training iterations.

[0188] 6. Done / / This indicates the end of the current loop.

[0189] 7. Train the machine learning classifier F with k=n ml (·) is the output of the entire algorithm process, resulting in the final machine learning classifier.

[0190] Optionally, the initial tag set may include two types of tags: a first type of tag indicating that a streamed video that has finished playing is choppy at a certain moment; and a second type of tag indicating that a streamed video that has finished playing is smooth at a certain moment. Accordingly, updating the initial tags in the initial tag set that meet the preset conditions includes: updating the first type of tags in the initial tag set that meet the preset conditions to the second type of tags, and updating the second type of tags in the initial tag set that meet the preset conditions to the first type of tags, thereby obtaining an updated initial tag set.

[0191] For example, when the first type of label When the streaming video is choppy, the first type of tag can be used. Updated to the second type of tag When the second type of label When the streaming video is smooth, the second type of tag can be used. Updated to Type 1 tags

[0192] S104. After the terminal device has reached the preset number of iterations, the final machine learning classifier is obtained.

[0193] The final machine learning classifier is used to evaluate the QoE of streaming video in real time.

[0194] In each iteration of training in the above embodiment, the terminal device uses an initial label set corresponding to the current iteration to train the machine learning classifier. Then, it updates the initial labels in the initial label set that meet preset conditions, generating an initial label set for the next iteration. This process continues until a preset number of iterations is reached, at which point it stops training, resulting in a final machine learning classifier for real-time evaluation of the QoE of streaming video. Thus, when a user watches streaming video online, the terminal device can acquire the online dataset in real time and input the feature vectors of the online dataset into the final machine learning classifier. This allows the final machine learning classifier to make an accurate, real-time evaluation of the QoE, such as predicting whether the currently viewed streaming video is choppy or smooth. Based on the real-time QoE evaluation results, it can determine whether the current network quality meets communication requirements and whether to trigger network acceleration.

[0195] Figure 6 This is a flowchart illustrating a method for updating initial labels based on a confidence learning algorithm, provided in an embodiment of this application. After each iteration of training the machine learning classifier, it is assumed that the initial label set used for this iteration is known. The set of predicted labels obtained by the machine learning classifier through cross-validation is The set of predicted stutter probabilities output by the machine learning classifier is The lag period is and Then as Figure 6 As shown, the updated initial label set is obtained for the next round of training iterations. The method may include the following steps.

[0196] Step S11: The terminal device determines the average lag probability and the average smoothness probability.

[0197] The average stutter probability is the average predicted stutter probability corresponding to the moment predicted as stutter within the stutter period, and the average smoothness probability is the average predicted smoothness probability corresponding to the moment predicted as smooth within the stutter period.

[0198] Assumption This indicates that the streaming video is smooth at time t. This indicates that the streaming video is choppy at time t.

[0199] The average probability of stuttering is:

[0200]

[0201] The average smoothness probability is:

[0202]

[0203] in, It is used to represent the predicted stutter probability corresponding to a moment that is predicted to be stuttering. It is used to represent the predicted smoothness probability corresponding to a moment that is predicted to be smooth.

[0204] This is used to represent the number of samples predicted to be stuck during the stuck time period.

[0205] This is used to represent the number of samples that are predicted to be smooth during the lag period.

[0206] Assume the lag period is from time t1 to time t. 10 Table 1 shows the correspondence between predicted labels, predicted lag probability, and predicted smoothness probability during the lag period provided in the embodiments of this application.

[0207] Table 1

[0208]

[0209] Predict the label from time t1 to time t5. This indicates stuttering, and the probability of stuttering is predicted based on these moments. The average probability of lag can be calculated as: (0.6+0.9+0.7+0.7+0.6) / 5=0.7.

[0210] From time t6 to time t 10 Predicted tags To represent smoothness, the predicted smoothness probability is based on these moments. The average smoothness probability can be calculated as: (0.6+0.9+0.7+0.8+0.8) / 5=0.76.

[0211] Step S12: The terminal device determines the samples at each moment during the lag period and outputs the predicted labels from the machine learning classifier. The predicted probability p t .

[0212]

[0213] When predicting labels When the predicted probability is That is, the prediction probability is the prediction stutter probability; when the predicted label When the predicted probability is That is, the prediction probability is the prediction fluency probability.

[0214] The predicted probability p of the sample at each time point within the lag period is calculated. t Subsequently, for any moment within the lag period (referred to as the target moment), the terminal device can execute the following steps S13 to S17.

[0215] Step S13: The terminal device determines whether the predicted label at the target time indicates stuttering or smoothness. If it is stuttering, proceed to step S14 below. If it is smooth, proceed to step S15 below.

[0216] Step S14: The terminal device determines whether the predicted stuttering probability at the target time is less than or equal to the average stuttering probability. If it is less than or equal to the average stuttering probability, proceed to step S16 below; otherwise, it does not need to be added to the first set.

[0217] Step S15: The terminal device determines whether the predicted smoothness probability at the target time is less than or equal to the average smoothness probability. If it is less than or equal to the average smoothness probability, then proceed to step S17 below; otherwise, it does not need to be added to the first set.

[0218] Step S16: The terminal device adds the predicted stuttering probability at the target time to the first set.

[0219] Step S17: The terminal device adds the predicted smoothness probability of the target time to the first set.

[0220] For example, the first set can be denoted as P. dirty As shown in Table 1 above, the average stutter probability is 0.7. If the predicted stutter probability corresponding to a stutter label is less than or equal to 0.7, then the predicted stutter probability is determined to be outside the confidence interval. The predicted stutter probabilities at times t1 and t5 are both less than the average stutter probability, while the predicted stutter probabilities at times t3 and t4 are both equal to the average stutter probability. Therefore, the confidence levels of the predicted stutter probabilities at times t1, t3, t4, and t5 are low, and thus the predicted stutter probabilities at times t1, t3, t4, and t5 can be added to the first set P.dirty .

[0221] Additionally, as shown in Table 1, the average smoothness probability is 0.76. If the predicted smoothness probability corresponding to a smoothness label is less than or equal to 0.76, then the predicted smoothness probability is considered to be outside the confidence interval. The predicted stuttering probabilities at times t6 and t8 are less than the average smoothness probability, therefore the confidence levels of the predicted smoothness probabilities at times t6 and t8 are low, and thus the predicted smoothness probabilities at times t6 and t8 can be added to the first set P. dirty .

[0222] Furthermore, the terminal device can connect with the device added to the first set P. dirty The probability corresponding to the time step is added to the second set Index. dirty The final terminal device can obtain the first set P. dirty ={0.6,0.7,0.7,0.6,0.6,0.7}, and the second set Index dirty ={t1,t3,t4,t5,t6,t8}.

[0223] Step S18: The terminal device starts from the first set P dirty P probabilities are determined.

[0224] Among these, the P probabilities are less than the other probabilities in the first set. P is a positive integer.

[0225] Step S19: The terminal device sets the initial tag set. Update the P initial labels in the set to obtain the updated set of initial labels.

[0226] Among them, the P initial labels correspond to the same time as the P probabilities.

[0227] In determining the first set P dirty After setting the probability set to {0.6, 0.7, 0.7, 0.6, 0.6, 0.7}, the terminal device can sort the probabilities in the first set in ascending order. Then, it determines the top X% of the P probabilities and places them in the second set (Index). dirty Determine the set of time points {t1, t3, t4, t5, t6, t8} corresponding to the P probabilities. m Then, the initial tag set... In the set of time points {t m Update the corresponding initial tags.

[0228] Right now:

[0229]

[0230] Finally, the updated initial tag set was obtained. This initial tag set Used for the next round of iteration training.

[0231] Figure 7 This is a flowchart illustrating another method for updating initial labels based on a confidence learning algorithm, provided in an embodiment of this application. The method may include the following steps.

[0232] Step S21, assume the set of feature vectors is And the initial label set used for the first round of iterative training is Input feature vector set and the initial tag set is The initial label set Y is derived from the time intervals {t1, t2, ..., t}. i The initial labels for each time step in the sequence , ..., are either manually labeled or automatically generated based on the log file.

[0233] Step S22: Train the machine learning classifier and learn... The mapping relationship F ml (·)

[0234] Step S23: From time {t1, t2, ... t... i Extract the time t when all initial labels mutate from , ...} i' , and in t i' The time before and after [t] i'-T ,…,t i'+T ] feature vector set and the initial set of labels corresponding to the times of these mutations.

[0235] Step S24: Through mapping relationship F ml (·), calculated to obtain The set of predicted stuttering probabilities output by the machine learning classifier

[0236] Step S25: Traverse the initial tag set With the set of predicted stuttering probabilities At each time step, calculate the initial label y. t With the predicted stuttering probability p t Cross-entropy.

[0237] Step S26: Sort the data in descending order of cross-entropy.

[0238] Step S27: Change the initial label corresponding to the top X% of samples in the sorting results to the opposite initial label, for example, change the stuttering label to the smooth label, or change the smooth label to the stuttering label. Here, X is a preset value, such as X = 5, 10, or 20.

[0239] Step S28: Save the updated and modified tags to the initial tag set. middle.

[0240] Then, repeat steps S22 to S28 according to the preset number of iterations.

[0241] Step S29: Use the machine learning classifier from the last round as the final machine learning classifier.

[0242] For example, consider the first iteration of a confidence-based machine learning classifier training process. Assume that at time t... i' If the initial label undergoes a mutation and T = 5, then at the mutation time t i' The correspondence between the initial label, predicted stutter probability, and cross-entropy at each time point before and after is shown in Table 2.

[0243] Table 2

[0244] time initial label Predicting the probability of stuttering Cross-entropy Modified initial label <![CDATA[t i'-5 ]]> 0 0.3 0.3567 0 <![CDATA[t i'-4 ]]> 0 0.2 0.2231 0 <![CDATA[t i'-3 ]]> 0 0.1 0.1054 0 <![CDATA[t i'-2 ]]> 0 0.3 0.3567 0 <![CDATA[t i'-1 ]]> 0 0.6 0.9163 1 <![CDATA[t i' ]]> 1 0.6 0.5108 1 <![CDATA[t i'+1 ]]> 1 0.9 0.1054 1 <![CDATA[t i'+2 ]]> 1 0.7 0.3567 1 <![CDATA[t i'+3 ]]> 1 0.7 0.3567 1 <![CDATA[t i'+4 ]]> 0 0.5 0.6931 0 <![CDATA[t i'+5 ]]> 0 0.1 0.1054 0

[0245] Referring to Table 2, machine learning classifiers can utilize the mapping relationship F ml (·), output and time t i'-5 t i'-4 、…、t i' 、…、t i'+4 t i'+5 The predicted stuttering probability corresponds to each time step. Then, each time step is iterated to calculate the cross-entropy between the initial label and the predicted stuttering probability at each time step.

[0246] Specifically, the cross-entropy between the initial label and the predicted stutter probability at each time step can be calculated according to the above embodiment. Figure 4 and Figure 5 The calculation is performed using the provided calculation method: when the initial label At that time, the cross-entropy loss function When the initial label At that time, the cross-entropy loss function This allows us to calculate the cross-entropy between the initial label and the predicted stutter probability at each time step, and then sort them in descending order of cross-entropy. Assume we change the initial labels of the top 10% of the sorted samples, i.e., take one sample from Table 2. Since time t... i'-1 The corresponding cross-entropy of 0.3979 is the highest among all cross-entropies, therefore time t is... i'-1The original QoE label is changed from "0" to "1". Then, the changed label is updated and saved to the initial label set. In other words, the initial tag set Tags in Change it to "1". Then, you can iterate through the training process of the machine learning classifier until you get the final machine learning classifier.

[0247] It is understandable that by changing the initial labels before and after some mutation moments, these labels are generated iteratively by the classifier itself, rather than being fixed to manually labeled labels, thus improving the generalization ability of the machine learning classifier.

[0248] Figure 8 This is a schematic diagram illustrating the process of obtaining a set of feature vectors according to an embodiment of this application. Figure 8 As shown, after the terminal device acquires the offline dataset, parameter normalization, offline feature calculation, and offline feature selection can be performed sequentially to obtain a set of feature vectors. The following section describes how to obtain the feature vectors used to evaluate at time t. i Taking the feature vector set of QoE as an example, the steps are explained.

[0249] Step S81: The terminal device obtains the offline dataset.

[0250] Among them, offline datasets This can include multiple offline parameters. These offline parameters are low-level parameters recorded by the log file during the playback of the streaming video and obtained by the terminal device after the playback of at least one streaming video has ended, ranging from the initial playback time of the at least one streaming video to time t. In other words, the offline parameters are terminal-side parameters obtained from the log file of the streaming video that has finished playing.

[0251] Specifically, one example is that the terminal device can obtain data from the initial time t1 when the video app is first used to play streaming video service to the time t2 when the streaming video is last played. i All offline parameters can be used to form an offline dataset. Another example is that the terminal device can obtain data from the last time the streaming media was played, t. i At time t i Offline parameters of a streaming video service played (e.g., continuous or intermittent) using a video app at a certain point in time t1 can be obtained. For example, offline parameters of multiple playbacks of streaming video services using a video app within the past day, week, or month can be obtained. These offline parameters can form an offline dataset.

[0252] It should be noted that in data acquisition during continuous playback, the time interval between any two adjacent moments mentioned above can be the same, for example, time t1 and time x. t-1 The time interval is 1 second. Alternatively, in data collection during interval playback, the time intervals between some adjacent moments mentioned above may differ.

[0253] Optionally, the aforementioned terminal-side parameters may include at least one of the following parameters:

[0254] The first type is the transport layer parameter. This parameter is the primary parameter for training machine learning models. It reflects the network transmission status when streaming video is played on a terminal device, such as the statistical characteristics of data packets per unit time.

[0255] For example, transport layer parameters may include the number of uplink TCP / UDP packets and the total packet length per unit time. As another example, transport layer parameters may include the number of downlink TCP / UDP packets, the total packet length, the RTT value per unit time, and the number and length of retransmitted packets per unit time, among other transport layer data. Here, "unit time" refers to the time interval between two equally spaced moments. Of course, transport layer parameters may also include other parameters, which are not limited in this embodiment.

[0256] The second is the QoS parameter. This QoS parameter is also used to train machine learning models. It is used to evaluate the network's ability to provide services for streaming video.

[0257] For example, QoS parameters mainly include several important environmental parameters such as cellular and Wi-Fi signal strength and Wi-Fi negotiation rate. Of course, QoS parameters can also include availability, throughput, network transmission stability, reliability, latency, latency variation, jitter, transmission bit rate, bit error rate, transmission failure rate, reliability, security, guaranteed flow bit rate (GFBR), maximum flow bit rate (MFBR), average window and aggregate maximum bit rate (AMBR), allocation and retention priority (ARP), etc.

[0258] The third type is terminal parameters. These are the parameters of the terminal device itself when playing streaming video.

[0259] For example, terminal parameters may include audio track information, video track information, brand and model, operating system, identity document (ID), network operator, network access method, and / or IP address. Among these, the audio track information and video track information are network-independent and are used to indicate the playback progress of audio and video during streaming video. In this embodiment, initial tags can be generated based on the audio track information and video track information.

[0260] As an optional implementation, for each video app, the terminal device performs an offline training process based on the offline parameters of the streaming videos played by each video app within a preset time period. This determines the QoE evaluation model corresponding to each video app; that is, different QoE evaluation models are pre-created for different video apps. When a user plays a streaming video online using a video app, the QoE evaluation model corresponding to that video app can be used for real-time QoE evaluation.

[0261] As an alternative implementation, for multiple video apps, the terminal device performs an offline training process based on offline parameters of streaming videos played by these apps within a preset time period. This determines a QoE evaluation model corresponding to each of the multiple video apps; in other words, a QoE evaluation model is pre-created for each video app. When a user plays a streaming video online using any of these video apps, the QoE evaluation model can be used for real-time QoE evaluation.

[0262] Step S82: The terminal device normalizes the offline parameters in the offline dataset.

[0263] For offline datasets Each offline parameter x in t Normalization is performed:

[0264]

[0265] Where G(·) is the normalization function.

[0266] Offline datasets on terminal devices After normalizing each offline parameter, the offline dataset is updated to... Among them, sample data It can be used to represent offline parameters Normalized parameters, sample data It can be used to represent offline parameters Normalized parameters, ..., sample data It can be used to represent offline parameters The parameters after normalization.

[0267] Alternatively, normalization methods may include, but are not limited to, standardization, interval scaling, and discretization.

[0268] 1) Standardization Methods

[0269] If the terminal device adopts a standardized approach: Normalizing the offline parameters will normalize them to the (0, 1) distribution region, making the processed offline parameters conform to a standard normal distribution. Here, μ is the expected value of all sample data, and σ is the standard deviation of all sample data.

[0270] 2) Range scaling method

[0271] Interval scaling methods are suitable for sample data that do not conform to a normal distribution. Specifically, they can include:

[0272] Range scaling method:

[0273] Logarithmic method:

[0274] Methods for taking the square root:

[0275] If the terminal device uses the range scaling method: This allows offline parameters with relatively small data differentiation to be scaled to the [0,1] range. max x represents the maximum value among all sample data. min This represents the minimum value among all sample data.

[0276] If the terminal device uses the interval scaling method: or This allows offline parameters with significant data differentiation and imbalance to be scaled down to a certain range.

[0277] 3) Discretization method

[0278] If the terminal device uses a discretization method, then the continuous value of the offline parameter x can be... t Bucketing to obtain discrete parameters This increases the frequency of identical parameter values, reduces outliers, and effectively filters out abnormal values.

[0279] It should be understood that regardless of which normalization method the terminal device uses to normalize the offline parameters, it can convert dimensional offline parameters into dimensionless offline parameters, i.e., scalar parameters, thereby making the distribution of different parameters more similar, and thus improving the training efficiency and robustness of the model.

[0280] Step S83: The terminal device calculates the offline characteristics of the normalized offline parameters.

[0281] In this embodiment of the application, the offline features of the offline parameters can be represented by feature vectors.

[0282] After normalizing the offline parameters of the offline dataset, the terminal device can process data from time t1 to time t2. i Extract the parameters from the offline parameters used to evaluate time t. i eigenvectors of QoE For example, feature vector Where R represents the real number field; N represents the number or dimension of features contained in each sample data. That is, the feature vector. It can be understood as an N-dimensional vector.

[0283] For example, used to evaluate time t i Offline features of QoE Represented as:

[0284]

[0285] Where D(·) is the offline feature extraction function.

[0286] Optionally, the offline features of the offline parameters may include, but are not limited to, window features, global features, and at least one of other features.

[0287] 1) Window features

[0288] Used to evaluate time t i The window characteristics of QoE can be represented as:

[0289]

[0290] Among them, D window (·) is the window feature extraction function.

[0291] In this embodiment of the application, window features Only with time windows [t i -T1+1,t i The input is related to offline parameters. The duration of this time window is T1, and this time window ∈ [t1, t2]. i ].

[0292] Optionally, the terminal device may employ (weighted) average, median, maximum, minimum, standard deviation, and other composite statistical methods, based on the time window [t]. i -T1+1,t iThe offline parameters are input, and the window features are calculated.

[0293] It is understandable that, since the duration T1 of the time window can be considered a constant, the time complexity and space complexity of calculating the window features are both O(1), i.e., constant level. Here, time complexity refers to the amount of computation required to execute the algorithm, and space complexity refers to the memory space required to execute the algorithm.

[0294] 2) Global features

[0295] Used to evaluate time t i The global characteristics of QoE can be represented as:

[0296]

[0297] Among them, D whole (·) is the global feature extraction function.

[0298] In this embodiment of the application, global features With time interval [t1,t] i The input offline parameters are relevant. Among them, the time period [t1,t...] i [This represents the total time period corresponding to the offline dataset.]

[0299] Optionally, the terminal device may use methods such as (weighted) average, maximum, minimum, skewness, and kurtosis to determine the metric based on the time period [t1, t2]. i The input offline parameters are used to calculate global features. Among them, skewness is a feature number that characterizes the degree of asymmetry of the probability density curve relative to the mean, and kurtosis is a feature number that characterizes the height of the peak of the probability density curve at the mean.

[0300] It should be noted that, theoretically, the time and space complexity of calculating global features are both O(t), which is related to the business duration (t). i -t1) shows a linear relationship, that is, the longer the duration of the streaming video service, the higher the frequency of the time complexity and space complexity of the global features.

[0301] 3) Other characteristics

[0302] Features that are unrelated to network parameters can be referred to as other features. For example, other features include the duration of a streaming video service session.

[0303] It should be noted that the time complexity and space complexity of the real-time attributes used to record the entire streaming video service are both O(1), which is constant level.

[0304] Step S84: The terminal device performs feature selection on the offline features.

[0305] Since the extracted feature vectors contain some features with relatively high redundancy, and these features with high redundancy will make the training of the machine learning classifier more complex and time-consuming. Therefore, the terminal device can adopt a feature selection algorithm to remove these features with high redundancy and obtain a refined feature vector so as to reduce the training complexity and speed up the training speed. Here, N' is used to represent the refined dimension, and N' < N. That is, the feature vector can be understood as an N'-dimensional vector.

[0306] Optionally, the feature selection algorithm may include but is not limited to: methods based on mutual information, methods based on maximum correlation-minimum redundancy, and feature selection methods based on the wrapper method, etc.

[0307] In the above embodiments, after calculating the offline features of the offline data, by removing the features with high redundancy, a refined set of feature vectors can be obtained, thereby reducing the training complexity of the subsequent machine learning classifier and speeding up the training speed.

[0308] It should be noted that the above embodiments are described by taking the terminal device as an example to execute the training method of the score memory unit and the QoE evaluation method. It should be understood that in actual implementation, the server may also execute the training method of the score memory unit and the QoE evaluation method, which is not limited in the embodiments of the present application.

[0309] The above mainly introduces the solution provided by the embodiments of the present application from the perspective of the terminal device. It can be understood that in order for the terminal device to implement the above functions, it includes the corresponding hardware structure or software module for each function, or a combination of both. Those skilled in the art should easily realize that, combining the units and algorithm steps of each example described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed in the form of hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered to exceed the scope of the present application.

[0310] This application embodiment can divide the terminal device into functional modules according to the above method example. For example, each function can be divided into a separate functional module, or two or more functions can be integrated into one processing module. The integrated module can be implemented in hardware or as a software functional module. It should be noted that the module division in this application embodiment is illustrative and only represents one logical functional division. In actual implementation, there may be other division methods. The following description uses the example of dividing each functional module according to each function.

[0311] Figure 9 This is a schematic diagram of the training device provided in an embodiment of this application. Figure 9 As shown, the training device 900 may include an acquisition module 901, a training module 902, and an update module 903.

[0312] The acquisition module 901 can be used to acquire a set of feature vectors, which includes multiple feature vectors obtained from the log file of the streaming video that has finished playing. One feature vector is used to evaluate the QoE at a certain moment.

[0313] Training module 902 is used to train a machine learning classifier in each iteration of training, based on the feature vector set and the initial label set used for that iteration. This results in a mapping between the feature vector and the QoE evaluation result at each time step. After a preset number of iterations, the final machine learning classifier is obtained. The initial label set includes multiple initial labels, one of which indicates whether the streaming video was choppy or smooth at a given time. The optimization objective of the machine learning classifier is to minimize the loss corresponding to the difference between the initial label and the QoE evaluation result at each time step. The final machine learning classifier is used to evaluate the QoE of the streaming video in real time.

[0314] The update module 903 can be used to update the initial labels in the initial label set that meet the preset conditions, so as to obtain the updated initial label set, which is used for the next round of iterative training.

[0315] Optionally, the preset conditions include: the time corresponding to the initial label is within the lag period.

[0316] Optionally, training module 902 can be used for:

[0317] For any moment within the lag period, if the predicted label for the target moment indicates lag, and the predicted lag probability for the target moment is less than or equal to the average lag probability, then the predicted lag probability for the target moment is added to the first set; or, if the predicted label for the target moment indicates smoothness, and the predicted smoothness probability for the target moment is less than or equal to the average smoothness probability, then the predicted smoothness probability for the target moment is added to the first set; wherein, the average lag probability is the average of the predicted lag probabilities corresponding to moments predicted as lag within the lag period, and the average smoothness probability is the average of the predicted smoothness probabilities corresponding to moments predicted as smooth within the lag period.

[0318] Determine P probabilities from the first set, where each of the P probabilities is less than the other probabilities in the first set, and P is a positive integer.

[0319] Update the P initial labels in the initial label set to obtain the updated initial label set, where the P initial labels correspond to the same time as the P probabilities.

[0320] Optionally, training module 902 can be used for:

[0321] For any moment within the lag time period, the predicted lag probability of the target moment is determined based on the mapping relationship between the feature vector corresponding to each moment and the QoE evaluation result; and the cross-entropy loss function is determined based on the initial label corresponding to the target moment and the predicted lag probability of the target moment; where the target moment is any moment within the lag time period.

[0322] After obtaining the cross-entropy loss function corresponding to each time step, the Q initial labels in the initial label set are updated to obtain the updated initial label set; where the cross-entropy loss function of the Q initial labels is greater than the cross-entropy loss function of the other initial labels, and Q is a positive integer.

[0323] Optionally, the update module 903 can be specifically used to update the first type of tags in the initial tag set that meet preset conditions to the second type of tags, and to update the second type of tags in the initial tag set that meet preset conditions to the first type of tags, thus obtaining an updated initial tag set. The first type of tags indicates that the streaming video that has finished playing is choppy at a certain moment, and the second type of tags indicates that the streaming video that has finished playing is smooth at a certain moment.

[0324] Optionally, the acquisition module 901 can be used to normalize offline data, which is terminal-side parameters obtained from the log file of a streaming video that has finished playing; extract all feature vectors of the normalized offline data; and remove redundant feature vectors from the all feature vectors to obtain a feature vector set.

[0325] The training apparatus in this application embodiment can correspond to the method described in this application embodiment related to the offline training process, and for the sake of brevity, it will not be described again here.

[0326] Figure 10 This is a schematic diagram of the structure of a terminal device provided in an embodiment of this application. Figure 10 As shown, the terminal device may include a processor 201, which is coupled to a memory 203. The processor 201 is used to execute computer programs or instructions stored in the memory so that the terminal device implements the methods in the above embodiments.

[0327] The terminal device may also include a communication bus 202, a communication interface 204, an output device 205, and an input device 206.

[0328] The number of processors 201 can be one or more. A processor 201 may include at least one processing unit. For example, a processor may include, Figure 10 The diagram shows at least one central processing unit (CPU). For example, the processor may also include an image signal processor (ISP), a digital signal processor (DSP), a video codec, a neural network processing unit (NPU), a graphics processing unit (GPU), an application processor (AP), a modem processor, and / or a baseband processor. In some embodiments, different processing units may be independent devices or integrated into one or more processors.

[0329] The communication bus 202 may include a path for transmitting information between the processor 201, the memory 203, and the communication interface 204.

[0330] Communication interface 204, using any transceiver-like device, is used to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), or wireless local area network (WLAN). In this embodiment, communication interface 204 is mainly used to communicate with a server, for example, to transmit data packets for streaming video.

[0331] The memory 203 may be a read-only memory (ROM) or other type of static storage device capable of storing static information and instructions, random access memory (RAM) or other type of dynamic storage device capable of storing information and instructions, or electrically erasable programmable read-only memory (EEPROM), compact disk read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed optical discs, laser discs, optical discs, digital versatile optical discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium capable of carrying or storing desired program code in the form of instructions or data structures and accessible by a computer, but not limited thereto. The memory may exist independently and be connected to the processor via a bus. The memory may also be integrated with the processor.

[0332] The memory 203 stores execution application code, such as the code for a video application, and its execution is controlled by the processor 201. The processor 201 executes the application code stored in the memory 203 to implement the machine learning classifier training method in the above embodiments. In this application embodiment, the memory can also be used for log files and machine learning classifiers, etc.

[0333] Output device 205 communicates with processor 201 and can display information in various ways, such as displaying a streaming video playback interface. Output device 205 may include a display panel, such as a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), or a quantum dot light-emitting diode (QLED), etc.

[0334] Input device 206 communicates with processor 201 and can receive user input in various ways, such as receiving user input for streaming video playback. Input device 206 can be a mouse, keyboard, touchscreen, or sensing device, etc.

[0335] It should be understood that Figure 10 The terminal device shown can correspond to Figure 9 The training device shown. Among them, Figure 10 The processor 201 in the terminal device shown can correspond to Figure 9 The training device includes an acquisition module 901, a training module 902, and an update module 903.

[0336] This application also provides a computer-readable storage medium storing computer instructions. When the computer-readable storage medium is run on a terminal device, it causes the terminal device to perform the method described above. The computer instructions can be stored in the computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access, or it can include one or more data storage devices such as servers or data centers that can be integrated using media. Available media can be magnetic media (e.g., floppy disks, hard disks, or magnetic tapes), optical media, or semiconductor media (e.g., solid-state drives (SSDs)).

[0337] This application also provides a computer program product, which includes computer program code that, when run on a computer, causes the computer to perform the methods described in the above embodiments.

[0338] This application also provides a chip coupled to a memory, which is used to read and execute computer programs or instructions stored in the memory to perform the methods described in the above embodiments. This chip can be a general-purpose processor or a dedicated processor.

[0339] It should be noted that the chip can be implemented using the following circuits or devices: one or more field programmable gate arrays (FPGAs), programmable logic devices (PLDs), controllers, state machines, gate logic, discrete hardware components, any other suitable circuits, or any combination of circuits capable of performing the various functions described throughout this application.

[0340] The terminal device, training device, computer-readable storage medium, computer program product, and chip provided in the embodiments of this application are all used to execute the methods provided above. Therefore, the beneficial effects they can achieve can be referred to the beneficial effects corresponding to the methods provided above, and will not be repeated here.

[0341] It should be understood that the above description is merely to help those skilled in the art better understand the embodiments of this application, and is not intended to limit the scope of the embodiments of this application. Based on the examples given above, those skilled in the art can obviously make various equivalent modifications or changes. For example, some steps in the various embodiments of the above detection method may be unnecessary, or new steps may be added. Alternatively, any combination of two or more of the above embodiments may be used. Such modifications, changes, or combinations also fall within the scope of the embodiments of this application.

[0342] It should also be understood that the above description of the embodiments of this application focuses on highlighting the differences between the various embodiments. Any similarities or differences not mentioned can be referred to each other. For the sake of brevity, they will not be repeated here.

[0343] It should also be understood that the sequence number of each process does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

[0344] It should also be understood that in the embodiments of this application, "pre-setting" or "pre-defining" can be achieved by pre-saving the corresponding code, table or other means that can be used to indicate relevant information in the device (e.g., including terminal device), and this application does not limit the specific implementation method.

[0345] It should also be understood that the methods, situations, categories, and classifications of embodiments in this application are for the convenience of description only and should not constitute a special limitation. Various methods, categories, situations, and features in embodiments can be combined without contradiction.

[0346] It should also be understood that, in the various embodiments of this application, unless otherwise specified or in case of logical conflict, the terms and / or descriptions between different embodiments are consistent and can be referenced by each other, and the technical features in different embodiments can be combined to form new embodiments according to their inherent logical relationships.

[0347] Finally, it should be noted that the above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any changes or substitutions within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A method for training a machine learning classifier, characterized in that, The method includes: A feature vector set is obtained, which includes multiple feature vectors obtained from the log file of the finished streaming video. Each feature vector is used to evaluate the quality of experience (QoE) of the finished streaming video at a certain moment. In this round of iterative training, the machine learning classifier is trained based on the feature vector set and the initial label set to obtain the mapping relationship between the feature vector and the QoE evaluation result corresponding to each time step. The initial label set is the label set obtained after the previous round of iterative training, and includes multiple initial labels. One initial label is used to indicate whether the streaming video that has finished playing is stuttering or smooth at a given time step. The optimization objective of the machine learning classifier is to minimize the loss corresponding to the degree of difference between the initial label and the QoE evaluation result corresponding to each time step. The QoE evaluation result corresponding to a time step includes a predicted stuttering probability, or a predicted label and a predicted stuttering probability, or a predicted label and a predicted smoothness probability. A predicted label indicates whether the streaming video that has finished playing is stuttering or smooth at a given time step, a predicted stuttering probability indicates the probability that the streaming video that has finished playing is stuttering at a given time step, and a predicted smoothness probability indicates the probability that the streaming video that has finished playing is smooth at a given time step. The initial labels in the initial label set that are located within the lag time period are updated to obtain the updated initial label set, which is used for the next round of iterative training. After the preset number of iterations is reached, the final machine learning classifier is obtained. The final machine learning classifier is used to evaluate the QoE of streaming video in real time. The process of updating the initial tags in the initial tag set that are located within the lag time period includes: For any moment within the lag time period: if the predicted label for the target moment indicates lag, and the predicted lag probability for the target moment is less than or equal to the average lag probability, then the predicted lag probability for the target moment is added to the first set; or, if the predicted label for the target moment indicates smoothness, and the predicted smoothness probability for the target moment is less than or equal to the average smoothness probability, then the predicted smoothness probability for the target moment is added to the first set; update the P initial labels in the initial label set, wherein the P initial labels correspond to the same moment as the P probabilities, the P probabilities are determined from the first set, and the P probabilities are less than the other probabilities in the first set; Alternatively, for any moment within the lag time period: determine the predicted lag probability of the target moment based on the mapping relationship; and determine the cross-entropy loss function based on the initial label corresponding to the target moment and the predicted lag probability of the target moment; after obtaining the cross-entropy loss function corresponding to each moment, update the Q initial labels in the initial label set, wherein the cross-entropy loss function of the Q initial labels is greater than the cross-entropy loss function of other initial labels in the initial label set.

2. The method of claim 1, wherein, The period of lag includes any of the following: The moment of mutation and the preset duration preceding the moment of mutation; The moment of mutation and the preset duration following the moment of mutation; The time of mutation, the preset duration before the time of mutation, and the preset duration after the time of mutation; Wherein, the initial label corresponding to the mutation moment is different from the initial label corresponding to the previous moment of the mutation moment, or the initial label corresponding to the mutation moment is different from the initial label corresponding to the next moment of the mutation moment.

3. The method of claim 1, wherein, The average stutter probability is the average of the predicted stutter probabilities corresponding to the moments predicted as stuttering within the stutter time period, and the average smoothness probability is the average of the predicted smoothness probabilities corresponding to the moments predicted as smooth within the stutter time period.

4. The method of claim 1, wherein, The step of updating the P initial labels in the initial label set includes: Update the first type of label in the P initial labels to the second type of label, and update the second type of label in the P initial labels to the first type of label; The first type of label is used to indicate that the streaming video that has finished playing is choppy at a certain moment, and the second type of label is used to indicate that the streaming video that has finished playing is smooth at a certain moment.

5. The method according to claim 1, characterized in that, The step of updating the Q initial labels in the initial label set includes: Update the first type of label in the Q initial labels to the second type of label, and update the second type of label in the Q initial labels to the first type of label; The first type of label is used to indicate that the streaming video that has finished playing is choppy at a certain moment, and the second type of label is used to indicate that the streaming video that has finished playing is smooth at a certain moment.

6. The method of claim 1, wherein, The initial set of labels used for the first round of iterative training is either manually labeled or automatically generated based on log files of streaming videos that have finished playing.

7. The method according to any one of claims 1 to 6, characterized in that, The acquisition of the feature vector set includes: The offline data is normalized, and the offline data is terminal-side parameters obtained from the log files of streaming video that has finished playing. Extract all feature vectors from the normalized offline data; Redundant feature vectors are removed from all the feature vectors to obtain the feature vector set.

8. The method of claim 7, wherein, The terminal-side parameters include at least one of the following: Transport layer parameters, which are used to reflect the network transmission status when playing a streaming video that has finished playing; Quality of Service (QoS) parameters, which are used to evaluate the network's ability to provide service for streaming video that has finished playing; Terminal parameters, which are the parameters of the terminal device itself when playing a streaming video that has finished playing.

9. A terminal device, comprising: The device includes a processor coupled to a memory, the processor being configured to execute a computer program or instructions stored in the memory, such that the terminal device implements a training method for a machine learning classifier as described in any one of claims 1 to 8.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when run on a terminal device, causes the terminal device to perform a training method for a machine learning classifier as described in any one of claims 1 to 8.