Method, device, electronic equipment, medium and program for processing color ring media stream

By carrying display attributes in the preset keyframe images of the ringback tone media stream and using SIP signaling to identify the terminal type, the problem of the calling terminal being unable to determine whether to disable or invoke 3D rendering capabilities is solved, enabling the reasonable playback of ringback tone media streams on different terminals and displaying video ringback tones with naked-eye 3D effects.

CN118803142BActive Publication Date: 2026-06-26CHINA MOBILE COMM LTD RES INST +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA MOBILE COMM LTD RES INST
Filing Date
2024-06-21
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

After the calling terminal is upgraded to support naked-eye 3D display, it is impossible to determine whether to disable or activate the 3D rendering capability, resulting in the inability to play video ringback tones according to their type.

Method used

By carrying display attributes, including the type of the ringback tone media stream and the layout type of the 3D video data, in the preset keyframe images of the ringback tone media stream, the calling terminal type is identified using the Contact field in SIP signaling, the type of the ringback tone media stream to be sent is determined, and corresponding processing is performed in the calling terminal.

Benefits of technology

It enables the reasonable disabling or activation of 3D rendering capabilities based on the type of ringback tone media stream in scenarios where naked-eye 3D terminals and 2D terminals coexist, ensuring that the calling terminal can display video ringback tones with naked-eye 3D effects.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN118803142B_ABST
    Figure CN118803142B_ABST
Patent Text Reader

Abstract

The embodiment of the present application discloses a kind of processing method, device, electronic equipment, medium and program of color ring media stream, processing method of color ring media stream applied to color ring call node includes: the type of color ring media stream to be sent is determined;In the case where the preset key frame image of the color ring media stream carries the display attribute of the color ring media stream, the color ring media stream is sent to calling terminal;Wherein, the display attribute at least includes the type of the color ring media stream.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application belongs to the field of video processing technology, and specifically relates to a method, apparatus, electronic device, medium and program for processing ringback tone media streams. Background Technology

[0002] In related technologies, the implementation logic of the called party video ringback tone service involves the called party setting the ringback tone content and sending video data to the calling terminal through the ringback tone platform of the called party domain. When the calling terminal is upgraded to a terminal that supports naked-eye 3D display, it can receive 2D or 3D video data. However, the calling terminal cannot know the type of video ringback tone to be played. Therefore, it cannot determine whether to disable or invoke 3D rendering capabilities for the received video data, which is not conducive to playing video ringback tones according to their type. Summary of the Invention

[0003] This application provides a method, apparatus, electronic device, medium, and program for processing ringback tone media streams.

[0004] This application provides a method for processing ringback tone media streams, applied in a ringback tone call node, the method comprising:

[0005] Determine the type of the ringback tone media stream to be sent;

[0006] When the preset keyframe image of the ringback tone media stream carries the display attributes of the ringback tone media stream, the ringback tone media stream is sent to the calling terminal; wherein, the display attributes include at least the type of the ringback tone media stream.

[0007] In some embodiments, when the type of the ringback tone media stream is three-dimensional video data, the display attributes also include the layout type of the three-dimensional video data. It can be seen that after the ringback tone media stream is sent to the calling terminal, if the display attributes also include the layout type of the three-dimensional video data, the calling terminal can process the ringback tone media stream according to the layout type of the three-dimensional video data, which is beneficial for displaying video ringback tones with naked-eye 3D effects on the calling terminal.

[0008] In some embodiments, the preset keyframe image is the first frame image. Thus, after sending the first frame image of the ringback tone media stream to the calling terminal, the calling terminal can determine the display attributes of the ringback tone media stream based on the first frame image, and thereby reasonably determine the image processing method for the non-first frame images of the ringback tone media stream based on the display attributes.

[0009] In some embodiments, determining the type of the ringback tone media stream to be sent includes: receiving Session Initialization Protocol (SIP) signaling sent by the calling terminal, wherein the SIP signaling carries the terminal type of the calling terminal; and determining the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal. It can be seen that the embodiments of this application can utilize the SIP signaling sent by the calling terminal to indicate the terminal type of the calling terminal, thereby allowing the ringback tone calling node to directly determine the terminal type of the calling terminal based on the SIP signaling.

[0010] In some embodiments, the terminal type of the calling terminal is located in the Contact field of the SIP signaling. Since the Contact field in SIP signaling is defined to identify a terminal or call capability, it is suitable for identifying the type of the calling terminal.

[0011] In some embodiments, determining the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal includes: when the terminal type of the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, determining the type of the ringback tone media stream to be sent as 3D video data; when the terminal type of the calling terminal is a terminal that does not support naked-eye 3D display, or when the called terminal has not subscribed to the 3D video ringback tone service, determining the type of the ringback tone media stream to be sent as 2D video data.

[0012] Understandably, when the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, it means that the condition for playing the 3D video ringback tone on the calling terminal is met. Therefore, the type of the ringback tone media stream to be sent can be reasonably determined as 3D video data. When the calling terminal is a terminal that does not support naked-eye 3D display, or when the called terminal has not subscribed to the 3D video ringback tone service, it means that the condition for playing the 3D video ringback tone on the calling terminal is not met. In this case, the type of the ringback tone media stream to be sent can be reasonably determined as 2D video data.

[0013] In some embodiments, before determining the type of the ringback tone media stream to be sent, the method further includes: when the calling terminal's terminal type is a terminal supporting naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has not set the ringback tone content of the 3D video ringback tone, determining whether the ringback tone content of the 2D video ringback tone set by the called terminal can be obtained; when the ringback tone content of the 2D video ringback tone set by the called terminal can be obtained, converting the ringback tone content of the 2D video ringback tone from 2D video data to 3D video data to obtain the ringback tone media stream.

[0014] It can be seen that when the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, the conditions for playing 3D video ringback tones on the calling terminal are met. In this case, if the called terminal has not set 3D video ringback tone content but has set 2D video ringback tone content, the ringback tone media stream can be obtained by converting the 2D video ringback tone content from 2D video data to 3D video data. This allows the calling terminal to display a ringback tone with naked-eye 3D effect based on the received ringback tone media stream.

[0015] In some embodiments, before determining the type of the ringback tone media stream to be sent, the method further includes: when the calling terminal's terminal type is a terminal supporting naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has set the ringback tone content of the 3D video ringback tone, generating the ringback tone media stream according to the ringback tone content of the 3D video ringback tone; when the calling terminal's terminal type is a terminal supporting naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has not set the ringback tone content of the 3D video ringback tone, determining whether the ringback tone content of the 2D video ringback tone set by the called terminal can be obtained; when the ringback tone content of the 2D video ringback tone set by the called terminal cannot be obtained, generating the ringback tone media stream according to the default ringback tone content of the 3D video ringback tone.

[0016] It can be seen that when the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, the conditions for playing a 3D video ringback tone on the calling terminal are met. In this case, if the called terminal sets the ringback tone content for a 3D video ringback tone, the ringback tone media stream can be easily generated based on that content. Conversely, when the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, the conditions for playing a 3D video ringback tone on the calling terminal are also met. In this case, if the called terminal has not set either 3D or 2D video ringback tone content, the ringback tone media stream can be generated based on the default 3D video ringback tone content, thus enabling the calling terminal to display a ringback tone with naked-eye 3D effects.

[0017] This application also provides another method for processing ringback tone media streams, applied in the calling terminal, the method including:

[0018] Receive ringback tone media stream;

[0019] Extract the preset keyframe images from the ringback tone media stream;

[0020] The ringback tone media stream is played according to the display attributes of the ringback tone media stream carried by the preset keyframe image; wherein, the display attributes include at least the type of the ringback tone media stream.

[0021] In some embodiments, when the type of the ringback tone media stream is three-dimensional video data, the display attributes also include the layout type of the three-dimensional video data. It can be seen that after the calling terminal receives the ringback tone media stream, if the display attributes also include the layout type of the three-dimensional video data, the calling terminal can process the ringback tone media stream according to the layout type of the three-dimensional video data, which is beneficial for displaying video ringback tones with naked-eye 3D effects on the calling terminal.

[0022] In some embodiments, the preset keyframe image is the first frame image. Thus, after sending the first frame image of the ringback tone media stream to the calling terminal, the calling terminal can determine the display attributes of the ringback tone media stream based on the first frame image, and thereby reasonably determine the image processing method for the non-first frame images of the ringback tone media stream based on the display attributes.

[0023] In some embodiments, before receiving the ringback tone media stream, the method further includes: sending SIP signaling to the ringback tone calling node, wherein the SIP signaling carries the terminal type of the calling terminal; the terminal type of the calling terminal is used by the ringback tone calling node to determine the type of the ringback tone media stream. It can be seen that the embodiments of this application can utilize the SIP signaling sent by the calling terminal to indicate the terminal type of the calling terminal, thereby allowing the ringback tone calling node to directly determine the terminal type of the calling terminal based on the SIP signaling.

[0024] In some embodiments, the terminal type of the calling terminal is located in the Contact field of the SIP signaling. Since the Contact field in SIP signaling is defined to identify a terminal or call capability, it is suitable for identifying the type of the calling terminal.

[0025] This application embodiment also provides a processing device for ringback tone media streams, applied in a ringback tone call node, the device comprising:

[0026] The determination module is used to determine the type of the ringback tone media stream to be sent;

[0027] A sending module is configured to send the ringback tone media stream to the calling terminal when the preset keyframe image of the ringback tone media stream carries the display attributes of the ringback tone media stream; wherein the display attributes include at least the type of the ringback tone media stream.

[0028] This application embodiment also provides another processing device for ringback tone media streams, applied in a calling terminal, the device comprising:

[0029] The transceiver module is used to receive ringback tone media streams;

[0030] The processing module is used to extract preset keyframe images of the ringback tone media stream; and play the ringback tone media stream according to the display attributes of the ringback tone media stream carried by the preset keyframe images; wherein, the display attributes include at least the type of the ringback tone media stream.

[0031] This application also provides an electronic device, which includes a processor and a memory for storing a computer program that can run on the processor; wherein the processor is used to run the computer program to execute any of the above-described methods for processing ringback tone media streams.

[0032] This application also provides a computer storage medium storing a computer program, which, when executed by a processor, implements any of the above-described methods for processing ringback tone media streams.

[0033] This application also provides a computer program product, including a computer program, characterized in that the computer program, when executed by a processor, implements any of the above-described methods for processing ringback tone media streams.

[0034] As can be seen, since the preset keyframe image of the ringback tone media stream carries the display attributes of the ringback tone media stream, and the display attributes include at least the type of the ringback tone media stream, after the ringback tone media stream is sent to the calling terminal, the calling terminal can determine whether to disable or call the 3D rendering capability based on the type of the ringback tone media stream, so as to play the ringback tone media stream according to its type. Attached Figure Description

[0035] Figure 1 An interactive flowchart for implementing ringback tone media stream transmission provided for related technologies;

[0036] Figure 2 A flowchart illustrating a method for processing ringback tone media streams applied to a ringback tone call node, as provided in an embodiment of this application;

[0037] Figure 3A A schematic diagram of a first layout type of three-dimensional video data provided in an embodiment of this application;

[0038] Figure 3B A schematic diagram illustrating a second layout type of three-dimensional video data provided in an embodiment of this application;

[0039] Figure 3C A schematic diagram illustrating a third layout type of three-dimensional video data provided in the embodiments of this application;

[0040] Figure 3DA schematic diagram illustrating a fourth layout type of three-dimensional video data provided in the embodiments of this application;

[0041] Figure 4 This is a flowchart illustrating the process of a ringback tone calling node sending a ringback tone media stream in an embodiment of this application.

[0042] Figure 5 This is a flowchart illustrating the process of establishing a ringback tone media stream in an embodiment of this application;

[0043] Figure 6 A flowchart illustrating a method for processing ringback tone media streams applied to a calling terminal, as provided in an embodiment of this application;

[0044] Figure 7 This is a flowchart illustrating the calling terminal's processing of the ringback tone media stream in an embodiment of this application;

[0045] Figure 8 This is a flowchart illustrating the interactive process of establishing a custom ringback tone call in this embodiment of the application.

[0046] Figure 9 This is a schematic diagram of the structure of a ringback tone media stream processing device applied to a ringback tone call node according to an embodiment of this application;

[0047] Figure 10 This is a schematic diagram of the structure of a processing device for a caller's ringback tone media stream applied to a calling terminal, according to an embodiment of this application.

[0048] Figure 11 This is a schematic diagram of the composition structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0049] Glasses-free 3D display technology is a technology that encodes and renders 3D video data, then projects the image onto the viewer's left and right eyes using techniques such as lenses and gratings, thus creating a 3D visual experience without wearable devices. Terminals equipped with glasses-free 3D display technology are called glasses-free 3D terminals. These terminals can display 2D video data in ordinary 2D and 3D video data in 3D. Glasses-free 3D ringback tones are ringback tones with glasses-free 3D effects displayed on a glasses-free 3D terminal, based on video ringback tones. Since call terminals include both 2D terminals (i.e., terminals that do not support glasses-free 3D display) and glasses-free 3D terminals, and ringback tones materials can include both 2D and 3D video data, a ringback tones processing method compatible with different types of terminals and different types of ringback tones materials is needed. For example, a method is needed to play glasses-free 3D ringback tones while being compatible with different types of ringback tones materials. Current ringback tones playback technologies only require playing 2D video data on 2D terminals. In ringback tones services and related scenarios, there is no need to consider type judgment of terminals and media streams or execute other logic, making it unable to support glasses-free 3D ringback tones services. There is no naked-eye 3D ringback tone technology that is compatible with different types of terminals and different types of ringback tone materials in the relevant technologies.

[0050] The first technical solution of the related technology provides a method, apparatus, device, terminal, and ringback tone platform for playing video ringback tones, to solve the problem that users in the metaverse cannot interact and communicate with users in the real world. The method is applied to a metaverse device and includes: after receiving a trigger input in a three-dimensional virtual space, instructing the terminal to initiate a call request; the call request carrying the virtual mobile user identification code of the metaverse device; receiving a panoramic video ringback tone sent by the ringback tone platform based on the virtual mobile user identification code; and playing the panoramic video ringback tone in the three-dimensional virtual space. The first technical solution of the related technology discloses a three-dimensional ringback tone method, but it only supports access from metaverse devices and does not consider the simultaneous existence of two-dimensional terminals and naked-eye three-dimensional terminals.

[0051] A second technical solution in the related field provides a video call method, device, and system. The method includes: receiving video and audio data from the other end of the call sent by a local communication device; when the data type identifier in the video data is a two-dimensional video data identifier, converting the video data into three-dimensional video data; displaying the three-dimensional stereoscopic image corresponding to the three-dimensional video data and the video image acquired by the local camera on the same display device; and simultaneously outputting the audio corresponding to the audio data. This achieves a stereoscopic immersive video call using a virtual reality device and a local communication device, improving the user's video call experience. This solution describes a method for stereoscopic immersive video calls and provides a method for indicating data types through data identifiers. However, this solution does not consider the logic for handling two-dimensional and three-dimensional media streams in a three-dimensional ringback tone playback scenario, and only proposes a type identification method without specific implementation logic.

[0052] In the scenario of playing 3D video ringback tones, there are often 2D terminals and naked-eye 3D terminals at the same time. When the ringback tones are transmitted over the network, the network side makes a lot of modifications and overwrites to the ringback tones media stream based on business logic. The technical solutions of related technologies do not take into account such situations and are not adaptable to actual scenarios.

[0053] The technical solutions of the relevant technologies do not provide a reasonable notification method for the types of ringback tone materials. The technical solutions of the relevant technologies provide a three-dimensional transmission system for the other end, but do not provide a method for the ringback tone call node to process media streams in scenarios where two-dimensional terminals and naked-eye three-dimensional terminals coexist, or in scenarios where two-dimensional ringback tone media streams (ringback tone media streams of type two-dimensional video data) and three-dimensional ringback tone media streams (ringback tone media streams of type three-dimensional video data) coexist.

[0054] In the video ringback tone playback process of related technologies, a 2D ringback tone media stream can be played uniformly on the terminal after successful media negotiation. In scenarios where both 2D and glasses-free 3D terminals coexist, it is necessary to play the traditional 2D ringback tone media stream on the 2D terminal and the 3D ringback tone media stream on the glasses-free 3D terminal, thereby achieving compatibility between 2D and 3D ringback tone services. However, in related technologies, the implementation logic of the called party's video ringback tone service involves the called party setting the ringback tone content and sending video data to the calling terminal through the ringback tone platform of the called party's domain. When the calling terminal is upgraded to a terminal supporting glasses-free 3D display, for called parties who have activated a 3D ringback tone package (for playing 3D ringback tone media streams) and set 3D ringback tone content, the calling terminal will receive the 3D ringback tone media stream; for called parties who have activated a 3D ringback tone package but have not set 3D ringback tone content, the calling terminal should receive the 3D ringback tone media stream; for called parties who have not activated a 3D ringback tone package but have set 2D ringback tone content, the calling terminal will receive the 2D ringback tone media stream. The calling terminal cannot know the type of video ringback tone to be played. Therefore, it cannot determine whether to disable or invoke 3D rendering capabilities for the received video data, which is not conducive to playing video ringback tones according to the type requirements.

[0055] The following is through Figure 1 The technical problems existing in the related technologies are illustrated by example.

[0056] Figure 1 The interactive flowchart provided for the implementation of ringback tone media stream transmission in related technologies, such as Figure 1 As shown, the nodes in the calling domain include UEa and the Serving-Call Session Control Function (S-CSCF), while the nodes in the called domain include the ringback tone calling node and UEb. The nodes in the called domain also include the S-CSCF or the Interrogating-Call Session Control Function (I-CSCF). UEa is the calling terminal, and UEb is the called terminal.

[0057] Reference Figure 1After a node in the calling domain sends an INVITE message to a node in the called domain, the node in the called domain can reply with a 183 message to the node in the calling domain. Then, the calling and called domains can exchange PRACK and UPDATE messages. After exchanging PRACK and UPDATE messages, UEb can send a 180 message to the I-CSCF or S-CSCF, and the I-CSCF or S-CSCF can reply with a 180 message to the ringback tone calling node. The ringback tone calling node can send an UPDATE message to the calling domain, and UEa can send a 200 message to the ringback tone calling node. After receiving the 200 message, the ringback tone calling node can execute the ringback tone playback process.

[0058] exist Figure 1 In the interactive process of implementing ringback tone media stream transmission, the ringback tone call node cannot obtain the terminal type and cannot determine the type of ringback tone media stream to be sent to the calling terminal; the calling terminal cannot know the type of video ringback tone to be played and cannot determine whether to disable or invoke 3D rendering capabilities for the received ringback tone media stream. The ringback tone call node cannot execute the following logic: 1) Determine the type of the ringback tone media stream to be sent in a scenario where 2D terminals and naked-eye 3D terminals coexist; 2) Determine the ringback tone service type of the called terminal when a 3D ringback tone media stream can be sent; 3) Obtain the ringback tone content when the called terminal subscribes to the 3D ringback tone service; 4) Processing logic when there is no 3D ringback tone content.

[0059] In view of the above-mentioned technical problems, the technical solutions of the embodiments of this application are proposed.

[0060] The embodiments of this application will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the embodiments provided herein are merely illustrative of the embodiments of this application and are not intended to limit the embodiments of this application. Furthermore, the embodiments provided below are some embodiments for implementing this application, and not all embodiments for implementing this application. Unless otherwise specified, the technical solutions described in the embodiments of this application can be implemented in any combination.

[0061] It should be noted that, in the embodiments of this application, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a method or apparatus that includes a list of elements includes not only the elements expressly described, but also other elements not expressly listed, or elements inherent to implementing the method or apparatus. Without further limitations, an element defined by the phrase "comprising a..." does not exclude the presence of other related elements (e.g., steps in the method or units in the apparatus, such as portions of circuitry, processors, programs, or software, etc.) in the method or apparatus that includes that element.

[0062] This application provides a method for processing ringback tone media streams applied to ringback tone calling nodes, where the ringback tone calling nodes are located on the network side or in the cloud.

[0063] Figure 2 A flowchart illustrating the processing method of the ringback tone media stream applied to the ringback tone call node provided in this application embodiment is shown below. Figure 2 As shown, the process includes:

[0064] Step 201: Determine the type of the ringback tone media stream to be sent.

[0065] In this embodiment of the application, the type of ringback tone media stream can be two-dimensional video data or three-dimensional video data. The three-dimensional video data is used for displaying video ringback tones with naked-eye three-dimensional effects on naked-eye three-dimensional terminals, while the two-dimensional video data can be played on two-dimensional terminals.

[0066] Step 202: If the preset keyframe image of the ringback tone media stream carries the display attributes of the ringback tone media stream, send the ringback tone media stream to the calling terminal; wherein, the display attributes include at least the type of the ringback tone media stream.

[0067] In some embodiments, the preset keyframe image of the ringback tone media stream can be any frame image of the ringback tone media stream.

[0068] In practical applications, steps 201 to 202 can be implemented based on a processor, which can be at least one of the following: Application Specific Integrated Circuit (ASIC), Digital Signal Processor (DSP), Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field Programmable Gate Array (FPGA), Central Processing Unit (CPU), Controller, Microcontroller, and Microprocessor.

[0069] As can be seen, since the preset keyframe image of the ringback tone media stream carries the display attributes of the ringback tone media stream, and the display attributes include at least the type of the ringback tone media stream, after the ringback tone media stream is sent to the calling terminal, the calling terminal can determine whether to disable or call the 3D rendering capability based on the type of the ringback tone media stream, so as to play the ringback tone media stream according to its type.

[0070] In some embodiments of this application, when the type of the ringback tone media stream is three-dimensional video data, the display attributes also include the layout type of the three-dimensional video data.

[0071] For example, refer to Figures 3A to 3D The layout types of 3D video data can be alternating frames, stacked frames, left-right frames, and top-bottom frames. The principle of alternating frames is that the left-eye image and the right-eye image are transmitted alternately at a rate of 60 frames / second, so that the total frame rate of the two eyes reaches 120 frames / second. The principle of stacked frames is that the left-right image and the right-eye image are transmitted in the same frame, and the resolution of each image remains unchanged. The principle of left-right frames is that the left-eye image and the right-eye image are transmitted in the same frame, and the horizontal resolution of the image is halved. The principle of top-bottom frames is that the left-eye image and the right-eye image are transmitted in the same frame, and the vertical resolution of the image is halved.

[0072] For example, the type of ringback tone media stream and the layout type of 3D video data can be characterized by information such as the average grayscale value of the preset keyframe image, the number of the preset keyframe image, and the letter of the preset keyframe image. For example, when the preset keyframe image is the first frame image, the ringback tone calling node can set the average grayscale value of the first frame image to 0, 20, 40, 60, or 80, and encode and send the first frame image. After parsing the first frame image, the calling terminal can calculate the average grayscale value of the first frame image. When the average grayscale value of the first frame image is 0, the type of the ringback tone media stream can be determined to be two-dimensional video data; when the average grayscale value of the first frame image is 20, the type of the ringback tone media stream can be determined to be three-dimensional video data and the layout type of the three-dimensional video data is alternating frame layout; when the average grayscale value of the first frame image is 40, the type of the ringback tone media stream can be determined to be three-dimensional video data and the layout type of the three-dimensional video data is stacked frame layout; when the average grayscale value of the first frame image is 60, the type of the ringback tone media stream can be determined to be three-dimensional video data and the layout type of the three-dimensional video data is left-right frame layout; when the average grayscale value of the first frame image is 80, the type of the ringback tone media stream can be determined to be three-dimensional video data and the layout type of the three-dimensional video data is top-bottom frame layout.

[0073] It can be seen that after the ringback tone media stream is sent to the calling terminal, if the display attributes also include the layout type of the 3D video data, the calling terminal can process the ringback tone media stream according to the layout type of the 3D video data, which is beneficial for displaying video ringback tones with naked-eye 3D effects on the calling terminal.

[0074] In some embodiments of this application, the keyframe image is preset to be the first frame image. In this way, after the first frame image of the ringback tone media stream is sent to the calling terminal, the calling terminal can determine the display attributes of the ringback tone media stream based on the first frame image, and thus reasonably determine the image processing method of the non-first frame images of the ringback tone media stream based on the display attributes of the ringback tone media stream.

[0075] When establishing a ringback tone media stream, since the media format of 3D video data is the same as that of 2D video data, the only difference between the two is their content. Therefore, there is no substantial difference in the signaling used to transmit 3D and 2D video data on the communication link. Consequently, it is not suitable to use signaling to identify the type of the ringback tone media stream. To address the above problem, embodiments of this application can carry the type of the ringback tone media stream and the layout type of the 3D video data through the first frame image.

[0076] Figure 4 This is a flowchart illustrating the process of a ringback tone calling node sending a ringback tone media stream in an embodiment of this application, as follows: Figure 4 As shown, the process may include:

[0077] Step 401: Determine the type of the ringback tone media stream to be sent.

[0078] Step 402: When the ringback tone media stream is 3D video data, determine the layout type of the 3D video data.

[0079] Step 403: The type of the ringback tone media stream and the layout type of the 3D video data are carried by the first frame image of the ringback tone media stream.

[0080] Step 404: Append subsequent images to be sent after the first frame image.

[0081] Step 405: Encode and send the ringback tone media stream.

[0082] In some embodiments of this application, the process of determining the type of the ringback tone media stream to be sent includes: receiving SIP signaling sent by the calling terminal, wherein the SIP signaling carries the terminal type of the calling terminal; and determining the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal.

[0083] For example, SIP signaling can be an INVITE message or a 183 message.

[0084] As can be seen, the embodiments of this application can use the SIP signaling sent by the calling terminal to indicate the terminal type of the calling terminal, so that the ringback tone calling node can directly determine the terminal type of the calling terminal based on the SIP signaling.

[0085] In some embodiments of this application, the terminal type of the calling terminal is located in the Contact field of the SIP signaling. Since the Contact field in SIP signaling is defined as identifying a terminal or call capability, it is suitable for identifying the type of the calling terminal.

[0086] In some embodiments of this application, the process of determining the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal may include:

[0087] When the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, the type of the ringback tone media stream to be sent is determined to be 3D video data.

[0088] When the calling terminal is a terminal that does not support naked-eye 3D display, or when the called terminal has not subscribed to the 3D video ringback tone service, the type of the ringback tone media stream to be sent is determined to be 2D video data.

[0089] Understandably, when the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, it means that the condition for playing the 3D video ringback tone on the calling terminal is met. Therefore, the type of the ringback tone media stream to be sent can be reasonably determined as 3D video data. When the calling terminal is a terminal that does not support naked-eye 3D display, or when the called terminal has not subscribed to the 3D video ringback tone service, it means that the condition for playing the 3D video ringback tone on the calling terminal is not met. In this case, the type of the ringback tone media stream to be sent can be reasonably determined as 2D video data.

[0090] In some embodiments of this application, the method further includes, before determining the type of the ringback tone media stream to be sent:

[0091] When the calling terminal is a terminal that supports naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has not set the ringback tone content of the 3D video ringback tone, determine whether the ringback tone content of the 2D video ringback tone set by the called terminal can be obtained.

[0092] When the caller's terminal can obtain the content of the two-dimensional video ringback tone set, the two-dimensional video ringback tone content is converted from two-dimensional video data to three-dimensional video data to obtain the ringback tone media stream.

[0093] It can be seen that when the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, the conditions for playing 3D video ringback tones on the calling terminal are met. In this case, if the called terminal has not set 3D video ringback tone content but has set 2D video ringback tone content, the ringback tone media stream can be obtained by converting the 2D video ringback tone content from 2D video data to 3D video data. This allows the calling terminal to display a ringback tone with naked-eye 3D effect based on the received ringback tone media stream.

[0094] In some embodiments of this application, the method further includes, before determining the type of the ringback tone media stream to be sent:

[0095] When the calling terminal is a terminal that supports naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has set the ringback tone content of the 3D video ringback tone, the ringback tone media stream is generated according to the ringback tone content of the 3D video ringback tone.

[0096] If the calling terminal is a terminal that supports naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has not set the ringback tone content of the 3D video ringback tone, it is determined whether the ringback tone content of the 2D video ringback tone set by the called terminal can be obtained; if the ringback tone content of the 2D video ringback tone set by the called terminal cannot be obtained, a ringback tone media stream is generated according to the default 3D video ringback tone content.

[0097] It can be seen that when the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, the conditions for playing a 3D video ringback tone on the calling terminal are met. In this case, if the called terminal sets the ringback tone content for a 3D video ringback tone, the ringback tone media stream can be easily generated based on that content. Conversely, when the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, the conditions for playing a 3D video ringback tone on the calling terminal are also met. In this case, if the called terminal has not set either 3D or 2D video ringback tone content, the ringback tone media stream can be generated based on the default 3D video ringback tone content, thus enabling the calling terminal to display a ringback tone with naked-eye 3D effects.

[0098] In some embodiments, the 3D video ringback tone service can be a 3D ringback tone package service, and the 2D video ringback tone service can be a 2D ringback tone package service. Figure 5 This is a flowchart of establishing a ringback tone media stream in an embodiment of this application, such as... Figure 5 As shown, the process may include:

[0099] Step 501: Determine the terminal type of the calling terminal.

[0100] Step 502: Determine whether the terminal type is a terminal that supports naked-eye 3D display. If not, proceed to step 503; if yes, proceed to step 505.

[0101] Here, if the calling terminal is a terminal that supports naked-eye 3D display, then the calling terminal has the ability to display both 3D and 2D video ringback tones.

[0102] Step 503: Obtain the ringback tone content of the two-dimensional video ringback tone.

[0103] Step 504: Establish ringback tone media stream.

[0104] Here, a ringback tone media stream can be established based on the ringback tone content of the two-dimensional video ringback tone obtained in step 503. The type of the ringback tone media stream established in step 504 is two-dimensional video data.

[0105] Step 505: Obtain the ringback tone package subscription data of the called terminal.

[0106] Step 506: Determine whether the called terminal has subscribed to the 3D video ringback tone service. If yes, proceed to step 507; otherwise, proceed to step 503.

[0107] In other embodiments, if the called terminal has not subscribed to the 3D video ringback tone service, it can be handled according to preset exception logic.

[0108] Step 507: Obtain the ringback tone content of the called terminal.

[0109] Step 508: Determine whether the called terminal has set a 3D video ringback tone. If yes, proceed to step 509; otherwise, proceed to step 511.

[0110] Step 509: Obtain the ringback tone content of the 3D video ringback tone.

[0111] Step 510: Establish ringback tone media stream.

[0112] Here, a ringback tone media stream can be established based on the ringback tone content of the 3D video ringback tone. The type of the ringback tone media stream established in step 508 is 3D video data.

[0113] Step 511: Determine whether the called terminal can obtain the ringback tone content of the two-dimensional video ringback tone. If not, proceed to step 512; if yes, proceed to step 513.

[0114] Step 512: Obtain the default 3D video ringback tone content, and then proceed to step 510.

[0115] Step 513: Obtain the ringback tone content of the 3D video ringback tone through video conversion, and then execute step 510.

[0116] Here, the content of a two-dimensional video ringback tone can be converted from two-dimensional video data to three-dimensional video data to obtain the content of a three-dimensional video ringback tone.

[0117] Figure 6 A flowchart of a method for processing ringback tone media streams applied to a calling terminal, as provided in an embodiment of this application, is shown below. Figure 6 As shown, the process includes:

[0118] Step 601: Receive the ringback tone media stream.

[0119] Step 602: Extract the preset keyframe images of the ringback tone media stream.

[0120] Step 603: Play the ringback tone media stream according to the display attributes of the ringback tone media stream carried by the preset keyframe image; wherein, the display attributes include at least the type of the ringback tone media stream.

[0121] In practical applications, steps 601 to 603 can be implemented based on a processor, which can be at least one of ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor.

[0122] As can be seen, since the preset keyframe image of the ringback tone media stream carries the display attributes of the ringback tone media stream, and the display attributes include at least the type of the ringback tone media stream, after the calling terminal receives the ringback tone media stream, it can determine whether to disable or call the 3D rendering capability based on the type of the ringback tone media stream, so as to play the ringback tone media stream according to its type.

[0123] In some embodiments of this application, when the type of the ringback tone media stream is three-dimensional video data, the display attributes also include the layout type of the three-dimensional video data.

[0124] It can be seen that after the calling terminal receives the ringback tone media stream, if the display attributes also include the layout type of the 3D video data, the calling terminal can process the ringback tone media stream according to the layout type of the 3D video data, which is beneficial for displaying video ringback tones with naked-eye 3D effects on the calling terminal.

[0125] In some embodiments of this application, the keyframe image is preset to be the first frame image. In this way, after the first frame image of the ringback tone media stream is sent to the calling terminal, the calling terminal can determine the display attributes of the ringback tone media stream based on the first frame image, and thus reasonably determine the image processing method of the non-first frame images of the ringback tone media stream based on the display attributes of the ringback tone media stream.

[0126] Figure 7 This is a flowchart of the calling terminal processing the ringback tone media stream in an embodiment of this application, such as... Figure 7 As shown, the process may include:

[0127] Step 701: Receive the ringback tone media stream.

[0128] Step 702: Extract the first frame image of the ringback tone media stream.

[0129] Step 703: Determine the type of the ringback tone media stream and the layout type of the 3D video data by parsing the first frame image.

[0130] Step 704: Process the ringback tone media stream according to its type and the layout type of the 3D video data.

[0131] For example, the calling terminal can determine whether to disable or invoke 3D rendering capabilities based on the type of the ringback tone media stream, and can enable different 3D display rendering methods for different layout types of 3D video data. For example, 3D rendering capabilities can be disabled by disabling the 3D display software development kit (SDK), and invoked by calling the 3D display SDK.

[0132] For example, the calling terminal can parse and render the non-first frame images of the ringback tone media stream based on the parsing results of the first frame image.

[0133] In some embodiments of this application, before receiving the ringback tone media stream, the calling terminal may also send SIP signaling to the ringback tone calling node. The SIP signaling carries the terminal type of the calling terminal. The terminal type of the calling terminal is used by the ringback tone calling node to determine the type of the ringback tone media stream.

[0134] As can be seen, the embodiments of this application can use the SIP signaling sent by the calling terminal to indicate the terminal type of the calling terminal, so that the ringback tone calling node can directly determine the terminal type of the calling terminal based on the SIP signaling.

[0135] In some embodiments of this application, the terminal type of the calling terminal is located in the Contact field of the SIP signaling. Since the Contact field in SIP signaling is defined as identifying a terminal or call capability, it is suitable for identifying the type of the calling terminal.

[0136] In a communication network compatible with 2D terminals, naked-eye 3D terminals, 2D video ringback tones, 3D video ringback tones, and the ringback tones content of 2D and 3D video ringback tones, when a ringback tones call is triggered in the called domain, due to differences in terminal type, ringback tones service type, and ringback tones content, it is necessary to identify the terminal type, ringback tones service type, and ringback tones content at the ringback tones call node and terminal on the network side. When the ringback tones content of 3D video ringback tones is missing, a ringback tones call based on 3D video data is established through the conversion of 2D video data to 3D video data.

[0137] Figure 8 This is a flowchart illustrating the interactive process of establishing a custom ringback tone call in this embodiment of the application, such as... Figure 8 As shown, the process includes:

[0138] Step 81: The node in the calling domain sends an INVITE message to the node in the called domain.

[0139] Step 82: The node in the called domain can reply with a 183 message to the node in the calling domain.

[0140] Here, information such as the terminal type of the calling terminal can be carried in the INVITE message and the 183 message.

[0141] For example, the following code indicating the terminal type of the calling terminal can be added to the INVITE message or 183 message:

[0142] Allow:INVITE,ACK,OPTIONS,CANCEL,BYE,UPDATE,INFO,REFER,NOT IFY,MESSAGE,PRACK

[0143] Contact:

[0144] +g.3gpp.icsi-ref="urn%3Aurn-7%3A3gpp-service.ims.icsi.mmtel";

[0145] audio;

[0146] video;

[0147] +sip.instance=" <urn:gsma:imei:35293609-715199-0>";

[0148] ue.type = g.3gpp.ue.3D

[0149] ...

[0150] Step 83: The calling domain and the called domain exchange PRACK and UPDATE messages.

[0151] Step 84: The called terminal sends a 180 message to the I-CSCF or S-CSCF.

[0152] Step 85: The I-CSCF or S-CSCF can reply with a 180 message to the ringback tone calling node.

[0153] Step 86: Obtain the ringback tone content and generate the first frame image of the ringback tone media stream.

[0154] The ringback tone content obtained in this step can be either two-dimensional video ringback tone content or three-dimensional video ringback tone content. The method for generating the first frame image of the ringback tone media stream has been described in the aforementioned content.

[0155] Step 87: The ringback tone calling node sends an UPDATE message to the calling terminal.

[0156] Here, the UPDATE message sent by the ringback tone calling node to the calling terminal can carry the ringback tone media stream; after receiving the ringback tone media stream, the calling terminal can parse and render the image of the ringback tone media stream according to the parsing result of the first frame image of the ringback tone media stream.

[0157] Step 88: The calling terminal replies with a 200UPDATE message to the ringback tone calling node.

[0158] This application embodiment can be applied to scenarios such as terminals, communication services, ringback tone services, and 3D video processing. It can realize the notification between the network side and the terminal regarding the terminal type and the ringback tone media stream type, thereby solving the problems in the video ringback tone architecture where the network side cannot know the terminal type to determine the type of ringback tone media stream to be sent, and the terminal cannot know the type of media stream sent by the network to determine whether to call the 3D display capability. In addition, it provides called video ringback tone media stream judgment logic that is compatible with 2D terminals and naked-eye 3D terminals, and covers the exception handling logic when the caller subscribes to the 3D video ringback tone service but has not set the 3D video ringback tone content.

[0159] In summary, this application proposes a method for displaying ringback tones with naked-eye 3D effects based on SIP signaling. It provides methods for reasonably reporting terminal types, cloud-notified ringback tone media stream types, and layout types of 3D video data. Furthermore, it provides business logic for both 2D and 3D video ringback tones applicable to ringback tone calling nodes. This application also proposes corresponding processing logic for scenarios where the called terminal has subscribed to the 3D video ringback tone service but has not set the 3D video ringback tone content. This application can propose a naked-eye 3D ringback tone processing mode in communication networks compatible with 2D terminals, naked-eye 3D terminals, 2D video ringback tone services, 3D video ringback tone services, 2D video ringback tone content, and 3D video ringback tone content, which is of great value to communication operators' ringback tone services.

[0160] This application proposes to add a field identifier to the INVITE or 183 stage of the video ringback tone process to notify the calling terminal type. The first frame of the Real-Time Transport Protocol (RTP) media stream identifies the type of the ringback tone media stream and the layout type of the three-dimensional video data. This enables the processing of two-dimensional video ringback tones, three-dimensional video ringback tones, and different layout types of three-dimensional video data, and provides the relevant processing logic for three-dimensional video ringback tones by the ringback tone calling node.

[0161] Those skilled in the art will understand that, in the above-described method of the specific implementation, the order in which each step is written does not imply a strict execution order and does not constitute any limitation on the implementation process. The specific execution order of each step should be determined by its function and possible internal logic.

[0162] Figure 9 This is a schematic diagram of the structure of a ringback tone media stream processing device applied to a ringback tone call node according to an embodiment of this application, as shown below. Figure 9 As shown, the device includes:

[0163] The determination module 901 is used to determine the type of the ringback tone media stream to be sent;

[0164] The sending module 902 is used to send the ringback tone media stream to the calling terminal when the preset keyframe image of the ringback tone media stream carries the display attributes of the ringback tone media stream; wherein the display attributes include at least the type of the ringback tone media stream.

[0165] In some embodiments, when the type of the ringback tone media stream is three-dimensional video data, the display attributes also include the layout type of the three-dimensional video data.

[0166] In some embodiments, the preset keyframe image is the first frame image.

[0167] In some embodiments, the determining module 901 is used to determine the type of the ringback tone media stream to be sent, including:

[0168] Receive SIP signaling sent by the calling terminal, wherein the SIP signaling carries the terminal type of the calling terminal;

[0169] The type of the ringback tone media stream to be sent is determined based on the terminal type of the calling terminal.

[0170] In some embodiments, the terminal type of the calling terminal is located in the Contact field of the SIP signaling.

[0171] In some embodiments, the determining module 901 is configured to determine the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal, including:

[0172] When the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, the type of the ringback tone media stream to be sent is determined to be 3D video data.

[0173] When the calling terminal is a terminal that does not support naked-eye 3D display, or when the called terminal has not subscribed to the 3D video ringback tone service, the type of the ringback tone media stream to be sent is determined to be 2D video data.

[0174] In some embodiments, the device further includes an acquisition module, which is configured to determine whether the ringback tone content of the two-dimensional video ringback tone set by the called terminal can be acquired when the calling terminal is a terminal type that supports naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has not set the ringback tone content of the 3D video ringback tone; and when the ringback tone content of the two-dimensional video ringback tone set by the called terminal can be acquired, the ringback tone content of the two-dimensional video ringback tone is converted from two-dimensional video data to three-dimensional video data to obtain the ringback tone media stream.

[0175] In some embodiments, the device further includes an acquisition module, which is configured to generate the ringback tone media stream based on the ringback tone content of the 3D video ringback tone when the calling terminal is a terminal type that supports naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has set the ringback tone content of the 3D video ringback tone; and when the calling terminal is a terminal type that supports naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has not set the ringback tone content of the 3D video ringback tone, the module determines whether the ringback tone content of the 2D video ringback tone set by the called terminal can be acquired; and when the ringback tone content of the 2D video ringback tone set by the called terminal cannot be acquired, the module generates the ringback tone media stream based on the default ringback tone content of the 3D video ringback tone.

[0176] Figure 10 This is a schematic diagram of the structure of a processing device for the ringback tone media stream applied to the calling terminal according to an embodiment of this application, as shown below. Figure 10 As shown, the device includes:

[0177] Transceiver module 1001 is used to receive ringback tone media streams;

[0178] The processing module 1002 is used to extract preset keyframe images of the ringback tone media stream; and play the ringback tone media stream according to the display attributes of the ringback tone media stream carried by the preset keyframe images; wherein, the display attributes include at least the type of the ringback tone media stream.

[0179] In some embodiments, when the type of the ringback tone media stream is three-dimensional video data, the display attributes also include the layout type of the three-dimensional video data.

[0180] In some embodiments, the preset keyframe image is the first frame image.

[0181] In some embodiments, the transceiver module 1001 is further configured to send SIP signaling to the ringback tone calling node before receiving the ringback tone media stream, wherein the SIP signaling carries the terminal type of the calling terminal; the terminal type of the calling terminal is used by the ringback tone calling node to determine the type of the ringback tone media stream.

[0182] In some embodiments, the terminal type of the calling terminal is located in the Contact field of the SIP signaling.

[0183] It should be noted that the description of the above device embodiments is similar to the description of the above method embodiments, and has similar beneficial effects. For technical details not disclosed in the device embodiments of this application, please refer to the description of the method embodiments of this application for understanding.

[0184] It should be noted that, in the embodiments of this application, if the above methods are implemented as software functional modules and sold or used as independent products, they can also be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the embodiments of this application, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a terminal, server, etc.) to execute all or part of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), magnetic disks, or optical disks. Thus, the embodiments of this application are not limited to any specific hardware and software combination.

[0185] Correspondingly, this application embodiment further provides a computer program product, the computer program product including computer executable instructions, which are used to implement any of the ringback tone media stream processing methods provided in this application embodiment.

[0186] Accordingly, this application embodiment further provides a computer storage medium storing computer-executable instructions, which are used to implement any of the ringback tone media stream processing methods provided in the above embodiments.

[0187] This application also provides an electronic device. Figure 11 This is a schematic diagram of the composition structure of an electronic device provided in an embodiment of this application, as shown below. Figure 11 As shown, the electronic device 110 may include:

[0188] Memory 111 is used to store executable instructions;

[0189] The processor 112 is used to execute executable instructions stored in the memory 111 to implement any of the above-described methods for processing ringback tone media streams.

[0190] The processor 112 mentioned above can be at least one of ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor.

[0191] The aforementioned computer-readable storage medium or memory 111 may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), a magnetic random access memory (FRAM), a flash memory, a magnetic surface memory, an optical disc, or a compact disc read-only memory (CD-ROM), etc.; it may also be various terminals that include one or any combination of the above-mentioned memories, such as mobile phones, computers, tablet devices, personal digital assistants, etc.

[0192] In some embodiments, the functions or modules of the apparatus provided in this application can be used to perform the methods described in the above method embodiments. The specific implementation can be referred to the description of the above method embodiments, and for the sake of brevity, it will not be repeated here.

[0193] The description of the various embodiments above tends to emphasize the differences between the various embodiments. The similarities or similarities between them can be referred to, and for the sake of brevity, they will not be repeated here.

[0194] The methods disclosed in the various method embodiments provided in this application can be arbitrarily combined to obtain new method embodiments without conflict.

[0195] The features disclosed in the various product embodiments provided in this application can be arbitrarily combined without conflict to obtain new product embodiments.

[0196] The features disclosed in the various method or device embodiments provided in this application can be arbitrarily combined without conflict to obtain new method or device embodiments.

[0197] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0198] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims. All of these forms are within the protection scope of this application.

Claims

1. A method for processing ringback tone media streams, characterized in that, When applied to ringback tone calling nodes, the method includes: Determine the type of the ringback tone media stream to be sent; When the preset keyframe image of the ringback tone media stream carries the display attributes of the ringback tone media stream, the ringback tone media stream is sent to the calling terminal; wherein, the display attributes include at least the type of the ringback tone media stream; the type of the ringback tone media stream is used to enable the calling terminal to determine whether to disable or invoke the 3D rendering capability; The process of determining the type of the ringback tone media stream to be sent includes: receiving a Session Initiation Protocol (SIP) signaling message sent by the calling terminal, wherein the SIP signaling message carries the terminal type of the calling terminal; and determining the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal. Determining the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal includes: if the calling terminal is playing a 3D video ringback tone, determining the type of the ringback tone media stream to be sent as 3D video data; if the calling terminal is not playing a 3D video ringback tone, determining the type of the ringback tone media stream to be sent as 2D video data.

2. The method according to claim 1, characterized in that, When the type of the ringback tone media stream is three-dimensional video data, the display attributes also include the layout type of the three-dimensional video data.

3. The method according to claim 1, characterized in that, The preset keyframe image is the first frame image.

4. The method according to claim 1, characterized in that, The terminal type of the calling terminal is located in the Contact field of the SIP signaling.

5. The method according to claim 1, characterized in that, Determining the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal includes: When the calling terminal is a terminal that supports naked-eye 3D display and the called terminal has subscribed to the 3D video ringback tone service, the type of the ringback tone media stream to be sent is determined to be 3D video data. When the calling terminal is a terminal that does not support naked-eye 3D display, or when the called terminal has not subscribed to the 3D video ringback tone service, the type of the ringback tone media stream to be sent is determined to be 2D video data.

6. The method according to claim 5, characterized in that, Before determining the type of the ringback tone media stream to be sent, the method further includes: When the calling terminal is a terminal that supports naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has not set the ringback tone content of the 3D video ringback tone, it is determined whether the ringback tone content of the 2D video ringback tone set by the called terminal can be obtained. When the caller's terminal is able to obtain the ringback tone content of the two-dimensional video ringback tone, the ringback tone content of the two-dimensional video ringback tone is converted from two-dimensional video data to three-dimensional video data to obtain the ringback tone media stream.

7. The method according to claim 5, characterized in that, Before determining the type of the ringback tone media stream to be sent, the method further includes: When the calling terminal is a terminal that supports naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has set the ringback tone content of the 3D video ringback tone, the ringback tone media stream is generated according to the ringback tone content of the 3D video ringback tone. If the calling terminal is a terminal that supports naked-eye 3D display, the called terminal has subscribed to the 3D video ringback tone service, and the called terminal has not set the ringback tone content of the 3D video ringback tone, it is determined whether the ringback tone content of the 2D video ringback tone set by the called terminal can be obtained; if the ringback tone content of the 2D video ringback tone set by the called terminal cannot be obtained, the ringback tone media stream is generated according to the default 3D video ringback tone content.

8. A method for processing ringback tone media streams, characterized in that, Applied to a calling terminal, the method includes: Receive ringback tone media stream; Extract the preset keyframe images from the ringback tone media stream; The ringback tone media stream is played according to the display attributes of the ringback tone media stream carried by the preset keyframe image; wherein, the display attributes include at least the type of the ringback tone media stream; the type of the ringback tone media stream is used to enable the calling terminal to determine whether to disable or invoke the 3D rendering capability; Before receiving the ringback tone media stream, the method further includes: sending Session Initiation Protocol (SIP) signaling to the ringback tone calling node, wherein the SIP signaling carries the terminal type of the calling terminal; the terminal type of the calling terminal is used by the ringback tone calling node to determine the type of the ringback tone media stream; wherein, if the calling terminal is playing a three-dimensional video ringback tone, the ringback tone calling node is used to determine the type of the ringback tone media stream to be sent as three-dimensional video data; if the calling terminal is not playing a three-dimensional video ringback tone, the ringback tone calling node is used to determine the type of the ringback tone media stream to be sent as two-dimensional video data.

9. The method according to claim 8, characterized in that, When the type of the ringback tone media stream is three-dimensional video data, the display attributes also include the layout type of the three-dimensional video data.

10. The method according to claim 8, characterized in that, The preset keyframe image is the first frame image.

11. The method according to claim 8, characterized in that, The terminal type of the calling terminal is located in the Contact field of the SIP signaling.

12. A processing device for ringback tone media streams, characterized in that, The device, used in ringback tone calling nodes, includes: The determination module is used to determine the type of the ringback tone media stream to be sent; A sending module is configured to send the ringback tone media stream to the calling terminal when the preset keyframe image of the ringback tone media stream carries the display attributes of the ringback tone media stream; wherein, the display attributes include at least the type of the ringback tone media stream; the type of the ringback tone media stream is used to enable the calling terminal to determine whether to disable or invoke the 3D rendering capability; The determining module is used to determine the type of the ringback tone media stream to be sent, including: receiving a Session Initiation Protocol (SIP) signaling sent by the calling terminal, wherein the SIP signaling carries the terminal type of the calling terminal; and determining the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal. The determining module is used to determine the type of the ringback tone media stream to be sent based on the terminal type of the calling terminal, including: if the calling terminal is playing a three-dimensional video ringback tone, determining the type of the ringback tone media stream to be sent as three-dimensional video data; if the calling terminal is not playing a three-dimensional video ringback tone, determining the type of the ringback tone media stream to be sent as two-dimensional video data.

13. A processing device for ringback tone media streams, characterized in that, The device, used in a calling terminal, includes: The transceiver module is used to receive ringback tone media streams; The processing module is used to extract preset keyframe images of the ringback tone media stream; and play the ringback tone media stream according to the display attributes of the ringback tone media stream carried by the preset keyframe images; wherein, the display attributes include at least the type of the ringback tone media stream; the type of the ringback tone media stream is used to enable the calling terminal to determine whether to disable or invoke the 3D rendering capability; The transceiver module is further configured to send Session Initiation Protocol (SIP) signaling to the ringback tone calling node before receiving the ringback tone media stream. The SIP signaling carries the terminal type of the calling terminal. The terminal type of the calling terminal is used by the ringback tone calling node to determine the type of the ringback tone media stream. Wherein, if the calling terminal is playing a 3D video ringback tone, the ringback tone calling node is configured to determine the type of the ringback tone media stream to be sent as 3D video data; if the calling terminal is not playing a 3D video ringback tone, the ringback tone calling node is configured to determine the type of the ringback tone media stream to be sent as 2D video data.

14. An electronic device, characterized in that, The electronic device includes a processor and a memory for storing computer programs capable of running on the processor; wherein, The processor is used to run the computer program to perform the method according to any one of claims 1-11.

15. A computer storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the computer program implements the method described in any one of claims 1 to 11.

16. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the method of any one of claims 1 to 11.