Sdp signaling transmission method and device, and storage medium

By using predefined instruction templates and media templates and employing identifiers for lightweight transmission, the problem of low transmission efficiency caused by excessively large SDP signaling size is solved, and efficient signaling transmission is achieved in bandwidth-constrained environments.

CN122247971APending Publication Date: 2026-06-19SHENZHEN JOOAN TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHENZHEN JOOAN TECH CO LTD
Filing Date
2026-05-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing audio and video session methods, the SDP signaling volume is too large, resulting in low transmission efficiency in network environments with limited bandwidth or high packet loss rates, which affects user experience.

Method used

By using predefined instruction templates and media templates, lightweight transmission is achieved using instruction template identifiers and media template identifiers to generate target SDP signaling, reducing the amount of signaling data and avoiding the repeated transmission of fixed content.

🎯Benefits of technology

It improves the transmission efficiency of SDP signaling, reduces bandwidth usage and transmission latency, and enhances transmission reliability and reconnection capability in weak network environments.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122247971A_ABST
    Figure CN122247971A_ABST
Patent Text Reader

Abstract

This application provides an SDP signaling transmission method, apparatus, and storage medium. The method parses the signaling data sent by the transmitting end to extract the instruction template identifier, media template identifier, and dynamically changing field set, eliminating the need for the transmitting end to transmit the complete SDP text and reducing the amount of signaling data transmitted. By retrieving the instruction template corresponding to the instruction template identifier from a preset instruction template library, the redundant overhead of fixed-frame content is eliminated. By retrieving the target media template corresponding to the media template identifier from a preset media template library, a short identifier replaces the lengthy encoding / decoding attribute set, compressing the signaling volume. By filling the fields from the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template to generate the target SDP signaling, the receiving end can complete the signaling reconstruction locally, further reducing the amount of SDP signaling data transmitted and thus improving the transmission efficiency of SDP signaling.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of communication technology, and in particular to an SDP signaling transmission method, apparatus and storage medium. Background Technology

[0002] In the fields of IoT and embedded real-time communication, devices (such as IPC (Internet Protocol Camera), smart locks, and wearable devices) typically establish audio and video sessions with peers through WebRTC (Web Real-Time Communications) or SIP (Session Initialization Protocol). These protocols rely on SDP (Session Description Protocol) for media capability negotiation and ICE (Interactive Connectivity Establishment) connection establishment.

[0003] However, in the existing audio and video session methods mentioned above, complete SDP information usually includes session description, media format list, ICE candidate address, fingerprint information, ICE credentials, etc., and the size is generally 10k to 20k bytes. The signaling size is too large, which will consume a lot of bandwidth and power for embedded devices. Moreover, in network environments with limited bandwidth or high packet loss rate, the large size of SDP signaling is prone to signaling transmission timeout, causing ICE restart failure and session disconnection, which seriously affects the user experience.

[0004] Therefore, how to achieve lightweight transmission of SDP signaling and improve the transmission efficiency of SDP signaling is a technical problem that urgently needs to be solved in this field. Summary of the Invention

[0005] This application provides an SDP signaling transmission method, apparatus, and storage medium, aiming to solve the technical problem of low SDP signaling transmission efficiency caused by the excessive size of SDP signaling, and to improve the transmission efficiency of SDP signaling.

[0006] In a first aspect, this application provides an SDP signaling transmission method, which includes the following steps: The signaling data sent by the sending end is parsed to extract the instruction template identifier, media template identifier, and dynamically changing field set; Retrieve the instruction template corresponding to the instruction template identifier from the preset instruction template library; Retrieve the target media template corresponding to the media template identifier from the preset media template library; The target SDP signaling is generated by filling the corresponding fields in the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template.

[0007] Secondly, this application also provides an SDP signaling transmission device, the SDP signaling transmission device comprising: The data parsing module is used to parse the signaling data sent by the sending end and extract the instruction template identifier, media template identifier, and dynamically changing field set; The instruction template acquisition module is used to retrieve the instruction template corresponding to the instruction template identifier from a preset instruction template library; The media template acquisition module is used to retrieve the target media template corresponding to the media template identifier from a preset media template library; The signaling generation module is used to fill the fields in the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template to generate target SDP signaling.

[0008] Thirdly, this application also provides a computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, it implements the steps of the SDP signaling transmission method described above.

[0009] This application provides an SDP signaling transmission method, apparatus, and storage medium. The method parses the signaling data sent by the sender, extracting instruction template identifiers, media template identifiers, and dynamically changing field sets. This eliminates the need for the sender to transmit the complete SDP text, replacing it with lightweight identifiers and a few dynamic fields, thus reducing the amount of signaling data transmitted. By retrieving the instruction template corresponding to the instruction template identifier from a preset instruction template library, network transmission of standard session structure fields is avoided, eliminating the redundant overhead of fixed-frame content. By retrieving the target media template corresponding to the media template identifier from a preset media template library, the lengthy encoding / decoding attribute set is replaced with a short identifier, compressing the signaling volume. By filling the fields from the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template to generate the target SDP signaling, the receiver can complete the signaling reconstruction locally without relying on historical caches. This avoids the parsing, splicing, and transmission latency of the complete SDP signaling, reducing the amount of SDP signaling data transmitted and thus improving the transmission efficiency of SDP signaling. Attached Figure Description

[0010] To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0011] Figure 1 A flowchart illustrating an embodiment of the SDP signaling transmission method provided in this application; Figure 2 This is a schematic diagram of the structure of the first embodiment of the SDP signaling transmission device provided in this application; Figure 3 This is a schematic block diagram of the structure of the computer device provided in the embodiments of this application.

[0012] The realization of the purpose, functional features and advantages of this application will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation

[0013] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0014] The flowchart shown in the attached diagram is for illustrative purposes only and does not necessarily include all content and operations / steps, nor does it necessarily have to be performed in the order described. For example, some operations / steps can be broken down, combined, or partially merged, so the actual execution order may change depending on the actual situation.

[0015] The following detailed description of some embodiments of this application is provided in conjunction with the accompanying drawings. Unless otherwise specified, the following embodiments and features can be combined with each other.

[0016] The SDP signaling transmission method provided in this application is used for two or more devices communicating with each other. For ease of understanding, the two or more devices communicating with each other are divided into a receiving end and a sending end. The sending end is the device that sends SDP signaling, and the receiving end is the device that receives and reconstructs SDP signaling.

[0017] In one embodiment, upon initial pairing with the sending end, at least one instruction template and at least one target media template are predefined; a unique instruction template identifier is assigned to each instruction template to construct the instruction template library; and a unique media template identifier is assigned to each target media template to construct the media template library.

[0018] In one embodiment, before the device leaves the factory, or when the receiving end and the sending end are first paired, at least one instruction template and at least one target media template are predefined at both ends.

[0019] The instruction template defines the overall structural framework of SDP signaling and marks the filling positions of dynamically changing fields within this framework. The instruction template contains fixed session structure fields, including session description lines, session origin lines, session time lines, media packet attribute lines, media description lines, ICE authentication attribute lines, fingerprint attribute lines, and codec attribute placeholders.

[0020] The target media template contains fixed media capability fields, including audio encoding configuration, video encoding configuration, RTP (Real-time Transport Protocol) header extension attributes, and RTCP (Real-time Control Protocol) feedback attributes. Specifically, the target media template includes command header information, an audio template, and a video template. The audio template includes audio encoding configuration and additional audio attributes, and the video template includes video encoding configuration and additional video attributes.

[0021] When predefining instruction templates, a unique instruction template identifier is assigned to each instruction template to construct the instruction template library. The instruction template identifier is used to identify the structural type of the instruction template during signaling transmission.

[0022] Similarly, when predefining target media templates, a unique media template identifier is assigned to each target media template to build a media template library. The media template identifier is used to identify the encoding / decoding capability type of the target media template during signaling transmission.

[0023] For example, both the instruction template identifier and the media template identifier can adopt hierarchical encoding rules. The instruction template identifier indicates the overall structure type of the SDP signaling, while the media template identifier indicates the specific combination of media codec capabilities used.

[0024] During the initial handshake, the receiving and sending ends can negotiate and determine the template identifier (including instruction template identifier and media template identifier) ​​to be used in this session through lightweight methods (such as device capability exchange, QR code pairing), and store the negotiated template identifier in the context of this session.

[0025] In one specific embodiment, it is assumed that the two communicating parties (the receiving end and the sending end) are an embedded IPC camera and a mobile terminal APP. Before the device leaves the factory, a command template library and a media template library can be pre-installed in the local memory of the IPC camera. In the command template library, the following can be defined: the command template identifier SDP_AV represents a structure framework of 1 audio channel + 1 video channel; the command template identifier SDP_A represents a structure framework of audio-only communication; and the command template identifier SDP_AVV represents a structure framework of 1 audio channel + 2 video channels. In the media template library, the following can be defined: Media template identifier MEDIA_A_MPEG4 represents the corresponding MPEG4 audio encoding configuration (payload type 96-100); Media template identifier MEDIA_A_PCMU represents the corresponding PCMU audio encoding configuration (payload type 0); Media template identifier MEDIA_V_H264 represents the corresponding H264 video encoding configuration (payload type 123-120-102-103, etc.); Media template identifier MEDIA_V_H265 represents the corresponding H265 video encoding configuration (payload type 125-122-124-121).

[0026] During the initial pairing, the IPC camera and the mobile terminal APP can exchange device capability information by scanning a QR code, negotiate and determine that the instruction template identifier SDP_AVV and the media template identifier combination MEDIA_A_MPEG4+MEDIA_V_H264+MEDIA_V_H265 will be used for this session, and write the negotiation result into the session context of both parties for reuse during subsequent ICE reconstruction.

[0027] This embodiment predefines instruction templates and target media templates before the device leaves the factory or during the initial pairing and assigns unique identifiers to each to build a template library. This allows both communicating parties to determine the session structure type and media encoding / decoding capability combination simply by negotiating template identifiers during the initial handshake. As a result, the fixed session structure fields and media capability fields that constitute the majority of the complete SDP signaling are pre-placed locally. In subsequent signaling interactions, only short identifiers need to be transmitted to replace lengthy standard fields and encoding / decoding configurations, avoiding the repeated transmission of fixed content. This reduces signaling volume, bandwidth usage, and transmission latency. At the same time, the session context reuse mechanism provides a foundation for subsequent interactive connection reconstruction, further improving transmission efficiency and reconnection reliability in weak network environments.

[0028] Please refer to Figure 1 , Figure 1 This is a flowchart illustrating an embodiment of the SDP signaling transmission method provided in this application.

[0029] like Figure 1 As shown, the SDP signaling transmission method includes steps S101 to S104.

[0030] S101. Parse the signaling data sent by the sending end and extract the instruction template identifier, media template identifier, and dynamic change field set.

[0031] In one embodiment, after receiving signaling data sent by the sending end, the receiving end first performs protocol parsing on the signaling data to identify its encapsulation format and structure fields. The signaling data may be encapsulated using a lightweight data exchange format, including an instruction template identifier field, a media template identifier field, and a set of dynamically changing fields.

[0032] Specifically, the instruction template identifier in the signaling data is read to determine the overall structural framework type of this SDP signaling; the media template identifier in the signaling data is read to determine the media codec capability combination used in this session; and the set of dynamically changing fields in the signaling data is read to obtain the field values ​​that have changed during this session or during this interactive connection reconstruction.

[0033] The instruction template identifier is used to uniquely identify a preset instruction template. The instruction template defines the overall structural framework of SDP signaling, including data such as the session description line, session origin line, session time line, media packet attribute line, media description line, ICE authentication attribute line, fingerprint attribute line, and the padding positions for each field. For example, the instruction template identifier "SDP_AV" represents a structure of 1 audio channel + 1 video channel, and the instruction template identifier "SDP_AVV" represents a structure of 1 audio channel + 2 video channels. In ICE reconstruction scenarios, if the instruction template already negotiated and determined in the current session is reused, the instruction template identifier can be omitted from transmission, and the receiving end can directly read it from the session context.

[0034] The media template identifier is used to uniquely identify the preset target media template. The target media template contains fixed media capability fields, including audio encoding configuration, video encoding configuration, RTP header extension attributes, and RTCP feedback attributes.

[0035] Media template identifiers include audio template identifiers and / or video template identifiers. For example, "MEDIA_A_MPEG4" corresponds to an MPEG4 audio encoding configuration, and "MEDIA_V_H264" corresponds to an H264 video encoding configuration. Signaling data may contain one or more media template identifiers, each corresponding to a different media stream.

[0036] The dynamically changing field set refers to the set of fields whose values ​​change during each session establishment or interactive connection reconstruction and cannot be fixed in the template in advance. The dynamically changing field set includes at least the session version number (located in the session origin line, incrementing with each update), ICE authentication credentials (including ice-ufrag and ice-pwd, regenerated with each interactive connection reconstruction), fingerprint value (DTLS fingerprint information, used for encrypted channel authentication), IP address (the device's network connection address, updated during network switching), SSRC value (synchronization source identifier, used to distinguish different audio and video streams), and media stream identification information (including media stream name and CNAME identifier), etc.

[0037] Generally, the dynamically changing field set can be encapsulated in the signaling data in the form of key-value pairs, and the receiving end parses it and fills it into the corresponding filling position according to the preset mapping rules.

[0038] S102. Retrieve the instruction template corresponding to the instruction template identifier from the preset instruction template library.

[0039] In one embodiment, after completing the data parsing of the signaling data, the receiving end performs a local template retrieval operation based on the extracted template identifier.

[0040] For retrieving command templates, the receiving end uses the command template identifier as an index to retrieve the corresponding command template from its locally pre-set command template library. This command template library is pre-stored in the receiving end's local memory and contains one or more command templates. Each command template defines a specific type of SDP signaling overall structure framework. This SDP signaling overall structure framework includes command header information, audio description information, and video description information.

[0041] The instruction header information includes a session description line, session origin line, session time line, media packet attribute line, and extended allowed attribute line. The audio description information includes an audio media description line, audio connection information line, audio RTCP attribute line, audio ICE authentication attribute line, audio fingerprint attribute line, audio media stream identifier line, and audio codec attribute placeholders. The video description information includes a video media description line, video connection information line, video RTCP attribute line, video ICE authentication attribute line, video fingerprint attribute line, video media stream identifier line, and video codec attribute placeholders.

[0042] Fill points are marked in each attribute row of the instruction template to indicate the fill positions corresponding to dynamically changing fields and media template content.

[0043] S103. Retrieve the target media template corresponding to the media template identifier from the preset media template library.

[0044] In one embodiment, the media template identifier includes an audio template identifier and / or a video template identifier; the target media template includes a target audio template and / or a target video template.

[0045] Specifically, the media template identifier can contain only an audio template identifier, indicating that the currently generated target SDP signaling only needs to retrieve the audio template corresponding to that audio template identifier. Similarly, the media template identifier can contain only a video template identifier, indicating that the currently generated target SDP signaling only needs to retrieve the video template corresponding to that video template identifier. The media template identifier can also contain both audio and video template identifiers simultaneously, and there can be one or more audio and video template identifiers each, indicating that the currently generated target SDP instruction contains both audio and video templates, where there can be one or more audio and video templates each.

[0046] The receiving end uses the media template identifier as an index to retrieve and call up the corresponding target media template from a locally preset media template library. When the signaling data contains an audio template identifier, the receiving end retrieves at least one target audio template from the media template library based on the audio template identifier; when the signaling data contains a video template identifier, the receiving end retrieves at least one target video template from the media template library based on the video template identifier.

[0047] The target audio template includes audio configuration information and additional audio attributes. The audio configuration information includes one or more audio encoding configurations (such as encoding configurations for audio formats like MPEG4, RED, PCMU, and PCMA); the additional audio attributes include audio level indication, absolute send time, transport-wide congestion control, media identification / MID, and absolute capture time.

[0048] The target video template includes video configuration information and additional video attributes. The video configuration information includes one or more video encoding configurations, such as H.265 or H.264 encoding configurations. Additional video attributes include timestamp offset, absolute send time, video orientation, transport-wide congestion control, playout delay, video content type, video timing, color space, media ID, RTP stream ID, repaired RTP stream ID, and absolute capture time.

[0049] In one example, assume that the receiving end parses the signaling data to obtain the instruction template identifier SDP_AVV, the audio template identifier MEDIA_A_MPEG4, the first video template identifier MEDIA_V_H265, and the second video template identifier MEDIA_V_H264, and performs the corresponding template retrieval operation based on the parsing results.

[0050] Specifically, the instruction template with the instruction template identifier SDP_AVV is retrieved from the instruction template library. The following shows the instruction header information and audio information configuration in the instruction template: SDP_AVV: # The header information name of this instruction (template number is AVV). Description: This SDP instruction includes 1 audio message (A) and 2 video messages (VV). v=0 o = - {value - random number} {value - version number} IN IP4 127.0.0.1 s=- t=0 0 a=group:BUNDLE 0 1 2 a=extmap-allow-mixed a=msid-semantic: WMS myKvsVideoStream Audio information may include: m=audio 9 UDP / TLS / RTP / SAVPF {value-audioPT}#template number, for example, entering 96, 97, 98, 99, 100 here will retrieve the relevant MPEG4 encoding configuration. c=IN IP4 0.0.0.0 a=rtcp:9 IN IP4 0.0.0.0 a=ice-ufrag:{value-ice account} # The account and password for this audio information, for easy encryption and reading. a=ice-pwd:{value-ice-password} a=ice-options:trickle renomination a=fingerprint:sha-256 {value-fingerprint} # The encryption method for this audio information a=setup:actpass a=mid:0 {value-MEDIA_A_EXTMAP} # Reads the corresponding additional audio attributes from the corresponding audio template {Value - Audio encoding / decoding related configuration} # Read the corresponding audio encoding configuration from the corresponding audio template a = ssrc:{value - ssrc value} cname: { nTYj / ZdKKye1YYE / } # ssrc is the ID of the audio / video stream, used to distinguish different audio / video streams. cname is the name of the audio / video stream. a=ssrc:{value-ssrcvalue} msid:myKvsVideoStream AudioTrack_0 The following shows the configuration of two video channels in the instruction template: m=video 9 UDP / TLS / RTP / SAVPF {value-video PT} c=IN IP4 0.0.0.0 a=rtcp:9 IN IP4 0.0.0.0 a=ice-ufrag:{value-ice account} a=ice-pwd:{value-ice-password} a=ice-options:trickle renomination a=fingerprint:sha-256 {value-fingerprint} a=setup:actpass a=mid:1 {Value - MEDIA_V_EXTMAP} {Value - Video encoding / decoding related configuration} a=ssrc:{value--ssrcvalue} cname: a=ssrc:{value-ssrcvalue} msid:myKvsVideoStream VideoTrack_0 m=video 9 UDP / TLS / RTP / SAVPF {value-video PT} c=IN IP4 0.0.0.0 a=rtcp:9 IN IP4 0.0.0.0 a=ice-ufrag:{value-ice account} a=ice-pwd:{value-ice-password} a=ice-options:trickle renomination a=fingerprint:sha-256 {value-fingerprint} a=setup:actpass a=mid:1 {Value - MEDIA_V_EXTMAP} {Value - Video encoding / decoding related configuration} a=ssrc:{value--ssrcvalue} cname: a=ssrc:{value-ssrcvalue} msid:myKvsVideoStream VideoTrack_0 This instruction template defines an SDP structure framework that includes one audio media (A) and two video media (VV). Each media description line has reserved spaces for filling ICE authentication attributes, fingerprint attributes, codec attributes, and extended attributes.

[0051] Then, the target audio template corresponding to the audio template identifier is retrieved from the media template library, such as the audio template with template number MEDIA_A_MPEG4. This audio template contains MPEG4 audio encoding configuration with payload type 96-100 and additional audio attribute MEDIA_A_EXTMAP.

[0052] Retrieve the first target video template corresponding to the first video template identifier from the media template library, such as the video template with template number MEDIA_V_H265. This video template contains H265 video encoding configuration with payload types 125 and 122, as well as additional video attributes MEDIA_V_EXTMAP.

[0053] Retrieve the second target video template corresponding to the second video template identifier from the media template library, such as the video template with template number MEDIA_V_H264. This video template contains H264 video encoding configuration with payload types 123 and 120, as well as additional video attributes MEDIA_V_EXTMAP.

[0054] After the above retrieval is completed, the receiving end establishes an association mapping between the instruction template and the target audio template and the target video template, providing a complete template data foundation for subsequent dynamic field filling and generation of target SDP signaling.

[0055] S104. Fill each field in the dynamically changing field set and the target media template into the corresponding filling position of the instruction template to generate target SDP signaling.

[0056] After retrieving the instruction template and target media template, according to the preset field mapping rules in the instruction template, the dynamically changing field set obtained by parsing and the retrieved target media template content are sequentially filled into the corresponding filling positions of the instruction template to generate complete target SDP signaling.

[0057] Further, obtain the field mapping rules corresponding to the instruction template; according to the field mapping rules, fill each field in the dynamically changing field set and the target media template into the corresponding filling position of the instruction template to generate the target SDP signaling.

[0058] The field mapping rules are defined within the instruction template, using placeholders to mark the filling positions of each field. Each placeholder corresponds one-to-one with a dynamic field, ensuring accurate filling. Multi-level nested mappings are also supported to adapt to complex media negotiation scenarios. During the filling process, field types and value ranges are automatically validated, and outliers are marked or replaced with default values ​​to ensure the generated SDP signaling syntax is legal and semantically compliant. For example, the instruction template placeholders include numeric placeholders (such as {value-random number}, {value-version number}, {value-ice account}, {value-ice password}, ​​{value-fingerprint}, {value-IP address}, {value-ssrc value}) and template placeholders (such as {value-MEDIA_A_EXTMAP}, {value-audio codec related configuration}, {value-MEDIA_V_EXTMAP}, {value-video codec related configuration}). Numeric placeholders correspond to specific field values ​​in the dynamically changing field set, while template placeholders correspond to attribute sets in the target media template.

[0059] In one embodiment, according to the field mapping rules, the session version number from the dynamically changing field set is filled into the session origin row of the instruction template, replacing the {value-version number} placeholder; the ICE authentication credentials (ice-ufrag and ice-pwd) are filled into the ICE authentication attribute rows of the audio description information and video description information, respectively, replacing the {value-ice account} and {value-ice password} placeholders; the fingerprint value is filled into the fingerprint attribute rows of the audio description information and video description information, replacing the {value-fingerprint} placeholder; the IP address is filled into the connection information row and RTCP attribute row, replacing the {value-IP address} placeholder; and the SSRC value and media stream identifier information are filled into the SSRC attribute row, replacing the {value-ssrc value} placeholder and the corresponding media stream name placeholder. Simultaneously, the additional audio attribute (MEDIA_A_EXTMAP) from the target audio template is filled into the extended attribute position of the audio description information, replacing the placeholder {value-MEDIA_A_EXTMAP}; the audio encoding configuration from the target audio template is filled into the encoding / decoding attribute position of the audio description information in order of payload type value, replacing the placeholder {value-audio encoding / decoding related configuration}; the additional video attribute (MEDIA_V_EXTMAP) from the target video template is filled into the extended attribute position of the video description information, replacing the placeholder {value-MEDIA_V_EXTMAP}; the video encoding configuration from the target video template is filled into the encoding / decoding attribute position of the video description information in order of payload type value, replacing the placeholder {value-video encoding / decoding related configuration}.

[0060] In one specific embodiment, the receiving end performs a filling operation based on the retrieved instruction template SDP_AVV, target audio template MEDIA_A_MPEG4, first target video template MEDIA_V_H265, and second target video template MEDIA_V_H264.

[0061] Specifically, the content of the SDP signaling data transmission is as follows: { Random number: 6404891221946631881 Version number: 3 Template number: "SDP_AVV" ice username: "rx7B", #account for audio and video information ice username: "hK2ijt6d5CpjIgSJf1x1Jc6A", #password for audio and video information Fingerprint: “CD:BB:10:C7:70:C3:BB:3E:01:1D:87:39:BF:AC:72:22:48:DF:13:08:7A:52:77:8D:DF:5C:1C:8D:70:D3:EA:F3”, # Encryption method for audio and video information Audio: { Extmap: “MEDIA_A_EXTMAP”, Support: [ { pt: 96 , MEDIA_A_MPEG4}, { pt: 101, MEDIA_A_RED}, ... [,# This contains the template number that needs to be called for this audio information. The instruction template can then use this number to directly call the specific audio / video encoding configuration and corresponding additional audio / video attributes from the media template.] Ssrc: 2868833191# Audio stream ID cname: nTYj / ZdKKye1YYE / # The name of the audio stream }, Video: { Extmap: “MEDIA_V_EXTMAP”, Support: [ { pt: 125, MEDIA_V_MP5}, { pt: 123, MEDIA_A_MP4}, ... ], Ssrc: [237618651, 1090309381]# The IDs of the two video streams } } The transmitted data includes random numbers (such as 6404891221946631881), version numbers (such as 3), template numbers (such as SDP_AVV), ICE usernames (including account and password), encryption methods for audio and video information, and audio and video description information.

[0062] Based on the above instruction template example, the random number will be filled into the session origin line o=- {value-random number} {value-version number} IN IP4 127.0.0.1, replacing the {value-random number} placeholder; the version number 3 will be filled into the same position, replacing the {value-version number} placeholder, forming o=- 6404891221946631881 3 IN IP4 127.0.0.1. Enter the ICE username "rx7B" into a=ice-ufrag:{value-ice account}, forming a=ice-ufrag:rx7B; enter the ICE password "hK2ijt6d5CpjIgSJf1x1Jc6A" into a=ice-pwd:{value-ice password}, ​​forming a=ice-pwd:hK2ijt6d5CpjIgSJf1x1Jc6A; enter the encryption method into a=fingerprint:sha-256 {value-fingerprint}, forming a complete fingerprint attribute line. This generates the following instruction header information: v=0 o=- 6404891221946631881 3 IN IP4 127.0.0.1 s=- t=0 0 a=group:BUNDLE 0 1 2 a=extmap-allow-mixed a=msid-semantic: WMS myKvsVideoStream m=audio 9 UDP / TLS / RTP / SAVPF 96 97 98 99 100 101 0 8 118 105 13 110127 113 126 c=IN IP4 0.0.0.0 a=rtcp:9 IN IP4 0.0.0.0 a=ice-ufrag:rx7B a=ice-pwd:hK2ijt6d5CpjIgSJf1x1Jc6A a=ice-options:trickle renomination a=fingerprint:sha-256 CD:BB:10:C7:70:C3:BB:3E:01:1D:87:39:BF:AC:72:22:48:DF:13:08:7A:52:77:8D:DF:5C:1C:8D:70:D3:EA:F3 a=setup:actpass a=mid:0 Then, fill the MEDIA_A_EXTMAP content (extmap:1 to extmap:5 attribute rows) in the target audio template into the {value-MEDIA_A_EXTMAP} position; fill the MPEG4 audio encoding configuration (rtpmap, rtcp-fb, fmtp attribute rows) corresponding to the payload type values ​​[96,97,98,99,100] into the {value-audio encoding / decoding related configuration} position in sequence; fill the SSRC value 2868833191 and the media stream name into the a=ssrc:{value-ssrc value} cname:{nTYj / ZdKKye1YYE / } and a=ssrc:{value-ssrc value} msid:myKvsVideoStream AudioTrack_0 positions to obtain the target audio description information as shown below: a=rtpmap:96 mpeg4-generic / 8000 a=rtcp-fb:96 transport-cc a=rtcp-fb:96 nack a=rtpmap:97 red / 8000 a=fmtp:97 96 / 96 a=rtpmap:98 mpeg4-generic / 16000 a=rtcp-fb:98 transport-cc a=rtcp-fb:98 nack a=rtpmap:99 red / 16000 a=fmtp:99 98 / 98 a=rtpmap:100 mpeg4-generic / 48000 a=rtcp-fb:100 transport-cc a=rtcp-fb:100 nack a=rtpmap:101 red / 48000 a=fmtp:101 100 / 100 a=rtpmap:0 PCMU / 8000 a=rtpmap:8 PCMA / 8000 a=rtpmap:118 PCMA / 16000 a=rtpmap:105 CN / 16000 a=rtpmap:13 CN / 8000 a=rtpmap:110 PCMU / 16000 a=rtpmap:127 telephone-event / 48000 a=rtpmap:113 telephone-event / 16000 a=rtpmap:126 telephone-event / 8000# Specific information retrieved from the audio template a=ssrc:2868833191 cname:nTYj / ZdKKye1YYE / #audio stream number a=ssrc:2868833191 msid:myKvsVideoStream AudioTrack_0 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level a=extmap:2 http: / / www.webrtc.org / experiments / rtp-hdrext / abs-send-time a=extmap:3 http: / / www.ietf.org / id / draft-holmer-rmcat-transport-wide-cc-extensions-01 a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid a=extmap:5 http: / / www.webrtc.org / experiments / rtp-hdrext / abs-capture-time a=sendrecv a=msid:myKvsVideoStream AudioTrack_0 a=rtcp-mux#Additional audio attributes In the first target video template, the same ICE authentication credential, fingerprint value, and IP address are entered; the MEDIA_V_EXTMAP content from the first target video template is entered into the corresponding positions; the H265 video encoding configuration corresponding to the payload type value [125, 122] is entered into the encoding / decoding attribute positions in sequence to obtain the first target video description information. In the second target video template, the same dynamic field entry operation is performed; the MEDIA_V_EXTMAP content from the second target video template is entered into the corresponding positions; the H264 video encoding configuration corresponding to the payload type value [123, 120] is entered into the encoding / decoding attribute positions in sequence to obtain the second target video description information. It is understood that the target video template and the target audio template are entered in the same way. The entry method and results for the target video template can be referenced from the entry example for the target audio template, and will not be elaborated upon in this embodiment.

[0063] After the above filling operation is completed, each placeholder is replaced by the actual field value or template content. The generated instruction header information, target audio description information, first target video description information, and second target video description information constitute a complete target SDP signaling text, which includes the standard v=0 session description line, o=session origin line, s=session name line, t=time description line, audio m=media description line, video m=media description line, and all a=attribute lines, which can be directly used for subsequent ICE connectivity checks and media stream transmission establishment.

[0064] In one embodiment, the existing SDP signaling transmission optimization scheme still has the problem of historical cache dependency. The existing SDP signaling transmission optimization scheme (such as differential transmission based on historical SDP) requires both communicating parties to maintain session context and cache historical SDP. However, in scenarios such as stateless sessions (device restart or session state loss), first session establishment, cross-device switching, and memory-constrained devices, historical cache may not be available.

[0065] To address the historical cache dependency issue in the existing technology, this application further proposes the following: when an interactive connection restart event is triggered, an incremental request packet sent by the sending end is received; the incremental request packet is parsed to determine the incremental instruction template, the incremental media template, and the incremental dynamic change field set; each field in the incremental dynamic change field set and the incremental media template are filled into the incremental instruction template to generate reconstructed SDP signaling.

[0066] In one embodiment, during the session, an interactive connection restart event is triggered when a network switch, IP address change, or other event causing the existing transmission path to fail occurs. In this case, the sending end does not need to retransmit the complete SDP signaling; instead, it constructs a lightweight incremental request packet and sends it to the receiving end.

[0067] After receiving the incremental request packet, the receiving end performs a parsing operation. Specifically, it identifies whether the incremental request packet contains an incremental instruction template identifier. If it does, the incremental instruction template identifier is extracted; otherwise, the instruction template identifier already negotiated in the current session context is reused. It also identifies whether the incremental request packet contains an incremental media template identifier. If it does, the incremental media template identifier is extracted; otherwise, the media template identifier already negotiated in the current session context is reused. Finally, it extracts the incremental dynamic change field set from the incremental request packet, including the new session version number, the new interactive connection establishment authentication credentials (new ice-ufrag and ice-pwd), and the updated network address.

[0068] After parsing the incremental request packet, the receiving end retrieves the corresponding incremental instruction template from its local preset instruction template library using the incremental instruction template identifier as an index; and retrieves the corresponding incremental media template from its local preset media template library using the incremental media template identifier as an index. The structures of the incremental instruction template and the incremental media template are consistent with the template structure retrieved during the initial session establishment, ensuring consistency in the SDP signaling format.

[0069] Subsequently, the receiving end fills in each field from the incremental dynamic change field set and the incremental media template into the corresponding positions according to the preset field mapping rules in the incremental instruction template. The filling operation is the same as the filling operation logic during the initial session establishment. Specifically, the new session version number is filled into the session origin line, the new interactive connection establishment authentication credential is filled into the ICE authentication attribute line, the updated network address is filled into the connection information line, and the codec configuration and extended attributes in the incremental media template are filled into the corresponding codec attribute lines and extended attribute lines. After the filling is completed, a reconstruction SDP signaling is generated. This reconstruction SDP signaling contains the updated session parameters and can be directly used to perform interactive connection connectivity checks and complete the transmission path reconstruction.

[0070] In one specific embodiment, the receiving end and the sending end have completed the initial session establishment. Assume that the instruction template identifier recorded in the current session context is SDP_AV, and the media template identifiers are MEDIA_A_MPEG4 and MEDIA_V_H264. When the sending end switches from a WiFi network to a 4G network, an interactive connection restart event is triggered.

[0071] The sending end constructs an incremental request packet containing: a new session version number, a new ICE username, a new ICE password, and an updated network address. Since the current session's SDP_AV instruction template and MEDIA_A_MPEG4 and MEDIA_V_H264 media templates are reused, the transmission of the instruction template identifier and media template identifier is omitted in this incremental request packet.

[0072] After receiving the incremental request packet, the receiving end parses it: No incremental instruction template identifier was detected, so SDP_AV in the session context is reused; No incremental media template identifier was detected, so MEDIA_A_MPEG4 and MEDIA_V_H264 in the session context are reused; The incremental dynamic change field set (version number, ICE username, ICE password, IP address) is extracted.

[0073] The receiving end retrieves the SDP_AV incremental instruction template from the instruction template library and the MEDIA_A_MPEG4 and MEDIA_V_H264 incremental media templates from the media template library. It fills the version number into the session origin line; the new ICE authentication credentials into the ICE authentication attribute lines of the audio and video description information; the IP address into the connection information line; and the media template content into the corresponding codec attribute lines and extended attribute lines, thereby generating the reconstructed SDP signaling.

[0074] After generating the reconstructed SDP signaling, the receiving end performs a new interactive connection connectivity check based on the signaling, establishes a transmission channel with the updated network address using the new ICE authentication credentials, and completes the interactive connection reconstruction. Throughout the reconstruction process, the receiving end does not call any historical SDP cache, but only relies on the locally pre-built template library and lightweight incremental request packets to complete the signaling reconstruction, reducing the signaling transmission time from the traditional second level to the millisecond level.

[0075] This embodiment provides an SDP signaling transmission method. This method parses the signaling data sent by the sender, extracting instruction template identifiers, media template identifiers, and dynamically changing field sets. This eliminates the need for the sender to transmit the complete SDP text, replacing it with lightweight identifiers and a few dynamic fields, thus reducing the amount of signaling data transmitted. By retrieving the instruction template corresponding to the instruction template identifier from a preset instruction template library, network transmission of standard session structure fields is avoided, eliminating the redundant overhead of fixed-frame content. By retrieving the target media template corresponding to the media template identifier from a preset media template library, the lengthy encoding / decoding attribute set is replaced with a short identifier, compressing the signaling volume. By filling the fields from the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template to generate the target SDP signaling, the receiver can complete the signaling reconstruction locally without relying on historical caches. This avoids the parsing, splicing, and transmission latency of the complete SDP signaling, reducing the amount of SDP signaling data transmitted and thus improving the transmission efficiency of SDP signaling.

[0076] Please see Figure 2 , Figure 2 This is a schematic diagram of the structure of a first embodiment of an SDP signaling transmission device provided in this application. The SDP signaling transmission device is used to execute the aforementioned SDP signaling transmission method.

[0077] like Figure 2 As shown, the SDP signaling transmission device 200 includes: a data parsing module 201, an instruction template acquisition module 202, a media template acquisition module 203, and a signaling generation module 204.

[0078] The data parsing module 201 is used to parse the signaling data sent by the sending end and extract the instruction template identifier, media template identifier and dynamically changing field set; The instruction template acquisition module 202 is used to retrieve the instruction template corresponding to the instruction template identifier from a preset instruction template library; The media template acquisition module 203 is used to retrieve the target media template corresponding to the media template identifier from a preset media template library; The signaling generation module 204 is used to fill the fields in the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template to generate target SDP signaling.

[0079] It should be noted that those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the above-described apparatus and modules can be referred to the corresponding processes in the aforementioned SDP signaling transmission method embodiments, and will not be repeated here.

[0080] The apparatus provided in the above embodiments can be implemented as a computer program, which can be used in, for example... Figure 3 It runs on the computer device shown.

[0081] Please see Figure 3 , Figure 3 This is a schematic block diagram of a computer device provided in an embodiment of this application. The computer device may be a server.

[0082] See Figure 3 The computer device includes a processor, memory, and network interface connected via a system bus, wherein the memory may include non-volatile storage media and internal memory.

[0083] Non-volatile storage media can store operating systems and computer programs. These computer programs include program instructions that, when executed, cause the processor to perform any SDP signaling transmission method.

[0084] The processor provides computing and control capabilities, supporting the operation of the entire computer device.

[0085] Internal memory provides an environment for the execution of computer programs stored in non-volatile storage media. When the computer program is executed by the processor, it enables the processor to execute any SDP signaling transmission method.

[0086] This network interface is used for network communication, such as sending assigned tasks. Those skilled in the art will understand that... Figure 3 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0087] It should be understood that the processor can be a Central Processing Unit (CPU), but it can also be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. Among these, a general-purpose processor can be a microprocessor or any conventional processor.

[0088] In one embodiment, the processor is configured to run a computer program stored in memory to perform the following steps: The signaling data sent by the sending end is parsed to extract the instruction template identifier, media template identifier, and dynamically changing field set; Retrieve the instruction template corresponding to the instruction template identifier from the preset instruction template library; Retrieve the target media template corresponding to the media template identifier from the preset media template library; The target SDP signaling is generated by filling the corresponding fields in the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template.

[0089] The embodiments of this application also provide a computer-readable storage medium storing a computer program, the computer program including program instructions, and the processor executing the program instructions to implement any of the SDP signaling transmission methods provided in the embodiments of this application.

[0090] The computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiments, such as the hard disk or memory of the computer device. The computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, SmartMediaCard (SMC), SecureDigital (SD) card, or FlashCard equipped on the computer device.

[0091] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any person skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope disclosed in this application, and these modifications or substitutions should all be covered within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A method of SDP signaling, the method comprising: The method includes: The signaling data sent by the sending end is parsed to extract the instruction template identifier, media template identifier, and dynamically changing field set; Retrieve the instruction template corresponding to the instruction template identifier from the preset instruction template library; Retrieve the target media template corresponding to the media template identifier from the preset media template library; The target SDP signaling is generated by filling the corresponding fields in the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template.

2. The SDP signaling transmission method of claim 1, wherein, The media template identifier includes an audio template identifier and / or a video template identifier; the target media template includes a target audio template and / or a target video template.

3. The SDP signaling transmission method according to claim 2, wherein, The step of retrieving the target media template corresponding to the media template identifier from the preset media template library includes: Based on the audio template identifier, at least one of the target audio templates is retrieved from the media template library; and / or, Based on the video template identifier, at least one of the target video templates is retrieved from the media template library.

4. The SDP signaling transmission method according to claim 2, wherein, The target audio template includes audio configuration information and additional audio attributes; the target video template includes video configuration information and additional video attributes.

5. The SDP signaling transmission method according to claim 1, characterized in that, The step of filling each field in the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template to generate target SDP signaling includes: Obtain the field mapping rules corresponding to the instruction template; According to the field mapping rules, each field in the dynamically changing field set and the target media template are filled into the corresponding filling position of the instruction template to generate the target SDP signaling.

6. The SDP signaling transmission method according to claim 1, characterized in that, After filling the fields in the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template to generate the target SDP signaling, the method further includes: When an interactive connection restart event is triggered, receive incremental request packets sent by the sender; The incremental request packet is parsed to determine the incremental instruction template, the incremental media template, and the incremental dynamic change field set; Fill each field in the incremental dynamic change field set and the incremental media template into the incremental instruction template to generate reconstructed SDP signaling.

7. The SDP signaling transmission method according to claim 1, characterized in that, Before parsing the signaling data sent by the sending end and extracting the instruction template identifier, media template identifier, and dynamically changing field set, the process also includes: When pairing with the sending end for the first time, at least one instruction template and at least one target media template are predefined; Assign a unique instruction template identifier to each instruction template to construct the instruction template library; Assign a unique media template identifier to each target media template to construct the media template library.

8. The SDP signaling transmission method according to claim 1, characterized in that, The dynamically changing field set includes at least the session version number, interactive connection establishment authentication credentials, fingerprint value, IP address, synchronization source identifier value, and media stream identifier information.

9. An SDP signaling transmission device, characterized in that, The SDP signaling transmission device includes: The data parsing module is used to parse the signaling data sent by the sending end and extract the instruction template identifier, media template identifier, and dynamically changing field set; The instruction template acquisition module is used to retrieve the instruction template corresponding to the instruction template identifier from a preset instruction template library; The media template acquisition module is used to retrieve the target media template corresponding to the media template identifier from a preset media template library; The signaling generation module is used to fill the fields in the dynamically changing field set and the target media template into the corresponding filling positions of the instruction template to generate target SDP signaling.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, wherein when the computer program is executed by a processor, it implements the steps of the SDP signaling transmission method as described in any one of claims 1 to 8.