Data encapsulation method and electronic device

By adding a custom box to the image file format to store T.35 information, the problem that existing formats cannot carry HDR metadata is solved, and the correct display and tone mapping of HDR images are achieved.

WO2025152050A9PCT designated stage Publication Date: 2026-06-18HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2024-01-16
Publication Date
2026-06-18

Smart Images

  • Figure CN2024072644_18062026_PF_FP_ABST
    Figure CN2024072644_18062026_PF_FP_ABST
Patent Text Reader

Abstract

Provided in the embodiments of the present application are a data encapsulation method and an electronic device. The method comprises: first, acquiring T.35 information, wherein the T.35 information comprises a country code, a terminal provider code and a terminal provider oriented code; and then encapsulating the T.35 information into an image file format, wherein the image file format comprises a first box, and the first box comprises the T.35 information. In this way, a file can carry the T.35 information; and when the T.35 information carries multimedia metadata (comprising HDR metadata), correct display of an HDR image is completed by means of restoring and correctly using the HDR metadata at a decoding end.
Need to check novelty before this filing date? Find Prior Art

Description

Data encapsulation methods and electronic devices Technical Field

[0001] This application relates to the field of data processing, and more particularly to a data encapsulation method and an electronic device. Background Technology

[0002] Currently, image file formats such as High Efficiency Image File Format (HEIF) and AV1 Image File Format (AVIF) can encapsulate some metadata. However, these formats do not support encapsulating HDR metadata commonly found in T.35 information, which is part of High Dynamic Range (HDR) standards. T.35 information can carry information specified in the ITU-T (International Telecommunication Union Telecommunication Standardization Subcommittee on Telecommunication Standards) T.35 specification, "Procedures for the Allocation of Non-Standard Extension Codes Defined by the ITU." Therefore, existing image file formats cannot carry T.35 information, nor can they recover and use the correct HDR metadata at the decoding end (e.g., perform tone mapping based on the correct HDR metadata) to achieve correct display of HDR images.

[0003] Summary of the Invention

[0004] In view of this, this application provides a data encapsulation method and an electronic device.

[0005] In a first aspect, embodiments of this application provide a data encapsulation method, the method comprising: firstly, acquiring T.35 information; wherein the T.35 information includes: a country code, a terminal provider code, and a terminal provider oriented code; and then, encapsulating the T.35 information into an image file format; wherein the image file format includes a first box, the first box including the T.35 information.

[0006] Specifically, this application can pre-define T.35 information and its position within the image file format according to the specifications / standards of the image file format. Then, the T.35 information can be encapsulated into the image file format according to the pre-defined T.35 information and its position. This enables the image file format to carry T.35 information. When the T.35 information carries multimedia metadata (which may include HDR metadata), the HDR metadata can be recovered and correctly used at the decoding end to achieve the correct display of the HDR image.

[0007] For example, this application can add a custom box to the existing image file format. This new box can be used to store T.35 information, and the position of the new box can be defined. To distinguish the new box from the existing boxes in the image file format, the new box can be called the first box (in some scenarios, it can also be called the it35 Box, or other names; this application does not limit this). In other words, this application can encapsulate T.35 information into the first box of the image file format.

[0008] For example, image file formats may include, but are not limited to, JPEG, WebP, PNG, HEIF, AVIF, and JXL, etc., and this application does not limit them. This application uses HEIF as an example for illustration.

[0009] For example, a box is a general mechanism for organizing and storing data, enabling files to contain various types of information in a flexible manner. The hierarchy and content of boxes are defined by relevant standard specifications to ensure correct file parsing and processing. The term "box" may have other names in other image file formats or newly developed image file formats; this application does not limit the name of the box.

[0010] According to the first aspect, the image file format also includes a second box, which includes any one of the following: a metadata box or a streaming data box; the first box is nested within the second box.

[0011] For example, a metadata box, which can be used to store multimedia metadata, can be called a MetaBox (meta).

[0012] For example, a streaming data box, which can be used to store a sequence of images, can be called a MovieBox (moov).

[0013] In other words, the first box can be nested within a MetaBox or a MovieBox; this application does not impose any restrictions on this.

[0014] According to the first aspect, or any implementation of the first aspect above, the second box includes multiple third boxes, and the multiple third boxes have a hierarchical relationship; the first box is nested in any level of the third box.

[0015] For example, when the second box is a metadata box, the metadata box may include multiple third boxes, including but not limited to: item information box (iinf), item information entry box (infe), item property container box (ipco), item properties association box (ipma), and item location box (iloc), etc.; the hierarchical relationship of these multiple third boxes can be as shown in structure 1 in the specific embodiment; the first box can be nested in any level of the third box.

[0016] When the second box is a streaming data box, the streaming data box may include multiple third boxes, including but not limited to: SampleDescriptionBox (stsd), Sample Group Description Box (sgpd), metadata Sample Entry Box, timed metadata Track Box, Track Box, etc. The hierarchical relationship of multiple third boxes can be as shown in structure 2 in the specific implementation; the first box can be nested in any level of third box.

[0017] According to the first aspect, or any implementation of the first aspect above, the image file format further includes a project identifier and a media box. The project identifier indicates that the T.35 information is a project. The T.35 information includes header information and payload. The header information includes a country code, a terminal provider code, and a terminal provider orientation code. The payload includes multimedia metadata. The first box includes the header information of the T.35 information, and the media box includes the payload of the T.35 information.

[0018] For example, an item is data that does not require periodic processing (i.e., untimed data), as opposed to sample data. An item identifier can be called an item ID.

[0019] For example, the payload can be called the payload; the header information can be called the header.

[0020] For example, the metadata for multimedia may include HDR metadata.

[0021] For example, HDR metadata may include: static metadata and / or dynamic metadata.

[0022] For example, static metadata, including the color space of the mastering display, the maximum and minimum brightness of the mastering display, the maximum brightness of the video sequence, and the maximum average brightness of the video sequence, is functionally used to reference the display for correct display and to reproduce the creator's intent. In a video sequence, this is static data that does not change over time. For example, the maximum brightness of a video sequence is the maximum value of the maximum brightness across all frames.

[0023] For example, dynamic metadata includes most of the static metadata and also adds metadata related to tone mapping on the display device. Functionally, it contains information closely related to the display and tone mapping processes, such as indicating the selection of tone mapping algorithms and parameters from high dynamic range to relatively low dynamic range. In video sequences, dynamic metadata often changes over time; for example, the maximum brightness of the image and the maximum brightness of the main display change during scene transitions.

[0024] For example, a media box, which can be used to store image data, can be called a MediaDataBox(mdat).

[0025] This approach is suitable for situations where T.35 information carries static metadata.

[0026] It should be understood that multimedia metadata can be image metadata, audio metadata, video metadata, etc., and this application embodiment does not limit this.

[0027] According to the first aspect, or any implementation of the first aspect above, the T.35 information is a project, and the metadata box includes an image information box; the first box is nested in the image information box, and the first box is a project information entry box.

[0028] For example, an image information box, which can be used to store project information, can be called an item information box (iinfBox).

[0029] For example, an item information entry box, which can be used to store item information entries, can be called an item information entry box (infe box). That is, the first box is an item information entry box.

[0030] This approach is suitable for situations where T.35 information carries static metadata.

[0031] According to the first aspect, or any of the above implementations of the first aspect, the T.35 information is a project attribute, and the metadata box includes a project attribute container box and a project attribute association box; the first box is nested in the project attribute container box, and the project attribute association box includes the association information between the T.35 information and the corresponding project.

[0032] For example, a project attribute can be called an item property.

[0033] For example, an item property container box, which can be used to store item properties, can be called an item property container box (ipco box).

[0034] For example, the item property association box, which can be used to store item property association information, can be called the item Properties Association Box (ipma Box).

[0035] For example, Property is a property related to an item.

[0036] This approach is suitable for situations where T.35 information carries static metadata.

[0037] According to the first aspect, or any implementation of the first aspect above, the T.35 information is a sample entry, the stream data box includes a track box, the track box includes a sample description box; the first box is nested in the sample description box.

[0038] For example, a sample entry can be called a Sample Entry. Here, a Sample is all the data associated with a single time point. Alternatively, a Sample is data at a specific point in time, or timed data, or time-varying data.

[0039] For example, a track box can also be called a Track Box. A track is a timing sequence of related samples in an ISO basic media file. In multimedia, a track corresponds to a sequence of images or sampled audio; in cue tracks, a track corresponds to a streaming channel.

[0040] It should be noted that in HEIF, continuous or timed media or metadata streams form a track, while static media or metadata are stored as items.

[0041] For example, the sample description box, which can be used to store sample entries, can be called a SampleDescriptionBox (stsd Box).

[0042] This approach is suitable for situations where T.35 information carries static metadata.

[0043] According to the first aspect, or any implementation of the first aspect above, the T.35 information is a sample group entry, the stream data box includes a track box, the track box includes a sample group description box; the first box is nested in the sample group description box.

[0044] For example, the sample group description box can store attributes describing the sample group and can be called the Sample Group Description Box (sgpd Box).

[0045] Since all entries in a SampleToGroupBox or CompactSampleToGroupBox with the same grouping_type_parameter value should map to the same type of T.35 message, this means that there are at least the same ITU-T T.35 country code, terminal provider code, and terminal provider guidance code, as well as T.35-specific type and version information (if applicable) for T.35 sample group entries. This is particularly useful when the metadata carried by the T.35 message is not frequent (i.e., the number of identical metadata entries is small) and will not cause problems due to the size of the MovieBox.

[0046] This approach reduces the redundancy of storing the same metadata in each sample and is suitable for situations where T.35 information carries static or dynamic metadata.

[0047] According to the first aspect, or any implementation of the first aspect above, the T.35 information is a sample entry, the streaming data box includes a track box, the track box includes a sample description box and a sample group description box; the T.35 information includes header information and payload, the header information includes a country code, a terminal provider code and a terminal provider orientation code, and the payload includes multimedia metadata; the first box includes the header information of the T.35 information, and the first box is nested in the sample group description box; the sample description box includes the payload of the T.35 information.

[0048] This allows for the processing of T.35 metadata that is very frequent (with a large number of identical metadata entries).

[0049] This approach is suitable for situations where T.35 information carries either static or dynamic metadata. Dynamic metadata is part of the payload of the T.35 information.

[0050] According to the first aspect, or any implementation of the first aspect above, the T.35 information is a sample entry, the streaming data box includes a timing metadata track box, the timing metadata track box includes a metadata sample entry box, and the first box is nested in the metadata sample entry box.

[0051] For example, a timed metadata track box, which can be used to store timed metadata, can be called a timed metadata track box.

[0052] For example, a metadata sample entry box, which can be used to store timed metadata entries, may be called a metadata sample entry box.

[0053] This approach is suitable for situations where T.35 information carries dynamic metadata.

[0054] According to the first aspect, or any implementation of the first aspect above, the method further includes: acquiring encoded multimedia data; encapsulating the encoded multimedia data into an image file format, the image file format further including a media box or a streaming data box, the media box or streaming data box including the encoded multimedia data. In this way, encoded multimedia data and T.35 information can be encapsulated into the same image file format.

[0055] According to the first aspect, or any implementation of the first aspect above, the T.35 information also includes multimedia metadata.

[0056] Secondly, embodiments of this application provide a data acquisition method, which includes: first, receiving an image file format; wherein the image file format includes a first box, the first box including T.35 information; then, reading the T.35 information from the first box of the image file format; wherein the T.35 information includes: a country code, a terminal provider code, and a terminal provider targeting code.

[0057] According to the second aspect, the image file format also includes a second box, which includes any one of the following: a metadata box or a streaming data box; the first box is nested within the second box.

[0058] According to the second aspect, or any implementation of the second aspect above, the second box includes multiple third boxes, and the multiple third boxes have a hierarchical relationship; the first box is nested in any level of the third box.

[0059] According to the second aspect, or any implementation of the second aspect above, the image file format also includes a project identifier and a media box. The project identifier indicates that the T.35 information is a project. The T.35 information includes header information and payload. The header information includes a country code, a terminal provider code, and a terminal provider orientation code. The payload includes multimedia metadata. The first box includes the header information of the T.35 information, and the media box includes the payload of the T.35 information.

[0060] According to the second aspect, or any implementation of the second aspect above, the T.35 information is a project, and the metadata box includes an image information box; the first box is nested in the image information box, and the first box is a project information entry box.

[0061] According to the second aspect, or any implementation of the second aspect above, the T.35 information is a project attribute, and the metadata box includes a project attribute container box and a project attribute association box; the first box is nested in the project attribute container box, and the project attribute association box includes the association information between the T.35 information and the corresponding project.

[0062] According to the second aspect, or any implementation of the second aspect above, the T.35 information is a sample entry, the stream data box includes a track box, the track box includes a sample description box; the first box is nested in the sample description box.

[0063] According to the second aspect, or any implementation of the second aspect above, the T.35 information is a sample group entry, the stream data box includes a track box, the track box includes a sample group description box; the first box is nested in the sample group description box.

[0064] According to the second aspect, or any implementation of the second aspect above, the T.35 information is a sample entry, the streaming data box includes a track box, the track box includes a sample description box and a sample group description box; the T.35 information includes header information and payload, the header information includes a country code, a terminal provider code and a terminal provider orientation code, and the payload includes multimedia metadata; the first box includes the header information of the T.35 information, and the first box is nested in the sample group description box; the sample description box includes the payload of the T.35 information.

[0065] According to the second aspect, or any implementation of the second aspect above, the T.35 information is a sample entry, the streaming data box includes a timing metadata track box, the timing metadata track box includes a metadata sample entry box, and the first box is nested in the metadata sample entry box.

[0066] According to the second aspect, or any implementation of the second aspect above, the image file format further includes a media box or a streaming data box, the media box or streaming data box including encoded multimedia data; the method further includes: obtaining the encoded multimedia data from the media box or streaming data box of the image file format; decoding the encoded multimedia data to obtain reconstructed multimedia data.

[0067] The second aspect and any implementation thereof correspond to the first aspect and any implementation thereof, respectively. The technical effects of the second aspect and any implementation thereof are similar to those of the first aspect and any implementation thereof, and will not be repeated here.

[0068] Thirdly, embodiments of this application provide an image file format, which is encapsulated according to the first aspect or any implementation thereof.

[0069] The third aspect and any implementation thereof correspond to the first aspect and any implementation thereof, respectively. The technical effects of the third aspect and any implementation thereof are similar to those of the first aspect and any implementation thereof, and will not be repeated here.

[0070] Fourthly, embodiments of this application provide a data encapsulation device, which includes:

[0071] The acquisition module is used to acquire T.35 information, which includes: country code, terminal provider code, and terminal provider targeting code.

[0072] An encapsulation module is used to encapsulate T.35 information into an image file format; wherein the image file format includes a first box, and the first box includes T.35 information.

[0073] For example, the data encapsulation device described above can be used to perform the method in the first aspect or any possible implementation of the first aspect.

[0074] The fourth aspect and any implementation thereof correspond to the first aspect and any implementation thereof, respectively. The technical effects of the fourth aspect and any implementation thereof are similar to those of the first aspect and any implementation thereof, and will not be repeated here.

[0075] Fifthly, embodiments of this application provide a data reading device, which includes:

[0076] A receiving module is used to receive an image file format; wherein the image file format includes a first box, and the first box includes T.35 information;

[0077] The reading module is used to read T.35 information from the first box of the image file format; wherein, the T.35 information includes: country code, terminal provider code and terminal provider orientation code.

[0078] The fifth aspect and any implementation thereof correspond to the second aspect and any implementation thereof, respectively. The technical effects of the fifth aspect and any implementation thereof are similar to those of the second aspect and any implementation thereof, and will not be repeated here.

[0079] In a sixth aspect, embodiments of this application provide an electronic device, including: a memory and a processor, the memory being coupled to the processor; the memory storing program instructions, which, when executed by the processor, cause the electronic device to perform the method in the first aspect or any possible implementation thereof.

[0080] The sixth aspect and any implementation thereof correspond to the first aspect and any implementation thereof, respectively. The technical effects of the sixth aspect and any implementation thereof are similar to those of the first aspect and any implementation thereof, and will not be repeated here.

[0081] In a seventh aspect, embodiments of this application provide an electronic device, including: a memory and a processor, the memory being coupled to the processor; the memory storing program instructions, which, when executed by the processor, cause the electronic device to perform the method in the second aspect or any possible implementation thereof.

[0082] The seventh aspect and any implementation thereof correspond to the second aspect and any implementation thereof, respectively. The technical effects corresponding to the seventh aspect and any implementation thereof are similar to those corresponding to the second aspect and any implementation thereof, and will not be repeated here.

[0083] Eighthly, embodiments of this application provide a chip including one or more interface circuits and one or more processors; the one or more processors receive or send data through the one or more interface circuits, and when the one or more processors execute computer instructions, the first aspect or any possible implementation thereof is executed.

[0084] The eighth aspect and any implementation thereof correspond to the first aspect and any implementation thereof, respectively. The technical effects corresponding to the eighth aspect and any implementation thereof are similar to those corresponding to the first aspect and any implementation thereof, and will not be repeated here.

[0085] Ninthly, embodiments of this application provide a chip including one or more interface circuits and one or more processors; the one or more processors receive or send data through the one or more interface circuits, and when the one or more processors execute computer instructions, the second aspect or any possible implementation thereof is executed.

[0086] The ninth aspect and any implementation thereof correspond to the second aspect and any implementation thereof, respectively. The technical effects corresponding to the ninth aspect and any implementation thereof are similar to those corresponding to the second aspect and any implementation thereof, and will not be repeated here.

[0087] In a tenth aspect, embodiments of this application provide a computer-readable storage medium storing a computer program that, when run on a computer or processor, causes the computer or processor to perform the method of the first aspect or any possible implementation thereof.

[0088] The tenth aspect and any implementation thereof correspond to the first aspect and any implementation thereof, respectively. The technical effects corresponding to the tenth aspect and any implementation thereof are similar to those corresponding to the first aspect and any implementation thereof, and will not be repeated here.

[0089] Eleventhly, embodiments of this application provide a computer-readable storage medium storing a computer program that, when run on a computer or processor, causes the computer or processor to perform the method in the second aspect or any possible implementation thereof.

[0090] The eleventh aspect and any implementation thereof correspond to the second aspect and any implementation thereof, respectively. The technical effects corresponding to the eleventh aspect and any implementation thereof can be found in the technical effects corresponding to the second aspect and any implementation thereof, as described above, and will not be repeated here.

[0091] In a twelfth aspect, embodiments of this application provide a computer program product including computer instructions that, when executed by a computer or processor, cause the computer or processor to perform the method in the first aspect or any possible implementation thereof.

[0092] The twelfth aspect and any implementation thereof correspond to the first aspect and any implementation thereof, respectively. The technical effects of the twelfth aspect and any implementation thereof are similar to those of the first aspect and any implementation thereof, and will not be repeated here.

[0093] In a thirteenth aspect, embodiments of this application provide a computer program product including computer instructions that, when executed by a computer or processor, cause the computer or processor to perform the method in the second aspect or any possible implementation thereof.

[0094] The thirteenth aspect and any implementation thereof correspond to the second aspect and any implementation thereof, respectively. The technical effects corresponding to the thirteenth aspect and any implementation thereof are similar to those corresponding to the second aspect and any implementation thereof, and will not be repeated here.

[0095] In a fourteenth aspect, embodiments of this application provide a computer-readable storage medium storing an image file format as described in the third aspect or any implementation thereof.

[0096] The fourteenth aspect and any implementation thereof correspond to the third aspect and any implementation thereof, respectively. The technical effects of the fourteenth aspect and any implementation thereof are similar to those of the third aspect and any implementation thereof, and will not be repeated here.

[0097] In a fifteenth aspect, embodiments of this application provide an apparatus for storing image file formats, the apparatus comprising: a receiver and at least one storage medium, the receiver being used to receive image file formats; and at least one storage medium being used to store the image file formats described in the third aspect or any implementation thereof.

[0098] The fifth aspect and any implementation thereof correspond to the third aspect and any implementation thereof, respectively. The technical effects of the fifth aspect and any implementation thereof can be found in the technical effects of the third aspect and any implementation thereof, as described above, and will not be repeated here.

[0099] In a sixteenth aspect, embodiments of this application provide an apparatus for transmitting an image file format, the apparatus comprising: a transmitter and at least one storage medium, the at least one storage medium being used to store the image file format in the third aspect or any implementation thereof; the transmitter being used to obtain the image file format from the storage medium and transmit the image file format to an end-side device via the transmission medium.

[0100] The sixteenth aspect and any implementation thereof correspond to the third aspect and any implementation thereof, respectively. The technical effects of the sixteenth aspect and any implementation thereof are similar to those of the third aspect and any implementation thereof, and will not be repeated here.

[0101] In a seventeenth aspect, embodiments of this application provide a system for distributing image file formats. The system includes: at least one storage medium for storing the image file formats described in the third aspect or any implementation thereof; and a streaming media device for obtaining a target image file format from the at least one storage medium and sending the target image file format to an end-side device, wherein the streaming media device includes a content server or a content distribution server.

[0102] The seventeenth aspect and any implementation thereof correspond to the third aspect and any implementation thereof, respectively. The technical effects corresponding to the seventeenth aspect and any implementation thereof are similar to those corresponding to the third aspect and any implementation thereof, and will not be repeated here.

[0103] Eighteenthly, embodiments of this application provide a data structure including a first box, the first box including T.35 information; wherein, the T.35 information includes: a country code, a terminal provider code, and a terminal provider oriented code.

[0104] For example, the data structure can be in an image file format.

[0105] According to the eighteenth aspect, the data structure also includes a second box, which includes any one of the following: a metadata box or a streaming data box; the first box is nested within the second box.

[0106] According to aspect eighteen, or any implementation of aspect eighteen above, the second box includes multiple third boxes, which have a hierarchical relationship; the first box is nested in any level of the third box.

[0107] According to aspect eighteen, or any implementation thereof, the data structure further includes a project identifier and a media box, wherein the project identifier indicates that the T.35 information is a project; the T.35 information includes header information and payload, wherein the header information includes a country code, a terminal provider code and a terminal provider orientation code, and the payload includes multimedia metadata; the first box includes the header information of the T.35 information, and the media box includes the payload of the T.35 information.

[0108] According to aspect 18, or any implementation of aspect 18 above, the T.35 information is a project, and the metadata box includes an image information box; the first box is nested within the image information box, and the first box is a project information entry box.

[0109] According to aspect 18, or any implementation of aspect 18 above, the T.35 information is a project attribute, and the metadata box includes a project attribute container box and a project attribute association box; the first box is nested in the project attribute container box, and the project attribute association box includes the association information between the T.35 information and the corresponding project.

[0110] According to aspect 18, or any implementation of aspect 18 above, T.35 information is a sample entry, the stream data box includes a track box, the track box includes a sample description box; the first box is nested in the sample description box.

[0111] According to aspect eighteen, or any implementation of aspect eighteen above, the T.35 information is a sample group entry, the stream data box includes a track box, the track box includes a sample group description box; the first box is nested in the sample group description box.

[0112] According to aspect eighteen, or any implementation thereof, the T.35 information is a sample entry; the streaming data box includes a track box, the track box includes a sample description box and a sample group description box; the T.35 information includes header information and payload; the header information includes a country code, a terminal provider code, and a terminal provider orientation code; the payload includes multimedia metadata; the first box includes the header information of the T.35 information, and the first box is nested within the sample group description box; the sample description box includes the payload of the T.35 information.

[0113] According to aspect 18, or any implementation of aspect 18 above, T.35 information is a sample entry, the streaming data box includes a timing metadata track box, the timing metadata track box includes a metadata sample entry box, and the first box is nested in the metadata sample entry box.

[0114] According to aspect eighteen, or any implementation thereof, the data structure further includes a media box or streaming data box, which includes encoded multimedia data. Attached Figure Description

[0115] Figure 1A is a schematic diagram illustrating an exemplary application scenario;

[0116] Figure 1B is an exemplary end-to-end process diagram of HDR video;

[0117] Figure 2A is a schematic diagram of an exemplary data encapsulation process 200;

[0118] Figure 2B is a schematic diagram illustrating an exemplary image file format;

[0119] Figure 2C is a schematic diagram illustrating an exemplary image file format;

[0120] Figure 3 is a schematic diagram of an exemplary data acquisition process 300;

[0121] Figure 4 is a schematic diagram of an exemplary data encapsulation device 400;

[0122] Figure 5 is a schematic diagram of an exemplary data reading device 500;

[0123] Figure 6 is a schematic block diagram of an encoding / decoding system used in an embodiment of this application;

[0124] Figure 7A is a block diagram of a content delivery system for implementing content distribution services as used in an embodiment of this application;

[0125] Figure 7B is a schematic diagram of an example structure of terminal device 2106 in Figure 7A;

[0126] Figure 8A is a schematic diagram of the workflow of a streaming media system used in an embodiment of this application;

[0127] Figure 8B is a schematic diagram of a streaming media system architecture used in an embodiment of this application;

[0128] Figure 9A is a schematic diagram of a possible system architecture applicable to the embodiments of this application;

[0129] Figure 9B is a schematic diagram of the structure of an image processing system provided in an embodiment of this application;

[0130] Figure 10 is a schematic diagram of the structure of an exemplary device. Detailed Implementation

[0131] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0132] In this article, the term "and / or" is merely a description of the relationship between related objects, indicating that there can be three relationships. For example, A and / or B can represent three situations: A exists alone, A and B exist simultaneously, and B exists alone.

[0133] The terms "first" and "second," etc., used in the specification and claims of this application are used to distinguish different objects, not to describe a specific order of objects. For example, "first target object" and "second target object," etc., are used to distinguish different target objects, not to describe a specific order of target objects.

[0134] In the embodiments of this application, the terms "exemplary" or "for example" are used to indicate that something is an example, illustration, or description. Any embodiment or design that is described as "exemplary" or "for example" in the embodiments of this application should not be construed as being more preferred or advantageous than other embodiments or design. Specifically, the use of the terms "exemplary" or "for example" is intended to present the relevant concepts in a specific manner.

[0135] In the description of the embodiments in this application, unless otherwise stated, "multiple" means two or more. For example, multiple processing units means two or more processing units; multiple systems means two or more systems.

[0136] Figure 1A is a schematic diagram of an exemplary application scenario.

[0137] Referring to FIG1A, by way of example, the first electronic device may include an image generation module, an information generation module, an image encoding module (or an image encoder), an encapsulation module, or a transmission module.

[0138] For example, the image encoding module can be a software module or a hardware module, and the embodiments of this application do not limit it in this way.

[0139] For example, the image generation module may include, but is not limited to, an image acquisition module, an image editing module, etc., and the embodiments of this application do not limit this.

[0140] It should be understood that FIG1A is only one example of a first electronic device. The first electronic device in other embodiments of this application has more or fewer modules than shown in FIG1A, and the embodiments of this application are not limited in this regard.

[0141] Referring to FIG1A, the second electronic device may, by way of example, include a display module, an image decoding module (or image decoder), a decapsulation module, and a receiving module. Exemplarily, the image decoding module may be a software module or a hardware module, and this application embodiment does not limit this. It should be understood that FIG1A is only one example of a second electronic device, and the second electronic device in other embodiments of this application has more modules than shown in FIG1A, and this application embodiment does not limit this.

[0142] Referring again to FIG1A, exemplarily, after the image generation module of the first electronic device generates an image, it can output the generated image to the image encoding module and the information generation module. Subsequently, the information generation module can generate T.35 information associated with the image and output the T.35 information to the encapsulation module. The image encoding module can encode the image to obtain an image bitstream (wherein, the bitstream can also be called a bit stream), and output the encoded image bitstream to the encapsulation module. Then, the encapsulation module can encapsulate the image bitstream and T.35 information into an image file format and output the image file format to the sending module. The sending module can then send this image file format to the second electronic device.

[0143] It should be noted that T.35 information is also binary; thus, it can be directly encapsulated / embedded into image file formats.

[0144] Subsequently, the receiving module of the second electronic device receives the image file format and outputs it to the decapsulation module. The decapsulation module then decapsulates the image file format, obtaining the image bitstream and T.35 information. Next, the decapsulation module outputs the image bitstream to the image decoding module and the T.35 information to the display module. Then, the image decoding module decodes the image bitstream to obtain the reconstructed image and outputs it to the display module, which displays the reconstructed image based on the T.35 information.

[0145] For example, the first electronic device includes, but is not limited to: a server, a PC (Personal Computer), a laptop computer, a tablet computer, a mobile phone, and a watch.

[0146] For example, the second electronic device includes, but is not limited to: PC, laptop, tablet, mobile phone, watch, head-mounted display (such as virtual reality (VR) / augmented reality (AR) headset, etc.).

[0147] For example, when the first electronic device is used for encoding and the second electronic device is used for decoding, the first electronic device can be referred to as the encoding end and the second electronic device can be referred to as the decoding end.

[0148] For example, T.35 information may include the information required to carry by multimedia as specified in the ITU-T T.35 specification (i.e., the "Procedure for the Allocation of Non-Standard Extension Codes Defined by the ITU"): country code, terminal provider code, and terminal provider oriented code. For example, the multimedia involved in this application may include, but is not limited to, images, videos, audio, and text (such as subtitles and bullet comments in the video).

[0149] For example, the T.35 information may also include metadata about the image. In one possible approach, the image encoded by the first electronic device in FIG1A may be an HDR image; the corresponding T.35 information may also include HDR metadata.

[0150] To make it easier to understand, let's first introduce SDR and HDR.

[0151] Dynamic range is used in many fields to represent the ratio of the maximum to the minimum value of a variable. In digital imaging, dynamic range represents the ratio between the maximum and minimum grayscale values ​​within the displayable range of an image. The dynamic range in nature is quite large; the brightness of a night scene under the stars is approximately 0.001 cd / m², while the sun's brightness reaches 1,000,000,000 cd / m², resulting in a dynamic range on the order of 1,000,000,000 / 0.001 = 10¹³. However, in real-world natural scenes, the brightness of the sun and starlight are not simultaneously available. For real-world natural scenes, the dynamic range is typically between 10⁻³ and 10⁶. Currently, most color digital images use one byte (8 bits) to store each of the R, G, and B channels. This means each channel represents a grayscale range of 0 to 255 levels. This 0-255 range is the image's dynamic range. Since the dynamic range of the same scene in the real world is between 10⁻³ and 10⁶, we call this high dynamic range (HDR). The dynamic range of ordinary images is, in contrast, low dynamic range (LDR). The imaging process of a digital camera is essentially a mapping from the high dynamic range of the real world to the low dynamic range of the photograph.

[0152] Standard dynamic range (HDR) images are the counterpart to high dynamic range (HDR) images. Traditional 8-bit images in formats like JPEG can be considered standard dynamic range images. Before the advent of cameras capable of capturing HDR images, traditional cameras could only record light information within a certain range by controlling exposure values. Since the maximum illuminance information of a display device cannot match the brightness information of the real world, and we view images through display devices, a photoelectric transfer function is needed. Early display devices used cathode ray tube (CRT) displays, and their photoelectric transfer function was the Gamma function. This photoelectric transfer function based on the Gamma function was defined in the ITU-R Recommendation BT.1886 standard.

[0153] However, with the upgrading of display devices, the illuminance range of display devices continues to increase. Existing consumer-grade HDR displays have an illuminance of 600 cd / m², while high-end HDR displays can reach 2000 cd / m², far exceeding the illuminance of SDR displays. The photoelectric conversion function in the ITU-R Recommendation BT.1886 standard cannot adequately represent the display performance of HDR devices. Therefore, an improved electro-optical transfer function is needed to adapt to the upgrade of display devices. The idea of ​​the photoelectric transfer function comes from the mapping function in the Tone Mapping algorithm; by making appropriate adjustments to the mapping function, the photoelectric transfer function is obtained. Currently, the three common photoelectric conversion functions are: Perception Quantization (PQ) function, gamma function, logarithm (log) function, and hybrid log-gamma (HLG) function, etc.; these three photoelectric conversion functions are the conversion functions specified in the AVS standard.

[0154] For HDR images and videos to achieve a better experience, it is an end-to-end process. Refer to Figure 1B, which illustrates the end-to-end process of HDR video.

[0155] Since the dynamic range of an image acquisition module is fixed under specific shooting conditions, to obtain images with a higher dynamic range, multiple images with different exposures acquired at the same time are generally combined to create a multi-exposure image, which is then used to obtain an image with a higher dynamic range, also known as an HDR image. The bit width of such HDR images is typically greater than 10 bits, thus enabling them to accommodate scenes with a high dynamic range.

[0156] Referring to Figure 1B, by way of example, material production can refer to the process of synthesizing an HDR image using multiple images with different exposures captured at the same time; and synthesizing the original HDR video using HDR images that are sequential in time; the original HDR video obtained from material production can be called material.

[0157] Next, the HDR image / video production module of the image generation module can perform editing / color grading and other processing on the source material to obtain the final HDR video (i.e., the Master). Then, the metadata generation module can generate metadata based on the final HDR video. On one hand, it can use partial metadata (which can be called metadata 1) along with the country code, terminal provider code, and terminal provider targeting code to generate T.35 information. On the other hand, another part of the metadata (which can be called metadata 2) can be output to the video encoding module; the video encoding module performs video encoding on the HDR video and metadata 2 to obtain a bitstream, which is then output to the encapsulation module. The encapsulation module can then encapsulate the bitstream and T.35 information into an image file format and distribute / transmit this image file format over the network. After receiving this image file format, the decoding end can call the decapsulation module to decapsulate the image file format, obtaining the bitstream and T.35 information. Then, it can call the video decoding module to decode the bitstream, obtaining the reconstructed HDR video and the reconstructed metadata 2. Finally, the display module can display the reconstructed HDR video based on the reconstructed metadata and the metadata carried by the T.35 information.

[0158] For example, T.35 information may include dynamic metadata or dynamic metadata. As another example, T.35 information may also include HDR display mapping information, etc. It should be understood that other information may also be carried through T.35 information, and this application does not limit this.

[0159] For example, static metadata, including the color space of the mastering display, the maximum and minimum brightness of the mastering display, the maximum brightness of the video sequence, and the maximum average brightness of the video sequence, is functionally used to reference the display for correct display and to reproduce the creator's intent. In a video sequence, this is static data that does not change over time. For example, the maximum brightness of a video sequence is the maximum value of the maximum brightness across all frames.

[0160] For example, dynamic metadata includes most of the static metadata and also adds metadata related to tone mapping on the display device. Functionally, it contains information closely related to the display and tone mapping processes, such as indicating the selection of tone mapping algorithms and parameters from high dynamic range to relatively low dynamic range. In video sequences, dynamic metadata often changes over time; for example, the maximum brightness of the image or the maximum brightness of the main display changes during scene transitions.

[0161] For example, this application can be applied to various image service scenarios (e.g., mobile phone photography, cloud photo album, image transcoding service, etc.), various video / audio service scenarios (e.g., live streaming, video-on-demand, etc.), etc., and the embodiments of this application are not limited thereto. This application uses multimedia as an example for illustration.

[0162] The following describes the process of encapsulating T.35 information into an image file format.

[0163] First, T.35 information and its location within the image file format can be defined according to the specifications / standards of the image file format.

[0164] For example, image file formats may include, but are not limited to, JPEG, WebP, PNG, HEIF, AVIF, and JXL, etc., and this application does not limit them.

[0165] For example, this application can add a custom box to the existing image file format. This new box can be used to store T.35 information, and the position of the new box can be defined. In order to distinguish the new box from the existing boxes in the existing image file format, the new box can be called the first box (in some scenarios, it can also be called the T35 Box (which can be represented by the four characters "it35", which can be understood as "it35" representing the box carrying T.35 information), or other names, which are not limited in this application).

[0166] For example, a box is a general mechanism for organizing and storing data, enabling files to contain various types of information in a flexible manner. The hierarchy and content of boxes are defined by relevant standard specifications to ensure the correct parsing and processing of files.

[0167] This application uses HEIF image file format as an example for illustration.

[0168] First, let's explain some of the syntax in HEIF:

[0169] Box: An object-oriented building block defined by a unique type identifier and length.

[0170] Container Box: A box whose sole purpose is to contain and combine a set of related boxes.

[0171] Item: This refers to data that does not require periodic processing, as opposed to sample data.

[0172] Property: These are attributes related to the item.

[0173] A sample is all the data associated with a single point in time. Alternatively, a sample refers to dynamic data.

[0174] A track is a timed sequence of related samples in an ISO basic media file. For multimedia, a track corresponds to a sequence of images or sampled audio; for cue tracks, a track corresponds to a streaming channel.

[0175] Sample grouping is the process of assigning each sample in a track to a sample group according to grouping criteria.

[0176] It should be noted that in HEIF, continuous or timed media or metadata streams form a track, while static media or metadata are stored as items.

[0177] The definitions of each syntax in HEIF can be found in ISO / IEC 23008-12:2022, second edition, and will not be repeated here.

[0178] For example, HEIF has the following basic design:

[0179] 1. Still images are stored as items. Typically, image items are encoded independently and do not depend on any other items in the decoding process. If predictive encoded image items with encoding dependencies exist, this will be explicitly stated. Any number of image items can be contained in the same file.

[0180] 2. Image sequences are stored as tracks. Image sequence tracks can be instructed to be displayed in a timed sequence (video or burst animation) or a non-timed manner, such as an image gallery. When there are encoding dependencies between images, image sequence tracks can be used instead of image items.

[0181] Each static element in the HEIF file is an item. For example, an "item" refers to data that does not require timed processing, as opposed to sample data (which is time-dependent), and its description is defined by the boxes contained within a MetaBox.

[0182] For items, HEIF files can contain coded items, such as an image encoded in HEVC; HEIF files can also contain derived items, such as image overlays; HEIF files can also contain metadata items, such as EXIF ​​information.

[0183] Each item can also have one or more associated properties.

[0184] There are structures that connect items to each other and between items and their attributes.

[0185] For example, HEIF specifies a box structure format from which codec-specific image formats can be derived.

[0186] For example, in HEIF, a box is a data structure used to organize and store information within a file. A box is a hierarchical structure, with each box having a unique identifier and length. Boxes can contain other boxes, forming nested structures. Different types of boxes are used to store different types of information, such as image data, metadata, configuration information, etc.

[0187] Structurally, a Box can be divided into a Full Box and a Container Box; a Container Box can contain nested Boxes, while a Full Box can only contain specific defined content and cannot contain nested Boxes.

[0188] From a content perspective, Boxes can be categorized based on the content they store:

[0189] A box used to describe file types and compatibility: FileTypeBox(ftyp) (also known as a file type box);

[0190] A box used to store multimedia metadata: MetaBox (meta) (also known as a metadata box);

[0191] A Box used to store image data: MediaDataBox(mdat) (also known as a media data box);

[0192] A Box used to store image sequences: MovieBox (moov) (also known as a streaming data box);

[0193] Boxes used to store item information, such as: item information box (iinf) (also known as image information box), etc.

[0194] Boxes used to store item information entries, such as: item information entry box (infe) (also known as item information entry box), etc.

[0195] Boxes used to store item properties, such as the item property container box (ipco).

[0196] Boxes used to store information related to project attributes, such as the item properties association box (ipma) (also known as the project attribute association box).

[0197] Boxes used to store samples, such as track boxes.

[0198] A box used to store sample entries, such as SampleDescriptionBox(stsd), etc.

[0199] A box used to store descriptions of sample group attributes, such as the Sample Group Description Box (sgpd) (also known as the sample group description box).

[0200] Boxes used to store timed metadata, such as timed metadata track boxes.

[0201] Boxes used to store periodic metadata entries, such as the metadata sample entry box.

[0202] The ftyp Box, meta Box, mdat Box, and moov Box are the four top-level boxes in HEIF; the ftyp Box and mdat Box are full boxes, while the meta Box and moov Box are container boxes. It should be noted that the moov Box is an optional box in HEIF.

[0203] For example, the item information box (iinf), item information entry box (infe), item property container box (ipco), item properties association box (ipma), and item location box (iloc) are nested in a meta box.

[0204] For example, SampleDescriptionBox (stsd) and Sample Group Description Box (sgpd) are nested within Track Boxes. Metadata Sample Entry Boxes are nested within Timed Metadata Track Boxes. Track Boxes and Timed Metadata Track Boxes are nested within Moov Boxes.

[0205] For example, a simplified HEIF hierarchy can be as follows:

[0206] Structure 1:

[0207] For example, a simplified HEIF hierarchy can be as follows:

[0208] Structure 2:

[0209] It should be understood that Structure 1 and Structure 2 are merely examples of the hierarchical structure of HEIF. The hierarchical structure of HEIF in this application may include more or fewer boxes, or more or fewer levels than shown in Structure 1 and Structure 2; this application does not impose any limitations in this regard. Furthermore, this application does not impose any limitations on the position of each box in the hierarchical structure of HEIF.

[0210] For example, it35 can be nested within any existing Box of HEIF.

[0211] For example, it35 can be nested within any existing Box at any level of HEIF.

[0212] For example, the definition of it35 and its position in HEIF can be as follows:

[0213] Method 1: In the image file format, add a new custom metadata element called "T35Information". This element can be defined as an item (that is, define T.35 information as an item), such as a metadata item. The type of this metadata item can be defined as the newly added value "it35". (item_type = 'it35'.)

[0214] Definition Method 1 (1):

[0215] T35Information (also known as it35) can have syntax similar to the following:

[0216] Syntax 1:

[0217] For example, the semantic interpretation of grammar 1 can be as follows:

[0218] The itu_t_t35_country_code should be a single byte, the value of which is defined as the country code by Rec.ITU-T T.35Annex A, or the country code extension 0xFF.

[0219] If itu_t_t35_country_code_extension_byte exists, it should be one byte, the value of which is specified as the country code by Rec.ITU-T T.35Annex B.

[0220] The itu_t_t35_payload should be a payload containing data registered in accordance with Rec.ITU-T T.35 (which may include metadata of multimedia, such as dynamic metadata of HDR images or static metadata of HDR images).

[0221] The ITU-T T.35 terminal provider code and terminal provider guidance code shall be included in the first or more bytes of the itu_t_t35_payload, and the format shall be specified by the governing body that issued the terminal provider code. Any remaining itu_t_t35_payload data shall be data with specific syntax and semantics as defined by the entity identified by the ITU-T T.35 country code, terminal provider code, and terminal provider guidance code.

[0222] The length of itu_t_t35_payload is the number of bytes remaining in the project.

[0223] For example, in HEIF, when T.35 information is defined as a metadata item and the syntax of the T.35 Box is syntax 1, a simplified possible hierarchical structure of HEIF is given based on the above "Structure 1":

[0224] Structure 3:

[0225] Among them, the structural differences between structure 3 and structure 1 are marked, as shown in bold.

[0226] Referring to structure 3, T.35 Box is an infe Box; T.35 Box can be nested in iinf.

[0227] Definition Method 1 (2):

[0228] T35Information (also known as it35) can have syntax similar to the following:

[0229] Grammar 2:

[0230] It should be noted that in definition method one (2), a data block T35DataBlock also needs to be defined to store the payload of T.35; wherein, T35DataBlock can have a syntax similar to the following.

[0231] Grammar 3:

[0232] aligned(8)class T35DataBlock(){

[0233] bit(8)itu_t_t35_payload[];

[0234] }

[0235] The semantic explanations of grammars 2 and 3 can be found in the semantic explanation of grammar 1, and will not be repeated here.

[0236] For example, in HEIF, when T.35 information is defined as a metadata item and the syntax of the T.35 Box is syntax 2, a simplified possible hierarchical structure of HEIF is given based on the above "Structure 1":

[0237] Structure 4:

[0238] The T.35 information includes a header and a payload. The header includes the country code, the terminal provider code, and the terminal provider targeting code. The payload may include multimedia metadata, such as dynamic metadata or static metadata of an HDR image.

[0239] Referring to structure 4, the T.35 Box (including only the header) is an infe Box; the T.35 Box can be nested within an infe Box. The payload can also be nested within a mdat Box.

[0240] Method 2: In the image file format, add a new custom metadata element called "T35Information". This element can be defined as an item property (that is, defining T.35 information as an item property). For example, this item property can belong to and be associated with an item (image). In HEIF, an image is also an item, such as a derived item or a coded image item. They can all have attributes. Defined attributes can be explicitly specified in the attribute box (for example, this T.35 attribute is placed in the container box itemPropertyContainerBox) and assigned to a specific image. Since each item property is a box (Box) or a full box (FullBox), the box type (Boxtype) of the item property specifies the type of the attribute. Therefore, the type of this metadata element (item property) can be defined as the newly added "it35" value (Box_type = 'it35'). And according to the property definition, this item property Box should be located in the container of itemPropertyContainerBox.

[0241] The T.35 Box can be defined as follows:

[0242] Box Type: 'it35'

[0243] Property type: Descriptive item property

[0244] Container: itemPropertyContainerBox

[0245] For example, T35Information (i.e., it35) can have a syntax similar to the following:

[0246] Grammar 3:

[0247] The semantic explanation of grammar 3 can be referred to the semantic explanation of grammar 1, and will not be repeated here.

[0248] For example, when defining T.35 information as an item property in HEIF, a simplified possible hierarchical structure of HEIF is given based on "Structure 1" above:

[0249] Structure 5:

[0250] Structure 5 is marked with structural differences from Structure 1, as shown in bold.

[0251] In other words, the T.35 box is nested within the IPCO, and the association information between the T.35 information and the corresponding items is added to the IPMA Box.

[0252] Definition Method 3: In the image file format, add a new custom element and define this element as a sample entry. That is, define the T.35 information as a sample entry.

[0253] A Sample Entry is a box type that contains information describing a sample. Each Sample Entry corresponds to a sample within a track. This box is primarily used to describe the format and encoding method of the media data.

[0254] Since Sample Entry describes different data formats and encoding methods, the T.35 information defined here is a property independent of the codec.

[0255] Since the Sample Entry is a Box, this sample entry can be named "T35InformationBox". The box type of the Sample Entry Box is defined as the newly added value "it35" (Box_type = 'it35'). The Sample Entry can be placed inside a SampleDescriptionBox('stsd').

[0256] For example, T35InformationBox (also known as it35) can have syntax similar to the following:

[0257] Grammar 4:

[0258] The semantic interpretation of grammar 4 can be understood as the semantic interpretation of grammar 1, and will not be elaborated further here.

[0259] For example, in HEIF, when T.35 information is defined as Sample Entry, a simplified possible hierarchical structure of HEIF is given based on "Structure 2" above:

[0260] Structure 6:

[0261] Among them, the structural differences between structure 6 and structure 2 are marked, as shown in bold.

[0262] In other words, the T.35 box is nested within the STSD.

[0263] For example, regarding structure 6: when there is an image sequence, a track needs to be defined for storage. This allows for an STSD at the end level. The STSD contains codec information such as HVC1. This approach uses IT35, thus making the metadata codec-independent (side-by-side boxes).

[0264] It should be noted that in Scheme 3, it35 (Sample Entry) is independent of other Sample Entry boxes and does not limit the codec to achieve applicability.

[0265] It should be noted that Scheme 1, Scheme 2 and Scheme 3 are applicable to T.35 information carrying static metadata.

[0266] Definition Method 4: In the image file format, add a new custom element and define this element as a Sample Group Entry. That is, define T.35 as a Sample Group Entry.

[0267] The T.35 information defined by the Sample Group Entry can carry dynamic metadata. Since the Sample Group Entry is a Box, this sample entry can be named "T35SampleGroupEntry". This sample group box is defined as the grouping type of the newly added "it35" value (grouping_type = 'it35'). This Sample Group Entry can be placed in a SampleGroupDescriptionBox('sgpd'), and the sample group can be associated with a Track.

[0268] Definition Method 4 (1): All data carried by the T.35 message (including header and payload) is stored in SampleGroupDescriptionBox('sgpd'), not in the sample (i.e. Sample Entry in stsd).

[0269] For example, each sample in a track can be associated with zero or more sample group descriptions, each defining a different type of T.35 information record. The same T.35 information may apply to different samples.

[0270] grouping_type='it35' is defined as the grouping criterion for T.35 metadata.

[0271] A track's SampleTableBox('stbl') or TrackFragmentBox('traf') can contain zero or more SampleToGroupBox('sbgp') or CompactSampleToGroupBox('csgp') with grouping_type='it35', but with different grouping_type_parameter entries.

[0272] For example, T35SampleGroupEntry (i.e., it35) can have the following syntax:

[0273] Grammar 5:

[0274] The semantic explanation in grammar 5 can be referred to the semantic explanation in grammar 1, and will not be repeated here.

[0275] For example, in HEIF, when T.35 information is defined as SampleGroupEntry, a simplified possible hierarchical structure of HEIF is given based on "Structure 2" above:

[0276] Structure 7:

[0277] In other words, the T.35 box is embedded into the SGPD.

[0278] All entries in a SampleToGroupBox or CompactSampleToGroupBox with the same grouping_type_parameter value should be mapped to the same type of T.35 message; this means that there are at least the same ITU-T T.35 country code, terminal provider code, and terminal provider guidance code, as well as T.35 specific type and version information (if applicable) for T.35 sample group entries.

[0279] For example, sgpd contains five T.35 information entries with IDs 1, 2, 3, 4, and 5, which come from HDR Vivid, HDR Vivid, Dolby Vision, Dolby Vision, and Dolby Vision, respectively.

[0280] Definition 4(1) can reduce the redundancy of storing the same metadata in each sample, which is very useful when the metadata carried by the T.35 information is not frequent and will not cause problems due to the size of the MovieBox.

[0281] Definition 4(2): The payload carried by the T.35 message is stored in the sample, while the SampleGroupDescriptionBox is only used to indicate the header of the T.35 message. The T.35 message is carried in the sample. The sample grouping mechanism is used to identify samples that hold a specific T.35 message. This method allows for the processing of very frequent (a large number of identical metadata) T.35 metadata.

[0282] Each sample in the orbit can be associated with zero or more sample group descriptions, each description defining a record of different types of T.35 information. The same T.35 information may apply to different samples.

[0283] grouping_type='it35' is defined as the grouping criterion for T.35 metadata.

[0284] The SampleTableBox or TrackFregmentBox of a track can contain zero or more SampleToGroupBoxes or CompactSampleToGroupBoxes with grouping_type='it35'.

[0285] Each sample group description should record exactly one T.35 message, that is, it should be mapped to a T.35 sample group entry with at least the same ITU-T T.35 country code, terminal provider code and terminal provider guidance code, as well as T.35 type and version (if applicable).

[0286] The syntax for T35SampleGroupEntry (i.e., it35) in definition method 4(2) can be as follows:

[0287] Grammar 6:

[0288] For example, in HEIF, when T.35 information is defined as SampleGroupEntry, a simplified possible hierarchical structure of HEIF is given based on "Structure 2" above:

[0289] Structure 8:

[0290] For example, the first sbgp has two child idx, pointing to id=1 and id=2 in sgpd. grouping_type_parameter=0 (this groups the HDR Vivid together).

[0291] The second sbgp has 3 child idxes. They point to id=3, id=4, and id=5 in sgpd, and grouping_type_parameter=1 (this groups the three Dolby Visions together).

[0292] In other words, in definition method four (1), the header and payload of the T.35 information are both placed in the Sample Group Description Box. In definition method four (2), the Sample Group Description Box only stores the header of the T.35 information, and the payload of the T.35 information is stored in the sample (i.e., the sample entry of stsd).

[0293] Definition Method 5: The Timed metadata track can be used to store dynamic metadata carried by T.35 information, such as HDR dynamic tone mapping or per-frame changes in cinematic granularity metadata. It can also be used to store less frequent (fewer instances of identical metadata) T.35 metadata. The sample duration of T.35 can be set to cover multiple samples from the referenced track, which are the tracks referenced by the T.35 message.

[0294] When using timing metadata tracks, the referenced track should not include a T35InformationBox in a SampleEntry that has the same ITU-T T.35 country code, terminal provider code, and terminal provider guidance code.

[0295] For example, T.35 information can be defined as a sample entry, named "T35InfoSampleEntry" definition, and coded with 'it35'.

[0296] For example, a sample entry for T.35 information should contain a T35InfoConfigurationBox that describes the T.35 information present in each sample, and its type can be represented by 'T35C'.

[0297] For example, the syntax of T.35 can be as follows:

[0298] Grammar 7:

[0299] For example, the syntax of T35CommmonHeaderInfoBox can be as follows:

[0300] Grammar 8:

[0301] For example, the syntax of T35InfoSample can be as follows:

[0302] Grammar 9:

[0303] Semantic explanations in grammars 7 through 9:

[0304] t35_common is the general T.35 information for all samples related to this sample entry. It should include the ITU-T T.35 country code, terminal provider code, and terminal provider guidance code. It should also include a portion of the ITU-T T.35 message specifying the message version. In all samples containing this sample entry, it may also include all other general data following the terminal provider guidance code.

[0305] itu_t_t35_country_code should be a single byte, the value of which is specified by the country code specification in Annex A of ITU-T T.35, or the country code extension 0xFF.

[0306] If present, itu_t_t35_country_code_extension_byte should be one byte, and its value is specified by the country code specification in Annex B of ITU-T T.35.

[0307] `itu_t_t35_common` should be a payload containing the ITU-T T.35 terminal provider code and terminal provider guidance code, as specified in the ITU-T T.35 specification. It should also include a portion of the version of the ITU-T T.35 message. In all samples related to this sample entry, it may also include all general data following the terminal provider guidance code.

[0308] t35_remaining_payload should be the remaining payload, containing data registered according to the ITU-T T.35 country code, terminal provider code, and terminal provider guidance code specifications in T35InfoSampleEntry. Concatenating the data from T35CommmonHeaderInfoBox with the data from each T35InfoSample will yield the complete ITU-T T.35 message.

[0309] For example, defining T.35 information as sample entries, a simplified possible hierarchical structure of HEIF is given based on "Structure 2" above:

[0310] Structure 9:

[0311] Similarly, this application can also define T35 boxes and their locations within the JPEG Universal Metadata Box Format (JUMBF). For example, JUMBF can be referenced to Part 5 of the JPEG Systems document; JPEG Systems Part 5 defines a framework for JPEG standards to help JPEG add universal metadata and allows for future extensions. Extensions can be, for example, metadata, supplementary images, or other elements in addition to the base image. For example, JUMBF also defines the basic concept of a box structure; therefore, in JUMBF, T.35 information can also be defined as a box, and the location of the T35 box can be defined according to JUMBF specifications (i.e., placing the T35 box in the appropriate location specified by JUMBF); this will not be elaborated further here.

[0312] In this way, T.35 information can be encapsulated into an image file format (or image file) according to the image file format that adds T.35 information definition and T.35 information location definition; the following data encapsulation process 200 can be referred to.

[0313] Figure 2A is a schematic diagram of an exemplary data encapsulation process 200.

[0314] S201, Obtain T.35 Information; where T.35 information includes: country code, terminal provider code, and terminal provider targeting code.

[0315] For example, T.35 messages may also carry multimedia metadata. It should be understood that multimedia metadata is optional information carried by T.35 messages, while country codes, terminal provider codes, and terminal provider targeting codes are mandatory information carried by T.35 messages.

[0316] S202, encapsulate the T.35 information into an image file format; wherein the image file format includes a first box, and the first box includes the T.35 information.

[0317] For example, T.35 information can be encapsulated into an image file format according to any of the above-described methods one through five.

[0318] For example, according to any one of the above definition methods one to five, the image file format obtained in S202 may include a first box, and the first box includes T.35 information.

[0319] For example, when the image file format obtained in S202 is an HEIF file, according to any one of the above definition methods 1 (1) and 2 to 5, the image file format obtained in S202 may also include a second box. The second box may be a metadata box or a streaming data box, and the first box may be nested in the second box; as shown in Figure 2B.

[0320] For example, the second box includes multiple third boxes, which have a hierarchical relationship; the first box is nested within any level of the third box.

[0321] In other words, when the second box is a metadata box, the metadata box can include multiple third boxes, including but not limited to: item information box (iinfBox), item information entry box (infe Box), item property container box (ipco Box), item properties association box (ipma Box), and item location box (iloc Box), etc.; the hierarchical relationship of these multiple third boxes can be as shown in structure 1; the first box can be nested in any level of third box (as shown in structure 3, structure 4 or structure 5).

[0322] When the second box is a streaming data box, the streaming data box can include multiple third boxes, including but not limited to: SampleDescriptionBox (stsd Box), Sample Group Description Box (sgpd Box), metadata Sample Entry Box, timed metadata Track Box, Track Box, etc. The hierarchical relationship of multiple third boxes can be as shown in Structure 2; the first box can be nested in any level of third box (as shown in Structure 6, Structure 7, Structure 8, or Structure 9).

[0323] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method one (2), the image file format obtained in S202 also includes an item identifier and a media box. The item identifier indicates that the T.35 information is an item, the first box includes the header information of the T.35 information, and the media box includes the payload of the T.35 information. The first box is nested in the image information box, as shown in Figure 2C.

[0324] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method one (1), the T.35 information is the item, the first box is nested in the image information box, and the first box is the item information entry box.

[0325] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method two, the T.35 information is the project attribute; the first box is nested in the project attribute container box, and the project attribute associated box includes the association information between the T.35 information and the corresponding project.

[0326] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method three, the T.35 information is a sample entry, and the first box is nested in the sample description box.

[0327] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method four (1), the T.35 information is a sample group entry, and the first box is nested in the sample group description box.

[0328] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method four (2), the T.35 information is a sample entry, the first box includes the header information of the T.35 information, the first box is nested in the sample group description box; the sample description box includes the payload of the T.35 information.

[0329] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method five, the T.35 information is a sample entry, and the first box is nested in the metadata sample entry box.

[0330] For example, encoded multimedia data can also be acquired; the encoded multimedia data can be encapsulated in an image file format, the image file format further including a media box or streaming data box, the media box or streaming data box including the encoded multimedia data.

[0331] Figure 3 is a schematic diagram of the data acquisition process 300 as an example.

[0332] S301, Receive image file format; wherein, the image file format includes a first box, and the first box includes T.35 information.

[0333] S302, read T.35 information from the first box of the image file format; wherein, the T.35 information includes: country code, terminal provider code and terminal provider orientation code.

[0334] For example, T.35 information can be read from the first box of an image file format.

[0335] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method one (2), the item identifier can be read from the image file format, the payload of the T.35 information can be read from the media box according to the item identifier, and the header information of the T.35 information can be read from the first box nested in the image information box.

[0336] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method one (1), T.35 information can be read from the first box nested in the image information box.

[0337] For example, when the T.35 box and its position in the image file format are defined as shown in definition method two, T.35 information can be read from the first box nested within the project attribute container box. Furthermore, the association information between the T.35 information and the corresponding project can be read from the project attribute association box; thus, the project associated with the T.35 information can be determined.

[0338] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method three, the T.35 information can be read from the first box nested in the sample description box.

[0339] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method four (1), the T.35 information can be read from the first box nested in the sample group description box.

[0340] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method four (2), the header information of the T.35 information can be read from the first box nested in the sample group description box, and the payload including the T.35 information can be read from the sample description box.

[0341] For example, when the T.35 box and the position of the T.35 box in the image file format are defined as shown in definition method five, T.35 information can be read from the first box nested in the metadata sample entry box.

[0342] In addition, encoded multimedia data can be read from image file formats, including media boxes or streaming data boxes; then, the encoded multimedia data is decoded to obtain reconstructed multimedia data. Afterwards, the reconstructed multimedia data can be played according to the T.35 information.

[0343] For example, this application embodiment also provides an image file format that can be encapsulated according to the data encapsulation process 200.

[0344] Figure 4 is a schematic diagram of an exemplary data encapsulation device 400. The data encapsulation device 400 can be used to perform the methods of the foregoing embodiments; therefore, the beneficial effects it can achieve can be referred to the beneficial effects of the corresponding methods provided above, and will not be repeated here.

[0345] For example, the data encapsulation device 400 may include:

[0346] The acquisition module 401 is used to acquire T.35 information; wherein, the T.35 information includes: country code, terminal provider code and terminal provider targeting code;

[0347] The encapsulation module 402 is used to encapsulate T.35 information into an image file format; wherein the image file format includes a first box, and the first box includes T.35 information.

[0348] Figure 5 is a schematic diagram of an exemplary data reading device 500. The data encapsulation device 500 can be used to perform the methods of the foregoing embodiments; therefore, the beneficial effects it can achieve can be referred to the beneficial effects of the corresponding methods provided above, and will not be repeated here.

[0349] For example, the data encapsulation device 500 may include:

[0350] The receiving module 501 is used to receive an image file format; wherein the image file format includes a first box, and the first box includes T.35 information;

[0351] The reading module 502 is used to read T.35 information from the first box of the image file format; wherein, the T.35 information includes: country code, terminal provider code and terminal provider orientation code.

[0352] The following description, in conjunction with FIG6, illustrates the encoding and decoding system used in this application. FIG6 is a schematic block diagram of an encoding and decoding system used in an embodiment of this application, such as a video encoding and decoding system 10 (or simply encoding and decoding system 10) that can utilize the technology of this application. The video encoder 20 (or simply encoder 20) and video decoder 30 (or simply decoder 30) in the video encoding and decoding system 10 represent devices, etc., that can be used to perform various technologies according to the various examples described in this application.

[0353] As shown in Figure 6, the encoding and decoding system 10 includes a source device 12, which provides encoded data 21, such as encoded images, to a destination device 14 for decoding the encoded data.

[0354] The source device 12 includes an encoder 20, and optionally may also include an image source 16, an image preprocessor 18 (or preprocessing unit), a communication interface or a communication unit 22.

[0355] Image source 16 may include or may be any type of image capture device for capturing real-world images, and / or any type of image generation device, such as a computer graphics processor for generating computer animation images or any type of device for acquiring and / or providing real-world images, computer-generated images (e.g., screen content, virtual reality (VR) images, and / or any combination thereof (e.g., augmented reality (AR) images). The image source may be any type of memory or storage device storing any of the images described above.

[0356] To distinguish the processing performed by the preprocessor 18 or the preprocessing unit 18, the image or image data 17 may also be referred to as the raw image or raw image data 17.

[0357] The preprocessor 18 receives (raw) image data 17 and preprocesses the image data 17 to obtain a preprocessed image 19 or preprocessed image data 19. For example, the preprocessing performed by the preprocessor 18 may include cropping, color format conversion (e.g., from RGB to YCbCr), color correction, or noise reduction. It is understood that the preprocessing unit 18 may be an optional component.

[0358] The video encoder 20 is used to receive preprocessed image data 19 and provide encoded image data 21.

[0359] The communication interface 22 in the source device 12 can be used to: receive encoded image data 21 and send encoded image data 21 (or other arbitrarily processed version) to another device such as the destination device 14 or any other device via the communication channel 13 for storage or direct reconstruction.

[0360] Destination device 14 includes decoder 30 (e.g., video decoder 30), and optionally may include communication interface or communication unit 28, post-processor 32 (or post-processing unit 32) and display device 34.

[0361] The communication interface 28 in the destination device 14 is used to receive encoded image data 21 (or other processed versions) directly from the source device 12 or from any other source device such as a storage device, for example, the storage device is an encoded image data storage device, and to provide the encoded image data 21 to the decoder 30.

[0362] Communication interfaces 22 and 28 can be used to send or receive encoded image data 21 or encoded data through a direct communication link between source device 12 and destination device 14, such as a direct wired or wireless connection, or through any type of network, such as a wired network, a wireless network or any combination thereof, any type of private network and public network or any combination thereof.

[0363] For example, the communication interface 22 can be used to encapsulate the encoded image data 21 into a suitable format such as a message, and / or process the encoded image data using any type of transmission encoding or processing, so as to transmit it on a communication link or communication network.

[0364] Communication interface 28 corresponds to communication interface 22. For example, it can be used to receive transmitted data and process the transmitted data using any type of corresponding transmission decoding or processing and / or decapsulation to obtain encoded image data 21.

[0365] Both communication interface 22 and communication interface 28 can be configured as a one-way communication interface or a two-way communication interface as indicated by the arrow pointing from the source device 12 to the destination device 14 in FIG6, and can be used to send and receive messages, etc., to establish a connection, acknowledge and exchange any other information related to the communication link and / or data transmission such as encoded image data transmission, etc.

[0366] Decoder 30 is used to receive encoded image data 21 and provide decoded image data 31 or decoded image 31.

[0367] The post-processor 32 in the destination device 14 is used to post-process the decoded image data 31 (also known as reconstructed image data), such as the decoded image 31, to obtain post-processed image data 33, such as the post-processed image 33. The post-processing performed by the post-processing unit 32 may include, for example, color format conversion (e.g., from YCbCr to RGB), color adjustment, trimming or resampling, or any other processing to generate the decoded image data 31 for display by the display device 34, etc.

[0368] The display device 34 in the destination device 14 is used to receive the post-processed image data 33 to display the image to a user or viewer. The display device 34 can be or includes any type of display for representing the reconstructed image, such as an integrated or external display screen or monitor. For example, the display screen may include a liquid crystal display (LCD), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS), a digital light processor (DLP), or any other type of display screen.

[0369] Although Figure 6 illustrates source device 12 and destination device 14 as independent devices, device embodiments may also include both source device 12 and destination device 14, or the functions of both source device 12 and destination device 14 simultaneously; that is, they may include both source device 12 or its corresponding functions and destination device 14 or its corresponding functions. In these embodiments, source device 12 or its corresponding functions and destination device 14 or its corresponding functions may be implemented using the same hardware and / or software, or through separate hardware and / or software, or any combination thereof.

[0370] As described, the presence and (accurate) division of different units or functions in the source device 12 and / or destination device 14 shown in Figure 6 may vary depending on the actual device and application, which is obvious to those skilled in the art.

[0371] The content provisioning system for the content distribution service applied in this application is described below with reference to Figure 7A, which is a block diagram of a content provisioning system for implementing the content distribution service applied in an embodiment of this application. The content provisioning system 2100 includes a capture device 2102, a terminal device 2106, and (optionally) a display 2126. The capture device 2102 communicates with the terminal device 2106 via a communication link 2104. The communication link may include the aforementioned communication channel 13. The communication link 2104 includes, but is not limited to, WiFi, Ethernet, wired, wireless (3G / 4G / 5G), USB, or any combination thereof.

[0372] The capture device 2102 generates data and can encode the data using the encoding method shown in the above embodiments. Alternatively, the capture device 2102 can distribute the data to a streaming media server (not shown in the figure), where the server encodes the data and transmits the encoded data to the terminal device 2106. The capture device 2102 includes, but is not limited to, cameras, smartphones or tablets, computers or laptops, video conferencing systems, PDAs, in-vehicle devices, or any combination thereof. For example, the capture device 2102 may include the source device 12 described above. When the data includes video, the video encoder 20 in the capture device 2102 can actually perform video encoding processing. When the data includes audio (i.e., sound), the audio encoder 20 in the capture device 2102 can actually perform audio encoding processing. In some practical scenarios, the capture device 2102 distributes the encoded video data and encoded audio data by multiplexing them together. In other practical scenarios, such as in a video conferencing system, the encoded audio data and encoded video data are not multiplexed. The capture device 2102 distributes the encoded audio data and encoded video data to the terminal device 2106 separately.

[0373] The terminal device 2106 in the content providing system 2100 receives and regenerates the encoded data. The terminal device 2106 can be a device with data reception and recovery capabilities, such as a smartphone or tablet computer 2108, a computer or laptop computer 2110, a network video recorder (NVR) / digital video recorder (DVR) 2112, a television 2114, a set-top box (STB) 2116, a video conferencing system 2118, a personal digital assistant (PDA) 2122, an in-vehicle device 2124, or any combination thereof, or such devices capable of decoding the encoded data. For example, the terminal device 2106 may include the aforementioned destination device 14. When the encoded data includes video, the video decoder 30 in the terminal device prioritizes video decoding. When the encoded data includes audio, the audio decoder in the terminal device prioritizes audio decoding. The terminal device 2106 can be a video playback application, streaming media playback application, streaming media playback platform, or live streaming platform running on the terminal device.

[0374] For terminal devices with displays, such as smartphones or tablets 2108, computers or laptops 2110, NVRs / DVRs 2112, televisions 2114, PDAs 2122, or in-vehicle devices 2124, the terminal device can send the decoded data to its display. For terminal devices without displays, such as STBs 2116 and video conferencing systems 3118, the device is connected to an external display 2126 to receive and display the decoded data.

[0375] When encoding or decoding, the various devices in this system can use the image encoding device or image decoding device shown in the above embodiments.

[0376] Figure 7B is a schematic diagram of an example structure of terminal device 2106 in Figure 7A. After terminal device 2106 receives the bitstream from capture device 2102, protocol processing unit 2202 analyzes the transmission protocol of the bitstream. The protocol includes, but is not limited to, Real-Time Streaming Protocol (RTSP), Hypertext Transfer Protocol (HTTP), HTTP Live Streaming Protocol (HLS), MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH), Real-time Transport Protocol (RTP), Real-Time Messaging Protocol (RTMP), or any combination thereof.

[0377] After processing the stream, the protocol processing unit 2202 generates a stream file. The file is output to the demultiplexing unit 2204. The demultiplexing unit 2204 can separate the multiplexed data into encoded audio data and encoded video data. As mentioned above, in other practical scenarios, such as in a video conferencing system, the encoded audio data and encoded video data are not multiplexed. In this case, the encoded data is not transmitted to the video decoder 3206 and the audio decoder 2208 through the demultiplexing unit 2204.

[0378] Through demultiplexing, a video elementary stream (ES), an audio ES, and optional subtitles are generated. Video decoder 2206, including video decoder 30 as described in the above embodiments, decodes the video ES using the decoding method shown in the above embodiments to generate video frames and sends the data to synchronization unit 2212. Audio decoder 2208 decodes the audio ES to generate audio frames and sends the data to synchronization unit 2212. Alternatively, the video frames can be stored in a buffer (not shown) before being sent to synchronization unit 2212. Similarly, audio frames can be stored in a buffer (not shown) before being sent to synchronization unit 2212.

[0379] Synchronization unit 2212 synchronizes video and audio frames and provides the video / audio to video / audio display 3214. For example, synchronization unit 2212 synchronizes the presentation of video and audio information. The information can be represented using encoded audio and visual data, with associated timestamps for data stream transmission, encoded in the syntax.

[0380] If the bitstream includes subtitles, the subtitle decoder 2210 decodes the subtitles, synchronizes them with the video and audio frames, and provides the video / audio / subtitles to the video / audio / subtitle display 2216.

[0381] The present invention is not limited to the above-described system. The image encoding device or image decoding device in the above embodiments can be used in other systems such as automobiles.

[0382] The following description, with reference to Figure 8A, illustrates a streaming media system applicable to an embodiment of this application. Figure 8A is a schematic diagram of the workflow of a streaming media system applicable to an embodiment of this application.

[0383] The streaming media system includes a content creation module that generates the necessary content data, such as video or audio. The system also includes a video encoding module that encodes the generated content using an encoder. Finally, it includes a video streaming transmission module that transmits the encoded video as a stream. Optionally, the video stream format can be converted to a stream format commonly used in OTT (over-the-top) devices. These protocols include, but are not limited to, Real-Time Streaming Protocol (RTSP), Hypertext Transfer Protocol (HTTP), HTTP Live Streaming Protocol (HLS), MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH), Real-time Transport Protocol (RTP), Real-Time Messaging Protocol (RTMP), or any combination thereof. Optionally, video stream storage can be performed, storing the original format and / or multiple converted stream formats for easy use. Furthermore, the streaming media system also includes a video stream encapsulation module, used to encapsulate the video stream to generate an encapsulated video stream, which can be called a video streaming media packet. For example, a video streaming media packet can be generated based on a transcoded video stream or a stored video stream. Furthermore, the streaming media system also includes a content distribution network (CDN), which is used to distribute the video streaming media packet to multiple OTT devices, such as mobile phones, computers, tablets, and home projectors.

[0384] It should be noted that video encoding, video streaming, video transcoding, video storage, video streaming media packet generation, and content delivery networks can all be implemented on cloud servers.

[0385] The following describes an exemplary streaming media system architecture of this application with reference to Figure 8B. This streaming media system architecture includes: client devices, content delivery networks, and cloud servers.

[0386] Users on client devices send playback or replay requests to the cloud platform. Optionally, the content of the request can be the title of the movie or TV show to be played.

[0387] The cloud platform makes a decision and responds to the client by sending the address of the requested content on the CDN. Optionally, the content sent to the client can be a URL (uniform resource locator). Specifically, the playback application service in the cloud platform checks user authorization and permissions, and then considers individual client characteristics and current network conditions to determine which specific files are needed to process the playback request. It should be noted that the Content Delivery Network (CDN) periodically reports its operational status, learned routes, and available content (files) to the cache control service in the cloud platform.

[0388] The client then requests content from the CDN based on the address, and the CDN provides the content to the client, thus fulfilling the client's request.

[0389] The system architecture applicable to the embodiments of this application is described below with reference to Figure 9A. Figure 9A is a schematic diagram of a possible system architecture applicable to the embodiments of this application. The system architecture of the embodiments of this application includes: a front-end device, a transmission link, and a terminal display device.

[0390] The front-end device is used to capture or create HDR / SDR content (e.g., HDR / SDR video or images).

[0391] In one possible embodiment, the front-end device can also be used to extract corresponding metadata from the HDR content. This metadata may include global mapping information, local mapping information, and dynamic and static metadata corresponding to the HDR content. The front-end device can then send the HDR content and metadata to the terminal display device via a transmission link. Specifically, the HDR content and metadata may be transmitted as a single data packet or as two separate data packets; this embodiment does not impose specific limitations.

[0392] Optionally, the terminal display device can receive metadata and HDR content, and extract the global mapping information, local mapping information, and terminal display device information contained in the corresponding metadata based on the HDR content. It then obtains a mapping curve to perform global and local tone mapping on the HDR content, converting it into display content adapted to the HDR display device or SDR device in the terminal display device, and displays it. It should be understood that in different embodiments, the terminal display device may include a display device with a display capability having a lower or higher dynamic range than the HDR content generated by the front-end device; this application does not limit this.

[0393] Optionally, the front-end device and the terminal display device in this application can be independent and different physical devices. For example, the front-end device can be a video capture device or a video production device, wherein the video capture device can be a camera, a video processing machine, or other similar device. The terminal display device can be a virtual reality (VR) glasses, a mobile phone, a tablet, a television, a projector, or other devices with video playback capabilities.

[0394] Optionally, the transmission link between the front-end device and the terminal display device can be a wireless connection or a wired connection. The wireless connection can employ technologies such as Long Term Evolution (LTE), 5th Generation (5G) mobile communication, and future mobile communication technologies. Wireless connections may also include technologies such as Wireless-Fidelity (WiFi), Bluetooth, and Near Field Communication (NFC). Wired connections can include Ethernet connections, local area network (LAN) connections, etc. No specific limitations are imposed.

[0395] This application can also integrate the functions of the front-end device and the terminal display device onto the same physical device, such as a mobile phone or tablet with video recording capabilities. This application can also integrate some functions of the front-end device and some functions of the terminal display device onto the same physical device. No specific limitations are imposed in this regard.

[0396] The end-to-end image processing system provided in this application embodiment is described below with reference to Figure 9B. This system can be applied to the system architecture shown in Figure 9A. Figure 9B is a schematic diagram of the structure of an image processing system provided in this application embodiment. In Figure 9B, HDR / SDR content is exemplarily represented by HDR video. The image processing system includes: an HDR preprocessing module, an HDR video encoding module, an HDR video decoding module, and a tone mapping module.

[0397] The HDR preprocessing module and HDR video encoding module can be located in the front-end device shown in Figure 9A, while the HDR video decoding module and tone mapping module can be located in the terminal display device shown in Figure 9A.

[0398] The HDR preprocessing module extracts dynamic metadata (e.g., maximum, minimum, average, and range of brightness) from the HDR video, determines mapping curve parameters based on the dynamic metadata and the display capabilities of the target display device, writes these parameters into the dynamic metadata to obtain HDR metadata, and then transmits it. The HDR video can be either captured directly or processed by a colorist; the display capabilities of the target display device refer to the range of brightness it can display.

[0399] HDR video encoding module: Used to encode HDR video and HDR metadata according to video compression standards (e.g., AVS or HEVC standards) and output the corresponding bitstream (AVS or HEVC bitstream).

[0400] HDR video decoding module: used to decode the generated bitstream (AVS bitstream or HEVC bitstream) according to the standard corresponding to the bitstream format, and output the decoded HDR video and HDR metadata.

[0401] The tone mapping module is used to generate a mapping curve based on the parameters of the mapping curve in the decoded HDR metadata, and to perform tone mapping (i.e., HDR adaptation processing or SDR adaptation processing) on ​​the decoded HDR video. The tone-mapped HDR-adapted video is then sent to the HDR display terminal for display, or the SDR-adapted video is sent to the SDR display terminal for display.

[0402] For example, the HDR preprocessing module may exist in the video capture device or the video production device.

[0403] For example, the HDR video encoding module may be present in a video capture device or a video production device.

[0404] For example, HDR video decoding modules can exist in set-top boxes, television display devices, mobile terminal display devices, and video conversion devices for online live streaming and online video applications.

[0405] For example, the tone mapping module can exist in set-top boxes, television display devices, mobile terminal display devices, and video conversion devices such as live streaming and online video applications. More specifically, the tone mapping module can exist as a chip or software program in set-top boxes, television displays, and mobile terminal displays, and as a software program in video conversion devices such as live streaming and online video applications.

[0406] In one possible embodiment, when both the tone mapping module and the HDR video decoding module are present in the set-top box, the set-top box can complete the functions of receiving, decoding, and tone mapping the video stream. The set-top box sends the decoded video data to the display device for display through the high definition multimedia interface (HDMI), so that users can enjoy the video content.

[0407] In one example, FIG10 shows a schematic block diagram of an apparatus 1000 according to an embodiment of the present application. The apparatus 1000 may include a processor 1001 and a transceiver / transceiver pin 1002, and optionally, a memory 1003.

[0408] The various components of device 1000 are coupled together via bus 1004, which includes a data bus, a power bus, a control bus, and a status signal bus. However, for clarity, all buses are referred to as bus 1004 in the figure.

[0409] Optionally, the memory 1003 can be used to store instructions from the foregoing method embodiments. The processor 1001 can be used to execute the instructions in the memory 1003, control the receive pin to receive signals, and control the transmit pin to transmit signals.

[0410] The device 1000 may be an electronic device or a chip of an electronic device in the above method embodiments.

[0411] Among them, electronic devices can be terminal devices or servers.

[0412] All relevant content of each step involved in the above method embodiments can be referenced from the functional description of the corresponding functional module, and will not be repeated here.

[0413] This application also provides a chip, including one or more interface circuits and one or more processors; the one or more processors receive or send data through the one or more interface circuits, and when the one or more processors execute computer instructions, the steps of the above-described related method steps that implement the method in the above embodiments are executed. The interface circuit is a transceiver / transceiver pin 1002.

[0414] This embodiment also provides a computer-readable storage medium storing computer instructions. When the computer instructions are executed on an electronic device, the electronic device performs the aforementioned method steps to implement the methods described in the above embodiments.

[0415] This embodiment also provides a computer program product containing computer instructions that, when executed by a computer or processor, cause the computer to perform the aforementioned related steps to implement the methods described in the above embodiments.

[0416] In addition, embodiments of this application also provide an apparatus, which may specifically be a chip, component, or module. The apparatus may include a connected processor and a memory; wherein the memory is used to store computer execution instructions, and when the apparatus is running, the processor may execute the computer execution instructions stored in the memory to cause the chip to execute the methods in the above-described method embodiments.

[0417] In this embodiment, the electronic device, computer-readable storage medium, computer program product or chip are all used to execute the corresponding methods provided above. Therefore, the beneficial effects that can be achieved can be referred to the beneficial effects of the corresponding methods provided above, and will not be repeated here.

[0418] Through the above description of the embodiments, those skilled in the art will understand that, for the sake of convenience and brevity, only the division of the above functional modules is used as an example. In actual applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above.

[0419] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another apparatus, or some features may be ignored or not executed. Furthermore, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0420] The units described as separate components may or may not be physically separate. A component shown as a unit can be one or more physical units; that is, it can be located in one place or distributed in multiple different locations. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0421] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0422] Any content in the various embodiments of this application, as well as any content in the same embodiment, can be freely combined. Any combination of the above content is within the scope of this application.

[0423] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium. Based on this understanding, the technical solutions of the embodiments of this application, in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, can be embodied in the form of a software product. This software product is stored in a storage medium and includes several instructions to cause a device (which may be a microcontroller, chip, etc.) or processor to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0424] The steps of the methods or algorithms described in conjunction with the embodiments of this application can be implemented in hardware or by a processor executing software instructions. The software instructions can consist of corresponding software modules, which can be stored in random access memory (RAM), flash memory, read-only memory (ROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disks, portable hard disks, CD-ROMs, or any other form of storage medium well known in the art. An exemplary storage medium is coupled to a processor, enabling the processor to read information from and write information to the storage medium. Of course, the storage medium can also be a component of the processor. The processor and the storage medium can reside in an ASIC.

[0425] Those skilled in the art will recognize that the functions described in the embodiments of this application in one or more of the above examples can be implemented using hardware, software, firmware, or any combination thereof. When implemented using software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or code on a computer-readable medium. Computer-readable media include computer-readable storage media and communication media, wherein communication media include any medium that facilitates the transmission of a computer program from one place to another. Storage media can be any available medium that can be accessed by a general-purpose or special-purpose computer.

[0426] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of this application.

Claims

1. A data encapsulation method, characterized in that, The method includes: Obtain T.35 information; wherein, the T.35 information includes: country code, terminal provider code, and terminal provider targeting code; The T.35 information is encapsulated into an image file format; wherein the image file format includes a first box, and the first box includes the T.35 information.

2. The method according to claim 1, characterized in that, The image file format further includes a second box, which includes any one of the following: a metadata box or a streaming data box; The first box is nested inside the second box.

3. The method according to claim 2, characterized in that, The second box includes multiple third boxes, which have a hierarchical relationship. The first box is nested within a third box at any level.

4. The method according to claim 1, characterized in that, The image file format also includes a project identifier and a media box, wherein the project identifier indicates that the T.35 information is a project; The T.35 information includes header information and payload. The header information includes the country code, the terminal provider code, and the terminal provider targeting code. The payload includes multimedia metadata. The first box includes the header information of the T.35 information, and the media box includes the payload of the T.35 information.

5. The method according to claim 2 or 3, characterized in that, The T.35 information refers to the project, and the metadata box includes the image information box; The first box is nested within the image information box, and the first box is a project information entry box.

6. The method according to claim 2 or 3, characterized in that, The T.35 information is project attributes, and the metadata box includes a project attribute container box and a project attribute association box; The first box is nested within the project attribute container box, and the project attribute association box includes the association information between the T.35 information and the corresponding project.

7. The method according to claim 2 or 3, characterized in that, The T.35 information is a sample entry, the stream data box includes a track box, and the track box includes a sample description box; The first box is nested within the sample description box.

8. The method according to claim 2 or 3, characterized in that, The T.35 information is a sample group entry, the stream data box includes a track box, and the track box includes a sample group description box; The first box is nested within the sample group description box.

9. The method according to claim 2 or 3, characterized in that, The T.35 information is a sample entry; the streaming data box includes a track box, and the track box includes a sample description box and a sample group description box; the T.35 information includes header information and payload; the header information includes the country code, the terminal provider code, and the terminal provider orientation code; the payload includes multimedia metadata. The first box includes the header information of the T.35 information, and the first box is nested within the sample group description box; The sample description box includes the payload of the T.35 information.

10. The method according to claim 2 or 3, characterized in that, The T.35 information refers to sample entries, and the streaming data box includes a timing metadata track box, which in turn includes a metadata sample entry box. The first box is nested within the metadata sample entry box.

11. The method according to any one of claims 1 to 10, characterized in that, The method further includes: Acquire encoded multimedia data; The encoded multimedia data is encapsulated into the image file format, which further includes a media box or a streaming data box, the media box or streaming data box containing the encoded multimedia data.

12. The method according to any one of claims 1 to 11, characterized in that, The T.35 information also includes multimedia metadata.

13. A data acquisition method, characterized in that, The method includes: Receive image file format; wherein, the image file format includes a first box, and the first box includes T.35 information; The T.35 information is read from the first box of the image file format; wherein the T.35 information includes: country code, terminal provider code, and terminal provider targeting code.

14. The method according to claim 13, characterized in that, The image file format further includes a second box, which includes any one of the following: a metadata box or a streaming data box; The first box is nested inside the second box.

15. The method according to claim 14, characterized in that, The second box includes multiple third boxes, which have a hierarchical relationship. The first box is nested within a third box at any level.

16. The method according to claim 13, characterized in that, The image file format also includes a project identifier and a media box, wherein the project identifier indicates that the T.35 information is a project; The T.35 information includes header information and payload. The header information includes the country code, the terminal provider code, and the terminal provider targeting code. The payload includes multimedia metadata. The first box includes the header information of the T.35 information, and the media box includes the payload of the T.35 information.

17. The method according to claim 14 or 15, characterized in that, The T.35 information refers to the project, and the metadata box includes the image information box; The first box is nested within the image information box, and the first box is a project information entry box.

18. The method according to claim 14 or 15, characterized in that, The T.35 information is project attributes, and the metadata box includes a project attribute container box and a project attribute association box; The first box is nested within the project attribute container box, and the project attribute association box includes the association information between the T.35 information and the corresponding project.

19. The method according to claim 14 or 15, characterized in that, The T.35 information is a sample entry, the stream data box includes a track box, and the track box includes a sample description box; The first box is nested within the sample description box.

20. The method according to claim 14 or 15, characterized in that, The T.35 information is a sample group entry, the stream data box includes a track box, and the track box includes a sample group description box; The first box is nested within the sample group description box.

21. The method according to claim 14 or 15, characterized in that, The T.35 information is a sample entry; the streaming data box includes a track box, and the track box includes a sample description box and a sample group description box; the T.35 information includes header information and payload; the header information includes the country code, the terminal provider code, and the terminal provider orientation code; the payload includes multimedia metadata. The first box includes the header information of the T.35 information, and the first box is nested within the sample group description box; The sample description box includes the payload of the T.35 information.

22. The method according to claim 14 or 15, characterized in that, The T.35 information refers to sample entries, and the streaming data box includes a timing metadata track box, which in turn includes a metadata sample entry box. The first box is nested within the metadata sample entry box.

23. The method according to any one of claims 13 to 22, characterized in that, The image file format also includes a media box or a streaming data box, which includes encoded multimedia data; The method further includes: The encoded multimedia data is obtained from the media box or streaming data box of the image file format; Decode the encoded multimedia data to obtain reconstructed multimedia data.

24. The method according to any one of claims 14 to 23, characterized in that, The T.35 information also includes multimedia metadata.

25. An image file format, characterized in that, The image file format is obtained by encapsulating the data according to any one of claims 1 to 12.

26. An electronic device, characterized in that, include: A memory and a processor, wherein the memory is coupled to the processor; The memory stores program instructions that, when executed by the processor, cause the electronic device to perform the method as described in any one of claims 1 to 12.

27. An electronic device, characterized in that, include: A memory and a processor, wherein the memory is coupled to the processor; The memory stores program instructions that, when executed by the processor, cause the electronic device to perform the method as described in any one of claims 13 to 24.

28. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed on a computer or processor, causes the computer or processor to perform the method as described in any one of claims 1 to 24.

29. A computer program product, characterized in that, The computer program product includes computer instructions that, when executed by a computer or processor, cause the steps of the method as described in any one of claims 1 to 24 to be performed.

30. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores an image file format, which is obtained by encapsulating the data according to any one of claims 1 to 12.