Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and System for Providing an XML Binary Format

a binary format and xml technology, applied in the field of tagbased data description binary formatting, can solve the problems of piece-by-piece process, unnecessary overhead generated by xml document by the recipient computing device, and reduce the size of the resulting file. , the effect of reducing the size of the fil

Inactive Publication Date: 2009-05-21
MICROSOFT TECH LICENSING LLC
View PDF101 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0007]In view of the foregoing, the present invention provides a way of incorporating binary formatting into a tag-based description language, such as XML. The binary formatting is achieved by tokenizing the tag and attribute names into variable sized numeric tokens, thereby obviating the need for repetitive or redundant storage of lengthy unicode words, etc. The binary formatting minimizes parsing time and the generation of overhead incident to the formatting and parsing of data. Parsing time and where applicable XML generation time are thereby substantially decreased and generally, the size of the resulting file decreases too.

Problems solved by technology

One prevailing problem with XML is that parsing of the XML document by the recipient computing device generates unnecessary overhead, and thus inserts time delays in the process.
The piece-by-piece process of dividing the document into individual elements, attributes, and other pieces, also known as “tokenization”, can consume a considerable amount of time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and System for Providing an XML Binary Format
  • Method and System for Providing an XML Binary Format
  • Method and System for Providing an XML Binary Format

Examples

Experimental program
Comparison scheme
Effect test

example 1-1

An XML Document

[0043]

 Verbatim DataLife MF 2HD 10 3.5” black  floppy disks 

[0044]This document is text and might well be stored in a text file. The document can be edited with this file with any standard text editor, such as BBEdit, UltraEdit, Emacs, or vi. A special XML editor is unnecessary. Then again, this document might not be a file at all. It might be a record in a database. It might be assembled on the fly by a CGI query to a web server and exist only in a computer's memory. It might even be stored in multiple files and assembled at runtime. Even if it isn't in a file, however, the document is a text document that can be read and transmitted by any software capable of reading and transmitting text.

[0045]Programs that try to understand the contents of the XML document i.e., programs that do not merely treat it as any other text file, use an XML parser to read the document. The parser is responsible for dividing the document into individual elements, attributes, and other piec...

examples

[0078]In Examples A and B below, a text-based representation precedes a binary formatted or tokenized representation in accordance with the present invention. White space is ignored to make the encoding less verbose, and the character count does not include the white space. In example A, the binary representation reduces the overall byte count from 41 to 34. In Example B, the binary representation reduces the overall byte count from 293 to 212.

example a

[0079]

footextSTREAMDESCRIPTION0x08Name definition for ‘a’ gets name token 10x01 ‘a’Textdata0x01Element0x01Name token 1 for ‘a’0x08Name definition for ‘b’ gets name token 20x01 ‘b’Textdata0x01Element0x02Name token 2 for ‘b’0x89Text data token with singular bit set (ENDof ‘b’)0x03 ‘f’‘o’‘o’Textdata0x08Name definition for ‘bar’ gets name token 30x03 ‘b’‘a’‘r’Textdata0x81Element with empty bit set0x03Name token 1 for ‘bar’0x00END of ‘bar’ (no attributes)0x00END of ‘a’0x01Element0x01Name token 1 for ‘a’0x01Element0x02Name token 1 for ‘b’0x89Text data token with singular bit set (END of‘b’)0x04 ‘t’‘e’‘x’‘t’Textdata0x00END of ‘a’

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A technique for incorporating binary formatting into a tag-based description language, such as XML, is provided. The binary formatting is achieved by tokenizing the tag and attribute names into variable sized numeric tokens, thereby obviating the need for repetitive or redundant storage of lengthy unicode words, etc. The binary formatting minimizes parsing time and the generation of overhead incident to the formatting and parsing of data. Parsing time is thereby substantially decreased and generally, the size of the resulting file decreases too.

Description

CROSS-REFERENCE TO RELATED APPLICATION[0001]This application is a continuation of U.S. patent application Ser. No. 09 / 838,436 (MSFT-0323 / 167389.04) filed Apr. 19, 2001 entitled “Method and System for Providing an XML Binary Format,” which is herein incorporated by reference in its entirety.FIELD OF THE INVENTION[0002]The present invention relates to tag-based descriptions of data. More particularly, the present invention relates to binary formatting of tag-based data descriptions. The present invention is suited for, but by no means limited to, methods and systems for tokenizing text-based data formats such as XML.BACKGROUND OF THE INVENTION[0003]XML, the Extensible Markup Language, is a W3C-endorsed standard for document markup. It defines a generic syntax used to mark up data with simple and complex human-readable tags. It provides a self-describing standard format for computer documents. This format is flexible enough to be customized for domains as diverse as web sites, electron...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/00G06F17/21G06F17/22
CPCG06F17/22G06F17/218G06F40/117G06F40/12
Inventor CSERI, ISTVANSEELIGER, OLIVER NICHOLASLAYMAN, ANDREW J.
Owner MICROSOFT TECH LICENSING LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products