Systems and methods for efficient compression, representation and decompression of a variable table data

A technology of tabular data and compressors, which is applied in file systems, electronic digital data processing, digital data information retrieval, etc., and can solve problems such as lack of advanced features

Pending Publication Date: 2022-05-27
KONINKLJIJKE PHILIPS NV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, many of these approaches are based on disk-based array management tools (e.g., TileDB and HDF5), which lack advanced features, such as including but not limited to metadata, links, and attributes Specific index processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Systems and methods for efficient compression, representation and decompression of a variable table data
  • Systems and methods for efficient compression, representation and decompression of a variable table data
  • Systems and methods for efficient compression, representation and decompression of a variable table data

Examples

Experimental program
Comparison scheme
Effect test

example 2

[0162] …

[0163] Dataset 2 (Gene Expression Data)→

[0164] Annotation file (sample 1)

[0165] Annotation file (example 2)

[0166] …

[0167] …

[0168] In one embodiment, different annotation files can be merged together to achieve improved compression and analysis performance, for example, as follows:

[0169] Dataset Group (Large Studies) →

[0170] Dataset 1 (variant call data) →

[0171] Annotation files (all samples)

[0172] Dataset 2 (Gene Expression Data)→

[0173] Annotation files (all samples)

[0174] …

[0175] To implement this implementation, the system processor can augment the existing dataset header structure with additional fields to support the data type (sequence / variant / gene expression / …), the number of annotation files in the dataset, and the The byte shift for each file in . When sharing a compressor across annotation files or across datasets, the compressor's parameters can be stored at the dataset level or dataset group level, respective...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method for controlling data compression includes accessing genomic annotation data in one of a plurality of first file formats; extracting attributes from the genome annotation data; dividing the genome annotation data into blocks; and processing the extracted attributes and blocks into related information. The method further includes selecting a different compressor for the attribute and the block identified in the correlation information; and generating a file in a second file format, the file in the second file format including the related information and information indicating the different compressors for the attribute and the chunk indicated in the related information. The information indicating the different compressor is processed into the second file format to allow for selective decompression of the attributes and the chunks indicated in related information.

Description

[0001] Cross References to Related Applications [0002] This application is related to, and related to, U.S. Provisional Patent Application US 62 / 923141, filed October 18, 2019, which is hereby incorporated by reference in its entirety for all purposes. [0003] This application is related to U.S. Provisional Patent Application US 62 / 923113, filed October 18, 2019, which is hereby incorporated by reference in its entirety for all purposes. [0004] This application is related to U.S. Provisional Patent Application US 62 / 956941 (Attorney Docket No. 2019P00831US01), entitled "Customizable Delimited TextCompression Framework," filed concurrently with this application, which is hereby incorporated by reference in its entirety for all purposes. technical field [0005] One or more embodiments herein relate to the management of compression and decompression of information including, but not limited to, tabular data and delimited text files. Background technique [0006] Genomic ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B50/50G16B50/10H03M7/30G06F16/13G06F16/174
CPCG16B50/50G16B50/10H03M7/30H03M7/6082H03M7/70G06F16/1744G06F16/13H03M7/3079G06F21/604
Inventor S·尚达科张贻谦
Owner KONINKLJIJKE PHILIPS NV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products