Associated data compressing method friendly to query

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of associated data and compression methods, applied in the field of big data, can solve problems such as aggravating performance problems and reducing query efficiency, and achieve the effect of improving the compression rate

Active Publication Date: 2017-05-24

WUHAN UNIV OF SCI & TECH +1

View PDF5 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although more and more storage media can be used to store increasingly large linked data sets, large data sets not only lead to low query efficiency, but also exacerbate performance problems in other common processes (such as RDF publishing and exchange)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0047] The technical solution of the present invention will be described in detail below in conjunction with the drawings and embodiments.

[0048] The technical solution provided by the present invention is an associated data set compression algorithm based on a relational matrix, specifically comprising the following steps:

[0049] 1. Define the memory model of triples, including three data segments of subject S, predicate P and object O;

[0050] 2. Input the associated data in N-Triple format and parse it to get a set of triples;

[0051] The detailed process is as follows:

[0052]2.1. Filter out lines starting with "#" or empty lines;

[0053] 2.2. Read each row of data and split the string by spaces;

[0054] 2.3. Map the segmented data to the subject, predicate and object of the triple to construct a triple;

[0055] 3. Build a dictionary and ID the triplet;

[0056] The detailed process is as follows:

[0057] 3.1. Flatten the triples obtained in the previous s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to an associated data compressing method friendly to query. The method comprises the following steps: defining a relation mining rule, and mining a potential incidence relation in a triad; defining a compression query memory model which consists of a subject vector, a predicate vector and an object matrix; defining a serialization mode of the compression query memory model, and implementing serialization and deserialization by using three auxiliary symbols; defining a query mode of executing SPARQL on the compression query memory model, querying a subject and a predicate by using a binary search method, and querying an object by using a linear traverse method; and defining a scheme for solving slow query caused by the over-large object matrix, and dividing a large data block into a plurality of small data blocks. Compared with most of existing compression schemes, an associated data set processed by the method has the characteristics that the compression ratio is increased, and SPARQL query operation can be carried out directly under the compression state.

Description

technical field [0001] The invention relates to the field of big data, and is used for storage, transmission and query of massive RDF, LOD and knowledge map-related data. In particular, it relates to a query-friendly method for relational data compression Background technique [0002] There are many existing associated data compression schemes, but most of them are not friendly to queries. The generally accepted compression scheme is HDT, which has a high compression rate, but it needs to be decompressed first when querying, which is not friendly to queries. Inspired by the HDT scheme, many compression techniques based on the HDT scheme have also been proposed, such as HDT FoQ, WaterFowl, and HDT++. These compression techniques have a common feature: high compression ratio, but they are not friendly to queries. [0003] There are also some query-friendly schemes, such as the BitMat method. This compression scheme uses a three-dimensional matrix to express triplet relations...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30

CPCG06F16/33G06F16/374

Inventor 顾进广彭燊黄智生符海东梅琨

Owner WUHAN UNIV OF SCI & TECH

Associated data compressing method friendly to query

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology