Multivariate big data fusion method and system based on BLOB (Binary Large OBject)

A fusion method and big data technology, applied in the fields of database, semantic web and big data, can solve the problem of lack of binary large objects, etc., and achieve the effect of enhancing coupling

Active Publication Date: 2017-09-01
COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the model lacks support for binary large objects (BLOB: binary large object)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multivariate big data fusion method and system based on BLOB (Binary Large OBject)
  • Multivariate big data fusion method and system based on BLOB (Binary Large OBject)
  • Multivariate big data fusion method and system based on BLOB (Binary Large OBject)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] Embodiment 1, multivariate big data fusion management model RDF-B

[0042] RDF-B is mainly extended based on the RDF model (https: / / www.w3.org / TR / 2014 / REC-rdf11-concepts-20140225 / ), so it has all the features of the RDF model such as triples and directed graphs. The definition of the model consists of 3 parts:

[0043] (1) RDF-B model: Same as the public RDF model, the RDF-B model uses the triple form of to express the attributes (predicates) and Its attribute value (object), the attribute value can be a text type, Boolean type, numeric type, time type, or even another resource. The difference is that the attribute value of RDF-B can be of BLOB type;

[0044] (2) Attribute definition of BLOB attribute value: The attribute value of BLOB type has its own attributes, including content (content), length (length), digest (digest), and a 32-bit mark (mark). As shown in Table 1.

[0045] Table 1 is the attribute table of the BLOB attribute value

[0046] ...

Embodiment 2

[0051] Embodiment 2, the text expression method of RDF-B attribute value

[0052] The non-BLOB attribute value in the RDF-B model adopts the standard RDF text expression method, that is, XML literal is used to express the attribute value, which usually includes two parts: vocabulary and IRI of the data type. The former represents the text of the attribute value, and the latter To represent the type of value, such as: "hello"^^xsd:string, "1"^^xsd:integer.

[0053] For BLOB attribute values, RDF-B uses the following expressions:

[0054] ::=" "^^

[0055] ::=content: ,length: ,digest: , mark:

[0056] Among them: content-value, length-value, digest-value, and mark-value respectively correspond to the content (content), length (length), digest (digest) of the BLOB attribute value, and a 32-bit mark (mark).

[0057] The following code shows two RDF-B triples. The first line is a standard RDF triple, and the second line contains a BLOB attribute value.

...

Embodiment 3

[0063] Embodiment 3, the creation method of BLOB attribute value

[0064] This invention provides methods for creating BLOB attribute values ​​from different data sources:

[0065] Literal create(byte[]bytes): Generate BLOB attribute values ​​​​based on byte arrays

[0066] Literal create(File file): Generate BLOB attribute values ​​​​based on files

[0067] Literal create(InputStreamSource source): Generate BLOB attribute value according to input stream data source;

[0068] Literal create(String text): Generate BLOB attribute values ​​​​based on text strings

[0069] Take create(File file) as an example, the pseudocode is as follows:

[0070] val bl = new BlobLiteral {

[0071] val mark32=IOUtils.readBytes(openStream(),32); / / Read the first 32-bit flag

[0072] def openStream()=new FileInputStream(fileName); / / open file input stream

[0073] def getLength()=file.length(); / / Get the length of the file

[0074] def getDigest() = DigestUtils.md5Hex(op...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multivariate big data fusion method and system based on a BLOB (Binary Large OBject). The method comprises the following steps that: 1) on the basis of an RDF (Resource Description Framework) data model, creating an RDF-B data model, adopting a triple form by the RDF-B model to express the attribute of each resource and the attribute value of the attribute, wherein the attribute value comprises the attribute value of a BLOB type, and the attribute value of the BLOB type comprises data contents, length, abstracts and symbol information; and 2) generating a triple for the received data by the RDF-B data model, and storing the triple into a front-end storage system, wherein if the data is of a BLOB type, the RDF-B data model generates a tetrad< handle, length, abstract and symbol> according to the attribute value of the BLOB type in the triple corresponding to the data as the attribute value of the triple information of the data, then, the triple of the data is stored into the front-end storage system, and the data contents are stored into a rear-end storage system according to the handle.

Description

technical field [0001] The invention relates to the technical fields of big data, databases, and semantic web, and proposes a multivariate big data fusion method and system. Background technique [0002] With the generation of massive scale and multivariate heterogeneous data, traditional database technology (usually referred to as relational database system) cannot manage massive, unstructured or semi-structured data sets well. The proposal of big data technologies such as NoSQL and Hadoop can efficiently solve the management and processing problems of unstructured information in the whole network domain (Web Scale) in a distributed environment, and deepen the application of big data. [0003] Different from big data (big data), semantic web (semantic web) and linked data (linked data) are from the perspective of information organization, through the introduction of rich formal semantics, improve the correlation and comprehensibility of data, and gradually become A powerfu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/13G06F16/33
Inventor 沈志宏黎建辉周园春侯艳飞胡良霖
Owner COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products