Parallel Reasoning Algorithm for Streaming RDF Data Based on Spark Streaming

A data and algorithm technology, applied in the field of massive streaming RDF data reasoning, can solve the problems of low efficiency, inapplicability, and high consumption of pseudo-two-way network communication, and achieve the effect of ensuring completeness, reducing the number of tasks, and quickly reading

Active Publication Date: 2022-03-11
FUZHOU UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The IDRM algorithm is specially modeled for RDFS rules, so the efficiency of OWL Horst rule reasoning is not high
Slider is only designed for RDFS rules, so it is not suitable for complex OWL Horst rule reasoning
The PRAS algorithm performs reasoning on streaming data by designing a pseudo-bidirectional network, but due to the high consumption of pseudo-bidirectional network communication, the efficiency of processing a large amount of streaming data is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel Reasoning Algorithm for Streaming RDF Data Based on Spark Streaming
  • Parallel Reasoning Algorithm for Streaming RDF Data Based on Spark Streaming
  • Parallel Reasoning Algorithm for Streaming RDF Data Based on Spark Streaming

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The technical solution of the present invention will be specifically described below in conjunction with the accompanying drawings.

[0037] The present invention provides a parallel reasoning algorithm for streaming RDF data based on Spark Streaming, comprising the following steps:

[0038] Step S1, combined with OWL Horst inference rules, constructing the corresponding rule-connected variable relationship table; in the iterative parallel inference stage, the batch new data in the Streaming data stream and the data generated by the previous inference are regularly obtained as input data, and the input pattern data and The instance data is sorted and stored in the corresponding Redis cluster;

[0039] Step S2, connect the variable relationship table according to the rules, determine the rules that can be activated in this reasoning, and combine the corresponding instance data to generate reasoning data;

[0040] Step S3, delete and store the duplicate data generated in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a parallel reasoning algorithm for streaming RDF data based on Spark Streaming. First, combine the OWL Horst inference rules to construct the corresponding rule connection variable relationship table; in the iterative parallel inference stage, the batch new data in the Streaming data stream and the data generated by the previous inference are regularly obtained as input data, and the input pattern data and instance data are processed. Carry out classification processing and store it in the corresponding Redis cluster; then, connect the variable relationship table according to the rules, judge the rules that can be activated in this reasoning, and combine the corresponding instance data to generate reasoning data; finally, delete the duplicate data generated in this reasoning and storage, and the iterative reasoning ends. The present invention reduces the number of tasks of MapReduce, combines Spark to perform iterative reasoning of flow data; design rules to connect variable relational tables to store data and new data generated during reasoning, ensuring the completeness of the algorithm; design example triples The storage solution, combined with the characteristics of Redis, trades space for time to achieve fast reading of instance data.

Description

technical field [0001] The invention belongs to the technical field of massive streaming RDF data reasoning, and in particular relates to a parallel reasoning algorithm for streaming RDF data based on Spark Streaming. Background technique [0002] Most of the existing reasoning methods based on OWL rules are centralized processing of fixed-size static data sets. Due to the limitation of centralized processing mechanism, the existing algorithms are inefficient when processing massive real-time data. In order to respond to this growing demand, many scholars have studied and proposed their own RDF stream reasoning architecture: Barbieri DF[1] et al. proposed an incremental reasoning algorithm based on stream and rich background knowledge. Add expiry time information to RDF triples, when new streaming data arrives, perform reasoning calculations on new data, and terminate explicit facts and delete invalid triples. The IDRM[2] algorithm can efficiently and scalablely perform RDF...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2455G06N5/04
CPCG06N5/046
Inventor 汪璟玢陈晓曦
Owner FUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products