OWLHorst rule distributed type parallel reasoning algorithm in combination with Spark platform

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A distributed and rules-based technology, applied in the field of the Semantic Web, can solve the problems of time-consuming startup, whether the rules can be activated or not, multiple redundant calculations, etc., to achieve the effect of reducing overhead

Active Publication Date: 2017-08-04

FUZHOU UNIV

View PDF3 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

J.Urbani and others use WebPIE to reason on the RDFS / OWL rule set, which can satisfy the parallel reasoning of big data; but the algorithm enables one or more MapReduce tasks for each rule to reason, because the startup of the job is relatively time-consuming, Therefore, with the increase of RDFS / OWL reasoning rules, the efficiency of overall reasoning is limited

Gu Rong and others proposed an efficient and scalable semantic reasoning engine (YARM) based on MapReduce, which enables reasoning to complete the reasoning of RDFS rules within one MapReduce task; but this algorithm is not suitable for reasoning of complex OWL rules

In addition, when a new triplet generated by a certain rule is repeated, YARM will have too many redundant calculations and generate useless data

Wang Jingbing and others proposed a distributed parallel reasoning algorithm for RDF data combined with Rete. This algorithm combines RDF data ontology to construct a list of schema triples and a rule label model; in the RDFS / OWL reasoning stage, combined with MapReduce to implement the alpha stage and In the beta stage, the distributed inference of the Rete algorithm can be realized; however, the algorithm needs to consume more memory when connecting to the beta network for inference and is inefficient when performing multiple iterations, so this algorithm is limited by the cluster memory and platform

Gu Rong and others proposed an efficient parallel reasoning engine (Cichlid) based on Spark, combined with the RDD programming model, optimized the parallel reasoning algorithm; but this algorithm does not consider whether the rules can be activated, and reasoning is required, resulting in Waste of inference performance and redundancy of transmission

[0004] Due to the rapid growth of Semantic Web data, the memory limitations of centralized environments are no longer suitable for reasoning on large-scale data

Although there are currently distributed inference engines that can achieve data parallel inference, but the number of MapReduce tasks is large and time-consuming, and complex OWL Horst rules cannot be inferred, and there are too many redundant calculations that generate useless data and consume A large amount of memory is inefficient for multiple iterations, which makes it impossible to efficiently and correctly implement the reasoning of RDFS / OWL rules when the amount of data increases

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0015] The present invention will be further explained below in conjunction with the accompanying drawings and specific embodiments.

[0016] The present invention provides a kind of OWL Horst rule distributed parallel reasoning algorithm combined with Spark platform, and it comprises the following steps: DPRS algorithm mainly comprises the following several steps:

[0017] 1. Load pattern triplet set P j _RDD, O k _RDD and Rule m _linkvar_RDD and broadcast.

[0018] 2. Build a rule tag model Flag_Rule m and broadcast.

[0019] 3. To Flag_Rule m The rules in parallel execute the parallel inference of OWL Horst rules and output intermediate results.

[0020] 4. Remove duplicate triplets.

[0021] 5. If a new pattern triplet data is generated, then skip to 2, if a new instance triplet data is generated, then skip to 3, otherwise the algorithm ends.

[0022] Whole frame diagram of the present invention sees figure 1 .

[0023] Definition 1. (SchemaTriple) means that the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an OWLHorst rule distributed type parallel reasoning algorithm in combination with a Spark platform. According to the characteristics of Spark RDD, the principle of a TREAT platform is combined, an alpha register Om_RDD or Pt_RDD corresponding to a mode triad is constructed for RDF ontology data and broadcast, and a rule marking model is constructed; a mode first component of each rule is connected, a corresponding connecting mode triad set Rulem_linkvar_RDD is generated, and therefore the matching speed in the reasoning process is increased. At the OWL Horst reasoning stage, an alpha stage in a TREAT algorithm is achieved in combination with MapReduce, distributed parallel reasoning of multiple rules is achieved, and then the reasoning result is subjected to de-weight processing; a large number of instance triads can be filtered through the alpha register and the rule marking model, output of key assignment pairs at a Map stage is reduced, and therefore invalid network transmission is reduced.

Description

technical field [0001] The invention belongs to the technical field of semantic web, and in particular relates to a distributed parallel reasoning algorithm of OWLHorst rules combined with a Spark platform. Background technique [0002] The RDF and OWL standards in the Semantic World Wide Web have been widely used in various fields, such as general knowledge (DBpedia), medical life sciences (LODD), bioinformatics (UniProt), geographic information systems (Linkedgeodata) and semantic search engines (Watson )Wait. With the application of the Semantic World Wide Web, a large amount of semantic information has been produced. Due to the complexity and large-scale nature of the data, how to efficiently discover the hidden information in it through semantic information parallel reasoning is an urgent problem to be solved. Due to the rapid growth of Semantic Web data, the memory limitations of centralized environments are no longer suitable for reasoning on large-scale data. [0...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F17/30G06F17/27

CPCG06F16/24564G06F40/30

Inventor 汪璟玢叶怡新

Owner FUZHOU UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

OWLHorst rule distributed type parallel reasoning algorithm in combination with Spark platform

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology