Just a collection of my thoughts about the possiblity of developing (from scratch) an implementation of the RETE algorithm that integrates much more cleanly with RDFLib than Pychinko. Mostly as a way to avoid the heavy software-dependency burden associated with using Pychinko for forward-chaining. The simplicity of the RETE algorithm was the main motivation.

An alpha node filters RDF triples against a triple pattern. The triple pattern corresponds to a quoted statement (from the antecedent) of a rule. The triple is passed on by the alpha node if it matches the triple pattern.
class TriplePattern:
def __init__(self,(self,predicate,object)):
self.subject = subject
self.predicate = predicate
self.object = object
class ReteFact:
def __init__(self,(self,predicate,object)):
self.subject = subject
self.predicate = predicate
self.object = object
self.alphaNode = None #The alpha node that passed this Fact on (used for testing variable consistency)
class AlphaNode:
def __init__(self,triplePattern):
self.triplePattern = triplePattern
self.alphaMemory = Set() #The set of facts that passed the test
self.betaNode = None #The BetaNode fed by this AlphaNode
def test(self,aReteFact):
.. if fact passes attach this AlphaNode and add it to self.alphaMemory ..
The AlphaNode class could support a mechanism for registering certain built-in predicates with functions to invoke instead of test. This would be similar to the way PyChinko relies on CWM built-ins for implementing specialized Fact tests with the exception that arbitrary extensions can be built to work with the core algorithm.
A beta node is associated with two incoming sets of facts. The beta node is also associated with an outgoing set of facts which are updated with only those incoming facts that have consistent variable bindings. That is, for each of the variables in common, they must be bound to the same value. The outgoing set is used for the left incoming set of the beta node below it.
class BetaNode:
def __init__(self,leftNode,rightNode,terminalNode=False,consequent=None):
self.terminalNode = terminalNode #Boolean value indicating if this is the bottom of the RETE network or not
self.consequent = consequent is None and [] or consequent #The consequent is a list of TriplePatterns that will be fired if this is a terminal node and propagation is 'successful'
self.leftNode = leftNode
self.rightNode = rightNode #The incoming right input of a BetaNode is always an AlphaNode
self.betaMemory = Set() #The set of facts (from the memory of both incoming nodes) with consistent bindings
self.betaNode = None #The BetaNode fed by this BetaNode
def propagate(self):
.. checks for consistent variable bindings between memory of incoming nodes ..
class ReteNetwork:
"""
"""
def __init__(self,ruleGraphs):
self.topLevelAlphaNodes = []
self.inferredFacts = []
def serialize(self):
"Serialize network in N3 - for persistence"
pass
def reset(self):
"Reset the network by emptying the memory associated with all contained nodes"
pass
def buildNetw
def feedFact(self,aFact):
"Passes a fact to the top level alpha nodes, propagates through the network and adds all inferred facts to self.inferredFacts (if any)"
pass
The BetaNode is either a terminal (i.e., any facts that 'pass' will trigger the entire rule) or it isn't. If it's a terminal, it's associated with the consequent of the rule (a set of TriplePatterns). If it isn't a terminal, it is associated with a betaMemory which will hold all facts that pass the propagation test.
The propagation test checks for variables that are common between the memory of both incoming nodes. It propagates facts where these variables are bound to the same value. If this is a terminal node, then for each propagated fact, the consequent is filled in with the values for each variable and the triplePatterns should be 'fired.'
For the sake of reuse, documentation, and/or unit testing, a compiled RETE network can be expressed in RDF using a vocabulary specifically tailored for the components of the RETE alghorithm.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://metacognition.info/ontologies/ReteVocabulary.owl#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix log: <http://www.w3.org/2000/10/swap/log#>.
:AlphaNode a owl:Class.
:BetaNode a owl:Class.
:TerminalNode a owl:Class.
rdfs:subClassOf :BetaNode.
:alphaNodeTest a owl:ObjectProperty;
rdfs:comment "The alpha node test is a triple pattern in the rule antecedent";
rdfs:domain :AlphaNode;
rdfs:range rdf:Statement.
:leftInputOf a owl:ObjectProperty;
rdfs:domain owl:unionOf (:AlphaNode :BetaNode);
rdfs:range :BetaNode;
owl:inverseOf :leftInput.
:rightInputOf a owl:ObjectProperty;
rdfs:domain owl:unionOf (:AlphaNode :BetaNode);
rdfs:range :BetaNode;
owl:inverseOf :rightInput.
:leftInput a owl:ObjectProperty;
owl:inverseOf :leftInputOf;
rdfs:domain :BetaNode;
rdfs:range owl:unionOf (:AlphaNode :BetaNode).
:rightInput a owl:ObjectProperty;
owl:inverseOf :rightInputOf;
rdfs:domain :BetaNode;
rdfs:range owl:unionOf (:AlphaNode :BetaNode).
:ruleConsequent a owl:ObjectProperty;
rdfs:comment "The set of consequents (associated with the rule) that will be 'fired' when facts propagate past this Terminal Node";
rdfs:domain :TerminalNode;
rdfs:range log:Formula;
for subject,predicate,object in antecedentFormula:
if not AlphaNode(TriplePattern((subject,predicate,object))) in network:
.. add to network if there was no existing multi input pattern that had matched so far..
.. otherwise connect to existing multi input pattern (or alpha node)..
else:
if ..previous iteration was also in the network..:
..continue interation..
else:
..begin match for multi input pattern..
Compiled networks can be processed into the Graphviz .dot format, from which diagrams of the compiled networks can be generated into the various output formats supported by Graphviz.
Considering the very simple and specific behavior associated with the components of the algorithm, it would seem that fully regressive unit-tests could be easily written for the AlphaNode,BetaNode,ReteNetwork classes to facilitate test-driven development.
To be continued