Musing on Intelligence – Part 8: Symbolic Processing


Symbolizing Sensory Inputs

In electronics, it is well-understood that signals can be analog and digital, with digital signal being discretized analog signal.  The greatest benefit of converting analog signals to digital is quality preservation in noisy environment.  For instance, digital audio preserves its crystal sound quality even when copied repeatedly.

Neural systems have a similar mix of analog and digital. The electrochemical process inside a neuron is an analog, dynamic process; while inter-neuron transmission is pulse coded, or digital.  Neuron cell body sums the incoming pulses from upstream neurons and converts them into action potential that is eventually digitized through thresholding and transmitted as pulses to downstream neurons.


Digitization is not only happening at the cellular level, but also at the networks. Interconnecting neurons in a layer may cross-inhibit one another, creating a competitive network such that a winner will emerge at steady-state.  This is in effect the “category” layer (F2) in the Adaptive Resonance Theory (ART) network discussed in my previous blog post, where each node in the F2 layer represents a symbol.

Expert System for Symbolic Processing

When it comes to processing abstract symbolic information, neural network may not the most efficient mechanism.  Expert systems, which have been in use for decades for symbolic knowledge representation and processing, may be much more efficient for perform high-level reasoning.  In this blog post, I will use one expert system tool, called C Language Integrated Production System (CLIPS), as an example to illustrate how symbolic systems can learn to reason and plan through mechanisms called forward-chaining and backward-chaining.

First, a brief introduction of CLIPS, a tool for building expert system.  Typically a CLIPS expert system is made of rules and facts. Rules have the following form:

    (defrule myrule "my comment"

Whenever the patterns to the left of ⇒ are satisfied, the actions on the right are executed. Knowledge of causal relationships can be captured as CLIPS facts, which is essentially a set of string tokens, as the below example illustrates:

(cause high-wind effect tree-fall context outdoor prob 0.5)

In this fact, cause, effect, context and prob are keywords; and high-wind, tree-fall, outdoor and 0.5 are values. This fact expresses a piece of knowledge that, if one is outdoor and sees wind blowing, there is a good chance that trees will fall.

Forward Chaining in CLIPS

Now, consider this rule:

    (defrule forward-chaining "forward-chaining"
        (cause ?o)
        (cause ?o effect ?e context ?c prob ?p)
        (assert (cause ?e))

There are two patterns on the left-hand side (LHS) that have to be satisfied for the rule to “fire” and execute the actions on the right-hand side (RHS).  The first LHS pattern, (cause ?o), is asserted by application logic outside of the expert system, or by other rules.  The second LHS pattern represents a piece of cause-effect knowledge matching the same cause. Note that if one asserts an initial cause, and there is prior knowledge matching the same cause, then the RHS actions will assert a new cause ?e, which may trigger other knowledge with the cause ?e.  To illustrate this, let’s assume there is another piece of knowledge in the  system that states that there is a 30% probability fell trees will block roads if it happens in a city.  Now there are two pieces of knowledge in the system:

(cause high-wind effect tree-fall context outdoor prob 0.5)
(cause tree-fall effect road-blocked context in-city prob 0.3)

With the forward-chaining rule, the first fact together with initial cause, (cause high-wind), will generate an interim effect, tree-fall.  Together with the second fact, tree-fall will triggers road-blocked.  More simply put, we have an inference sequence of


In summary, forward-chaining enables machines with prior knowledge to anticipate city road blocks when there is high wind, with a certainty of 15%.

Backward Chaining in CLIPS

While forward-chaining enables machines to anticipate probable outcome based on initial observation, backward chaining enables machines  find probable root cause based on observations.

Backward-chaining rule looks like below:

    (defrule back-chaining "back-chaining"
        (effect ?e)
        (cause ?o effect ?e context ?c prob ?p)
        (assert (effect ?o))

In this case, assertion of (effect ?e) starts the ball rolling, but in this case the facts are matched based on effect, not cause.  Readers are left to work out that, given the same two facts as before, and an initial observed effect of (effect road-blocked), the root cause finding sequence would be as follow, where a potential root cause of road block is tree knocked down by high wind.


Imagine if there is a multitude causal relationships stored in the knowledge base as facts, then the forward and backward chaining rules could help predict anticipate probable outcome or develop hypothesis, or even diagnose root cause.  The knowledge base can build over time, much like how a person learns.

Implementation Example

The above forward and backward chaining formulations still lack the ability to track all the possible paths. The below CLIPS code tracks the paths, and uses the CLIPS Object Oriented Language (COOL) features.  In this code, the CAUSAL class represents the cause-effect relationship, and it has the same four components: cause, effect, context and probability:

;  CAUSAL is a class with cause, effect, context and prob
;  as properties
(defclass CAUSAL (is-a USER)
 (slot cause (create-accessor read-write))
 (slot effect (create-accessor read-write))
 (slot context (create-accessor read-write))
 (slot prob (create-accessor read-write))

; Once some (cause xyz) is asserted, a forward-chain
; will get started
(defrule forward-chain-starter
 ?f <- (cause ?c) 
 (assert (forward-chain ?c 1))

; Once some (effect xyz) is asserted, a backward-chain
; will get started
(defrule backward-chain-starter
 ?f <- (effect ?e) 
 (assert (backward-chain ?e 1))

;  Rule that performs forward chaining
(defrule forward-chaining
 ?f <- (forward-chain $?s ?c ?p1)
 ?obj <- (object (is-a CAUSAL) (cause ?c) (prob ?p2)) 
 (bind ?e (send ?obj get-effect))
 (assert (forward-chain $?s ?c ?e (* ?p1 ?p2)))

;  Rule that performs backward chaining and
;  updates the backward-chain
(defrule backward-chaining
 ?f <- (backward-chain $?s ?e ?p1)
 ?obj <- (object (is-a CAUSAL) (effect ?e) (prob ?p2)) 
 (bind ?c (send ?obj get-cause))
 (assert (backward-chain $?s ?e ?c (* ?p1 ?p2)))

;  Arbitrary initial facts to emulate externally 
;  asserted facts that kick starts both forward and backward
;  chaining.
(deffacts starting-facts
 (effect E)
 (cause A)

;  A set of arbitrary knowledge; could be expanded real-time
;  by other rules
(definstances starting-instances
 (a of CAUSAL (cause A) (effect B) (prob 0.5))
 (b of CAUSAL (cause B) (effect C) (prob 0.1))
 (c of CAUSAL (cause C) (effect D) (prob 0.2))
 (d of CAUSAL (cause B) (effect E) (prob 0.8))
 (e of CAUSAL (cause C) (effect E) (prob 0.3))

In the above CLIPS code, the starting query could be a (cause ?c) or a (effect ?e ). When it runs, the default facts defined by (deffacts ...) construct will cause the rule forward-chain-starter and backward-chain-starter to fire. Once fired, a forward and backward chain facts is started, triggering forward-chaining and backward-chaining rules to fire, extending the chains. When all the rules are fired, the result is all the possible forward and backward chains that can start from (cause A), and all chains that can end in (effect E).  The results looks like this:

CLIPS> (clear)
CLIPS> (load planner.clp)
Defining defclass: CAUSAL
Defining defrule: forward-chain-starter +j+j
Defining defrule: backward-chain-starter +j+j
Defining defrule: forward-chaining +j+j+j
Defining defrule: backward-chaining +j+j+j
Defining deffacts: starting-facts
Defining definstances: starting-instances
CLIPS> (reset)
CLIPS> (run)
CLIPS> (facts)
f-0     (initial-fact)
f-1     (effect E)
f-2     (cause A)
f-3     (forward-chain A 1)
f-4     (forward-chain A B 0.5)
f-5     (forward-chain A B E 0.4)
f-6     (forward-chain A B C 0.05)
f-7     (forward-chain A B C E 0.015)
f-8     (forward-chain A B C D 0.01)
f-9     (backward-chain E 1)
f-10    (backward-chain E C 0.3)
f-11    (backward-chain E C B 0.03)
f-12    (backward-chain E C B A 0.015)
f-13    (backward-chain E B 0.8)
f-14    (backward-chain E B A 0.4)
For a total of 15 facts.

Note that every possible forward and backward chains are listed; and there are two A→E paths.

In the next blog post, I will discuss how to model neural dynamics for dynamic neural processing.

Further Readings

Leave a Reply