Meta-level Representations in the IDP Knowledge Base System

Towards Bootstrapping Inference Engine Development

Motivation

  • Declarative approach works
  • Declarative systems hard to develop + maintain
    • Developed in imperative languages (C++, python ...)
    • Many case distinctions that each need to be solved efficiently
    • ASP Competition System Track participants
      • 2009: 16
      • 2011: 11
      • 2013: 10
  
We can do better: develop declarative systems declaratively

Bootstrapping

For software development (cf. Doug Engelbart):
"use the tool you are developing to improve that tool "


  • Develop IDP by (a.o.)
    • solving internal problems using IDP
    • improving the workflow using IDP
    • ...

IDP's Language: FO(ID)

  • First Order Logic extended with
    • types
    • definitions
    $\Rightarrow$ able to express transitive closure of relations


\[\left\{\begin{array}{lll} reaches(x,y) & \leftarrow & edge(x,y). \\ reaches(x,y) & \leftarrow & reaches(x,z) \wedge reaches(z,y). \\ \end{array}\right\}\]

IDP

  • Uses language $FO({ID})$
  • Knowledge base system
  • Supports a range of inference tasks
    • (Optimal) Model expansion
    • Theorem proving
    • ...

  • Goal: improve inferences offered by IDP using IDP's inferences themselves

    Model Expansion (MX)


    • Input: 
      • Theory T
      • Structure S
      • Output vocabulary Vout
    • Output:
      • Structure M over Vout such that there is a M' with
        • M' is a two-valued structure
        • M' more precise than S and M
        • M' satisfies T

    Spanning Tree Example

    Input structure with: $start/1$, $edge/2$, $time/1$, $fuel/1$
    Output vocabulary: $selEdge/2$, $totalTime/1$

    \[ \begin{array}{l} \left\{\begin{array}{lll} reachable(x) &\leftarrow & start(x). \\ reachable(y) & \leftarrow & reachable(x) \wedge edge(x,y). \end{array}\right\} \\ \forall x : reachable(x) \Rightarrow visit(x). \\ \left\{\begin{array}{lll} visit(x) &\leftarrow & start(x). \\ visit(y) &\leftarrow & visit(x) \wedge selEdge(x,y). \end{array}\right\} \\ \forall x~y : selEdge(x,y) \Rightarrow edge(x,y). \\ \{ totalTime(t) \leftarrow t = sum\{ x~y : selEdge(x,y) : time(x,y)\}. \} \\ \{ totalFuel(f) \leftarrow f = sum\{ x~y : selEdge(x,y) : fuel(x,y)\}. \} \\ \end{array} \]

    Generate Meta-level Representation


    • Input:  $FO({ID})$ problem specification (vocabulary, theory, structure)
    • Output: structure representing the problem specification


    • Approaches:
      • Herbrand functions
      • Reified representation over a limited domain

    ... using Herbrand functions


    E.g. function ``$\leftarrow(Formula,Formula):Rule$'' constructs rules

    $\leftarrow(visit(y),\wedge(visit(x),selEdge(x,y)))$
    • Problems:
      • Represent a list of objects (e.g. $\wedge([Formula]):Formula$)
      • Infinite domain of Herbrand terms

    ... using Reification


    \[\begin{array}{llllll} \text{type Formula} & = & \{\phi_1, \phi_2 \dots\} & \text{ type Rule} &=&\{r_1, r_2 \dots\} \\ \text{type Symbol} & = & \{\sigma_1, \sigma_2 \dots\} & \text{ type Index}&=&\{1, 2 \dots\} \\ \end{array}\]
    $head(r_1) = \phi_1$
    $body(r_1) = \phi_2$
    $predform(\phi_1,\sigma_1)$
    $\dots$

    • Advantage: finite domain, reified lists of arguments
    • Disadvantage: for transformations: need (over)estimation on new elements to be created

    ... using Reification


    • Very detailed representation
    • Often, only high-level info needed
    • $\Rightarrow$ create more abstract representation for specific purposes

    Bootstrapping IDP



    1. Recursion over negation analysis
    2. Optimization  of MX workflow
    3. Splitting definitions transformation

    1. Recursion over negation


    • If no Recursion Over Negation (RON)
    • $\Rightarrow$ exactly 1 model given opens
      $\Rightarrow$ optimizations are possible

    • Detect RON using MX

    1. Recursion over negation (2)

    • Recursion over negation in a definition if there is a defined symbol that depends negatively on another defined symbol
    • Symbol x  depends on y  if there is a rule defining x with
      • y  occurring in the rule's body, or
      • a symbol depending on y  in the rule's body

    $\Rightarrow$ Analyse transitive closure of
    direct dependencies in rules

    Encoding Direct Dependencies

    \[\left\{\begin{array}{lll} P & \leftarrow & \lnot Q \wedge R. \\ Q & \leftarrow & R. \\ R & \leftarrow & \lnot P. \end{array}\right\}\]

    • $P$ depends negatively on $Q$, positively on $R$
      • $\definecolor{myorange}{RGB}{255,153,0} head(r_1) = P~\wedge~inbody(r_1,Q,{\color{myorange}{n}})~\wedge~inbody(r_1,R,{\color{myorange}{p}})$
    • $Q$ depends positively on $R$
      • $head(r_2) = Q~\wedge~inbody(r_2,R,{\color{myorange}{p}})$
    • $\dots$

    1. Recursion over negation (3)

    • Encode direct dependencies in definitions
    • Calculate transitive closure of dependencies
    • Check for negative (indirect) dependencies

    Spanning Tree Example

    Input structure with: $start/1$, $edge/2$, $time/1$, $fuel/1$
    Output vocabulary: $selEdge/2$, $totalTime/1$

    \[ \begin{array}{l} \left\{\begin{array}{lll} reachable(x) &\leftarrow & start(x). \\ reachable(y) & \leftarrow & reachable(x) \wedge edge(x,y). \end{array}\right\} \\ \forall x : reachable(x) \Rightarrow visit(x). \\ \left\{\begin{array}{lll} visit(x) &\leftarrow & start(x). \\ visit(y) &\leftarrow & visit(x) \wedge selEdge(x,y). \end{array}\right\} \\ \forall x~y : selEdge(x,y) \Rightarrow edge(x,y). \\ \{ totalTime(t) \leftarrow t = sum\{ x~y : selEdge(x,y) : time(x,y)\}. \} \\ \{ totalFuel(f) \leftarrow f = sum\{ x~y : selEdge(x,y) : fuel(x,y)\}. \} \\ \end{array} \]

    2. Pre and post processing MX


    • Optimal workflow of MX depends on 
      • input structure S
      • output vocabulary Vout
      • dependencies between definitions
      • $\dots$
    • Extract special cases and solve them efficiently
    • E.g. evaluate definitions depending on completely given information

    • $\Rightarrow$ Split MX workflow

    2. Pre- and postprocessing MX (2)

    Four cases:
    • Preprocess
    • Search
    • Postprocess
    • Forget
    • Easy to express declaratively


      $\Rightarrow$ Define these 4 concepts using $FO({ID})$
      $\Rightarrow$ Calculate their model using ${IDP}$

    Spanning Tree Example

    \[\left\{\begin{array}{lll} reachable(x) &\leftarrow & start(x). \\ reachable(y) & \leftarrow & reachable(x) \wedge edge(x,y). \\ visit(x) &\leftarrow & start(x). \\ visit(y) & \leftarrow & visit(x) \wedge selEdge(x,y). \end{array}\right\}\]
    $\forall x : reachable(x) \Rightarrow visit(x).$

    $\dots$

    3. Splitting definitionS

    • Previous tasks depend on definition structure

    • $\Rightarrow$ split definitions

    Splitting Definitions

    • Only allowed if no dependency loops are broken

    \[\left\{\begin{array}{lll} samedef(r,r). & & \\ samedef(r,r') & \leftarrow & in(r) = in(r') \wedge head(r) = head(r'). \\ samedef(r,r') & \leftarrow & in(r) = in(r') \wedge \\ & & dep(in(r), head(r),head(r')) \wedge \\ & & dep(in(r), head(r'),head(r)). \\ \end{array}\right\}\]

    Conclusion

    • Declarative approach works
    • Declarative systems hard to develop + maintain
    • For some tasks (optimization, analysis) development using bootstrapping is fast , easy , and sufficiently efficient

    Reference

    Meta-level Representations in the IDP Knowledge Base System: Towards Bootstrapping Inference Engine Development
    B. Bogaerts, J. Jansen, B. De Cat, G. Janssens, M. Bruynooghe and M. Denecker