SOFA (Simple                   Ontology Framework API)

SOFA Design Whitepaper

by Alex Alishevskikh
<alexeya (at) gmail (dot) com>

Abstract

SOFA (Simple Ontology Framework API) is a Java API representing an object model of abstract, language-independent specification of knowledge, known as an Ontology. It is intended for using by developers of the Semantic Web, Information Retrieval, Knowledge Bases applications and other ontology-driven software engineering. SOFA provides a simplified and highly abstract model of ontology which is independent of a specific ontology representation language and operates with ontologies on a conceptual, rather than syntactic level. It allows for SOFA-based applications to operate with ontologies described in diverse language forms and gives significant advantages in simplicity of software development.

Purpose of this document is to provide basic SOFA design principles for developers of SOFA-based ontology software. It also may be helpful for those who going to examine SOFA to join in the development.

1. Introduction

The suggested software is a set of reusable Java API's aimed to provide developers of ontology applications with the following tools:

Key goals

And after all,

Key principles

Design principles

Implementation principles

Terminology

2. Structure of API

The SOFA API is divided into the following conceptual modules:



The SOFA API design scheme

3. Ontology Object Model

The Ontology model API is a heart of entire platform. This is a set of interfaces and their implementations representing an Ontology Object Model for manipulating ontologies by a programmatic object-oriented way. The model in conceptual level is consistent with W3C OWL (Web Ontology Language).

Conceptual Model of an Abstract Ontology

Ontology

An ontology is a formal representation of knowledge about area of interest. In SOFA, an ontology is considered as a set of individuals (Things) which encapsulate sequences of axioms and facts. Ontology is intended to be a knowledge repository that responsible for Things creation, storage, retrieving and removing. Also it provides an uniform namespace for all Things which belong to the ontology.

Ontologies can also have non-logical annotations that can be used to record human-readable labels, comments, versioning info and other non-logical information associated with an ontology.

Things

A Thing is a logical meaning ontology member which encapsulates knowledge about a specific item within an area of interest.

Thing model can be considered as a set of statements declaring some facts about a given item. A character of these statements is specified by Relations (predicates). The set of Relations allowed to participate in a specific Thing is an union of sets of Relations declared in domains of all ontology Concepts for which this Thing is an instance.

Each Thing has the following built-in properties:

Concepts

The special Things providing hierarchical classification of other Things. A specific Concept defines a group of Things that belong together because they share some relation types. A Thing belonging to a specific Concept is called an instance of this Concept.

The Concepts can be organized in a specialization hierarchy using subconcept axioms. More general Concepts extended with their subconcepts, which represent more special notions. A multiple inheritance is allowed, i.e. a Concept can be the direct subconcept of more than one superconcepts.

Relations

The Things providing specification of relationships between Things or from Things to actual data values. A Relation specification includes a domain of this Relation (a set of the Concept for instances of which the relation can be applied) and a range specifies the Concepts or the data types, instances of which are allowed to be the targets of this Relation. As well as the Concept, the Relations can be organized in a specialization hierarchy using subrelation axioms.

Relation attributes

Relations may have the following attributes, playing role in ontology inferencing:

Restrictions

The Concept can state restrictions on how the specific Relations may be applied on its instances. There are following types of restrictions:

Inference rules

An implementation provides an inference mechanism for making the implications from the axioms which are expicitly stated in the model. The basic inference rules are:

Integrity conditions

An implementation provides checking of ontology integrity, which means an integrity of all Things belonging to the ontology. This integrity evaluated as truth of the following conditions:

There is a sort of inconsistencies which are beyond the area of integrity and must be considered as exceptional situations (errors):

By convention, the model must not allow these situations.

Interoperability of the Ontology instances

The client applications must be able to manipulate with a number of distinct ontology instances at once. It should be possible for ontology members to have the relations with members of another ontology instances. The ontology model implementation provides a transparent mechanism for such interaction between the distinct ontologies.

Ontology Storage Model

The Ontology Storage Model API provides an abstract model of a storage utility for Ontology Model implementation. The SOFA model implementation refers to the storage model interface to store and retrieve the data (sets of statements) of the ontology internals. The client applications usually should not appeal directly to the storage API passing over the ontology model.

The interface part of the storage model represents an abstract ontology storage utility. It is independent from a specific way of storing the data in a particular physical storage back-end. This is a responsibility of a storage model implementation, which knows how to interact with a specific storage mechanism.

Storage model implementations

In-memory storage

The default storage model implementation is a simple in-memory storage . This is a minimalistic storage utility based on default implementations of the java.util.Collection and java.util.Map interfaces family. This implementation intended mainly for testing and experimental purposes and it can use the ontology serialization mechanism for long-term storage.

Persistent storage

The main productional storage model implementation is a persistent storage. It is built using the JDBC (Java DataBase Connector) framework to store and retrieve the ontology axioms with relational database management systems (RDBMS). This approach allows to enable a vast of approved database software as possible persistent storage utilities for ontology models. Also it brings necessary characteristics of an enterprise quality storage - such as reliablity, scalability, transactions support and security.

Besides, a persistent storage model can be implemented with other suitable data storage back-ends, e.g. Object-Oriented DBMS, Native XML databases, BerkeleyDB-like databases etc.

Interoperability storage

The special class of storage model implementations are adapters to existing applications and information systems. The adapters interpret the ontology axioms into the structures of the external data model and vice versa. It provides a transparent way to interact the ontology applications with these systems and provides an ontological representation and way of mainipulate of their data.

Ontology Serialization

Reversible representation means that the model can be serialized in an external format and then restored (deserialized) from this format backwards. This mechanism also allows to use existing ontologies and provides an interoperability with external agents.

The SOFA ontology model is independent from specific languages, but it can be interpreted in terms of those having expression capabilities to describe that model. As the SOFA model is conceptually consistent with semantics of W3C Ontology Web Language (positioned as an industry standard of ontology representation), the model can be entirely represented using this language syntax. Also it is rather true for DAML+OIL (the predecessor of OWL and still the most popular ontology definition language). Other languages can lose some details of the SOFA ontology model.

The Ontology Serialization package includes the modules providing serialization of the ontology model with specific languages and restoring it from a serialized form. The primary of these modules are:

Programming aspects

The Ontology model API is an hierarchy of objects of Java programming language. The root notion (Thing) is represented by objects with a root interface (Thing ) which provides the basic getters and setters methods for statements and built-in relations. The sub-notions (Concept and Relation) are represented by specialized subinterfaces of Thing, extended with methods for their specific needs.

Instantiation of ontology objects is provided by the Ontology interface.

Java data types

The model provides a transparent way to get and set the arbitrary Java objects as the datatype values of the Thing's statements. It also provides an automatical mapping from the Java classes to datatype ranges of the Relations.

Events and event listeners

The changes of the model bring to arising of an event. A client application can tracks the certain events by setting the event listeners, which will be notified about arising of events of specified class and execute certain tasks to handle these events.

The events mechanism is based on Java events framework ( java.util.EventObject class and java.util.EventListener interface).

Exceptions

When the model meets an illegal action, failure or in another situation, which may be considered as abnormal, it throws an exception. A client application can catch the exception to handle it in appropriate way. The exceptions mechanism is based on Java exceptions framework ( java.lang.Throwable class hierarchy).


$Id: index.html,v 1.1.1.1 2005/02/14 07:58:51 alexeya Exp $