SOFA (Simple                   Ontology Framework API)

Getting Started with the SOFA API

by Alex Alishevskikh
<alexeya (at) gmail (dot) com>

API version: 0.3

This document contains a brief tutorial overview of the key aspects of the programming with the SOFA API. For more details, please refer to the SOFA API documentation included into distribution.

  1. Basics
  2. Concepts and instances
  3. Relations. Reasoning and inferencing
  4. Ontology validation
  5. Interoperability of ontologies
  6. Advanced techniques
  7. Ontology representation

Basics

Creating new ontology

The SOFA Reference Implementation includes the OntoConnector class ( org.semanticweb.sofa.client.OntoConnector ) providing a mechanism for Ontology instantiation. This is a singleton class which keeps a single pool of multiple Ontology instances during the session and provides a common way for creating new and getting existing Ontologies.

import org.semanticweb.sofa.*;
import org.semanticweb.sofa.client.OntoConnector;
...
OntoConnector ontoConnector = OntoConnector.getInstance();
Ontology onto = ontoConnector.createOntology("http://example.com/ontology");

The code above will create new blank Ontology implementation with default in-memory storage model. To instantiate an Ontology we need to provide a namespace identifier (a valid URI name). Later we can address this Ontology in the pool by its namespace identifier:

Ontology onto_copy = ontoConnector.getOntology("http://example.com/ontology");                                            

— it creates a new reference to the existing Ontology with the variable onto_copy.

Creating the concepts and subconcepts.

Let's create the simple hierarchy of geographical concepts:

Concept geographical = onto.createConcept("Geographical");
Concept region = geographical.createSubConcept("Region");
Concept point = geographical.createSubConcept("Point");

We have the following structure of concepts:

"Region" and "Point" are the subconcepts of "Geographical" superconcept. The "Geographical" concept is top-level, i.e. it has no superconcepts itself.

Creating the relations.

Let's define the ontology Relations between instances of our concepts.

Relation locatedIn = geographical.createRelation("LocatedIn");
locatedIn.addRange(region);
Relation nearTo = geographical.createRelation("NearTo");
nearTo.addRange(geographical);

We have created two Relation objects "LocatedIn" and "NearTo" which belong to the "Geographical" concept instances (they say these relations are in a domain of that concept). As "Region" and "Point" are both the subconcepts of "Geographical", these relations are belong to their instances as well (in other words, "Region" and "Point" inherit those relations from the "Geographical" superconcept). The relationships between concepts and relations in our ontology are illustrated by the following diagram:

We called addRange() method to limit possible targets of our relations. Thus, we declared that the "LocatedIn" relation may have values of instances of the "Region" concept only and the "NearTo" relation may have values of instances of the "Geographical" concept (including instances of their subconcepts as well). We used addRange() method with Concept argument to specify that these relations have values of instances of ontology concepts. To define a relation that will refer to the objects of a Java data type, we have to use addRange() method with the argument of a java.lang.Class instance:

Relation name = geographical.createRelation("Name");
name.addRange(String.class);

We have created the relation which values are limited by objects of the java.lang.String class.

New domain concepts can be added to a relation later with addDomainConcept() method of Relation or with addRelation() method of Concept. We can browse the domain concepts of a relation with domainConcepts() method which returns an iterator over ontology concepts:

for (Iterator i = locatedIn.domainConcepts(true); i.hasNext();)
System.out.println(((Concept)i.next()).getId());

causes the output:

Region
Geographical
Point

A boolean argument of the domainConcepts() allows us to get all implicit (indirect) domain concepts. We would get the "Geographical" concept only (direct domain concept) if we used the false argument instead.

Similarly, the ranges() method of the Relation returns an iterator over the ranges of a given relation. We can apply the instanceof Java operator to check if a current value is a Concept range or not:

for (Iterator i = someRelation.ranges(true); i.hasNext();) {
Object range = i.next();
if (range instanceOf Concept)
System.out.println(((Concept)range).getId());
else
System.out.println(((Class)range).getName());
}

Creating the instances. Setting and getting the relations.

Now we going to create the specific individuals as the instances of ontology concepts:

Thing eriador = region.createInstance("Eriador");
Thing annuminas = point.createInstance("Annuminas");
Thing nenuial = point.createInstance("Nenuial");

And set the values for relations of our new instances:

annuminas.set(locatedIn, eriador);
nenuial.set(locatedIn, eriador);
annuminas.set(nearTo, nenuial);
nenuial.set(nearTo, annuminas);
annuminas.set(name, "Annuminas city");
nenuial.set(name, "Nenuial lake");
eriador.set(name, "Eriador");

We have set the following relations in our ontology:

Let's test our assignments and try to retrieve the relations values with the get() method:

System.out.println(
annuminas.get(name) +
" is located in " +
((Thing)annuminas.get(locatedIn)).get(name) +
", near to " +
((Thing)annuminas.get(nearTo)).get(name)
);

We will see:

Annuminas city is located in Eriador, near to Nenuial lake                                            

Note that when we going to get the instance value, we have to cast a result of the get() method to Thing interface, because it returns a value of java.lang.Object.

Setting and getting multiple values. The restrictions.

To set more than one value to relation we should use the add() method:

nenuial.add(name, "Nenuial lake"); // first 'add()' acts as 'set()'
nenuial.add(name, "Evendim lake");

or, we can use setAll() or addAll() methods which allows to assign an array of values (or a java.util.Collection) at once:

nenuial.setAll(name, new String[] {"Nenuial lake", "Evendim lake"});                                            

Difference between the set() and add() (and between setAll() and addAll()) is that the set() and setAll() clear all previous values of the given relation before making new assignments.

To retrieve multiple values there is the list() method which returns a java.util.Collection of values:

Collection results = nenuial.list(name);
for (Iterator i = results.iterator(); i.hasNext();)
System.out.println(i.next());

But what if we need to limit a number of possible relation values? Then we should use restrictions:

geographical.setRestrictionOn(name, 0, 1);                                                                                                              

The last two arguments of setRestrictionOn() method is a minimal and maximal cardinality of the relation values. In our case, only one (and not required) value of the "Name" will be accepted for the "Geographical" instances (including the instances of "Region" and "Point").

Additionally, we can restrict a relation by a set of predefined values:

Relation color = onto.createRelation("Color");
color.addRange(String.class);
Concept fruit = onto.createConcept("Fruit");
Set availColors = new HashSet();
availColors.add("red"); availColors.add("green"); availColors.add("orange");
fruit.setRestrictionOn(color, availColors, 1, 1);

We have used the setRestrictionOn() method with four arguments to limit the possible values of the "Color" relation of the "Fruit" concept. The second argument of this variant of setRestrictionOn() is a java.util.Collection containing allowed instances.

Note that setting restrictions (as well as other ontology constraints) does not prevent us to violate them. They are just the conditions of ontology integrity which can be checked post factum (see "Ontology validation" section below).

Browsing an ontology

As you could notice, we always use a string parameter when we create new ontology members. This is a unique identifier (ID) addressing the individual within an ontology. The SOFA Reference implementation puts some conventional constraints on using identifiers:

We can use ID's to retrieve the Things from an ontology:

Thing t = onto.getThing("Eriador");                                            

This method returns null if a given member were not found. If we need to get the Concept or Relation instances, we should use the special methods:

Concept c = onto.getConcept("Region");
Relation r = onto.getRelation("Name");

Two methods above will return null if a specified Thing is found but is not a Concept or a Relation respectively. To be sure, we can retrieve a Thing object and call the isConcept() or isRelation() boolean methods:

Thing t = onto.getThing("FooBar");
if (t != null) {
if (t.isConcept())
System.out.println("FooBar is a Concept");
else
if (t.isRelation())
System.out.println("FooBar is a Relation");
else
System.out.println("FooBar is a simple Thing");
}

If isConcept() or isRelation() return true, we able to call the toConcept() or toRelation() methods to reconstruct the Concept or Relation object instances from a Thing source:

Thing t = onto.getThing("FooBar"); 
Concept c = null; Relation r = null;
if (t != null) {
if (t.isConcept())
c = t.toConcept();
else
if (t.isRelation())
r = t.toRelation();
}

Note that we cannot resort to usual Java cast operation here.

There are three methods of the Ontology interface which return iterators over the ontology members:

We can use these methods to get a complete listing of our ontology:

System.out.println("All individuals:");
for (Iterator i = onto.things(); i.hasNext();)
System.out.println("t"+((Thing)i.next()).getId());
System.out.println("nAll concepts:");
for (Iterator i = onto.concepts(); i.hasNext();)
System.out.println("t"+((Concept)i.next()).getId());
System.out.println("nAll relations:");
for (Iterator i = onto.relations(); i.hasNext();)
System.out.println("t"+((Relation)i.next()).getId());

We will see the following output:

All individuals:
	LocatedIn
	Nenuial
	Region
Eriador Name Geographical Point Annuminas NearTo All concepts: Region Geographical Point All relations: LocatedIn Name NearTo

Instead of iterators, we can use also the getThings(), getConcepts() and getRelations() methods which return the Collections of individuals.

As you could notice, the result of the things() method includes the strange Thing object with an empty string as an identifier. This is a special Thing representing a recursive link to the ontology itself. Indeed, an Ontology is a Thing too and it always contains an instance of itself as its member. We can make sure of this fact with a simple test:

Thing o = onto.getThing("");
if (o.isOntology()) {
Ontology onto2 = o.toOntology();
System.out.println(onto2.equals(onto));
}

Annotations

Together with logical meaning relations, the Things (and Ontologies) may keep some annotational information for representational and documenting purposes. Special methods are defined by the Thing interface to set this information:

The corresponding getter methods return the values of annotation items, or empty strings, if a specific item has not been set:

As we learned before, any Ontology is a Thing too, so we can set labels, comments and version info for Ontology instances as well:

onto.setLabel("My first ontology");
onto.setComment("This is my first ontology");
onto.setVersionInfo("1.0");

Concepts and instances

The concepts hierarchy. Multiple inheritance.

Let's extend our concepts structure:

Concept city = point.createSubConcept("City");
Concept lake = point.createSubConcept("Lake");
Concept state = region.createSubConcept("State");

Now we have the following concepts tree:

The subConcepts() method returns an iterator over the subconcepts of the given concept. It has a boolean argument which defines that not only direct subconcepts should be returned:

for (Iterator i = geographical.subConcepts(false); i.hasNext();)
System.out.println(((Concept)i.next()).getId());
System.out.println("----");
for (Iterator i = geographical.subConcepts(true); i.hasNext();)
System.out.println(((Concept)i.next()).getId());

We will see the following output:

Region
Point
----
Lake
Region
State
City
Point                                            

If the boolean argument is false, the subConcepts() method iterates over a set of the directly known subconcepts only. Otherwise, it recursively iterates over direct subconcepts, their subconcepts and so on, until a closed set of all subconcepts will be built.

The superConcepts() method returns an iterator over all superconcepts of the given concept. The boolean argument defines that not only direct superconcepts should be returned (much as with subConcepts() method):

for (Iterator i = city.superConcepts(true); i.hasNext;)     
System.out.println(((Concept)i.next()).getId());

returns:

__THING_CONCEPT
Geographical
Point

Don't be surprised for the first strange concept. This is a built-in system concept which is a root of SOFA concepts hierarchy and implicit superconcept of any custom concept. In fact, "Geographical" as a top-level concept, is a direct subconcept of "__THING_CONCEPT ". You usually may ignore it, but if you want to know more about those nuts and bolts, please refer to the "Advanced techniques" section below.

The relations in a domain of the specific concept is an union of relations in the domains of all its superconcepts. The definedRelations() method allows to get an iterator over relations in the domain:

for (Iterator i = city.definedRelations(true); i.hasNext();)     
System.out.println(((Relation)i.next()).getId());

causes the output:

LocatedIn
Name NearTo __VERSIONINFO_REL __INSTANCEOF_REL __COMMENT_REL __LABEL_REL

We used the true argument to get an iterator including the relations in the domains of all superconcepts (including four system relations in domain of "__THING_CONCEPT"). As there are no relations defined on the "City" concept directly, we would have an empty iterator if we had used the false argument.

Thus, if we need to define some common relation for few distinct concepts, we have to define it on their nearest common superconcept. But sometimes it can lead to logical inconsistencies. Suppose we want to add a common relation about inhabitant people for "City" and "State" instances. To define it in a domain of the "Geographical" superconcept is a bad idea, because we don't want this relation for lakes, rivers, mountains and a lot of other geographical objects. In that case, multiple concepts inheritance will helps. We can create the separate "Residence" concept having this relation in a domain and add the "City" and "State" as subconcepts of the "Residence" (together with the "Point" and "Region" respectively):

Concept residence = onto.createConcept("Residence"); 
Concept people = onto.createConcept("People");
Relation inhabitantPeople = residence.createRelation("InhabitantPeople");
inhabitantPeople.addRange(people);
residence.addSubConcept(city);
residence.addSubConcept(region);

We have got the following ontology:

We used the addSubConcept() method to add an existing concept as a new subconcept. Alternatively, we could call the addSuperConcept() method:

city.addSuperConcept(residence);
region.addSuperConcept(residence);

Instances. Multiple concept references.

If we need to get a list of instances of the given concept, we can use the instances() method which returns an iterator over all direct instances of this concept or over direct instances plus instances of all its subconcepts (if the boolean argument is true):

for (Iterator i = region.instances(false); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
System.out.println("----");
for (Iterator i = geographical.instances(true); i.hasNext();)
System.out.println(((Thing)i.next()).getId());

Will causes the output:

Eriador
----
Eriador
Annuminas
Nenuial

The concepts() method returns an iterator over the concepts referenced by a given instance. The boolean argument defines that all superconcepts of the direct concepts will be included as well:

annuminas.removeConcept(point);
annuminas.addConcept(city);
for (Iterator i = annuminas.concepts(false); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
System.out.println("----");
for (Iterator i = annuminas.concepts(true); i.hasNext();)
System.out.println(((Thing)i.next()).getId());

causes:

City
----
City
Residence
Point
Geographical
__THING_CONCEPT

An instance can directly reference to more than one ontology concepts. Turning back to the case with multiple concepts inheritance, we could use multiple concepts reference technique instead:

Concept residence = onto.createConcept("Residence");  
Concept people = onto.createConcept("People");
Relation inhabitantPeople = residence.createRelation("InhabitantPeople");
inhabitantPeople.addRange(people);
annuminas = city.createInstance("Annuminas");
annuminas.addConcept(residence);

Now we have got the following concepts and instances structure:

The addConcept() method adds an instance to the specified concept. In the last line of the code above we could call residence.addInstance(annuminas) as well. As you could notice, many methods have such inversed variants (addSubConcept() and addSuperConcept(), addDomainConcept() and addDefinedRelation() etc.)

Relations. Reasoning and inferencing

An ontology is not a static information model. It uses explicitly stated initial facts for automatical inferencing of sets of implied facts about subjects of an ontology. We've met a sort of these implications in the cases of exploring the indirect sub- and superconcepts, instances, domain concepts and ranges. Now we will get to know how to implement inferencing in our custom relations.

Relations generalization and subrelations

Relations can arrange hierarchical structures as well as the concepts. Each relation can be a subrelation of another, more general relation (superrelation). A new subrelation inherits the sets of the domain concepts and ranges of its ancestor and can refine it for more specific needs. The relations generalization principle plays an important role in ontology inferencing due to the fact that a statement with a subrelation implies a set of implicit statements with all its ancestors.

Suppose, for example, we need to add a new relation to the "Region" concept which would declare that the given "Region" instance is a part of another. It is obvious that this relation is a special case of the "LocatedIn", since all parts of a region are always located in there. Thus, to keep a logical consistence of our ontology, we should maintain and change these two relations in sync.

But we can avoid it, if we declare our new relation as a subrelation of the "LocatedIn":

Relation partOf = locatedIn.createSubRelation("PartOf");
partOf.setRange(region);
Thing arnor = state.createInstance("Arnor");
arnor.set(partOf, eriador);

"LocatedIn" relation has been added automatically due to the fact that "PartOf" is a subrelation of "LocatedIn". To retrieve such implicit relation values, we should use the list() method with two arguments:

Collection results = arnor.list(locatedIn, Thing.INCLUDE_SUBPROPERTIES);
for (Iterator i = results.iterator(); i.hasNext();)
System.out.println(((Thing)i.next()).getId());

It causes "Eriador" in output. The second argument of list() is a sum of predefined integer constants specifying the types of implicit relations which will be included into a result. There are two special constants:

Transitive relations

Let's consider the following code:

Thing eriador = region.createInstance("Eriador");
Thing arnor = state.createInstance("Arnor");
Thing annuminas = city.createInstance("Annuminas");
annuminas.set(locatedIn, arnor);
arnor.set(locatedIn, eriador);

By the common sense, if Annuminas is located in Arnor and Arnor is located in Eriador, then Annuminas is located in Eriador as well. But this obvious fact is not presented in our ontology, unless we've declared it explicitly: annuminas.set(locatedIn, eriador). To avoid such excessive statements we can declare the "LocatedIn" relation as transitive:

locatedIn.setTransitive(true);
annuminas.set(locatedIn, arnor);
arnor.set(locatedIn, eriador);
for (Iterator i = annuminas.list(locatedIn, Thing.INCLUDE_TRANSITIVE).iterator(); i.hasNext();)
System.out.println(((Thing)i.next()).getId());

An output will be:

Arnor
Eriador

As you see, the second value ("Eriador") has been added implicitly, because the "LocatedIn" relation is a transitive relation and we used Thing.INCLUDE_TRANSITIVE constant as an argument of the list() . In the diagram below, the implicit relation is shown as a dashed line:

If the relation P is transitive then (x P y) and (y P z) causes (x P z).

Symmetric relations

Do you remember we wrote:

annuminas.set(nearTo, nenuial);
nenuial.set(nearTo, annuminas);

Our common sense tells us again that the second set() is excessive here. Indeed, if Annuminas city is near to Nenuial lake, then this lake is near to that city as well. To make that implicit statement, we should declare the "NearTo" relation as symmetric:

nearTo.setSymmetric(true);
annuminas.set(nearTo, nenuial);
System.out.println(nenuial.getId() + " is near to: ");
for (Iterator i = nenuial.list(nearTo, Thing.INCLUDE_SYMMETRIC).iterator(); i.hasNext();)
System.out.println(((Thing)i.next()).getId());

We will see:

Nenuial is near to: 
Annumunas

If the relation P is symmetric then (x P y) causes (y P x).

Inversed relations

Let's assume we need to get every geographic object located in a given region. The objects know about a region where they are located in (due to "LocatedIn" relation), but a region itself doesn't know which objects it contains. To make it possible for regions, we can create new relation on the "Region" concept and declare it as inversion of the "LocatedIn" relation:

Relation hasLocation = region.createRelation("HasLocation"); 
hasLocation.setInverseOf(locatedIn, true);
arnor.set(locatedIn, eriador);
nenuial.set(locatedIn, eriador);
annuminas.set(locatedIn, arnor);
System.out.println("Eriador has the following locations:");
for (Iterator i = eriador.list(hasLocation,
Thing.INCLUDE_INVERSED + Thing.INCLUDE_TRANSITIVE).iterator(); i.hasNext();)
System.out.println(((Thing)i.next()).getId());

We will see:

Eriador has the following locations:
Arnor
Nenuial
Annuminas

Why the Annuminas is there, if we've declared it is in Arnor? It is because we declared the "LocatedIn" relation as transitive and used a sum of INCLUDE_INVERSED and INCLUDE_TRANSITIVE constants in the list() argument (see about the transitive relations above). The following diagram illustrates the explicit and implicit relationships between these instances:

The relation P is inversion of the relation P' if (x P1 y) causes (y P2 x). It is obvious that if the relation is inversion of itself, it is a symmetric relation.

Ontology Validation

Setting various ontology constraints (domain concepts for relations, ranges and restrictions for relation values) does not mean we are unable to break these constraints. In fact, we can infringe any constraint rule and SOFA will not even warn us about it. SOFA doesn't care about ontology integrity and doesn't try to prevent "illegal" assignments. There are few reasons for this behaviour:

Instead of preventing ontology inconsistence, SOFA provides tools for checking validity of separate assertions, Things and whole ontologies. Using these tools, a client application can test ontology integrity when it is needed and get information about possible problems of integrity.

The org.semanticweb.sofa.validation package contains the OntoChecker class which provides few static methods for ontology validation:

test() method checks validity of a single assertion on a given Thing with given Relation and value:

import org.semanticweb.sofa.validation.OntoChecker;
import org.semanticweb.sofa.validation.Problem;
...
int result = OntoChecker.test(thing, relation, value);

The method returns an integer value which is 0 if the assertion is valid, or error code >0 otherwise. Error codes are represented as the constants of the Problem class:

testThing() method checks validity of every statement of a given Thing. It returns a set of the Problem objects which encapsulate information of each invalid statement:

Set problems = OntoChecker.testThing(thing);
if (!problems.isEmpty()) {
	System.out.println(thing.getId()+ " has the following errors:");
	for (Iterator i = problems.iterator(); i.hasNext();) {
		Problem problem = (Problem)i.next();
		switch (problem.getErrCode()) {
			case Problem.DOMAIN_ERROR: 
System.out.println("Domain error"); break; case Problem.RANGE_ERROR:
System.out.println("Range error"); break; case Problem.RESTRICTION_ERROR:
System.out.println("Restriction error"); break; } System.out.println("tRelation: " + problem.getRelation().getId());
System.out.println("tValue: " + problem.getValue());
System.out.println(); } }

To check a whole ontology, OntoChecker has the testOntology() method which returns a set of the Problem objects. To get a Thing which has a given problem, the getThing() method may be used:

Set problems = OntoChecker.testOntology(onto);
for (Iterator i = problems.iterator(); i.hasNext();) {
	Problem problem = (Problem)i.next();
	switch (problem.getErrCode()) {
		case Problem.DOMAIN_ERROR: 
System.out.println("Domain error"); break; case Problem.RANGE_ERROR:
System.out.println("Range error"); break; case Problem.RESTRICTION_ERROR:
System.out.println("Restriction error"); break; } System.out.println("tThing: " + problem.getThing().getId());
System.out.println("tRelation: " + problem.getRelation().getId());
System.out.println("tValue: " + problem.getValue());
System.out.println(); }

Note that checking a whole ontology may be a time-consuming operation.

Interoperability of ontologies

In SOFA Reference Implementation, the OntoConnector singleton class ( org.semanticweb.sofa.client.OntoConnector) serves as a central point for intercommunicating the ontologies. Above in this tutorial, we already considered OntoConnector as a factory for creating new Ontology instances. In this section, we will discuss its role for interoperability and managing the ontologies in details.

Using multiple Ontology instances

In our applications, we can use more than one Ontology instances simultaneously. OntoConnector provides a mechanism of interoperability between ontology instances being created during the same session. So, the Things of different ontologies are visible for each other, as it is shown in the example:

OntoConnector ontoConnector = OntoConnector.getInstance();
Ontology onto1 = ontoConnector.createOntology("http://example.com/onto1");
Ontology onto2 = ontoConnector.createOntology("http://example.com/onto2");
Concept concept1 = onto1.createConcept("Concept1");
Concept concept2 = onto2.createConcept("Concept2");
Thing thing = concept1.createInstance("Thing");
thing.addConcept(concept2);

The Thing in the example above is an instance of two concepts belonging to different ontologies. We can test which ontology a Thing belongs to, using the getOntology() method. This method returns a parent ontology instance of a given Thing:

for (Iterator i = thing.concepts(false); i.hasNext();) {
Ontology o = ((Thing)i.next()).getOntology();
System.out.println(o.getNameSpace());
}

We will get:

http://example.com/onto1
http://example.com/onto2

Mentioned technique can be useful for multilevel ontology engineering with separate architectural layers of reusable ontology schema (containing concepts and relations definitions) and instances collections.

Retrieving ontologies and Things

If we know a namespace identifier of an exisiting ontology, we can retrieve its copy from the OntoConnector pool:

Ontology o = ontoConnector.getOntology("http://example.com/onto1");                                        

Moreover, we can retrieve a specific Thing with a known qualified identifier:

Thing t = ontoConnector.getThing("http://example.com/onto1#Foo");                                        

The last method allows to pick Things directly from the OntoConnector, without creating an Ontology instance variable.

Managing ontologies in the OntoConnector pool

In fact, managing the ontologies means that we can create new Ontology instances (what we already did before) and remove existing instances from the OntoConnector pool. To remove an ontology, OntoConnector has removeOntology() method:

OntoConnector ontoConnector = OntoConnector.getInstance();
ontoConnector.removeOntology("http://example.com/onto2");

All relations to the Things of the removed ontology become indeterminated after this operation. However, it is not that those relations are discarded — if we restore an ontology with the same namespace and Thing identifiers, we will get the relations working again.

To explore the OntoConnector pool we can use getNameSpaces() method which returns a set of namespace values (java.net.URI objects) of the ontologies currently existing in the pool. The example below shows how to get a collection of all Ontology instances:

Vector ontologies = new Vector();
for (Iterator i = ontoConnector.getNameSpaces().iterator; i.hasNext();) {
Ontology o = ontoConnector.getOntology((URI)i.next);
ontologies.add(o);
}

To check if an ontology with the specified namespace exists in the OntoConnector pool we can use contains() method:

if (ontoConnector.contains(NS))
System.out.println("Ontology "+NS+" exists.");

Fetching ontologies on demand

NOTE: This feature is supported by SOFA 0.3 and higher.

When ontologies interoperate each other during multiple sessions, it is required to restore the same set of ontologies in the OntoConnector pool to keep all external relations satisfied. The OntoConnector simplifies this process with on-demand ontology fetching mechanism.

By default, the OntoConnector is in the "client mode" which means that it serves as a client application which retrieves ontologies from external sources and parses them into the pool automatically when it is needed. In this mode, when an unsatisfied relation (i.e. having no target Thing in the OntoConnector pool) occured, the OntoConnector asks another singleton class called OntoClient ( org.semanticweb.sofa.client.OntoClient) to find and retrieve an ontology. OntoClient interprets the given ontology namespace identifier as a URL of an ontology resource and tries to fetch and parse it.

Let's consider this behavior with an example. Suppose, we call:

Thing foo = ontoConnector.getThing("file:/home/ontologies/onto1.owl#Foo");                                        

The algorythm of OntoConnector actions in this case is following:

  1. If an ontology with namespace identifier " file:/home/ontologies/onto1.owl" exists in the OntoConnector pool, the Thing with identifier "Foo" will be extracted and returned.
  2. Otherwise (an ontology with the given namespace doesn't exist), the OntoConnector calls the fetch() method of the OntoClient instance passing the namespace URI as an argument of this method.
  3. The fetch() method tries to read a resource (a local file, in this case) from URL " file:/home/ontologies/onto1.owl" and parse it into the new in-memory ontology model using SOFA serialialization mechanism (see the section called "Ontology representation "). If a URL contains a network protocol scheme ("http: "), OntoClient will create a network connection to download the specified remoted resource.
  4. If the ontology resource is read and parsed succesfully, the OntoConnector puts the fetched ontology instance into the pool. The requested Thing is extracted and returned.

To disable this behavior the "client mode" may be turned off:

ontoConnector.setClientModeOn(false);                                        

If the "client mode" is off, the OntoConnector will not call the OntoClient automatically and will return null value if a requested ontology instance doesn't exists. We can control the process of ontology loading manually by explicit calling the fetch() method:

import org.semanticweb.sofa.client.OntoClient;
...
OntoClient ontoClient = OntoClient.getInstance();
Ontology onto = null;
try {
onto = ontoClient.fetch(new URI("http://example.com/onto1"));
}
catch (Exception ex) {
System.err.println("Failed to fetch an ontology.");
ex.printStackTrace();
}

URI aliases

In a real life, the ontology namespace identifiers rarely match the real locations of the ontology resources. The OntoClient allows to set the mapping between namespace identifiers and real URL's of the resources with the setAlias() static methods:

OntoClient.setAlias("http://example.com/onto1", "file:/home/ontologies/onto1.owl");                                        

After setting this alias, a request to fetch an ontology with " http://example.com/onto1" namespace will cause reading and parsing a local file with the specified path.

Aliases are set globally and active until the session is finished. To reuse them, it is possible to get all aliases as a java.util.Properties object:

Properties allAliases = OntoClient.aliases();                                        

The aliases may be stored as a common Java "properties" file and restored with setAliases() method:

Properties aliases = new Properties();
aliases.load(inputStream);
OntoClient.setAliases(aliases);

Ontology readers

When parsing an external resource, the OntoClient selects an appropriate SOFA serializer depending on the content type of the resource. The two content types are supported by default:

New readers for other resource types can be implemented and registered for OntoClient by calling registerReader() method:

import org.semanticweb.serialize.Reader;
...
Reader reader = new MyCustomReader();
OntoClient.registerReader("text/x-mycustomtype", reader);

Advanced techniques

System ontology, metaconcepts and relation concepts

The SOFA Reference Implementation is based on the built-in System Ontology which provides a metavocabulary for defining all custom ontologies. Generally, we don't need to address the System Ontology directly, because it underlies the SOFA interfaces implementation and it is kept back by their methods. However, there are special cases requiring such advanced techniques.

Metaconcepts

Every ontology concept is a Thing, so it may have relation statements and be an instance of other concepts. Such "concepts of the concepts" are called "metaconcepts". Every custom concept in SOFA Reference Implementation is an instance of the common system metaconcept. This metaconcept provides a basic domain of the concept relations (such as the "hasSubConcept", "hasRelation", "hasRestriction" etc.) masked by the Concept interface methods (see the diagram above). To extend built-in concept functionality (e.g. define new concept relations) we can create our own metaconcepts by subconcepting the system one:

import org.semanticweb.sofa.vocabulary.SOFA;
import org.semanticweb.sofa.vocabulary.Vocabulary;
...
Vocabulary sysVocabulary = SOFA.getVocabulary();
Concept sysMetaConcept = sysVocabulary.concept_Concept();
Concept myMetaConcept = sysMetaConcept.createSubConcept("MyMetaConcept")

Suppose we need to define a new relation (e.g, a number of instances) on concepts which will be the instances of our metaconcept:

Relation numOfInstances = myMetaConcept.createRelation("NumOfInstances");
numOfInstances.addRange(Integer.class);

Now we can instantiate new concepts with our metaconcept and set our new relation on them:

Concept concept1 = myMetaConcept.createInstance("Concept1").toConcept();
concept1.set(numOfInstances, new Integer(0));

Of course, we have to maintain a value of instances number manually. To make it automatical, we could extend the Concept interface implementation class and override the addInstance() , createInstance() and removeInstance() methods. But such techniques are beyond scope of this document.

Relation concepts

Every relation in the SOFA Reference Implementation is an instance of the "Relation" System Ontology concept. To extend its domain with our own concept to instantiate the relations of a new type, we can create a subconcept of the system one:

Concept sysRelationConcept = sysVocabulary.relation_Concept();
Concept myRelationConcept = sysRelationConcept.createSubConcept("MyRelationConcept");
Relation prop1 = myRelationConcept.createInstance("Prop1").toRelation();

We can define new relations in a domain of our relation concept to extend existing functionality of the relation instances.

Pluggable ontology storage model implementations. Using relational databases for ontology models

The SOFA Reference Implementation has a pluggable mechanism of ontology storage model. A specific ontology model is an implementation of OntologyModel interface of the org.semanticweb.sofa.model package. It has a default implementation providing the in-memory storage model backed by Java Collections framework. To create new ontology with a custom model, we need to pass the OntologyModel instance as an argument of the OntoConnector.createOntology() method:

OntologyModel model = new MyCustomOntologyModel();
OntoConnector ontoConnector = OntoConnector.getInstance();
Ontology onto = ontoConnector.createOntology(model, "http://example.com/ontology");

The single-argument createOntology() method creates an ontology with the default in-memory model.

Optional org.semanticweb.sofa.model.rdb package contains an ontology model implementation for storing ontologies in a relational database using the JDBC API framework. To instantiate an RDB ontology model we have to get a JDBC Connector instance and pass it as an argument of OntologyModel implementation constructor:

import org.semanticweb.sofa.rdb.OntologyRDBModel;
import java.sql.*;
...
OntologyModel rdbModel = null;
Connection conn = null;
try {
Class.forName("com.mysql.jdbc.Driver").newInstance();
conn = DriverManager.getConnection("jdbc:mysql://localhost/dummy?user=root");
rdbModel = new OntologyRDBModel(conn, "table1");
}
catch(Exception ex) {
ex.printStackTrace();
}
Ontology onto = OntoConnector.getInstance().createOntology(rdbModel, "http://example.com/ontology");

The second argument of the OntologyRDBModel constructor is a name of database table which will be used for storing this ontology model. If the given table does not exist, it will be created. Otherwise, the createOntology() will return an existing ontology stored in that database table.

Ontology representation

For storage or transferring purposes an ontology can be represented with an external textual form (or, be serialized).

Ontology representation is reversible, i.e. an ontology can be serialized with an external format and then restored from this format backwards. This mechanism also allows to use existing ontologies and provides an interoperability with external agents.

The SOFA ontology model is independent from specific languages, but it can be interpreted in terms of those having expression capabilities to describe that model. As the SOFA model is conceptually consistent with semantics of W3C Ontology Web Language (positioned as an industry standard of ontology representation), the model can be entirely represented using this language syntax. Also it is rather true for DAML+OIL (the predecessor of OWL and still the most popular ontology definition language).

The Ontology Serialization package ( org.semanticweb.sofa.serialize.* ) includes the modules providing serialization of the ontology model with specific languages and restoring it from a serialized form. It supports the following ontology languages:

OWL (Ontology Web Language)

The org.semanticweb.sofa.serialize.owl package contains the serializer and deserializer classes for a limited subset of the W3C Ontology Web Language (OWL). It provides guaranteed reversible serialization of the SOFA ontology model without any losses.

The OWLWriter and OWLReader classes provide serializing and deserializing an ontology as an OWL document:

import org.semanticweb.sofa.serialize.owl.*;
...
OWLWriter.getWriter().write(onto, "file:///ontology.owl");
OWLReader.getReader().read(onto, "file:///ontology.owl");

DAML+OIL

The org.semanticweb.sofa.serialize.daml package contains the serializer and deserializer for a limited subset of the DAML+OIL language. It provides reversible serialization with a high reliability. Actually, the only known inconsistence is a lack of symmetric relation attribute in the DAML syntax.

The DAMLWriter.write() and DAMLReader.read() static methods provide serializing and deserializing an ontology as a DAML+OIL document:

import org.semanticweb.sofa.serialize.daml.*; 
...
DAMLWriter.getWriter().write(onto, "file:///ontology.daml");
DAMLReader.getReader().read(onto, "file:///ontology.daml");

RDF+RDF-Schema

The org.semanticweb.sofa.serialize.rdfs package contains the serializer and deserializer for a limited subset of the W3C RDF-Schema language. This language has no syntactic features to describe the restrictions, transitive, symmetric and inversed relations, so all these aspects will be ignored.

The RDFSWriter.write() and RDFSReader.read() static methods provide serializing and deserializing an ontology as a RDF+RDF-Schema document:

import org.semanticweb.sofa.serialize.rdfs.*;  
...
RDFSWriter.getWriter().write(onto, "file:///ontology.rdfs");
RDFSReader.getReader().read(onto, "file:///ontology.rdfs");

As any RDF ontology may be considered as a case of OWL ontology, the OWLReader may be used for reading RDF-Schema as well.

N-Triples serializer

Since 0.3 release, SOFA includes Ontology Model serialization mechanism based on plain-text N-Triples syntax. It adresses directly to the low-level storage model, thus it works many times faster than other serializers (which are based on analysis of a high-level ontology concerns to interpret them with a specific language terms). In fact, it works approximately 20 times faster for writing and 10 times faster for reading the same Ontology Model in comparison with XML-based (OWL, DAML and RDF) serializers. As a resulting N-Triples text is an exact snapshot of the low-level storage model, it is guaranteed that it will produce exactly the same model when it is parsed back. So, it is well suitable for internal persistent storing the in-memory models between sessions.

Usage of the N-Triples serializer has a conventional pattern:

import org.semanticweb.sofa.serialize.n3.*;
...
N3Writer.getWriter().write(onto, "file:testonto.nt");
N3Reader.getReader().read(onto, "file:testonto.nt");

The drawback of N-Triples serializer is that it relies on the SOFA Reference Implementation (unlike other serializers), so it probably will not work with other implementations of SOFA interfaces.

Java datatypes mapping

SOFA serialization package provides a transparent reversible representation of datatype relation values with textual format. The mechanism of Java objects serialization is based on mapping of the predefined Java classes to XML-Schema datatypes, as it shown in the table below:

Java class

XML-Schema datatype

String representation form

Parseable formats

java.lang.String

http://www.w3.org/2001/XMLSchema#string

as is

Any string

java.lang.Boolean

http://www.w3.org/2001/XMLSchema#boolean

"true" | "false"

"true" | "false" | "1" | "0"

java.lang.Integer

http://www.w3.org/2001/XMLSchema#int

Integer.toString()

String form of a number, suitable for Integer.decode() method

java.lang.Long

http://www.w3.org/2001/XMLSchema#long

Long.toString()

String form of a number, suitable for Long.decode() method

java.lang.Short

http://www.w3.org/2001/XMLSchema#short

Short.toString()

String form of a number, suitable for Short.decode() method

java.lang.Byte

http://www.w3.org/2001/XMLSchema#byte

Byte.toString()

String form of a number, suitable for Byte.decode() method

java.lang.Float

http://www.w3.org/2001/XMLSchema#float

Float.toString()

String form of a number, suitable for Float.valueOf() method

java.lang.Double

http://www.w3.org/2001/XMLSchema#double

Double.toString()

String form of a number, suitable for Double.valueOf() method

java.util.Date

http://www.w3.org/2001/XMLSchema#dateTime

Date and time in ISO 8601 format

  • Date and time in ISO 8601 format
  • Date and time format, suitable for DateFormat.parse() method
  • Date and time format, suitable for Date() constructor

java.net.URI

http://www.w3.org/2001/XMLSchema#anyURI

An URI string as defined by RFC 2396 (URI.toString())

An URI string suitable for URI.create() method

There are 10 basic Java classes having built-in support in ontology serializers and deserializers. The objects of other Java classes represented in Base64-encoded binary serialized form. Their datatype range represented as an URN having a special format: "java: classname ", where "classname" is qualified Java class name (for instance, "java:java.awt.image.BufferedImage").

Visualising an ontology with the DOTWriter

The org.semanticweb.sofa.serialize.visual package contains a special class DOTWriter which represents an ontology as a directed graph described by Graphviz DOT language syntax:

import org.semanticweb.sofa.visual.DOTWriter; 
...
DOTWriter.getWriter().write(onto, "file:///ontology.dot");

The result can be transformed into the Postscript, SVG, PNG, GIF, JPEG and other graphic formats using the DOT interpreter which is a part of AT&T Graphviz package. You can obtain Graphviz for free from http://www.research.att.com/sw/tools/graphviz


$Id: index.html,v 1.2 2005/03/09 06:56:58 alexeya Exp $