Getting Started with the SOFA API
by Alex Alishevskikh
<alexeya (at) gmail (dot) com>
API version: 0.3
This document contains a brief tutorial overview of the key aspects of the programming with the SOFA API. For more details, please refer to the SOFA API documentation included into distribution.
- Basics
- Concepts and instances
- Relations. Reasoning and inferencing
- Ontology validation
- Interoperability of ontologies
- Advanced techniques
- Ontology representation
Basics
Creating new ontology
The SOFA Reference Implementation includes the OntoConnector class (
org.semanticweb.sofa.client.OntoConnector ) providing a mechanism for
Ontology instantiation. This is a singleton class which keeps a single
pool of multiple Ontology instances during the session and provides a
common way for creating new and getting existing Ontologies.
import org.semanticweb.sofa.*;
import org.semanticweb.sofa.client.OntoConnector;
...
OntoConnector ontoConnector = OntoConnector.getInstance();
Ontology onto = ontoConnector.createOntology("http://example.com/ontology");
The code above will create new blank Ontology implementation with default in-memory storage model. To instantiate an Ontology we need to provide a namespace identifier (a valid URI name). Later we can address this Ontology in the pool by its namespace identifier:
Ontology onto_copy = ontoConnector.getOntology("http://example.com/ontology");
— it creates a new reference to the existing Ontology with the variable
onto_copy.
Creating the concepts and subconcepts.
Let's create the simple hierarchy of geographical concepts:
Concept geographical = onto.createConcept("Geographical");
Concept region = geographical.createSubConcept("Region");
Concept point = geographical.createSubConcept("Point");
We have the following structure of concepts:
"Region" and "Point" are the subconcepts of "Geographical" superconcept. The "Geographical" concept is top-level, i.e. it has no superconcepts itself.
Creating the relations.
Let's define the ontology Relations between instances of our concepts.
Relation locatedIn = geographical.createRelation("LocatedIn");
locatedIn.addRange(region);
Relation nearTo = geographical.createRelation("NearTo");
nearTo.addRange(geographical);
We have created two Relation objects "LocatedIn" and "NearTo" which belong to the "Geographical" concept instances (they say these relations are in a domain of that concept). As "Region" and "Point" are both the subconcepts of "Geographical", these relations are belong to their instances as well (in other words, "Region" and "Point" inherit those relations from the "Geographical" superconcept). The relationships between concepts and relations in our ontology are illustrated by the following diagram:
We called addRange() method to limit possible targets of
our relations. Thus, we declared that the "LocatedIn" relation may have
values of instances of the "Region" concept only and the "NearTo"
relation may have values of instances of the "Geographical" concept
(including instances of their subconcepts as well). We used
addRange() method with Concept argument to specify
that these relations have values of instances of ontology concepts. To
define a relation that will refer to the objects of a Java data type, we
have to use addRange() method with the argument of a
java.lang.Class instance:
Relation name = geographical.createRelation("Name");
name.addRange(String.class);
We have created the relation which values are limited by objects of the
java.lang.String class.
New domain concepts can be added to a relation later with
addDomainConcept() method of Relation or with
addRelation() method of Concept. We can browse the
domain concepts of a relation with domainConcepts() method
which returns an iterator over ontology concepts:
for (Iterator i = locatedIn.domainConcepts(true); i.hasNext();)
System.out.println(((Concept)i.next()).getId());
causes the output:
Region
Geographical
Point
A boolean argument of the domainConcepts() allows us to get
all implicit (indirect) domain concepts. We would get the "Geographical"
concept only (direct domain concept) if we used the false
argument instead.
Similarly, the ranges() method of the Relation
returns an iterator over the ranges of a given relation. We can apply
the instanceof Java operator to check if a current value is
a Concept range or not:
for (Iterator i = someRelation.ranges(true); i.hasNext();) {
Object range = i.next();
if (range instanceOf Concept)
System.out.println(((Concept)range).getId());
else
System.out.println(((Class)range).getName());
}
Creating the instances. Setting and getting the relations.
Now we going to create the specific individuals as the instances of ontology concepts:
Thing eriador = region.createInstance("Eriador");
Thing annuminas = point.createInstance("Annuminas");
Thing nenuial = point.createInstance("Nenuial");
And set the values for relations of our new instances:
annuminas.set(locatedIn, eriador);
nenuial.set(locatedIn, eriador);
annuminas.set(nearTo, nenuial);
nenuial.set(nearTo, annuminas);
annuminas.set(name, "Annuminas city");
nenuial.set(name, "Nenuial lake");
eriador.set(name, "Eriador");
We have set the following relations in our ontology:
Let's test our assignments and try to retrieve the relations values with
the get() method:
System.out.println(
annuminas.get(name) +
" is located in " +
((Thing)annuminas.get(locatedIn)).get(name) +
", near to " +
((Thing)annuminas.get(nearTo)).get(name)
);
We will see:
Annuminas city is located in Eriador, near to Nenuial lake
Note that when we going to get the instance value, we have to cast a
result of the get() method to Thing interface,
because it returns a value of java.lang.Object.
Setting and getting multiple values. The restrictions.
To set more than one value to relation we should use the add()
method:
nenuial.add(name, "Nenuial lake"); // first 'add()' acts as 'set()'
nenuial.add(name, "Evendim lake");
or, we can use setAll() or addAll() methods
which allows to assign an array of values (or a
java.util.Collection) at once:
nenuial.setAll(name, new String[] {"Nenuial lake", "Evendim lake"});
Difference between the set() and add() (and
between setAll() and addAll()) is that the
set() and setAll() clear all previous values of the given
relation before making new assignments.
To retrieve multiple values there is the list() method
which returns a java.util.Collection of values:
Collection results = nenuial.list(name);
for (Iterator i = results.iterator(); i.hasNext();)
System.out.println(i.next());
But what if we need to limit a number of possible relation values? Then we should use restrictions:
geographical.setRestrictionOn(name, 0, 1);
The last two arguments of setRestrictionOn() method is a
minimal and maximal cardinality of the relation values. In our case,
only one (and not required) value of the "Name" will be accepted for the
"Geographical" instances (including the instances of "Region" and
"Point").
Additionally, we can restrict a relation by a set of predefined values:
Relation color = onto.createRelation("Color");
color.addRange(String.class);
Concept fruit = onto.createConcept("Fruit");
Set availColors = new HashSet();
availColors.add("red"); availColors.add("green"); availColors.add("orange");
fruit.setRestrictionOn(color, availColors, 1, 1);
We have used the setRestrictionOn() method with four
arguments to limit the possible values of the "Color" relation of the
"Fruit" concept. The second argument of this variant of
setRestrictionOn() is a java.util.Collection
containing allowed instances.
Note that setting restrictions (as well as other ontology constraints) does not prevent us to violate them. They are just the conditions of ontology integrity which can be checked post factum (see "Ontology validation" section below).
Browsing an ontology
As you could notice, we always use a string parameter when we create new ontology members. This is a unique identifier (ID) addressing the individual within an ontology. The SOFA Reference implementation puts some conventional constraints on using identifiers:
- Within one ontology, it is not allowed for existing of two or more Things with the same ID value (in fact, it causes an exception at runtime).
- An ID value must not be an empty string (below, you will realize why).
- An ID value must not start with two underscore ("_") characters — they are reserved for internal purposes.
-
As ID's are used as parts of the Uniform Resource Identifiers (URI),
they must contains only alphabetic characters, digits and characters
from the string: "
_-!.~'()*" (as defined by RFC 2396 specification):
Correct identifiers
Wrong identifiers
Foo
123
foo-and-bar
_foo123
foo.bar
o'kay!
(foo)
***FOO***Foo&Bar
__foo
#Hello
"Foo"
100%
/usr/bin
foo and bar
We can use ID's to retrieve the Things from an ontology:
Thing t = onto.getThing("Eriador");
This method returns null if a given member were not found.
If we need to get the Concept or Relation
instances, we should use the special methods:
Concept c = onto.getConcept("Region");
Relation r = onto.getRelation("Name");
Two methods above will return null if a specified Thing is
found but is not a Concept or a Relation respectively. To be sure, we
can retrieve a Thing object and call the isConcept() or
isRelation() boolean methods:
Thing t = onto.getThing("FooBar");
if (t != null) {
if (t.isConcept())
System.out.println("FooBar is a Concept");
else
if (t.isRelation())
System.out.println("FooBar is a Relation");
else
System.out.println("FooBar is a simple Thing");
}
If isConcept() or isRelation() return true, we
able to call the toConcept() or toRelation()
methods to reconstruct the Concept or Relation object instances from a
Thing source:
Thing t = onto.getThing("FooBar");
Concept c = null; Relation r = null;
if (t != null) {
if (t.isConcept())
c = t.toConcept();
else
if (t.isRelation())
r = t.toRelation();
}
Note that we cannot resort to usual Java cast operation here.
There are three methods of the Ontology interface which
return iterators over the ontology members:
-
things()- Returns an iterator over the all members of this ontology. -
concepts()- Returns an iterator over the all ontology concepts. -
relations()- Returns an iterator over the all ontology relations.
We can use these methods to get a complete listing of our ontology:
System.out.println("All individuals:");
for (Iterator i = onto.things(); i.hasNext();)
System.out.println("t"+((Thing)i.next()).getId());
System.out.println("nAll concepts:");
for (Iterator i = onto.concepts(); i.hasNext();)
System.out.println("t"+((Concept)i.next()).getId());
System.out.println("nAll relations:");
for (Iterator i = onto.relations(); i.hasNext();)
System.out.println("t"+((Relation)i.next()).getId());
We will see the following output:
All individuals: LocatedIn Nenuial Region
Eriador Name Geographical Point Annuminas NearTo All concepts: Region Geographical Point All relations: LocatedIn Name NearTo
Instead of iterators, we can use also the getThings(),
getConcepts() and getRelations() methods which return the
Collections of individuals.
As you could notice, the result of the things() method
includes the strange Thing object with an empty string as an identifier.
This is a special Thing representing a recursive link to the ontology
itself. Indeed, an Ontology is a Thing too and it always contains an
instance of itself as its member. We can make sure of this fact with a
simple test:
Thing o = onto.getThing("");
if (o.isOntology()) {
Ontology onto2 = o.toOntology();
System.out.println(onto2.equals(onto));
}
Annotations
Together with logical meaning relations, the Things (and Ontologies) may keep some annotational information for representational and documenting purposes. Special methods are defined by the Thing interface to set this information:
-
setLabel(String)
A label is a string that makes a textual representation of a Thing (e.g. for labeling the Thing in user interfaces). The label values impact for a string form of the Thing objects. If a label is set, thetoString()method returns a value of the label. -
setComment(String)
A comment string may be defined for documentation purposes. -
setVersionInfo(String)
Version info is a string containing a version description in arbitrary format (as CVS/RCS tags, for example). Applications can set and check the version information to be sure they use the right versions of ontologies.
The corresponding getter methods return the values of annotation items, or empty strings, if a specific item has not been set:
-
String getLabel() -
String getComment() -
String getVersionInfo()
As we learned before, any Ontology is a Thing too, so we can set labels, comments and version info for Ontology instances as well:
onto.setLabel("My first ontology");
onto.setComment("This is my first ontology");
onto.setVersionInfo("1.0");
Concepts and instances
The concepts hierarchy. Multiple inheritance.
Let's extend our concepts structure:
Concept city = point.createSubConcept("City");
Concept lake = point.createSubConcept("Lake");
Concept state = region.createSubConcept("State");
Now we have the following concepts tree:
The subConcepts() method returns an iterator over the
subconcepts of the given concept. It has a boolean argument which
defines that not only direct subconcepts should be returned:
for (Iterator i = geographical.subConcepts(false); i.hasNext();)
System.out.println(((Concept)i.next()).getId());
System.out.println("----");
for (Iterator i = geographical.subConcepts(true); i.hasNext();)
System.out.println(((Concept)i.next()).getId());
We will see the following output:
Region Point ---- Lake Region State City Point
If the boolean argument is false, the subConcepts()
method iterates over a set of the directly known subconcepts only. Otherwise,
it recursively iterates over direct subconcepts, their subconcepts and
so on, until a closed set of all subconcepts will be built.
The superConcepts() method returns an iterator over all
superconcepts of the given concept. The boolean argument defines that
not only direct superconcepts should be returned (much as with
subConcepts() method):
for (Iterator i = city.superConcepts(true); i.hasNext;)
System.out.println(((Concept)i.next()).getId());
returns:
__THING_CONCEPT
Geographical
Point
Don't be surprised for the first strange concept. This is a built-in
system concept which is a root of SOFA concepts hierarchy and implicit
superconcept of any custom concept. In fact, "Geographical" as a
top-level concept, is a direct subconcept of "__THING_CONCEPT
". You usually may ignore it, but if you want to know more about those nuts
and bolts, please refer to the "Advanced techniques" section below.
The relations in a domain of the specific concept is an union of
relations in the domains of all its superconcepts. The
definedRelations() method allows to get an iterator over
relations in the domain:
for (Iterator i = city.definedRelations(true); i.hasNext();)
System.out.println(((Relation)i.next()).getId());
causes the output:
LocatedIn
Name NearTo __VERSIONINFO_REL __INSTANCEOF_REL __COMMENT_REL __LABEL_REL
We used the true argument to get an iterator including the
relations in the domains of all superconcepts (including four system
relations in domain of "__THING_CONCEPT"). As there
are no relations defined on the "City" concept directly, we would have
an empty iterator if we had used the false argument.
Thus, if we need to define some common relation for few distinct concepts, we have to define it on their nearest common superconcept. But sometimes it can lead to logical inconsistencies. Suppose we want to add a common relation about inhabitant people for "City" and "State" instances. To define it in a domain of the "Geographical" superconcept is a bad idea, because we don't want this relation for lakes, rivers, mountains and a lot of other geographical objects. In that case, multiple concepts inheritance will helps. We can create the separate "Residence" concept having this relation in a domain and add the "City" and "State" as subconcepts of the "Residence" (together with the "Point" and "Region" respectively):
Concept residence = onto.createConcept("Residence");
Concept people = onto.createConcept("People");
Relation inhabitantPeople = residence.createRelation("InhabitantPeople");
inhabitantPeople.addRange(people);
residence.addSubConcept(city);
residence.addSubConcept(region);
We have got the following ontology:
We used the addSubConcept() method to add an existing
concept as a new subconcept. Alternatively, we could call the
addSuperConcept() method:
city.addSuperConcept(residence);
region.addSuperConcept(residence);
Instances. Multiple concept references.
If we need to get a list of instances of the given concept, we can use
the instances() method which returns an iterator over all
direct instances of this concept or over direct instances plus instances
of all its subconcepts (if the boolean argument is true):
for (Iterator i = region.instances(false); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
System.out.println("----");
for (Iterator i = geographical.instances(true); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
Will causes the output:
Eriador
----
Eriador
Annuminas
Nenuial
The concepts() method returns an iterator over the concepts
referenced by a given instance. The boolean argument defines that all
superconcepts of the direct concepts will be included as well:
annuminas.removeConcept(point); annuminas.addConcept(city);
for (Iterator i = annuminas.concepts(false); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
System.out.println("----");
for (Iterator i = annuminas.concepts(true); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
causes:
City
----
City
Residence
Point
Geographical
__THING_CONCEPT
An instance can directly reference to more than one ontology concepts. Turning back to the case with multiple concepts inheritance, we could use multiple concepts reference technique instead:
Concept residence = onto.createConcept("Residence");
Concept people = onto.createConcept("People");
Relation inhabitantPeople = residence.createRelation("InhabitantPeople");
inhabitantPeople.addRange(people);
annuminas = city.createInstance("Annuminas");
annuminas.addConcept(residence);
Now we have got the following concepts and instances structure:
The addConcept() method adds an instance to the specified
concept. In the last line of the code above we could call
residence.addInstance(annuminas) as well. As you could notice,
many methods have such inversed variants (addSubConcept()
and addSuperConcept(), addDomainConcept() and
addDefinedRelation() etc.)
Relations. Reasoning and inferencing
An ontology is not a static information model. It uses explicitly stated initial facts for automatical inferencing of sets of implied facts about subjects of an ontology. We've met a sort of these implications in the cases of exploring the indirect sub- and superconcepts, instances, domain concepts and ranges. Now we will get to know how to implement inferencing in our custom relations.
Relations generalization and subrelations
Relations can arrange hierarchical structures as well as the concepts. Each relation can be a subrelation of another, more general relation (superrelation). A new subrelation inherits the sets of the domain concepts and ranges of its ancestor and can refine it for more specific needs. The relations generalization principle plays an important role in ontology inferencing due to the fact that a statement with a subrelation implies a set of implicit statements with all its ancestors.
Suppose, for example, we need to add a new relation to the "Region" concept which would declare that the given "Region" instance is a part of another. It is obvious that this relation is a special case of the "LocatedIn", since all parts of a region are always located in there. Thus, to keep a logical consistence of our ontology, we should maintain and change these two relations in sync.
But we can avoid it, if we declare our new relation as a subrelation of the "LocatedIn":
Relation partOf = locatedIn.createSubRelation("PartOf");
partOf.setRange(region);
Thing arnor = state.createInstance("Arnor");
arnor.set(partOf, eriador);
"LocatedIn" relation has been added automatically due to the fact that
"PartOf" is a subrelation of "LocatedIn". To retrieve such implicit
relation values, we should use the list() method with two
arguments:
Collection results = arnor.list(locatedIn, Thing.INCLUDE_SUBPROPERTIES);
for (Iterator i = results.iterator(); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
It causes "Eriador" in output. The second argument of
list() is a sum of predefined integer constants specifying the
types of implicit relations which will be included into a result. There
are two special constants:
-
Thing.INCLUDE_ALL— Every possible implicit relation will be included -
Thing.DIRECT_ONLY —No implicit relations will be included (as with the single-argumentlist()method).
Transitive relations
Let's consider the following code:
Thing eriador = region.createInstance("Eriador");
Thing arnor = state.createInstance("Arnor");
Thing annuminas = city.createInstance("Annuminas");
annuminas.set(locatedIn, arnor);
arnor.set(locatedIn, eriador);
By the common sense, if Annuminas is located in Arnor and Arnor is
located in Eriador, then Annuminas is located in Eriador as well. But
this obvious fact is not presented in our ontology, unless we've
declared it explicitly: annuminas.set(locatedIn, eriador).
To avoid such excessive statements we can declare the "LocatedIn"
relation as transitive:
locatedIn.setTransitive(true);
annuminas.set(locatedIn, arnor);
arnor.set(locatedIn, eriador);
for (Iterator i = annuminas.list(locatedIn, Thing.INCLUDE_TRANSITIVE).iterator(); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
An output will be:
Arnor
Eriador
As you see, the second value ("Eriador") has been added implicitly,
because the "LocatedIn" relation is a transitive relation and we used
Thing.INCLUDE_TRANSITIVE constant as an argument of the list()
. In the diagram below, the implicit relation is shown as a dashed line:
If the relation P is transitive then (x P y) and (y P z) causes (x P z).
Symmetric relations
Do you remember we wrote:
annuminas.set(nearTo, nenuial);
nenuial.set(nearTo, annuminas);
Our common sense tells us again that the second set() is
excessive here. Indeed, if Annuminas city is near to Nenuial lake, then
this lake is near to that city as well. To make that implicit statement,
we should declare the "NearTo" relation as symmetric:
nearTo.setSymmetric(true);
annuminas.set(nearTo, nenuial);
System.out.println(nenuial.getId() + " is near to: ");
for (Iterator i = nenuial.list(nearTo, Thing.INCLUDE_SYMMETRIC).iterator(); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
We will see:
Nenuial is near to:
Annumunas
If the relation P is symmetric then (x P y) causes (y P x).
Inversed relations
Let's assume we need to get every geographic object located in a given region. The objects know about a region where they are located in (due to "LocatedIn" relation), but a region itself doesn't know which objects it contains. To make it possible for regions, we can create new relation on the "Region" concept and declare it as inversion of the "LocatedIn" relation:
Relation hasLocation = region.createRelation("HasLocation");
hasLocation.setInverseOf(locatedIn, true);
arnor.set(locatedIn, eriador);
nenuial.set(locatedIn, eriador);
annuminas.set(locatedIn, arnor);
System.out.println("Eriador has the following locations:");
for (Iterator i = eriador.list(hasLocation,
Thing.INCLUDE_INVERSED + Thing.INCLUDE_TRANSITIVE).iterator(); i.hasNext();)
System.out.println(((Thing)i.next()).getId());
We will see:
Eriador has the following locations:
Arnor
Nenuial
Annuminas
Why the Annuminas is there, if we've declared it is in Arnor? It is
because we declared the "LocatedIn" relation as transitive and used a
sum of INCLUDE_INVERSED and INCLUDE_TRANSITIVE
constants in the list() argument (see about the transitive
relations above). The following diagram illustrates the explicit and
implicit relationships between these instances:
The relation P is inversion of the relation P' if (x P1 y) causes (y P2 x). It is obvious that if the relation is inversion of itself, it is a symmetric relation.
Ontology Validation
Setting various ontology constraints (domain concepts for relations, ranges and restrictions for relation values) does not mean we are unable to break these constraints. In fact, we can infringe any constraint rule and SOFA will not even warn us about it. SOFA doesn't care about ontology integrity and doesn't try to prevent "illegal" assignments. There are few reasons for this behaviour:
- Constraint rules may be changed in future and an ontology will become inconsistent anyway.
- It would be impossible to implement any ontology development tool if it doesn't permit an interim unstable state of an otology.
- Testing each new assignment could decrease the performance (it may need long and complex inferencing to retrieve implicit constraint rules).
Instead of preventing ontology inconsistence, SOFA provides tools for checking validity of separate assertions, Things and whole ontologies. Using these tools, a client application can test ontology integrity when it is needed and get information about possible problems of integrity.
The org.semanticweb.sofa.validation package contains the
OntoChecker class which provides few static methods for ontology
validation:
test() method checks validity of a single assertion on a
given Thing with given Relation and value:
import org.semanticweb.sofa.validation.OntoChecker;
import org.semanticweb.sofa.validation.Problem;
...
int result = OntoChecker.test(thing, relation, value);
The method returns an integer value which is 0 if the assertion is
valid, or error code >0 otherwise. Error codes are represented as the
constants of the Problem class:
-
Problem.DOMAIN_ERROR— The Thing is not an instance of a concept having this relation in its domain -
Problem.RANGE_ERROR— Type of the value is not allowed by range of this relation -
Problem.RESTRICTION_ERROR— The value is not allowed by restriction set on a concept of the given Thing.
testThing() method checks validity of every statement of a
given Thing. It returns a set of the Problem objects which
encapsulate information of each invalid statement:
Set problems = OntoChecker.testThing(thing);
if (!problems.isEmpty()) {
System.out.println(thing.getId()+ " has the following errors:");
for (Iterator i = problems.iterator(); i.hasNext();) {
Problem problem = (Problem)i.next();
switch (problem.getErrCode()) {
case Problem.DOMAIN_ERROR:
System.out.println("Domain error"); break;
case Problem.RANGE_ERROR:
System.out.println("Range error"); break;
case Problem.RESTRICTION_ERROR:
System.out.println("Restriction error"); break;
}
System.out.println("tRelation: " + problem.getRelation().getId());
System.out.println("tValue: " + problem.getValue());
System.out.println();
}
}
To check a whole ontology, OntoChecker has the
testOntology() method which returns a set of the Problem
objects. To get a Thing which has a given problem, the getThing()
method may be used:
Set problems = OntoChecker.testOntology(onto);
for (Iterator i = problems.iterator(); i.hasNext();) {
Problem problem = (Problem)i.next();
switch (problem.getErrCode()) {
case Problem.DOMAIN_ERROR:
System.out.println("Domain error"); break;
case Problem.RANGE_ERROR:
System.out.println("Range error"); break;
case Problem.RESTRICTION_ERROR:
System.out.println("Restriction error"); break;
}
System.out.println("tThing: " + problem.getThing().getId());
System.out.println("tRelation: " + problem.getRelation().getId());
System.out.println("tValue: " + problem.getValue());
System.out.println();
}
Note that checking a whole ontology may be a time-consuming operation.
Interoperability of ontologies
In SOFA Reference Implementation, the OntoConnector singleton class (
org.semanticweb.sofa.client.OntoConnector) serves as a central point
for intercommunicating the ontologies. Above in this tutorial, we
already considered OntoConnector as a factory for creating new Ontology
instances. In this section, we will discuss its role for
interoperability and managing the ontologies in details.
Using multiple Ontology instances
In our applications, we can use more than one Ontology instances simultaneously. OntoConnector provides a mechanism of interoperability between ontology instances being created during the same session. So, the Things of different ontologies are visible for each other, as it is shown in the example:
OntoConnector ontoConnector = OntoConnector.getInstance();
Ontology onto1 = ontoConnector.createOntology("http://example.com/onto1");
Ontology onto2 = ontoConnector.createOntology("http://example.com/onto2");
Concept concept1 = onto1.createConcept("Concept1");
Concept concept2 = onto2.createConcept("Concept2");
Thing thing = concept1.createInstance("Thing");
thing.addConcept(concept2);
The Thing in the example above is an instance of two concepts belonging
to different ontologies. We can test which ontology a Thing belongs to,
using the getOntology() method. This method returns a
parent ontology instance of a given Thing:
for (Iterator i = thing.concepts(false); i.hasNext();) {
Ontology o = ((Thing)i.next()).getOntology();
System.out.println(o.getNameSpace());
}
We will get:
http://example.com/onto1
http://example.com/onto2
Mentioned technique can be useful for multilevel ontology engineering with separate architectural layers of reusable ontology schema (containing concepts and relations definitions) and instances collections.
Retrieving ontologies and Things
If we know a namespace identifier of an exisiting ontology, we can retrieve its copy from the OntoConnector pool:
Ontology o = ontoConnector.getOntology("http://example.com/onto1");
Moreover, we can retrieve a specific Thing with a known qualified identifier:
Thing t = ontoConnector.getThing("http://example.com/onto1#Foo");
The last method allows to pick Things directly from the OntoConnector, without creating an Ontology instance variable.
Managing ontologies in the OntoConnector pool
In fact, managing the ontologies means that we can create new Ontology
instances (what we already did before) and remove existing instances
from the OntoConnector pool. To remove an ontology, OntoConnector has
removeOntology() method:
OntoConnector ontoConnector = OntoConnector.getInstance();
ontoConnector.removeOntology("http://example.com/onto2");
All relations to the Things of the removed ontology become indeterminated after this operation. However, it is not that those relations are discarded — if we restore an ontology with the same namespace and Thing identifiers, we will get the relations working again.
To explore the OntoConnector pool we can use getNameSpaces()
method which returns a set of namespace values (java.net.URI
objects) of the ontologies currently existing in the pool. The example below
shows how to get a collection of all Ontology instances:
Vector ontologies = new Vector();
for (Iterator i = ontoConnector.getNameSpaces().iterator; i.hasNext();) {
Ontology o = ontoConnector.getOntology((URI)i.next);
ontologies.add(o);
}
To check if an ontology with the specified namespace exists in the
OntoConnector pool we can use contains() method:
if (ontoConnector.contains(NS))
System.out.println("Ontology "+NS+" exists.");
Fetching ontologies on demand
NOTE: This feature is supported by SOFA 0.3 and higher.
When ontologies interoperate each other during multiple sessions, it is required to restore the same set of ontologies in the OntoConnector pool to keep all external relations satisfied. The OntoConnector simplifies this process with on-demand ontology fetching mechanism.
By default, the OntoConnector is in the "client mode" which means that
it serves as a client application which retrieves ontologies from
external sources and parses them into the pool automatically when it is
needed. In this mode, when an unsatisfied relation (i.e. having no
target Thing in the OntoConnector pool) occured, the OntoConnector asks
another singleton class called OntoClient (
org.semanticweb.sofa.client.OntoClient) to find and retrieve an
ontology. OntoClient interprets the given ontology namespace identifier
as a URL of an ontology resource and tries to fetch and parse it.
Let's consider this behavior with an example. Suppose, we call:
Thing foo = ontoConnector.getThing("file:/home/ontologies/onto1.owl#Foo");
The algorythm of OntoConnector actions in this case is following:
-
If an ontology with namespace identifier "
file:/home/ontologies/onto1.owl" exists in the OntoConnector pool, the Thing with identifier "Foo" will be extracted and returned. -
Otherwise (an ontology with the given namespace doesn't exist), the
OntoConnector calls the
fetch()method of the OntoClient instance passing the namespace URI as an argument of this method. -
The
fetch()method tries to read a resource (a local file, in this case) from URL "file:/home/ontologies/onto1.owl" and parse it into the new in-memory ontology model using SOFA serialialization mechanism (see the section called "Ontology representation "). If a URL contains a network protocol scheme ("http:"), OntoClient will create a network connection to download the specified remoted resource. - If the ontology resource is read and parsed succesfully, the OntoConnector puts the fetched ontology instance into the pool. The requested Thing is extracted and returned.
To disable this behavior the "client mode" may be turned off:
ontoConnector.setClientModeOn(false);
If the "client mode" is off, the OntoConnector will not call the
OntoClient automatically and will return null value if a
requested ontology instance doesn't exists. We can control the process
of ontology loading manually by explicit calling the fetch()
method:
import org.semanticweb.sofa.client.OntoClient;
...
OntoClient ontoClient = OntoClient.getInstance();
Ontology onto = null;
try {
onto = ontoClient.fetch(new URI("http://example.com/onto1"));
}
catch (Exception ex) {
System.err.println("Failed to fetch an ontology.");
ex.printStackTrace();
}
URI aliases
In a real life, the ontology namespace identifiers rarely match the real
locations of the ontology resources. The OntoClient allows to set the
mapping between namespace identifiers and real URL's of the resources
with the setAlias() static methods:
OntoClient.setAlias("http://example.com/onto1", "file:/home/ontologies/onto1.owl");
After setting this alias, a request to fetch an ontology with "
http://example.com/onto1" namespace will cause reading and parsing a
local file with the specified path.
Aliases are set globally and active until the session is finished. To
reuse them, it is possible to get all aliases as a
java.util.Properties object:
Properties allAliases = OntoClient.aliases();
The aliases may be stored as a common Java "properties" file and
restored with setAliases() method:
Properties aliases = new Properties();
aliases.load(inputStream);
OntoClient.setAliases(aliases);
Ontology readers
When parsing an external resource, the OntoClient selects an appropriate SOFA serializer depending on the content type of the resource. The two content types are supported by default:
-
text/xml(the synonymsapplication/xmlandapplication/xml+rdfare supported as well):
The OWLReader class (org.semanticweb.sofa.serialize.owl.OWLReader) is used for parsing the resources. -
text/plain
The N3Reader class (org.semanticweb.sofa.serialize.n3.N3Reader) is used for parsing the resources.
New readers for other resource types can be implemented and registered
for OntoClient by calling registerReader() method:
import org.semanticweb.serialize.Reader;
...
Reader reader = new MyCustomReader();
OntoClient.registerReader("text/x-mycustomtype", reader);
Advanced techniques
System ontology, metaconcepts and relation concepts
The SOFA Reference Implementation is based on the built-in System Ontology which provides a metavocabulary for defining all custom ontologies. Generally, we don't need to address the System Ontology directly, because it underlies the SOFA interfaces implementation and it is kept back by their methods. However, there are special cases requiring such advanced techniques.
Metaconcepts
Every ontology concept is a Thing, so it may have relation statements
and be an instance of other concepts. Such "concepts of the concepts"
are called "metaconcepts". Every custom concept in SOFA
Reference Implementation is an instance of the common system
metaconcept. This metaconcept provides a basic domain of the concept
relations (such as the "hasSubConcept", "hasRelation", "hasRestriction"
etc.) masked by the Concept interface methods (see the
diagram above). To extend built-in concept functionality (e.g. define
new concept relations) we can create our own metaconcepts by
subconcepting the system one:
import org.semanticweb.sofa.vocabulary.SOFA;
import org.semanticweb.sofa.vocabulary.Vocabulary;
...
Vocabulary sysVocabulary = SOFA.getVocabulary();
Concept sysMetaConcept = sysVocabulary.concept_Concept();
Concept myMetaConcept = sysMetaConcept.createSubConcept("MyMetaConcept")
Suppose we need to define a new relation (e.g, a number of instances) on concepts which will be the instances of our metaconcept:
Relation numOfInstances = myMetaConcept.createRelation("NumOfInstances");
numOfInstances.addRange(Integer.class);
Now we can instantiate new concepts with our metaconcept and set our new relation on them:
Concept concept1 = myMetaConcept.createInstance("Concept1").toConcept();
concept1.set(numOfInstances, new Integer(0));
Of course, we have to maintain a value of instances number manually. To
make it automatical, we could extend the Concept interface
implementation class and override the addInstance() ,
createInstance() and removeInstance() methods. But
such techniques are beyond scope of this document.
Relation concepts
Every relation in the SOFA Reference Implementation is an instance of the "Relation" System Ontology concept. To extend its domain with our own concept to instantiate the relations of a new type, we can create a subconcept of the system one:
Concept sysRelationConcept = sysVocabulary.relation_Concept();
Concept myRelationConcept = sysRelationConcept.createSubConcept("MyRelationConcept");
Relation prop1 = myRelationConcept.createInstance("Prop1").toRelation();
We can define new relations in a domain of our relation concept to extend existing functionality of the relation instances.
Pluggable ontology storage model implementations. Using relational databases for ontology models
The SOFA Reference Implementation has a pluggable mechanism of ontology
storage model. A specific ontology model is an implementation of
OntologyModel interface of the org.semanticweb.sofa.model
package. It has a default implementation providing the in-memory storage
model backed by Java Collections framework. To create new ontology with
a custom model, we need to pass the OntologyModel instance as an
argument of the OntoConnector.createOntology() method:
OntologyModel model = new MyCustomOntologyModel();
OntoConnector ontoConnector = OntoConnector.getInstance();
Ontology onto = ontoConnector.createOntology(model, "http://example.com/ontology");
The single-argument createOntology() method creates an
ontology with the default in-memory model.
Optional org.semanticweb.sofa.model.rdb package contains an
ontology model implementation for storing ontologies in a relational
database using the JDBC API framework. To instantiate an RDB ontology
model we have to get a JDBC Connector instance and pass it as an
argument of OntologyModel implementation constructor:
import org.semanticweb.sofa.rdb.OntologyRDBModel;
import java.sql.*;
...
OntologyModel rdbModel = null;
Connection conn = null;
try {
Class.forName("com.mysql.jdbc.Driver").newInstance();
conn = DriverManager.getConnection("jdbc:mysql://localhost/dummy?user=root");
rdbModel = new OntologyRDBModel(conn, "table1");
}
catch(Exception ex) {
ex.printStackTrace();
}
Ontology onto = OntoConnector.getInstance().createOntology(rdbModel, "http://example.com/ontology");
The second argument of the OntologyRDBModel constructor is
a name of database table which will be used for storing this ontology
model. If the given table does not exist, it will be created. Otherwise,
the createOntology() will return an existing ontology
stored in that database table.
Ontology representation
For storage or transferring purposes an ontology can be represented with an external textual form (or, be serialized).
Ontology representation is reversible, i.e. an ontology can be serialized with an external format and then restored from this format backwards. This mechanism also allows to use existing ontologies and provides an interoperability with external agents.
The SOFA ontology model is independent from specific languages, but it can be interpreted in terms of those having expression capabilities to describe that model. As the SOFA model is conceptually consistent with semantics of W3C Ontology Web Language (positioned as an industry standard of ontology representation), the model can be entirely represented using this language syntax. Also it is rather true for DAML+OIL (the predecessor of OWL and still the most popular ontology definition language).
The Ontology Serialization package (
org.semanticweb.sofa.serialize.* ) includes the modules providing
serialization of the ontology model with specific languages and
restoring it from a serialized form. It supports the following ontology
languages:
- W3C OWL (Ontology Web Language).
- DAML+OIL: provides reversible serialization with high reliability.
- RDF + RDF-Schema: provides reversible serialization of the main aspects of the SOFA ontology model.
- N-Triples: provides reversible serialization of an internal structure of the SOFA model.
OWL (Ontology Web Language)
The org.semanticweb.sofa.serialize.owl package contains the
serializer and deserializer classes for a limited subset of the W3C
Ontology Web Language (OWL). It provides guaranteed reversible
serialization of the SOFA ontology model without any losses.
The OWLWriter and OWLReader classes provide
serializing and deserializing an ontology as an OWL document:
import org.semanticweb.sofa.serialize.owl.*;
...
OWLWriter.getWriter().write(onto, "file:///ontology.owl");
OWLReader.getReader().read(onto, "file:///ontology.owl");
DAML+OIL
The org.semanticweb.sofa.serialize.daml package contains
the serializer and deserializer for a limited subset of the DAML+OIL
language. It provides reversible serialization with a high reliability.
Actually, the only known inconsistence is a lack of symmetric relation
attribute in the DAML syntax.
The DAMLWriter.write() and DAMLReader.read() static methods provide serializing and deserializing an ontology as a DAML+OIL document:
import org.semanticweb.sofa.serialize.daml.*;
...
DAMLWriter.getWriter().write(onto, "file:///ontology.daml");
DAMLReader.getReader().read(onto, "file:///ontology.daml");
RDF+RDF-Schema
The org.semanticweb.sofa.serialize.rdfs package contains
the serializer and deserializer for a limited subset of the W3C
RDF-Schema language. This language has no syntactic features to describe
the restrictions, transitive, symmetric and inversed relations, so all
these aspects will be ignored.
The RDFSWriter.write() and RDFSReader.read() static methods provide serializing and deserializing an ontology as a RDF+RDF-Schema document:
import org.semanticweb.sofa.serialize.rdfs.*;
...
RDFSWriter.getWriter().write(onto, "file:///ontology.rdfs");
RDFSReader.getReader().read(onto, "file:///ontology.rdfs");
As any RDF ontology may be considered as a case of OWL ontology, the OWLReader may be used for reading RDF-Schema as well.
N-Triples serializer
Since 0.3 release, SOFA includes Ontology Model serialization mechanism based on plain-text N-Triples syntax. It adresses directly to the low-level storage model, thus it works many times faster than other serializers (which are based on analysis of a high-level ontology concerns to interpret them with a specific language terms). In fact, it works approximately 20 times faster for writing and 10 times faster for reading the same Ontology Model in comparison with XML-based (OWL, DAML and RDF) serializers. As a resulting N-Triples text is an exact snapshot of the low-level storage model, it is guaranteed that it will produce exactly the same model when it is parsed back. So, it is well suitable for internal persistent storing the in-memory models between sessions.
Usage of the N-Triples serializer has a conventional pattern:
import org.semanticweb.sofa.serialize.n3.*;
...
N3Writer.getWriter().write(onto, "file:testonto.nt");
N3Reader.getReader().read(onto, "file:testonto.nt");
The drawback of N-Triples serializer is that it relies on the SOFA Reference Implementation (unlike other serializers), so it probably will not work with other implementations of SOFA interfaces.
Java datatypes mapping
SOFA serialization package provides a transparent reversible representation of datatype relation values with textual format. The mechanism of Java objects serialization is based on mapping of the predefined Java classes to XML-Schema datatypes, as it shown in the table below:
|
Java class |
XML-Schema datatype |
String representation form |
Parseable formats |
|
java.lang.String |
http://www.w3.org/2001/XMLSchema#string |
as is |
Any string |
|
java.lang.Boolean |
http://www.w3.org/2001/XMLSchema#boolean |
"true" | "false" |
"true" | "false" | "1" | "0" |
|
java.lang.Integer |
http://www.w3.org/2001/XMLSchema#int |
Integer.toString() |
String form of a number, suitable for Integer.decode() method |
|
java.lang.Long |
http://www.w3.org/2001/XMLSchema#long |
Long.toString() |
String form of a number, suitable for Long.decode() method |
|
java.lang.Short |
http://www.w3.org/2001/XMLSchema#short |
Short.toString() |
String form of a number, suitable for Short.decode() method |
|
java.lang.Byte |
http://www.w3.org/2001/XMLSchema#byte |
Byte.toString() |
String form of a number, suitable for Byte.decode() method |
|
java.lang.Float |
http://www.w3.org/2001/XMLSchema#float |
Float.toString() |
String form of a number, suitable for Float.valueOf() method |
|
java.lang.Double |
http://www.w3.org/2001/XMLSchema#double |
Double.toString() |
String form of a number, suitable for Double.valueOf() method |
|
java.util.Date |
http://www.w3.org/2001/XMLSchema#dateTime |
Date and time in ISO 8601 format |
|
|
java.net.URI |
http://www.w3.org/2001/XMLSchema#anyURI |
An URI string as defined by RFC 2396 (URI.toString()) |
An URI string suitable for URI.create() method |
There are 10 basic Java classes having built-in support in ontology
serializers and deserializers. The objects of other Java classes
represented in Base64-encoded binary serialized form. Their datatype
range represented as an URN having a special format: "java:
classname ", where "classname" is qualified Java class name (for
instance, "java:java.awt.image.BufferedImage").
Visualising an ontology with the DOTWriter
The org.semanticweb.sofa.serialize.visual package contains
a special class DOTWriter which represents an ontology as a
directed graph described by Graphviz DOT language syntax:
import org.semanticweb.sofa.visual.DOTWriter;
...
DOTWriter.getWriter().write(onto, "file:///ontology.dot");
The result can be transformed into the Postscript, SVG, PNG, GIF, JPEG and other graphic formats using the DOT interpreter which is a part of AT&T Graphviz package. You can obtain Graphviz for free from http://www.research.att.com/sw/tools/graphviz
$Id: index.html,v 1.2 2005/03/09 06:56:58 alexeya Exp $
