You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Panagiotis Papadakos <pa...@gmail.com> on 2013/01/23 19:46:28 UTC

listInstances OntClass problem

Hi.


Assume the attached RDF file. This is a simple

Manufacturer -> Europe (subclass) -> Germany (subclass) -> Audi (subclass)
-> Instance,

which has been created using inference for subclass and type from jena.


The problem that I have is that if I get the root class (in this example
Manufacturer)

and call the listInstances(true) method, I get an empty set (I assume that
I should

get Manufacturer, since it has a direct instance).


Furthermore, if I use the transitive reduction of the above code and use no
inference

in the model, the listInstances(false) method this time, again I get an
empty result

(I assume that again I should get the Manufacturer class).


Finally, I am really interested in very fast access from classes or
subclasses to instances and from instances to classes and
subclasses. Can you provide me any hints?

Thank you!

public class JenaTest {
    private OntModel model;
    private String inputFileName;

    public JenaTest() {

        // Create a simple RDFS++ Reasoner.
        //StringBuilder sb = new StringBuilder();

        //sb.append("[rdfs9:   (?x rdfs:subClassOf ?y), (?a rdf:type ?x) ->
(?a rdf:type ?y)] ");
        //sb.append("[rdfs11:  (?x rdfs:subClassOf ?y), (?y rdfs:subClassOf
?z) -> (?x rdfs:subClassOf ?z)] ");

        //Reasoner reasoner = new
GenericRuleReasoner(Rule.parseRules(sb.toString()));

        // an OWL/RDF model that performs no reasoning at all:
        model = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);

        //Model base = ModelFactory.createDefaultModel();
        //model = ModelFactory.createOntologyModel(OntModelSpec.RDFS_MEM,
ModelFactory.createInfModel(reasoner, base));


        // using the FileManager to find the input file:
        InputStream in =
FileManager.get().open("/home/mythos/Virtuoso/WISE_Demo_Data/data/TestInference.rdf");

        if (in == null) {
            System.out.println("File: \"" + inputFileName + "\" not found");
        }

        // read the RDF/XML file
        try {
            model.read(in, "");
        } catch (Exception e) {
            System.err.println("Could not read file \"" + inputFileName +
"\". It's either malformed or corrupted\n");
        }

    }

    public Set<String> getAllTopClassesWithInstances() {
        Set <String> top = new HashSet<String>();

        ExtendedIterator<OntClass> it = model.listHierarchyRootClasses();
        for (; it.hasNext();) {
            OntClass cl = it.next();
            if (cl.listInstances(false).hasNext()) {
                String c = cl.getLocalName();
                top.add(c);
            }
        }

        return top;
    }
    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        JenaTest test = new JenaTest();
        Set <String> top = test.getAllTopClassesWithInstances();
        System.out.println(top);

    }
}

-- 

Panagiotis Papadakos,


PhD Student

Computer Science Department

University of Crete

http://www.csd.uoc.gr/~papadako

Re: listInstances OntClass problem

Posted by Ian Dickinson <ia...@epimorphics.com>.
Hi Panagiotis,

On 23/01/13 18:46, Panagiotis Papadakos wrote:
> Assume the attached RDF file. This is a simple
>
> Manufacturer -> Europe (subclass) -> Germany (subclass) -> Audi
> (subclass) -> Instance,
>
> which has been created using inference for subclass and type from jena.
>
> The problem that I have is that if I get the root class (in this example
> Manufacturer)
>
> and call the listInstances(true) method, I get an empty set (I assume
> that I should
>
> get Manufacturer, since it has a direct instance).
I think you have several confusions here.  You are right that, in your 
ontology, Manufacturer is the root class. However I think your modelling 
is confused, because Germany is not a sub-class of Manufacturer. If it 
was, then every instance of a Germany would also be an instance of a 
Manufacturer. That may possibly be true in some domain, but I would 
think that it is unlikely, and it certainly doesn't really match reality.

> Furthermore, if I use the transitive reduction of the above code and use
> no inference
>
> in the model, the listInstances(false) method this time, again I get an
> empty result
>
> (I assume that again I should get the Manufacturer class).
The problem here is that you are mixing ontology languages. You have 
declared instances of rdfs:Class, but then are using an ontology model 
with the OWL profile. So without the inference engine's help, you don't 
get any results from listHierarchyRootClasses(), which in turn gives you 
no instances in the output.

The answer to this is to split up your code so that each method only has 
a single responsibility, and write JUnit tests for each one. Then you 
would see when your expectations (this hierarchy has some root classes) 
are violated (but the code isn't finding them).

> Finally, I am really interested in very fast access from classes or
> subclasses to instances and from instances to classes and
> subclasses. Can you provide me any hints?
My hint would be: get it working, before you worry about getting it 
working quickly.

http://c2.com/cgi/wiki?PrematureOptimization

When you have the core of your application running, use a profiler to 
find out where it runs slowly, and then optimize those bits. It almost 
certainly won't be where you expect. If necessary, add some performance 
tests to your unit and functional test suite ... which will of course be 
quite extensive by then :)

There were some other problems with your code, so I've attached a 
fixed-up version that addresses some of the problems and shows you that 
you can use either inference or an RDFS model.

Regards,
Ian


-- 
____________________________________________________________
Ian Dickinson                   Epimorphics Ltd, Bristol, UK
mailto:ian@epimorphics.com        http://www.epimorphics.com
cell: +44-7786-850536              landline: +44-1275-399069
------------------------------------------------------------
Epimorphics Ltd.  is a limited company registered in England
(no. 7016688). Registered address: Court Lodge, 105 High St,
               Portishead, Bristol BS20 6PT, UK


Re: listInstances OntClass problem

Posted by Dave Reynolds <da...@gmail.com>.
Have you tried with OntModelSpec.RDFS_MEM?

You seem to be using an OWL spec but the types in your sample ontology 
are declared as rdfs:Class not owl:Class.

Dave

On 23/01/13 18:46, Panagiotis Papadakos wrote:
> Hi.
>
>
> Assume the attached RDF file. This is a simple
>
> Manufacturer -> Europe (subclass) -> Germany (subclass) -> Audi
> (subclass) -> Instance,
>
> which has been created using inference for subclass and type from jena.
>
>
> The problem that I have is that if I get the root class (in this example
> Manufacturer)
>
> and call the listInstances(true) method, I get an empty set (I assume
> that I should
>
> get Manufacturer, since it has a direct instance).
>
>
> Furthermore, if I use the transitive reduction of the above code and use
> no inference
>
> in the model, the listInstances(false) method this time, again I get an
> empty result
>
> (I assume that again I should get the Manufacturer class).
>
>
> Finally, I am really interested in very fast access from classes or
> subclasses to instances and from instances to classes and
> subclasses. Can you provide me any hints?
>
> Thank you!
>
> public class JenaTest {
>      private OntModel model;
>      private String inputFileName;
>
>      public JenaTest() {
>
>          // Create a simple RDFS++ Reasoner.
>          //StringBuilder sb = new StringBuilder();
>
>          //sb.append("[rdfs9:   (?x rdfs:subClassOf ?y), (?a rdf:type
> ?x) -> (?a rdf:type ?y)] ");
>          //sb.append("[rdfs11:  (?x rdfs:subClassOf ?y), (?y
> rdfs:subClassOf ?z) -> (?x rdfs:subClassOf ?z)] ");
>
>          //Reasoner reasoner = new
> GenericRuleReasoner(Rule.parseRules(sb.toString()));
>
>          // an OWL/RDF model that performs no reasoning at all:
>          model = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
>
>          //Model base = ModelFactory.createDefaultModel();
>          //model =
> ModelFactory.createOntologyModel(OntModelSpec.RDFS_MEM,
> ModelFactory.createInfModel(reasoner, base));
>
>
>          // using the FileManager to find the input file:
>          InputStream in =
> FileManager.get().open("/home/mythos/Virtuoso/WISE_Demo_Data/data/TestInference.rdf");
>
>          if (in == null) {
>              System.out.println("File: \"" + inputFileName + "\" not
> found");
>          }
>
>          // read the RDF/XML file
>          try {
>              model.read(in, "");
>          } catch (Exception e) {
>              System.err.println("Could not read file \"" + inputFileName
> + "\". It's either malformed or corrupted\n");
>          }
>
>      }
>
>      public Set<String> getAllTopClassesWithInstances() {
>          Set <String> top = new HashSet<String>();
>
>          ExtendedIterator<OntClass> it = model.listHierarchyRootClasses();
>          for (; it.hasNext();) {
>              OntClass cl = it.next();
>              if (cl.listInstances(false).hasNext()) {
>                  String c = cl.getLocalName();
>                  top.add(c);
>              }
>          }
>
>          return top;
>      }
>      /**
>       * @param args the command line arguments
>       */
>      public static void main(String[] args) {
>          JenaTest test = new JenaTest();
>          Set <String> top = test.getAllTopClassesWithInstances();
>          System.out.println(top);
>
>      }
> }
>
> --
>
> Panagiotis Papadakos,
>
>
> PhD Student
>
> Computer Science Department
>
> University of Crete
>
> http://www.csd.uoc.gr/~papadako
>


Re: listInstances OntClass problem

Posted by Panagiotis Papadakos <pa...@gmail.com>.
Thanks for the reply Dave.

It seems that jena.apache.org/documentation/ontology/index.html
Fig. 5 explains what you are explaining in your email. I still believe
thought that javadocs should be more clear about this.

Thanks again

Regards
Papadakos Panagiotis


On Tue, Jan 29, 2013 at 3:24 PM, Dave Reynolds <da...@gmail.com>wrote:

> On 24/01/13 13:08, Panagiotis Papadakos wrote:
>
>> Ian and Dave, thank you both for your help.
>>
>> I didn't post the correct code and I am sorry for this.
>>
>> Regarding the ontology, I know it is not correct.
>> Maybe changing Europe to European, Germany to German, etc. would be
>> better.
>>
>> Now regarding the listInstances method, I still believe something is
>> wrong either in the code, in the API or in my way of thinking.
>>
>> listInstances is supposed to return the instances, either direct or
>> instances of its subclasses. Unfortunately if I use a simple RDF_MEM
>> model with no inference, listInstances(false) for the Manufacturer class
>> returns no result. Somehow I feel this is wrong.
>> I was thinking that internally, since there is no inference, jena should
>> visit each subclass, and the subclasses of them, etc. getting the direct
>> instances of each one and returning all the instances of the class and
>> its subclasses. Is this correct?
>>
>
> No.
>
> The notion is that reasoning is the job of the reasoner and that the
> OntAPI provides convenient access to that, but doesn't duplicate it. There
> are a few special cases but in general if you want reasoning then configure
> a reasoner.
>
>
>  Now regarding listInstances(true), I am supposing that it should return
>> all direct instances of the class, even if these instances are also
>> instances of a subclass (which for example can happen if I load the
>> TestInference.rdf file).
>>
>
> No. That's the point of direct, as it says in the javadoc setting
> direct=true means "excluding sub-classes of this class".
>
> If something is also an instance of a subclass of C then it is not a
> direct instance of C and should not be returned by listInstances(true).
>
> Dave
>
>


-- 
http://www.flickr.com/photos/papadako

Re: listInstances OntClass problem

Posted by Dave Reynolds <da...@gmail.com>.
On 24/01/13 13:08, Panagiotis Papadakos wrote:
> Ian and Dave, thank you both for your help.
>
> I didn't post the correct code and I am sorry for this.
>
> Regarding the ontology, I know it is not correct.
> Maybe changing Europe to European, Germany to German, etc. would be better.
>
> Now regarding the listInstances method, I still believe something is
> wrong either in the code, in the API or in my way of thinking.
>
> listInstances is supposed to return the instances, either direct or
> instances of its subclasses. Unfortunately if I use a simple RDF_MEM
> model with no inference, listInstances(false) for the Manufacturer class
> returns no result. Somehow I feel this is wrong.
> I was thinking that internally, since there is no inference, jena should
> visit each subclass, and the subclasses of them, etc. getting the direct
> instances of each one and returning all the instances of the class and
> its subclasses. Is this correct?

No.

The notion is that reasoning is the job of the reasoner and that the 
OntAPI provides convenient access to that, but doesn't duplicate it. 
There are a few special cases but in general if you want reasoning then 
configure a reasoner.

> Now regarding listInstances(true), I am supposing that it should return
> all direct instances of the class, even if these instances are also
> instances of a subclass (which for example can happen if I load the
> TestInference.rdf file).

No. That's the point of direct, as it says in the javadoc setting 
direct=true means "excluding sub-classes of this class".

If something is also an instance of a subclass of C then it is not a 
direct instance of C and should not be returned by listInstances(true).

Dave


Re: Re: listInstances OntClass problem

Posted by Panagiotis Papadakos <pa...@gmail.com>.
Thanks for the reply Chris.

Can somebody please describe what is the expected behaviour of
listInstances as explained in my previous email?

Thank you.

Regards Panagiotis




On Thu, Jan 24, 2013 at 3:46 PM, Chris Dollin
<ch...@epimorphics.com>wrote:

> On Thursday, January 24, 2013 03:08:35 PM Panagiotis Papadakos wrote:
>
> > Finally, regarding  performance,I would like to know if jena internally
> > uses indices that can provide me fast access from classes to instances
> and
> > from instances to classes. Else I probably will have to implement them.
>
> (a) Don't assume in advance that there will be a performance problem
>     that's the one you expect.
>
> (b) listStatements(null, RDF.type, MyClass) will deliver you an iterator
>     over all the statements which say that the subject is an instance
>     of MyClass.
>
>     Implementations of Model, or rather, the underlying Graph, will
>     (typically) have indexes that make this reasonably fast. The built-in
>     memory graph has S, P, and O indexes; TDB has SP, PO and SO
>     (I think) indexes as well.
>
>     And you can run listStatements the other way to find classes
>     given their instances.
>
> Chris
>
> --
> "I know it was late, but Mountjoy never bothers,                /Archer's
> Goon/
>  so long as it's the full two thousand words."
>
> Epimorphics Ltd, http://www.epimorphics.com
> Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20
> 6PT
> Epimorphics Ltd. is a limited company registered in England (number
> 7016688)
>
>


-- 
http://www.flickr.com/photos/papadako

Re: Re: listInstances OntClass problem

Posted by Chris Dollin <ch...@epimorphics.com>.
On Thursday, January 24, 2013 03:08:35 PM Panagiotis Papadakos wrote:

> Finally, regarding  performance,I would like to know if jena internally
> uses indices that can provide me fast access from classes to instances and
> from instances to classes. Else I probably will have to implement them.

(a) Don't assume in advance that there will be a performance problem
    that's the one you expect.

(b) listStatements(null, RDF.type, MyClass) will deliver you an iterator
    over all the statements which say that the subject is an instance
    of MyClass.

    Implementations of Model, or rather, the underlying Graph, will
    (typically) have indexes that make this reasonably fast. The built-in
    memory graph has S, P, and O indexes; TDB has SP, PO and SO
    (I think) indexes as well.

    And you can run listStatements the other way to find classes
    given their instances.

Chris

-- 
"I know it was late, but Mountjoy never bothers,                /Archer's Goon/
 so long as it's the full two thousand words."

Epimorphics Ltd, http://www.epimorphics.com
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT
Epimorphics Ltd. is a limited company registered in England (number 7016688)


Re: listInstances OntClass problem

Posted by Panagiotis Papadakos <pa...@gmail.com>.
Ian and Dave, thank you both for your help.

I didn't post the correct code and I am sorry for this.

Regarding the ontology, I know it is not correct.
Maybe changing Europe to European, Germany to German, etc. would be better.

Now regarding the listInstances method, I still believe something is wrong
either in the code, in the API or in my way of thinking.

listInstances is supposed to return the instances, either direct or
instances of its subclasses. Unfortunately if I use a simple RDF_MEM model
with no inference, listInstances(false) for the Manufacturer class returns
no result. Somehow I feel this is wrong.
I was thinking that internally, since there is no inference, jena should
visit each subclass, and the subclasses of them, etc. getting the direct
instances of each one and returning all the instances of the class and its
subclasses. Is this correct?

Now regarding listInstances(true), I am supposing that it should return all
direct instances of the class, even if these instances are also instances
of a subclass (which for example can happen if I load the TestInference.rdf
file). But it seems this is not the case. For example, for the Manufacturer
no instance is returned, although one exists (ID21_Car_Audi_A3...). The
problem seems to be that this instance is  also instance of the subclasses.
If I limit ID21_Car_Audi_A3 to be instance  of the Manufacturer only (like
in the attached Test_Super.rdf), then it is returned correctly by
listInstances(true).

Finally, regarding  performance,I would like to know if jena internally
uses indices that can provide me fast access from classes to instances and
from instances to classes. Else I probably will have to implement them.

Please, keep in mind that I am a real newbie to RDF....

Thank you very much for your time

Regards
Papadakos Panagiotis


On Wed, Jan 23, 2013 at 8:46 PM, Panagiotis Papadakos <pa...@gmail.com>wrote:

> Hi.
>
>
> Assume the attached RDF file. This is a simple
>
> Manufacturer -> Europe (subclass) -> Germany (subclass) -> Audi (subclass)
> -> Instance,
>
> which has been created using inference for subclass and type from jena.
>
>
> The problem that I have is that if I get the root class (in this example
> Manufacturer)
>
> and call the listInstances(true) method, I get an empty set (I assume that
> I should
>
> get Manufacturer, since it has a direct instance).
>
>
> Furthermore, if I use the transitive reduction of the above code and use
> no inference
>
> in the model, the listInstances(false) method this time, again I get an
> empty result
>
> (I assume that again I should get the Manufacturer class).
>
>
> Finally, I am really interested in very fast access from classes or
> subclasses to instances and from instances to classes and
> subclasses. Can you provide me any hints?
>
> Thank you!
>
> public class JenaTest {
>     private OntModel model;
>     private String inputFileName;
>
>     public JenaTest() {
>
>         // Create a simple RDFS++ Reasoner.
>         //StringBuilder sb = new StringBuilder();
>
>         //sb.append("[rdfs9:   (?x rdfs:subClassOf ?y), (?a rdf:type ?x)
> -> (?a rdf:type ?y)] ");
>         //sb.append("[rdfs11:  (?x rdfs:subClassOf ?y), (?y
> rdfs:subClassOf ?z) -> (?x rdfs:subClassOf ?z)] ");
>
>         //Reasoner reasoner = new
> GenericRuleReasoner(Rule.parseRules(sb.toString()));
>
>         // an OWL/RDF model that performs no reasoning at all:
>         model = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
>
>         //Model base = ModelFactory.createDefaultModel();
>         //model = ModelFactory.createOntologyModel(OntModelSpec.RDFS_MEM,
> ModelFactory.createInfModel(reasoner, base));
>
>
>         // using the FileManager to find the input file:
>         InputStream in =
> FileManager.get().open("/home/mythos/Virtuoso/WISE_Demo_Data/data/TestInference.rdf");
>
>         if (in == null) {
>             System.out.println("File: \"" + inputFileName + "\" not
> found");
>         }
>
>         // read the RDF/XML file
>         try {
>             model.read(in, "");
>         } catch (Exception e) {
>             System.err.println("Could not read file \"" + inputFileName +
> "\". It's either malformed or corrupted\n");
>         }
>
>     }
>
>     public Set<String> getAllTopClassesWithInstances() {
>         Set <String> top = new HashSet<String>();
>
>         ExtendedIterator<OntClass> it = model.listHierarchyRootClasses();
>         for (; it.hasNext();) {
>             OntClass cl = it.next();
>             if (cl.listInstances(false).hasNext()) {
>                 String c = cl.getLocalName();
>                 top.add(c);
>             }
>         }
>
>         return top;
>     }
>     /**
>      * @param args the command line arguments
>      */
>     public static void main(String[] args) {
>         JenaTest test = new JenaTest();
>         Set <String> top = test.getAllTopClassesWithInstances();
>         System.out.println(top);
>
>     }
> }
>
> --
>
> Panagiotis Papadakos,
>
>
> PhD Student
>
> Computer Science Department
>
> University of Crete
>
> http://www.csd.uoc.gr/~papadako
>



-- 
http://www.flickr.com/photos/papadako