You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Adam Jacobs <ja...@hotmail.com> on 2017/12/17 16:57:06 UTC

DatasetFactory.createGeneral()

I've noticed a couple of differences between DatasetFactory.create() and DatasetFactory.createGeneral().
My understanding is that the former is intended to create deep copies of its graphs whereas the latter maintains shallow links.
The DatasetFactory has quite a few creational methods, and their documentation tends to be brief; but a couple of behaviors stood out to me.

     1. In the general dataset, containsNamedModel() always returns false for the default graph.
          This seems unexpected whether from the perspective of copies or links.
     2. In the general dataset, addNamedModel() appears to perform the same function as replaceNamedModel().
          This is understandable for simplicity; but from the memory dataset perspective, a combined view of identically-named graphs would be expected.
          ~ Fun with words: can we say that "identically-named graphs" are homonymous?  ;) ~

Following is some code demonstrating the two points above. Are these behaviors intentional?

    public static void main(String... args) {
        Model m1 = ModelFactory.createDefaultModel();
        m1.add(m1.createResource(), m1.createProperty("foo"), m1.createResource());
        Model m2 = ModelFactory.createDefaultModel();
        m2.add(m2.createResource(), m2.createProperty("bar"), m2.createResource());

        Dataset memory = DatasetFactory.create();
        Dataset general = DatasetFactory.createGeneral();

        memory.getDefaultModel().add(m1);
        general.getDefaultModel().add(m1);
        memory.addNamedModel("bar", m2);
        general.addNamedModel("bar", m2);

        //Memory model contains default graph. General model does not.
        System.out.println(memory.containsNamedModel(Quad.defaultGraphIRI.getURI()));  //true
        System.out.println(general.containsNamedModel(Quad.defaultGraphIRI.getURI()));  //false

        memory.addNamedModel("bar", ModelFactory.createDefaultModel());
        general.addNamedModel("bar", ModelFactory.createDefaultModel());

        //Memory model add == merge. General model add == replace.
        System.out.println(memory.getNamedModel("bar"));  //merge
        System.out.println(general.getNamedModel("bar"));  //replace
    }

Re: DatasetFactory.createGeneral()

Posted by Andy Seaborne <an...@apache.org>.
Hi Adam,

Thanks for details.

On 17/12/17 16:57, Adam Jacobs wrote:
> I've noticed a couple of differences between DatasetFactory.create() and DatasetFactory.createGeneral().
> My understanding is that the former is intended to create deep copies of its graphs whereas the latter maintains shallow links.
> The DatasetFactory has quite a few creational methods, and their documentation tends to be brief; but a couple of behaviors stood out to me.
> 
>       1. In the general dataset, containsNamedModel() always returns false for the default graph.
>            This seems unexpected whether from the perspective of copies or links.
>       2. In the general dataset, addNamedModel() appears to perform the same function as replaceNamedModel().
>            This is understandable for simplicity; but from the memory dataset perspective, a combined view of identically-named graphs would be expected.
>            ~ Fun with words: can we say that "identically-named graphs" are homonymous?  ;) ~

Very dangerous territory to talk about "naming" and named graphs :-)

> 
> Following is some code demonstrating the two points above. Are these behaviors intentional?
> 
>      public static void main(String... args) {
>          Model m1 = ModelFactory.createDefaultModel();
>          m1.add(m1.createResource(), m1.createProperty("foo"), m1.createResource());
>          Model m2 = ModelFactory.createDefaultModel();
>          m2.add(m2.createResource(), m2.createProperty("bar"), m2.createResource());

See also DatasetFactory.createTxnMem()

> 
>          Dataset memory = DatasetFactory.create();
>          Dataset general = DatasetFactory.createGeneral();
> 
>          memory.getDefaultModel().add(m1);
>          general.getDefaultModel().add(m1);
>          memory.addNamedModel("bar", m2);
>          general.addNamedModel("bar", m2);
> 
>          //Memory model contains default graph. General model does not.
>          System.out.println(memory.containsNamedModel(Quad.defaultGraphIRI.getURI()));  //true
>          System.out.println(general.containsNamedModel(Quad.defaultGraphIRI.getURI()));  //false

Bug and looks fixable.
Please file a JIRA

Ditto Quad.unionGraph.

> 
>          memory.addNamedModel("bar", ModelFactory.createDefaultModel());
>          general.addNamedModel("bar", ModelFactory.createDefaultModel());
> 
>          //Memory model add == merge. General model add == replace.
>          System.out.println(memory.getNamedModel("bar"));  //merge
>          System.out.println(general.getNamedModel("bar"));  //replace

"Feature" (depending on how you want to look at it) and poor documentation.

This is a consequence of the use case for "general" for linking to 
graphs. It replaces a link because the graph may be an inference one so 
merging into the existing one does not work.  HDT is another case 
needing "general" - and the graphs aren't updateable.

In my view, "general", and it's links design, should be downplayed in 
favour of a storage dataset (TIM) as the default.  But we need a 
migration strategy to get there, "general" is behind ja:RDFDataset. We 
failed to start the migration to the in-memory transactional one at this 
release.

     Andy

>      }
>