You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Andy Seaborne <an...@apache.org> on 2015/01/24 19:34:43 UTC

Jena3 : core/arq split

[[
oaj = org.apache.jena
chhj = com.hp.hpl.jena
]]

One major possible change target is the core/arq split.

Much of this comes down to where quads/datasets go in the package tree. 
  They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser 
related.

The general idea is move dataset/quad support to core, move parsers to 
core (separate into their own package later??) and have jena-arq be 
SPARQL only.

The question is how much change to go through to achieve that

Possibility 1 : Less change

Move DatasetGraph* to oaj.dataset.*

API visible:

Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f. 
oaj.rdf.model)

Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)

Try to leave indirection class in chhj.query.Dataset somehow.


Possibility 2 : More change, more disruption (but one time)

Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the 
"RDF API".

Use oaj.graph for DatasetGraph and Quad.

Hmm - actually writing this down, I am tending towards possibility 2 if 
that works as cleanly as it sounds.

	Andy


Re: Jena3 : core/arq split

Posted by Stian Soiland-Reyes <st...@apache.org>.
+1 to all, are you reading my thoughts? ;+)
On 26 Jan 2015 19:28, "Rob Vesse" <rv...@dotnetrdf.org> wrote:

> Andy
>
> I would prefer proposal two, Jena 3 will be disruptive regardless (if only
> because of the time people spend updating import statements).  A few other
> more minor changes to import statements and POM definitions wouldn't be
> too big of a deal IMHO
>
> I would be strongly against leaving old package names with redirects since
> it only encourages people to not bother migrating code properly and just
> to simply update the version in the POM and not be aware that there are
> other changes that happened (e.g. RDF 1.1).  A one time disruptive
> migration forward to Jena 3 that makes me actually have to consider the
> impact of the migration on my existing code is strongly preferable to a
> staggered migration
>
> In that vein I would suggest that the IO components be moved into their
> own package (jena-riot I assume?) at the same time, again the principle is
> to make people take a single larger disruptive migration rather than
> requiring many smaller migrations.  If Core needs to have some way of
> wiring in IO automatically then I suggest we do it via the Java 7+
> ServiceLoader mechanism, I'm already using it a little in the Elephas IO
> modules and it works pretty nice and I would be willing to help get this
> set up for Jena 3 IO as necessary.
>
> I suppose the IO wiring comes back to the question of whether Model.read()
> and Model.write() are still relevant or if we force everyone over to using
> RDFDataMgr (which would be my preference) since the IO module has to rely
> on Core anyway for the relevant data model APIs and having Core somehow
> rely on IO is an ugly circular dependency (or gets us into the same
> problems we have now).  Of course the alternative solution to that is to
> have the Resource API also broken out into its own module so that Core
> really is only the core low level data structures.
>
> With regards to packaging if people are using higher level POM artifacts
> like apache-jena-libs then the module changes should remain fairly
> transparent to them.
>
> Rob
>
> On 24/01/2015 10:34, "Andy Seaborne" <an...@apache.org> wrote:
>
> >[[
> >oaj = org.apache.jena
> >chhj = com.hp.hpl.jena
> >]]
> >
> >One major possible change target is the core/arq split.
> >
> >Much of this comes down to where quads/datasets go in the package tree.
> >  They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser
> >related.
> >
> >The general idea is move dataset/quad support to core, move parsers to
> >core (separate into their own package later??) and have jena-arq be
> >SPARQL only.
> >
> >The question is how much change to go through to achieve that
> >
> >Possibility 1 : Less change
> >
> >Move DatasetGraph* to oaj.dataset.*
> >
> >API visible:
> >
> >Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f.
> >oaj.rdf.model)
> >
> >Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)
> >
> >Try to leave indirection class in chhj.query.Dataset somehow.
> >
> >
> >Possibility 2 : More change, more disruption (but one time)
> >
> >Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the
> >"RDF API".
> >
> >Use oaj.graph for DatasetGraph and Quad.
> >
> >Hmm - actually writing this down, I am tending towards possibility 2 if
> >that works as cleanly as it sounds.
> >
> >       Andy
> >
>
>
>
>
>

Re: Jena3 : core/arq split

Posted by Claude Warren <cl...@xenei.com>.
+1 for option 2 as well.  Lets get all the pain done at once.

On Mon, Jan 26, 2015 at 10:57 PM, Rob Vesse <rv...@dotnetrdf.org> wrote:

> Comments inline:
>
> On 26/01/2015 14:12, "Stian Soiland-Reyes" <st...@apache.org> wrote:
>
> >If we move out jena-riot, what is the gain? It relies on jena-core, and
> >the
> >core kind of needs read/write for everyday use. Core is not abstract like
> >the Commons RDF API.
>
> Well the real "core" is, the basic interfaces and classes I.e. Node,
> Triple, Graph, DatasetGraph, Dataset are fairly self contained and
> relatively abstract.  If we are talking about the Model, Resource,
> Ontology API then those are a lot more complex
>
> It's also perfectly possible to use these APIs without ever needing any IO
> (though perhaps unusual).
>
> >
> >Could we at least call it jena-io if it goes solo? I know it also does
> >streaming, but don't make it too hard to find ;-).
> >
> >Just today there was an email on one of the LOD lists where someone bailed
> >out of Jena because it needed 4 jena-* JARs to do a remote SPARQL query.
> >("the whole Jena stack"). How people survive without dependency management
> >is beyond me, but not everyone is in Maven land :-).
>
> If they think Jena is bad (23 distinct modules) clearly they haven't seen
> the list of Sesame artifacts lately (78 distinct modules) ;)
>
> Side note:  This sort of think makes me both laugh and cry.  Users want a
> user friendly domain specific API but then balk as soon as they realise
> that it means actually needing more than one library (because apparently
> modularisation is bad practise in the minds of end users).  Like you say
> if you are a serious developer how you get by without using any kind of
> proper build/package management tool really blows my mind.
>
> >
> >I can however see one compelling argument for putting RIOT as a new module
> >- if we are able to make both Core and ARQ work without it, and it also
> >can
> >reduce the list of external dependencies for users of those (e.g. avoid
> >jsonld-java, thrift, httpclient?)
>
> Yes reducing unnecessary dependencies for those that don't need them is
> always valuable
>
> Rob
>
> >On 26 Jan 2015 19:28, "Rob Vesse" <rv...@dotnetrdf.org> wrote:
> >
> >> Andy
> >>
> >> I would prefer proposal two, Jena 3 will be disruptive regardless (if
> >>only
> >> because of the time people spend updating import statements).  A few
> >>other
> >> more minor changes to import statements and POM definitions wouldn't be
> >> too big of a deal IMHO
> >>
> >> I would be strongly against leaving old package names with redirects
> >>since
> >> it only encourages people to not bother migrating code properly and just
> >> to simply update the version in the POM and not be aware that there are
> >> other changes that happened (e.g. RDF 1.1).  A one time disruptive
> >> migration forward to Jena 3 that makes me actually have to consider the
> >> impact of the migration on my existing code is strongly preferable to a
> >> staggered migration
> >>
> >> In that vein I would suggest that the IO components be moved into their
> >> own package (jena-riot I assume?) at the same time, again the principle
> >>is
> >> to make people take a single larger disruptive migration rather than
> >> requiring many smaller migrations.  If Core needs to have some way of
> >> wiring in IO automatically then I suggest we do it via the Java 7+
> >> ServiceLoader mechanism, I'm already using it a little in the Elephas IO
> >> modules and it works pretty nice and I would be willing to help get this
> >> set up for Jena 3 IO as necessary.
> >>
> >> I suppose the IO wiring comes back to the question of whether
> >>Model.read()
> >> and Model.write() are still relevant or if we force everyone over to
> >>using
> >> RDFDataMgr (which would be my preference) since the IO module has to
> >>rely
> >> on Core anyway for the relevant data model APIs and having Core somehow
> >> rely on IO is an ugly circular dependency (or gets us into the same
> >> problems we have now).  Of course the alternative solution to that is to
> >> have the Resource API also broken out into its own module so that Core
> >> really is only the core low level data structures.
> >>
> >> With regards to packaging if people are using higher level POM artifacts
> >> like apache-jena-libs then the module changes should remain fairly
> >> transparent to them.
> >>
> >> Rob
> >>
> >> On 24/01/2015 10:34, "Andy Seaborne" <an...@apache.org> wrote:
> >>
> >> >[[
> >> >oaj = org.apache.jena
> >> >chhj = com.hp.hpl.jena
> >> >]]
> >> >
> >> >One major possible change target is the core/arq split.
> >> >
> >> >Much of this comes down to where quads/datasets go in the package tree.
> >> >  They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser
> >> >related.
> >> >
> >> >The general idea is move dataset/quad support to core, move parsers to
> >> >core (separate into their own package later??) and have jena-arq be
> >> >SPARQL only.
> >> >
> >> >The question is how much change to go through to achieve that
> >> >
> >> >Possibility 1 : Less change
> >> >
> >> >Move DatasetGraph* to oaj.dataset.*
> >> >
> >> >API visible:
> >> >
> >> >Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f.
> >> >oaj.rdf.model)
> >> >
> >> >Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)
> >> >
> >> >Try to leave indirection class in chhj.query.Dataset somehow.
> >> >
> >> >
> >> >Possibility 2 : More change, more disruption (but one time)
> >> >
> >> >Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the
> >> >"RDF API".
> >> >
> >> >Use oaj.graph for DatasetGraph and Quad.
> >> >
> >> >Hmm - actually writing this down, I am tending towards possibility 2 if
> >> >that works as cleanly as it sounds.
> >> >
> >> >       Andy
> >> >
> >>
> >>
> >>
> >>
> >>
>
>
>
>
>


-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Jena3 : core/arq split

Posted by Stian Soiland-Reyes <st...@apache.org>.
The jena-osgi.jar is also such an uberjar :)

Don't tell anyone!

On 27 January 2015 at 13:12, Andy Seaborne <an...@apache.org> wrote:
> On 26/01/15 22:57, Rob Vesse wrote:
>>
>> Comments inline:
>>
>> On 26/01/2015 14:12, "Stian Soiland-Reyes" <st...@apache.org> wrote:
>>
>>> If we move out jena-riot, what is the gain? It relies on jena-core, and
>>> the
>>> core kind of needs read/write for everyday use. Core is not abstract like
>>> the Commons RDF API.
>>
>>
>> Well the real "core" is, the basic interfaces and classes I.e. Node,
>> Triple, Graph, DatasetGraph, Dataset are fairly self contained and
>> relatively abstract.  If we are talking about the Model, Resource,
>> Ontology API then those are a lot more complex
>>
>> It's also perfectly possible to use these APIs without ever needing any IO
>> (though perhaps unusual).
>>
>>>
>>> Could we at least call it jena-io if it goes solo? I know it also does
>>> streaming, but don't make it too hard to find ;-).
>>>
>>> Just today there was an email on one of the LOD lists where someone
>>> bailed
>>> out of Jena because it needed 4 jena-* JARs to do a remote SPARQL query.
>>> ("the whole Jena stack"). How people survive without dependency
>>> management
>>> is beyond me, but not everyone is in Maven land :-).
>>
>>
>> If they think Jena is bad (23 distinct modules) clearly they haven't seen
>> the list of Sesame artifacts lately (78 distinct modules) ;)
>
>
> We could produce an uber jar of iri/core/arq/tdb.
>
> Of course we already have an uber jar + dependencies - it's called Fuseki!
> "java -cp fusekijar commandline" is so convenient working on remote servers.
>
>> Side note:  This sort of think makes me both laugh and cry.  Users want a
>> user friendly domain specific API but then balk as soon as they realise
>> that it means actually needing more than one library (because apparently
>> modularisation is bad practise in the minds of end users).  Like you say
>> if you are a serious developer how you get by without using any kind of
>> proper build/package management tool really blows my mind.
>
>
> Or using classpath "lib/*".
>
>         Andy
>
>
>>
>>>
>>> I can however see one compelling argument for putting RIOT as a new
>>> module
>>> - if we are able to make both Core and ARQ work without it, and it also
>>> can
>>> reduce the list of external dependencies for users of those (e.g. avoid
>>> jsonld-java, thrift, httpclient?)
>>
>>
>> Yes reducing unnecessary dependencies for those that don't need them is
>> always valuable
>>
>> Rob
>>
>>> On 26 Jan 2015 19:28, "Rob Vesse" <rv...@dotnetrdf.org> wrote:
>>>
>>>> Andy
>>>>
>>>> I would prefer proposal two, Jena 3 will be disruptive regardless (if
>>>> only
>>>> because of the time people spend updating import statements).  A few
>>>> other
>>>> more minor changes to import statements and POM definitions wouldn't be
>>>> too big of a deal IMHO
>>>>
>>>> I would be strongly against leaving old package names with redirects
>>>> since
>>>> it only encourages people to not bother migrating code properly and just
>>>> to simply update the version in the POM and not be aware that there are
>>>> other changes that happened (e.g. RDF 1.1).  A one time disruptive
>>>> migration forward to Jena 3 that makes me actually have to consider the
>>>> impact of the migration on my existing code is strongly preferable to a
>>>> staggered migration
>>>>
>>>> In that vein I would suggest that the IO components be moved into their
>>>> own package (jena-riot I assume?) at the same time, again the principle
>>>> is
>>>> to make people take a single larger disruptive migration rather than
>>>> requiring many smaller migrations.  If Core needs to have some way of
>>>> wiring in IO automatically then I suggest we do it via the Java 7+
>>>> ServiceLoader mechanism, I'm already using it a little in the Elephas IO
>>>> modules and it works pretty nice and I would be willing to help get this
>>>> set up for Jena 3 IO as necessary.
>>>>
>>>> I suppose the IO wiring comes back to the question of whether
>>>> Model.read()
>>>> and Model.write() are still relevant or if we force everyone over to
>>>> using
>>>> RDFDataMgr (which would be my preference) since the IO module has to
>>>> rely
>>>> on Core anyway for the relevant data model APIs and having Core somehow
>>>> rely on IO is an ugly circular dependency (or gets us into the same
>>>> problems we have now).  Of course the alternative solution to that is to
>>>> have the Resource API also broken out into its own module so that Core
>>>> really is only the core low level data structures.
>>>>
>>>> With regards to packaging if people are using higher level POM artifacts
>>>> like apache-jena-libs then the module changes should remain fairly
>>>> transparent to them.
>>>>
>>>> Rob
>>>>
>>>> On 24/01/2015 10:34, "Andy Seaborne" <an...@apache.org> wrote:
>>>>
>>>>> [[
>>>>> oaj = org.apache.jena
>>>>> chhj = com.hp.hpl.jena
>>>>> ]]
>>>>>
>>>>> One major possible change target is the core/arq split.
>>>>>
>>>>> Much of this comes down to where quads/datasets go in the package tree.
>>>>>   They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser
>>>>> related.
>>>>>
>>>>> The general idea is move dataset/quad support to core, move parsers to
>>>>> core (separate into their own package later??) and have jena-arq be
>>>>> SPARQL only.
>>>>>
>>>>> The question is how much change to go through to achieve that
>>>>>
>>>>> Possibility 1 : Less change
>>>>>
>>>>> Move DatasetGraph* to oaj.dataset.*
>>>>>
>>>>> API visible:
>>>>>
>>>>> Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f.
>>>>> oaj.rdf.model)
>>>>>
>>>>> Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)
>>>>>
>>>>> Try to leave indirection class in chhj.query.Dataset somehow.
>>>>>
>>>>>
>>>>> Possibility 2 : More change, more disruption (but one time)
>>>>>
>>>>> Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the
>>>>> "RDF API".
>>>>>
>>>>> Use oaj.graph for DatasetGraph and Quad.
>>>>>
>>>>> Hmm - actually writing this down, I am tending towards possibility 2 if
>>>>> that works as cleanly as it sounds.
>>>>>
>>>>>        Andy
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>>
>>
>>
>



-- 
Stian Soiland-Reyes
Apache Taverna (incubating)
http://orcid.org/0000-0001-9842-9718

Re: Jena3 : core/arq split

Posted by Andy Seaborne <an...@apache.org>.
On 26/01/15 22:57, Rob Vesse wrote:
> Comments inline:
>
> On 26/01/2015 14:12, "Stian Soiland-Reyes" <st...@apache.org> wrote:
>
>> If we move out jena-riot, what is the gain? It relies on jena-core, and
>> the
>> core kind of needs read/write for everyday use. Core is not abstract like
>> the Commons RDF API.
>
> Well the real "core" is, the basic interfaces and classes I.e. Node,
> Triple, Graph, DatasetGraph, Dataset are fairly self contained and
> relatively abstract.  If we are talking about the Model, Resource,
> Ontology API then those are a lot more complex
>
> It's also perfectly possible to use these APIs without ever needing any IO
> (though perhaps unusual).
>
>>
>> Could we at least call it jena-io if it goes solo? I know it also does
>> streaming, but don't make it too hard to find ;-).
>>
>> Just today there was an email on one of the LOD lists where someone bailed
>> out of Jena because it needed 4 jena-* JARs to do a remote SPARQL query.
>> ("the whole Jena stack"). How people survive without dependency management
>> is beyond me, but not everyone is in Maven land :-).
>
> If they think Jena is bad (23 distinct modules) clearly they haven't seen
> the list of Sesame artifacts lately (78 distinct modules) ;)

We could produce an uber jar of iri/core/arq/tdb.

Of course we already have an uber jar + dependencies - it's called 
Fuseki!  "java -cp fusekijar commandline" is so convenient working on 
remote servers.

> Side note:  This sort of think makes me both laugh and cry.  Users want a
> user friendly domain specific API but then balk as soon as they realise
> that it means actually needing more than one library (because apparently
> modularisation is bad practise in the minds of end users).  Like you say
> if you are a serious developer how you get by without using any kind of
> proper build/package management tool really blows my mind.

Or using classpath "lib/*".

	Andy

>
>>
>> I can however see one compelling argument for putting RIOT as a new module
>> - if we are able to make both Core and ARQ work without it, and it also
>> can
>> reduce the list of external dependencies for users of those (e.g. avoid
>> jsonld-java, thrift, httpclient?)
>
> Yes reducing unnecessary dependencies for those that don't need them is
> always valuable
>
> Rob
>
>> On 26 Jan 2015 19:28, "Rob Vesse" <rv...@dotnetrdf.org> wrote:
>>
>>> Andy
>>>
>>> I would prefer proposal two, Jena 3 will be disruptive regardless (if
>>> only
>>> because of the time people spend updating import statements).  A few
>>> other
>>> more minor changes to import statements and POM definitions wouldn't be
>>> too big of a deal IMHO
>>>
>>> I would be strongly against leaving old package names with redirects
>>> since
>>> it only encourages people to not bother migrating code properly and just
>>> to simply update the version in the POM and not be aware that there are
>>> other changes that happened (e.g. RDF 1.1).  A one time disruptive
>>> migration forward to Jena 3 that makes me actually have to consider the
>>> impact of the migration on my existing code is strongly preferable to a
>>> staggered migration
>>>
>>> In that vein I would suggest that the IO components be moved into their
>>> own package (jena-riot I assume?) at the same time, again the principle
>>> is
>>> to make people take a single larger disruptive migration rather than
>>> requiring many smaller migrations.  If Core needs to have some way of
>>> wiring in IO automatically then I suggest we do it via the Java 7+
>>> ServiceLoader mechanism, I'm already using it a little in the Elephas IO
>>> modules and it works pretty nice and I would be willing to help get this
>>> set up for Jena 3 IO as necessary.
>>>
>>> I suppose the IO wiring comes back to the question of whether
>>> Model.read()
>>> and Model.write() are still relevant or if we force everyone over to
>>> using
>>> RDFDataMgr (which would be my preference) since the IO module has to
>>> rely
>>> on Core anyway for the relevant data model APIs and having Core somehow
>>> rely on IO is an ugly circular dependency (or gets us into the same
>>> problems we have now).  Of course the alternative solution to that is to
>>> have the Resource API also broken out into its own module so that Core
>>> really is only the core low level data structures.
>>>
>>> With regards to packaging if people are using higher level POM artifacts
>>> like apache-jena-libs then the module changes should remain fairly
>>> transparent to them.
>>>
>>> Rob
>>>
>>> On 24/01/2015 10:34, "Andy Seaborne" <an...@apache.org> wrote:
>>>
>>>> [[
>>>> oaj = org.apache.jena
>>>> chhj = com.hp.hpl.jena
>>>> ]]
>>>>
>>>> One major possible change target is the core/arq split.
>>>>
>>>> Much of this comes down to where quads/datasets go in the package tree.
>>>>   They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser
>>>> related.
>>>>
>>>> The general idea is move dataset/quad support to core, move parsers to
>>>> core (separate into their own package later??) and have jena-arq be
>>>> SPARQL only.
>>>>
>>>> The question is how much change to go through to achieve that
>>>>
>>>> Possibility 1 : Less change
>>>>
>>>> Move DatasetGraph* to oaj.dataset.*
>>>>
>>>> API visible:
>>>>
>>>> Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f.
>>>> oaj.rdf.model)
>>>>
>>>> Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)
>>>>
>>>> Try to leave indirection class in chhj.query.Dataset somehow.
>>>>
>>>>
>>>> Possibility 2 : More change, more disruption (but one time)
>>>>
>>>> Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the
>>>> "RDF API".
>>>>
>>>> Use oaj.graph for DatasetGraph and Quad.
>>>>
>>>> Hmm - actually writing this down, I am tending towards possibility 2 if
>>>> that works as cleanly as it sounds.
>>>>
>>>>        Andy
>>>>
>>>
>>>
>>>
>>>
>>>
>
>
>
>


Re: Jena3 : core/arq split

Posted by Rob Vesse <rv...@dotnetrdf.org>.
Comments inline:

On 26/01/2015 14:12, "Stian Soiland-Reyes" <st...@apache.org> wrote:

>If we move out jena-riot, what is the gain? It relies on jena-core, and
>the
>core kind of needs read/write for everyday use. Core is not abstract like
>the Commons RDF API.

Well the real "core" is, the basic interfaces and classes I.e. Node,
Triple, Graph, DatasetGraph, Dataset are fairly self contained and
relatively abstract.  If we are talking about the Model, Resource,
Ontology API then those are a lot more complex

It's also perfectly possible to use these APIs without ever needing any IO
(though perhaps unusual).

>
>Could we at least call it jena-io if it goes solo? I know it also does
>streaming, but don't make it too hard to find ;-).
>
>Just today there was an email on one of the LOD lists where someone bailed
>out of Jena because it needed 4 jena-* JARs to do a remote SPARQL query.
>("the whole Jena stack"). How people survive without dependency management
>is beyond me, but not everyone is in Maven land :-).

If they think Jena is bad (23 distinct modules) clearly they haven't seen
the list of Sesame artifacts lately (78 distinct modules) ;)

Side note:  This sort of think makes me both laugh and cry.  Users want a
user friendly domain specific API but then balk as soon as they realise
that it means actually needing more than one library (because apparently
modularisation is bad practise in the minds of end users).  Like you say
if you are a serious developer how you get by without using any kind of
proper build/package management tool really blows my mind.

>
>I can however see one compelling argument for putting RIOT as a new module
>- if we are able to make both Core and ARQ work without it, and it also
>can
>reduce the list of external dependencies for users of those (e.g. avoid
>jsonld-java, thrift, httpclient?)

Yes reducing unnecessary dependencies for those that don't need them is
always valuable

Rob

>On 26 Jan 2015 19:28, "Rob Vesse" <rv...@dotnetrdf.org> wrote:
>
>> Andy
>>
>> I would prefer proposal two, Jena 3 will be disruptive regardless (if
>>only
>> because of the time people spend updating import statements).  A few
>>other
>> more minor changes to import statements and POM definitions wouldn't be
>> too big of a deal IMHO
>>
>> I would be strongly against leaving old package names with redirects
>>since
>> it only encourages people to not bother migrating code properly and just
>> to simply update the version in the POM and not be aware that there are
>> other changes that happened (e.g. RDF 1.1).  A one time disruptive
>> migration forward to Jena 3 that makes me actually have to consider the
>> impact of the migration on my existing code is strongly preferable to a
>> staggered migration
>>
>> In that vein I would suggest that the IO components be moved into their
>> own package (jena-riot I assume?) at the same time, again the principle
>>is
>> to make people take a single larger disruptive migration rather than
>> requiring many smaller migrations.  If Core needs to have some way of
>> wiring in IO automatically then I suggest we do it via the Java 7+
>> ServiceLoader mechanism, I'm already using it a little in the Elephas IO
>> modules and it works pretty nice and I would be willing to help get this
>> set up for Jena 3 IO as necessary.
>>
>> I suppose the IO wiring comes back to the question of whether
>>Model.read()
>> and Model.write() are still relevant or if we force everyone over to
>>using
>> RDFDataMgr (which would be my preference) since the IO module has to
>>rely
>> on Core anyway for the relevant data model APIs and having Core somehow
>> rely on IO is an ugly circular dependency (or gets us into the same
>> problems we have now).  Of course the alternative solution to that is to
>> have the Resource API also broken out into its own module so that Core
>> really is only the core low level data structures.
>>
>> With regards to packaging if people are using higher level POM artifacts
>> like apache-jena-libs then the module changes should remain fairly
>> transparent to them.
>>
>> Rob
>>
>> On 24/01/2015 10:34, "Andy Seaborne" <an...@apache.org> wrote:
>>
>> >[[
>> >oaj = org.apache.jena
>> >chhj = com.hp.hpl.jena
>> >]]
>> >
>> >One major possible change target is the core/arq split.
>> >
>> >Much of this comes down to where quads/datasets go in the package tree.
>> >  They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser
>> >related.
>> >
>> >The general idea is move dataset/quad support to core, move parsers to
>> >core (separate into their own package later??) and have jena-arq be
>> >SPARQL only.
>> >
>> >The question is how much change to go through to achieve that
>> >
>> >Possibility 1 : Less change
>> >
>> >Move DatasetGraph* to oaj.dataset.*
>> >
>> >API visible:
>> >
>> >Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f.
>> >oaj.rdf.model)
>> >
>> >Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)
>> >
>> >Try to leave indirection class in chhj.query.Dataset somehow.
>> >
>> >
>> >Possibility 2 : More change, more disruption (but one time)
>> >
>> >Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the
>> >"RDF API".
>> >
>> >Use oaj.graph for DatasetGraph and Quad.
>> >
>> >Hmm - actually writing this down, I am tending towards possibility 2 if
>> >that works as cleanly as it sounds.
>> >
>> >       Andy
>> >
>>
>>
>>
>>
>>





Re: Jena3 : core/arq split

Posted by Stian Soiland-Reyes <st...@apache.org>.
If we move out jena-riot, what is the gain? It relies on jena-core, and the
core kind of needs read/write for everyday use. Core is not abstract like
the Commons RDF API.

Could we at least call it jena-io if it goes solo? I know it also does
streaming, but don't make it too hard to find ;-).

Just today there was an email on one of the LOD lists where someone bailed
out of Jena because it needed 4 jena-* JARs to do a remote SPARQL query.
("the whole Jena stack"). How people survive without dependency management
is beyond me, but not everyone is in Maven land :-).

I can however see one compelling argument for putting RIOT as a new module
- if we are able to make both Core and ARQ work without it, and it also can
reduce the list of external dependencies for users of those (e.g. avoid
jsonld-java, thrift, httpclient?)
On 26 Jan 2015 19:28, "Rob Vesse" <rv...@dotnetrdf.org> wrote:

> Andy
>
> I would prefer proposal two, Jena 3 will be disruptive regardless (if only
> because of the time people spend updating import statements).  A few other
> more minor changes to import statements and POM definitions wouldn't be
> too big of a deal IMHO
>
> I would be strongly against leaving old package names with redirects since
> it only encourages people to not bother migrating code properly and just
> to simply update the version in the POM and not be aware that there are
> other changes that happened (e.g. RDF 1.1).  A one time disruptive
> migration forward to Jena 3 that makes me actually have to consider the
> impact of the migration on my existing code is strongly preferable to a
> staggered migration
>
> In that vein I would suggest that the IO components be moved into their
> own package (jena-riot I assume?) at the same time, again the principle is
> to make people take a single larger disruptive migration rather than
> requiring many smaller migrations.  If Core needs to have some way of
> wiring in IO automatically then I suggest we do it via the Java 7+
> ServiceLoader mechanism, I'm already using it a little in the Elephas IO
> modules and it works pretty nice and I would be willing to help get this
> set up for Jena 3 IO as necessary.
>
> I suppose the IO wiring comes back to the question of whether Model.read()
> and Model.write() are still relevant or if we force everyone over to using
> RDFDataMgr (which would be my preference) since the IO module has to rely
> on Core anyway for the relevant data model APIs and having Core somehow
> rely on IO is an ugly circular dependency (or gets us into the same
> problems we have now).  Of course the alternative solution to that is to
> have the Resource API also broken out into its own module so that Core
> really is only the core low level data structures.
>
> With regards to packaging if people are using higher level POM artifacts
> like apache-jena-libs then the module changes should remain fairly
> transparent to them.
>
> Rob
>
> On 24/01/2015 10:34, "Andy Seaborne" <an...@apache.org> wrote:
>
> >[[
> >oaj = org.apache.jena
> >chhj = com.hp.hpl.jena
> >]]
> >
> >One major possible change target is the core/arq split.
> >
> >Much of this comes down to where quads/datasets go in the package tree.
> >  They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser
> >related.
> >
> >The general idea is move dataset/quad support to core, move parsers to
> >core (separate into their own package later??) and have jena-arq be
> >SPARQL only.
> >
> >The question is how much change to go through to achieve that
> >
> >Possibility 1 : Less change
> >
> >Move DatasetGraph* to oaj.dataset.*
> >
> >API visible:
> >
> >Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f.
> >oaj.rdf.model)
> >
> >Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)
> >
> >Try to leave indirection class in chhj.query.Dataset somehow.
> >
> >
> >Possibility 2 : More change, more disruption (but one time)
> >
> >Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the
> >"RDF API".
> >
> >Use oaj.graph for DatasetGraph and Quad.
> >
> >Hmm - actually writing this down, I am tending towards possibility 2 if
> >that works as cleanly as it sounds.
> >
> >       Andy
> >
>
>
>
>
>

Re: Jena3 : core/arq split

Posted by Andy Seaborne <an...@apache.org>.
On 26/01/15 19:27, Rob Vesse wrote:
> Andy
>
> I would prefer proposal two, Jena 3 will be disruptive regardless (if only
> because of the time people spend updating import statements).  A few other
> more minor changes to import statements and POM definitions wouldn't be
> too big of a deal IMHO

Agreed.

> I would be strongly against leaving old package names with redirects since
> it only encourages people to not bother migrating code properly and just
> to simply update the version in the POM and not be aware that there are
> other changes that happened (e.g. RDF 1.1).  A one time disruptive
> migration forward to Jena 3 that makes me actually have to consider the
> impact of the migration on my existing code is strongly preferable to a
> staggered migration

You persuade me ...

>
> In that vein I would suggest that the IO components be moved into their
> own package (jena-riot I assume?) at the same time, again the principle is
> to make people take a single larger disruptive migration rather than
> requiring many smaller migrations.  If Core needs to have some way of
> wiring in IO automatically then I suggest we do it via the Java 7+
> ServiceLoader mechanism, I'm already using it a little in the Elephas IO
> modules and it works pretty nice and I would be willing to help get this
> set up for Jena 3 IO as necessary.

Yes - it works nicely once it's set up.

I think whether to have jena-riot or packages in core is by and large a 
internal project issue.  Code isn't that large, few dependencies from 
RIOT.  It does enforce a clean separation.

I can think of one catch - it would be nice (and only "nice") to add 
enum-like constants to the model.read operations to avoid the 
string-confusion.

Currently, in oaj.riot.Lang.

Some subclassed ones in core would work ... but feels like getting 
technical debt even before a Jena3 is released.

> I suppose the IO wiring comes back to the question of whether Model.read()
> and Model.write() are still relevant or if we force everyone over to using
> RDFDataMgr (which would be my preference) since the IO module has to rely
> on Core anyway for the relevant data model APIs and having Core somehow
> rely on IO is an ugly circular dependency (or gets us into the same
> problems we have now).  Of course the alternative solution to that is to
> have the Resource API also broken out into its own module so that Core
> really is only the core low level data structures.
 >
> With regards to packaging if people are using higher level POM artifacts
> like apache-jena-libs then the module changes should remain fairly
> transparent to them.

Yes!

>
> Rob
>
> On 24/01/2015 10:34, "Andy Seaborne" <an...@apache.org> wrote:
>
>> [[
>> oaj = org.apache.jena
>> chhj = com.hp.hpl.jena
>> ]]
>>
>> One major possible change target is the core/arq split.
>>
>> Much of this comes down to where quads/datasets go in the package tree.
>>   They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser
>> related.
>>
>> The general idea is move dataset/quad support to core, move parsers to
>> core (separate into their own package later??) and have jena-arq be
>> SPARQL only.
>>
>> The question is how much change to go through to achieve that
>>
>> Possibility 1 : Less change
>>
>> Move DatasetGraph* to oaj.dataset.*
>>
>> API visible:
>>
>> Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f.
>> oaj.rdf.model)
>>
>> Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)
>>
>> Try to leave indirection class in chhj.query.Dataset somehow.
>>
>>
>> Possibility 2 : More change, more disruption (but one time)
>>
>> Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the
>> "RDF API".
>>
>> Use oaj.graph for DatasetGraph and Quad.
>>
>> Hmm - actually writing this down, I am tending towards possibility 2 if
>> that works as cleanly as it sounds.
>>
>> 	Andy
>>
>
>
>
>


Re: Jena3 : core/arq split

Posted by Rob Vesse <rv...@dotnetrdf.org>.
Andy

I would prefer proposal two, Jena 3 will be disruptive regardless (if only
because of the time people spend updating import statements).  A few other
more minor changes to import statements and POM definitions wouldn't be
too big of a deal IMHO

I would be strongly against leaving old package names with redirects since
it only encourages people to not bother migrating code properly and just
to simply update the version in the POM and not be aware that there are
other changes that happened (e.g. RDF 1.1).  A one time disruptive
migration forward to Jena 3 that makes me actually have to consider the
impact of the migration on my existing code is strongly preferable to a
staggered migration

In that vein I would suggest that the IO components be moved into their
own package (jena-riot I assume?) at the same time, again the principle is
to make people take a single larger disruptive migration rather than
requiring many smaller migrations.  If Core needs to have some way of
wiring in IO automatically then I suggest we do it via the Java 7+
ServiceLoader mechanism, I'm already using it a little in the Elephas IO
modules and it works pretty nice and I would be willing to help get this
set up for Jena 3 IO as necessary.

I suppose the IO wiring comes back to the question of whether Model.read()
and Model.write() are still relevant or if we force everyone over to using
RDFDataMgr (which would be my preference) since the IO module has to rely
on Core anyway for the relevant data model APIs and having Core somehow
rely on IO is an ugly circular dependency (or gets us into the same
problems we have now).  Of course the alternative solution to that is to
have the Resource API also broken out into its own module so that Core
really is only the core low level data structures.

With regards to packaging if people are using higher level POM artifacts
like apache-jena-libs then the module changes should remain fairly
transparent to them.

Rob

On 24/01/2015 10:34, "Andy Seaborne" <an...@apache.org> wrote:

>[[
>oaj = org.apache.jena
>chhj = com.hp.hpl.jena
>]]
>
>One major possible change target is the core/arq split.
>
>Much of this comes down to where quads/datasets go in the package tree.
>  They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser
>related.
>
>The general idea is move dataset/quad support to core, move parsers to
>core (separate into their own package later??) and have jena-arq be
>SPARQL only.
>
>The question is how much change to go through to achieve that
>
>Possibility 1 : Less change
>
>Move DatasetGraph* to oaj.dataset.*
>
>API visible:
>
>Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f.
>oaj.rdf.model)
>
>Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)
>
>Try to leave indirection class in chhj.query.Dataset somehow.
>
>
>Possibility 2 : More change, more disruption (but one time)
>
>Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the
>"RDF API".
>
>Use oaj.graph for DatasetGraph and Quad.
>
>Hmm - actually writing this down, I am tending towards possibility 2 if
>that works as cleanly as it sounds.
>
>	Andy
>