You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Andy Seaborne <an...@apache.org> on 2015/12/08 09:38:50 UTC

Re: A release?

On 23/11/15 15:31, A. Soroka wrote:
> A weightless +1 from me, of course! {grin} I’m excited to see this stuff get in front of people.
>
> I’m not sure what you mean by “Fuseki integration”, Andy. Do you mean to make the new dataset impl available as an option in Fuseki (presumably in both the web admin UI and the CLI for the standalone server)?

Fuseki should chnage to using the new in-memory txn dataset when 
creating new in-memory setups.  So the template-driven assemblers (UI) 
and from the command line (--mem) will change.  Existing setups are 
preserved because the template is used to create an assembler when first 
used, not each start-up.

	Andy

>
> ---
> A. Soroka
> The University of Virginia Library
>
>> On Nov 23, 2015, at 10:24 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>> We have:
>>
>> * in-memory transactional dataset from Adam is in the codebase
>> (details on progress below)
>> * configurable lucene analyzers
>> * general maintenance.
>> * Improved start-up and configuration to wire the jars together.
>>
>> Shall we do a release? This fits our 3-6 months cycle.
>>
>> As we know, getting people to try unreleased builds has limited effect (generally for point-bug fixes, less so for new features) so my suggestion is to release soon as v3.0.1 and describe the in-memory transactional dataset as "RC".  It's not fully integrated (test: Fuseki is not using it yet) and getting that done and stable isn't an instant task to do. At the same time, a release will let people try it.
>>
>> So we expose the in-memory transactional dataset at the API level now, introduce the deprecations so people can see what's coming, and complete/stabilize the integration to make a v3.1.0 in a few months time.
>>
>> 	Andy
>>
>>
>> Progress on the in-memory transactional dataset
>> https://pony-poc.apache.org/thread.html/Zj36f2pbaszgdnj
>>
>> If you want to try it, then DatasetFactory.createTxnMem() is the entry point.
>>
>> --------------------------------------------------------
>>
>> Outstanding tasks:
>> * Dependency version management in jena-parent.
>>
>> Done
>>
>> * more testing [done], and the organisation of the tests
>>
>> Test done, organisation of tests is partially done. There is a wider tasks to organise the dataset tests we have into some sort of order but that's non-blocking.
>>
>> * documentation
>> * code cleaning for deprecation of DatasetFactory.createMem
>> * Same migration for DatasetGraphFactory.createMem
>> * Fuseki integration
>>
>> --------------------------------------------------------
>>
>


Re: A release?

Posted by "A. Soroka" <aj...@virginia.edu>.
Okay, after trying three browsers, I found one that works (the others kept redirecting to https). I have some suggestions (and some guilt, since I should have written this without prompting). Shall I send a patch against the Markdown in SVN?

---
A. Soroka
The University of Virginia Library

> On Dec 9, 2015, at 8:03 AM, Andy Seaborne <an...@apache.org> wrote:
> 
> On 09/12/15 12:58, A. Soroka wrote:
>> Is it possible for non-committers to see that? I get a login dialog with the message "ASF Committers”.
> 
> The HTML on staging is public - I don't use my id to log in.
> 
> curl http://jena.staging.apache.org/documentation/rdf/datasets.html
> 
> works from where I am - no password, no login.
> 
> Have you got a CMS image setup from another time? that can get things confused. (note - http, not nttps)
> 
> Or the markdown:
> 
> https://svn.apache.org/repos/asf/jena/site/trunk/content/documentation/rdf/datasets.md
> 
> 	Andy
> 
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Dec 9, 2015, at 5:20 AM, Andy Seaborne <an...@apache.org> wrote:
>>> 
>>> I put some documentation in at
>>> 
>>> http://jena.staging.apache.org/documentation/rdf/datasets.html
>>> 
>>> because the announcement needs to link to something.  I hope I got it right.  This can be changed between releases -- the bytes aren't in the release itself.
>>> 
>>> (The file name was chosen so that more general text could go there and this becomes a section of documentation about dataset usage in Jena)
>>> 
>>> 	Andy
>>> 
>>> On 08/12/15 14:53, A. Soroka wrote:
>>>> That should certainly “enlist” a number of RC testers. {grin}
>>>> 
>>>> I will stand by for bug fixes.
>>>> 
>>>> ---
>>>> A. Soroka
>>>> The University of Virginia Library
>>>> 
>>>>> On Dec 8, 2015, at 3:38 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>> 
>>>>> On 23/11/15 15:31, A. Soroka wrote:
>>>>>> A weightless +1 from me, of course! {grin} I’m excited to see this stuff get in front of people.
>>>>>> 
>>>>>> I’m not sure what you mean by “Fuseki integration”, Andy. Do you mean to make the new dataset impl available as an option in Fuseki (presumably in both the web admin UI and the CLI for the standalone server)?
>>>>> 
>>>>> Fuseki should chnage to using the new in-memory txn dataset when creating new in-memory setups.  So the template-driven assemblers (UI) and from the command line (--mem) will change.  Existing setups are preserved because the template is used to create an assembler when first used, not each start-up.
>>>>> 
>>>>> 	Andy
>>>>> 
>>>>>> 
>>>>>> ---
>>>>>> A. Soroka
>>>>>> The University of Virginia Library
>>>>>> 
>>>>>>> On Nov 23, 2015, at 10:24 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>>>> 
>>>>>>> We have:
>>>>>>> 
>>>>>>> * in-memory transactional dataset from Adam is in the codebase
>>>>>>> (details on progress below)
>>>>>>> * configurable lucene analyzers
>>>>>>> * general maintenance.
>>>>>>> * Improved start-up and configuration to wire the jars together.
>>>>>>> 
>>>>>>> Shall we do a release? This fits our 3-6 months cycle.
>>>>>>> 
>>>>>>> As we know, getting people to try unreleased builds has limited effect (generally for point-bug fixes, less so for new features) so my suggestion is to release soon as v3.0.1 and describe the in-memory transactional dataset as "RC".  It's not fully integrated (test: Fuseki is not using it yet) and getting that done and stable isn't an instant task to do. At the same time, a release will let people try it.
>>>>>>> 
>>>>>>> So we expose the in-memory transactional dataset at the API level now, introduce the deprecations so people can see what's coming, and complete/stabilize the integration to make a v3.1.0 in a few months time.
>>>>>>> 
>>>>>>> 	Andy
>>>>>>> 
>>>>>>> 
>>>>>>> Progress on the in-memory transactional dataset
>>>>>>> https://pony-poc.apache.org/thread.html/Zj36f2pbaszgdnj
>>>>>>> 
>>>>>>> If you want to try it, then DatasetFactory.createTxnMem() is the entry point.
>>>>>>> 
>>>>>>> --------------------------------------------------------
>>>>>>> 
>>>>>>> Outstanding tasks:
>>>>>>> * Dependency version management in jena-parent.
>>>>>>> 
>>>>>>> Done
>>>>>>> 
>>>>>>> * more testing [done], and the organisation of the tests
>>>>>>> 
>>>>>>> Test done, organisation of tests is partially done. There is a wider tasks to organise the dataset tests we have into some sort of order but that's non-blocking.
>>>>>>> 
>>>>>>> * documentation
>>>>>>> * code cleaning for deprecation of DatasetFactory.createMem
>>>>>>> * Same migration for DatasetGraphFactory.createMem
>>>>>>> * Fuseki integration
>>>>>>> 
>>>>>>> --------------------------------------------------------
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 


Re: A release?

Posted by Andy Seaborne <an...@apache.org>.
On 09/12/15 12:58, A. Soroka wrote:
> Is it possible for non-committers to see that? I get a login dialog with the message "ASF Committers”.

The HTML on staging is public - I don't use my id to log in.

curl http://jena.staging.apache.org/documentation/rdf/datasets.html

works from where I am - no password, no login.

Have you got a CMS image setup from another time? that can get things 
confused. (note - http, not nttps)

Or the markdown:

https://svn.apache.org/repos/asf/jena/site/trunk/content/documentation/rdf/datasets.md

	Andy

>
> ---
> A. Soroka
> The University of Virginia Library
>
>> On Dec 9, 2015, at 5:20 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>> I put some documentation in at
>>
>> http://jena.staging.apache.org/documentation/rdf/datasets.html
>>
>> because the announcement needs to link to something.  I hope I got it right.  This can be changed between releases -- the bytes aren't in the release itself.
>>
>> (The file name was chosen so that more general text could go there and this becomes a section of documentation about dataset usage in Jena)
>>
>> 	Andy
>>
>> On 08/12/15 14:53, A. Soroka wrote:
>>> That should certainly “enlist” a number of RC testers. {grin}
>>>
>>> I will stand by for bug fixes.
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>>> On Dec 8, 2015, at 3:38 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>
>>>> On 23/11/15 15:31, A. Soroka wrote:
>>>>> A weightless +1 from me, of course! {grin} I’m excited to see this stuff get in front of people.
>>>>>
>>>>> I’m not sure what you mean by “Fuseki integration”, Andy. Do you mean to make the new dataset impl available as an option in Fuseki (presumably in both the web admin UI and the CLI for the standalone server)?
>>>>
>>>> Fuseki should chnage to using the new in-memory txn dataset when creating new in-memory setups.  So the template-driven assemblers (UI) and from the command line (--mem) will change.  Existing setups are preserved because the template is used to create an assembler when first used, not each start-up.
>>>>
>>>> 	Andy
>>>>
>>>>>
>>>>> ---
>>>>> A. Soroka
>>>>> The University of Virginia Library
>>>>>
>>>>>> On Nov 23, 2015, at 10:24 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>>>
>>>>>> We have:
>>>>>>
>>>>>> * in-memory transactional dataset from Adam is in the codebase
>>>>>> (details on progress below)
>>>>>> * configurable lucene analyzers
>>>>>> * general maintenance.
>>>>>> * Improved start-up and configuration to wire the jars together.
>>>>>>
>>>>>> Shall we do a release? This fits our 3-6 months cycle.
>>>>>>
>>>>>> As we know, getting people to try unreleased builds has limited effect (generally for point-bug fixes, less so for new features) so my suggestion is to release soon as v3.0.1 and describe the in-memory transactional dataset as "RC".  It's not fully integrated (test: Fuseki is not using it yet) and getting that done and stable isn't an instant task to do. At the same time, a release will let people try it.
>>>>>>
>>>>>> So we expose the in-memory transactional dataset at the API level now, introduce the deprecations so people can see what's coming, and complete/stabilize the integration to make a v3.1.0 in a few months time.
>>>>>>
>>>>>> 	Andy
>>>>>>
>>>>>>
>>>>>> Progress on the in-memory transactional dataset
>>>>>> https://pony-poc.apache.org/thread.html/Zj36f2pbaszgdnj
>>>>>>
>>>>>> If you want to try it, then DatasetFactory.createTxnMem() is the entry point.
>>>>>>
>>>>>> --------------------------------------------------------
>>>>>>
>>>>>> Outstanding tasks:
>>>>>> * Dependency version management in jena-parent.
>>>>>>
>>>>>> Done
>>>>>>
>>>>>> * more testing [done], and the organisation of the tests
>>>>>>
>>>>>> Test done, organisation of tests is partially done. There is a wider tasks to organise the dataset tests we have into some sort of order but that's non-blocking.
>>>>>>
>>>>>> * documentation
>>>>>> * code cleaning for deprecation of DatasetFactory.createMem
>>>>>> * Same migration for DatasetGraphFactory.createMem
>>>>>> * Fuseki integration
>>>>>>
>>>>>> --------------------------------------------------------
>>>>>>
>>>>>
>>>>
>>>
>>
>


Re: A release?

Posted by "A. Soroka" <aj...@virginia.edu>.
Is it possible for non-committers to see that? I get a login dialog with the message "ASF Committers”.

---
A. Soroka
The University of Virginia Library

> On Dec 9, 2015, at 5:20 AM, Andy Seaborne <an...@apache.org> wrote:
> 
> I put some documentation in at
> 
> http://jena.staging.apache.org/documentation/rdf/datasets.html
> 
> because the announcement needs to link to something.  I hope I got it right.  This can be changed between releases -- the bytes aren't in the release itself.
> 
> (The file name was chosen so that more general text could go there and this becomes a section of documentation about dataset usage in Jena)
> 
> 	Andy
> 
> On 08/12/15 14:53, A. Soroka wrote:
>> That should certainly “enlist” a number of RC testers. {grin}
>> 
>> I will stand by for bug fixes.
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Dec 8, 2015, at 3:38 AM, Andy Seaborne <an...@apache.org> wrote:
>>> 
>>> On 23/11/15 15:31, A. Soroka wrote:
>>>> A weightless +1 from me, of course! {grin} I’m excited to see this stuff get in front of people.
>>>> 
>>>> I’m not sure what you mean by “Fuseki integration”, Andy. Do you mean to make the new dataset impl available as an option in Fuseki (presumably in both the web admin UI and the CLI for the standalone server)?
>>> 
>>> Fuseki should chnage to using the new in-memory txn dataset when creating new in-memory setups.  So the template-driven assemblers (UI) and from the command line (--mem) will change.  Existing setups are preserved because the template is used to create an assembler when first used, not each start-up.
>>> 
>>> 	Andy
>>> 
>>>> 
>>>> ---
>>>> A. Soroka
>>>> The University of Virginia Library
>>>> 
>>>>> On Nov 23, 2015, at 10:24 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>> 
>>>>> We have:
>>>>> 
>>>>> * in-memory transactional dataset from Adam is in the codebase
>>>>> (details on progress below)
>>>>> * configurable lucene analyzers
>>>>> * general maintenance.
>>>>> * Improved start-up and configuration to wire the jars together.
>>>>> 
>>>>> Shall we do a release? This fits our 3-6 months cycle.
>>>>> 
>>>>> As we know, getting people to try unreleased builds has limited effect (generally for point-bug fixes, less so for new features) so my suggestion is to release soon as v3.0.1 and describe the in-memory transactional dataset as "RC".  It's not fully integrated (test: Fuseki is not using it yet) and getting that done and stable isn't an instant task to do. At the same time, a release will let people try it.
>>>>> 
>>>>> So we expose the in-memory transactional dataset at the API level now, introduce the deprecations so people can see what's coming, and complete/stabilize the integration to make a v3.1.0 in a few months time.
>>>>> 
>>>>> 	Andy
>>>>> 
>>>>> 
>>>>> Progress on the in-memory transactional dataset
>>>>> https://pony-poc.apache.org/thread.html/Zj36f2pbaszgdnj
>>>>> 
>>>>> If you want to try it, then DatasetFactory.createTxnMem() is the entry point.
>>>>> 
>>>>> --------------------------------------------------------
>>>>> 
>>>>> Outstanding tasks:
>>>>> * Dependency version management in jena-parent.
>>>>> 
>>>>> Done
>>>>> 
>>>>> * more testing [done], and the organisation of the tests
>>>>> 
>>>>> Test done, organisation of tests is partially done. There is a wider tasks to organise the dataset tests we have into some sort of order but that's non-blocking.
>>>>> 
>>>>> * documentation
>>>>> * code cleaning for deprecation of DatasetFactory.createMem
>>>>> * Same migration for DatasetGraphFactory.createMem
>>>>> * Fuseki integration
>>>>> 
>>>>> --------------------------------------------------------
>>>>> 
>>>> 
>>> 
>> 
> 


Re: A release?

Posted by "A. Soroka" <aj...@virginia.edu>.
“We” are the people having this conversation. It’s often very hard for me to understand what you are saying (e.g. “If it makes a difference, it can the implementation for Dataset(Graph)Factory.create()…", so I need to confirm carefully what we are talking about.

If I understand it so far, you want me to execute some TupleTable implementations that use simple datastructures that would not support transactionality but would operate with low overhead so that we can check the performance of a DatasetGraphInMemory using them?

---
A. Soroka
The University of Virginia Library

> On Dec 10, 2015, at 1:07 PM, Andy Seaborne <an...@apache.org> wrote:
> 
> On 10/12/15 17:41, A. Soroka wrote:
>> Ah, okay, cool. But was my other interpretation correct? A DatasetGraphInMemory using simple (non-transaction-supporting) TupleTables would be good for DatasetGraphFactory::create?
>> 
> 
> "so we are saying" - who is "we?!
> 
> maybe - if it's appreciably faster, yes; if it's not less point and it adds to maintenance.
> 
> That's why I was interested in a performance comparison.
> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Dec 10, 2015, at 12:38 PM, Andy Seaborne <an...@apache.org> wrote:
>>> 
>>> On 10/12/15 17:18, A. Soroka wrote:
>>>> Okay, so we are saying that a Dataset around a DatasetGraphInMemory using simple (non-transaction-supporting) TupleTables would be good for DatasetFactory::create or ::createGeneral? (Or both— I’m not totally clear about the difference between them.)
>>> 
>>> General holds graphs from other storages as a ptr to that graph.  It is a collection of graphs, no quads.  "GRAPH ?g" is a loop.
>>> 
>>> 	Andy
>>> 
>>>> 
>>>> ---
>>>> A. Soroka
>>>> The University of Virginia Library
>>>> 
>>>>> On Dec 10, 2015, at 10:25 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>> 
>>>>> On 10/12/15 13:31, A. Soroka wrote:
>>>>>> To the first question, I think the answer is yes. It was very much my intention that TupleTable and its subtypes would provide opportunity to explore different useful structures (e.g. Claude Warren’s ideas about Bloom filters). A [Triple|Quad]Table that uses ordinary maps/structures in the same way as the current impl uses persistent structures should be very easy. Would you like me to cut such classes to have them available for non-transactional cases?
>>>>>> 
>>>>>> I’m not quite sure what your last sentence means; is it about the semantics of the factory methods?
>>>>> 
>>>>> There are 4 factory methods in each factory
>>>>> 
>>>>> * DatasetFactory.create()
>>>>> The new place for in-memory, non-transactional dataset with graph-copy semantics.  Currently, it is a general dataset with "add graph" overridden to do a copy-in.
>>>>> 
>>>>> * DatasetFactory.createTxnMem()
>>>>> The new place for in-memory, transactional dataset (with graph-copy semantics).
>>>>> 
>>>>> * DatasetFactory.createMem()
>>>>> The old place for in-memory datasets - goes to createGeneral()
>>>>> Deprecated because if replaced by create() then details have changed.
>>>>> 
>>>>> * DatasetFactory.createGeneral()
>>>>> The new place for general datasets.
>>>>> 
>>>>>   Andy
>>>>> 
>>>>>> 
>>>>>> ---
>>>>>> A. Soroka
>>>>>> The University of Virginia Library
>>>>>> 
>>>>>>> On Dec 10, 2015, at 8:15 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>>>> 
>>>>>>> On 09/12/15 21:00, A. Soroka wrote:
>>>>>>>> Cool. The extension points are basically TripleTable and QuadTable, because of the constructor DatasetGraphInMemory(QuadTable, TripleTable). Someone could offer their own impls and use that (public) constructor.
>>>>>>> 
>>>>>>> Does that mean it would be easy to do a non-transactional version, that used hash maps (or used hash maps at the top level at least)?
>>>>>>> 
>>>>>>> That would make an excellent performance comparison.
>>>>>>> 
>>>>>>> If it makes a difference, it can the implementation for Dataset(Graph)Factory.create() and have the copy-in semantics for "add graph".
>>>>>>> 
>>>>>>> 	Andy
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 


Re: A release?

Posted by Andy Seaborne <an...@apache.org>.
On 10/12/15 17:41, A. Soroka wrote:
> Ah, okay, cool. But was my other interpretation correct? A DatasetGraphInMemory using simple (non-transaction-supporting) TupleTables would be good for DatasetGraphFactory::create?
>

"so we are saying" - who is "we?!

maybe - if it's appreciably faster, yes; if it's not less point and it 
adds to maintenance.

That's why I was interested in a performance comparison.

> ---
> A. Soroka
> The University of Virginia Library
>
>> On Dec 10, 2015, at 12:38 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>> On 10/12/15 17:18, A. Soroka wrote:
>>> Okay, so we are saying that a Dataset around a DatasetGraphInMemory using simple (non-transaction-supporting) TupleTables would be good for DatasetFactory::create or ::createGeneral? (Or both— I’m not totally clear about the difference between them.)
>>
>> General holds graphs from other storages as a ptr to that graph.  It is a collection of graphs, no quads.  "GRAPH ?g" is a loop.
>>
>> 	Andy
>>
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>>> On Dec 10, 2015, at 10:25 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>
>>>> On 10/12/15 13:31, A. Soroka wrote:
>>>>> To the first question, I think the answer is yes. It was very much my intention that TupleTable and its subtypes would provide opportunity to explore different useful structures (e.g. Claude Warren’s ideas about Bloom filters). A [Triple|Quad]Table that uses ordinary maps/structures in the same way as the current impl uses persistent structures should be very easy. Would you like me to cut such classes to have them available for non-transactional cases?
>>>>>
>>>>> I’m not quite sure what your last sentence means; is it about the semantics of the factory methods?
>>>>
>>>> There are 4 factory methods in each factory
>>>>
>>>> * DatasetFactory.create()
>>>> The new place for in-memory, non-transactional dataset with graph-copy semantics.  Currently, it is a general dataset with "add graph" overridden to do a copy-in.
>>>>
>>>> * DatasetFactory.createTxnMem()
>>>> The new place for in-memory, transactional dataset (with graph-copy semantics).
>>>>
>>>> * DatasetFactory.createMem()
>>>> The old place for in-memory datasets - goes to createGeneral()
>>>> Deprecated because if replaced by create() then details have changed.
>>>>
>>>> * DatasetFactory.createGeneral()
>>>> The new place for general datasets.
>>>>
>>>>    Andy
>>>>
>>>>>
>>>>> ---
>>>>> A. Soroka
>>>>> The University of Virginia Library
>>>>>
>>>>>> On Dec 10, 2015, at 8:15 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>>>
>>>>>> On 09/12/15 21:00, A. Soroka wrote:
>>>>>>> Cool. The extension points are basically TripleTable and QuadTable, because of the constructor DatasetGraphInMemory(QuadTable, TripleTable). Someone could offer their own impls and use that (public) constructor.
>>>>>>
>>>>>> Does that mean it would be easy to do a non-transactional version, that used hash maps (or used hash maps at the top level at least)?
>>>>>>
>>>>>> That would make an excellent performance comparison.
>>>>>>
>>>>>> If it makes a difference, it can the implementation for Dataset(Graph)Factory.create() and have the copy-in semantics for "add graph".
>>>>>>
>>>>>> 	Andy
>>>>>
>>>>
>>>
>>
>


Re: A release?

Posted by "A. Soroka" <aj...@virginia.edu>.
Ah, okay, cool. But was my other interpretation correct? A DatasetGraphInMemory using simple (non-transaction-supporting) TupleTables would be good for DatasetGraphFactory::create?

---
A. Soroka
The University of Virginia Library

> On Dec 10, 2015, at 12:38 PM, Andy Seaborne <an...@apache.org> wrote:
> 
> On 10/12/15 17:18, A. Soroka wrote:
>> Okay, so we are saying that a Dataset around a DatasetGraphInMemory using simple (non-transaction-supporting) TupleTables would be good for DatasetFactory::create or ::createGeneral? (Or both— I’m not totally clear about the difference between them.)
> 
> General holds graphs from other storages as a ptr to that graph.  It is a collection of graphs, no quads.  "GRAPH ?g" is a loop.
> 
> 	Andy
> 
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Dec 10, 2015, at 10:25 AM, Andy Seaborne <an...@apache.org> wrote:
>>> 
>>> On 10/12/15 13:31, A. Soroka wrote:
>>>> To the first question, I think the answer is yes. It was very much my intention that TupleTable and its subtypes would provide opportunity to explore different useful structures (e.g. Claude Warren’s ideas about Bloom filters). A [Triple|Quad]Table that uses ordinary maps/structures in the same way as the current impl uses persistent structures should be very easy. Would you like me to cut such classes to have them available for non-transactional cases?
>>>> 
>>>> I’m not quite sure what your last sentence means; is it about the semantics of the factory methods?
>>> 
>>> There are 4 factory methods in each factory
>>> 
>>> * DatasetFactory.create()
>>> The new place for in-memory, non-transactional dataset with graph-copy semantics.  Currently, it is a general dataset with "add graph" overridden to do a copy-in.
>>> 
>>> * DatasetFactory.createTxnMem()
>>> The new place for in-memory, transactional dataset (with graph-copy semantics).
>>> 
>>> * DatasetFactory.createMem()
>>> The old place for in-memory datasets - goes to createGeneral()
>>> Deprecated because if replaced by create() then details have changed.
>>> 
>>> * DatasetFactory.createGeneral()
>>> The new place for general datasets.
>>> 
>>>   Andy
>>> 
>>>> 
>>>> ---
>>>> A. Soroka
>>>> The University of Virginia Library
>>>> 
>>>>> On Dec 10, 2015, at 8:15 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>> 
>>>>> On 09/12/15 21:00, A. Soroka wrote:
>>>>>> Cool. The extension points are basically TripleTable and QuadTable, because of the constructor DatasetGraphInMemory(QuadTable, TripleTable). Someone could offer their own impls and use that (public) constructor.
>>>>> 
>>>>> Does that mean it would be easy to do a non-transactional version, that used hash maps (or used hash maps at the top level at least)?
>>>>> 
>>>>> That would make an excellent performance comparison.
>>>>> 
>>>>> If it makes a difference, it can the implementation for Dataset(Graph)Factory.create() and have the copy-in semantics for "add graph".
>>>>> 
>>>>> 	Andy
>>>> 
>>> 
>> 
> 


Re: A release?

Posted by Andy Seaborne <an...@apache.org>.
On 10/12/15 17:18, A. Soroka wrote:
> Okay, so we are saying that a Dataset around a DatasetGraphInMemory using simple (non-transaction-supporting) TupleTables would be good for DatasetFactory::create or ::createGeneral? (Or both— I’m not totally clear about the difference between them.)

General holds graphs from other storages as a ptr to that graph.  It is 
a collection of graphs, no quads.  "GRAPH ?g" is a loop.

	Andy

>
> ---
> A. Soroka
> The University of Virginia Library
>
>> On Dec 10, 2015, at 10:25 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>> On 10/12/15 13:31, A. Soroka wrote:
>>> To the first question, I think the answer is yes. It was very much my intention that TupleTable and its subtypes would provide opportunity to explore different useful structures (e.g. Claude Warren’s ideas about Bloom filters). A [Triple|Quad]Table that uses ordinary maps/structures in the same way as the current impl uses persistent structures should be very easy. Would you like me to cut such classes to have them available for non-transactional cases?
>>>
>>> I’m not quite sure what your last sentence means; is it about the semantics of the factory methods?
>>
>> There are 4 factory methods in each factory
>>
>> * DatasetFactory.create()
>> The new place for in-memory, non-transactional dataset with graph-copy semantics.  Currently, it is a general dataset with "add graph" overridden to do a copy-in.
>>
>> * DatasetFactory.createTxnMem()
>> The new place for in-memory, transactional dataset (with graph-copy semantics).
>>
>> * DatasetFactory.createMem()
>> The old place for in-memory datasets - goes to createGeneral()
>> Deprecated because if replaced by create() then details have changed.
>>
>> * DatasetFactory.createGeneral()
>> The new place for general datasets.
>>
>>    Andy
>>
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>>> On Dec 10, 2015, at 8:15 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>
>>>> On 09/12/15 21:00, A. Soroka wrote:
>>>>> Cool. The extension points are basically TripleTable and QuadTable, because of the constructor DatasetGraphInMemory(QuadTable, TripleTable). Someone could offer their own impls and use that (public) constructor.
>>>>
>>>> Does that mean it would be easy to do a non-transactional version, that used hash maps (or used hash maps at the top level at least)?
>>>>
>>>> That would make an excellent performance comparison.
>>>>
>>>> If it makes a difference, it can the implementation for Dataset(Graph)Factory.create() and have the copy-in semantics for "add graph".
>>>>
>>>> 	Andy
>>>
>>
>


Re: A release?

Posted by "A. Soroka" <aj...@virginia.edu>.
Okay, so we are saying that a Dataset around a DatasetGraphInMemory using simple (non-transaction-supporting) TupleTables would be good for DatasetFactory::create or ::createGeneral? (Or both— I’m not totally clear about the difference between them.)

---
A. Soroka
The University of Virginia Library

> On Dec 10, 2015, at 10:25 AM, Andy Seaborne <an...@apache.org> wrote:
> 
> On 10/12/15 13:31, A. Soroka wrote:
>> To the first question, I think the answer is yes. It was very much my intention that TupleTable and its subtypes would provide opportunity to explore different useful structures (e.g. Claude Warren’s ideas about Bloom filters). A [Triple|Quad]Table that uses ordinary maps/structures in the same way as the current impl uses persistent structures should be very easy. Would you like me to cut such classes to have them available for non-transactional cases?
>> 
>> I’m not quite sure what your last sentence means; is it about the semantics of the factory methods?
> 
> There are 4 factory methods in each factory
> 
> * DatasetFactory.create()
> The new place for in-memory, non-transactional dataset with graph-copy semantics.  Currently, it is a general dataset with "add graph" overridden to do a copy-in.
> 
> * DatasetFactory.createTxnMem()
> The new place for in-memory, transactional dataset (with graph-copy semantics).
> 
> * DatasetFactory.createMem()
> The old place for in-memory datasets - goes to createGeneral()
> Deprecated because if replaced by create() then details have changed.
> 
> * DatasetFactory.createGeneral()
> The new place for general datasets.
> 
>   Andy
> 
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Dec 10, 2015, at 8:15 AM, Andy Seaborne <an...@apache.org> wrote:
>>> 
>>> On 09/12/15 21:00, A. Soroka wrote:
>>>> Cool. The extension points are basically TripleTable and QuadTable, because of the constructor DatasetGraphInMemory(QuadTable, TripleTable). Someone could offer their own impls and use that (public) constructor.
>>> 
>>> Does that mean it would be easy to do a non-transactional version, that used hash maps (or used hash maps at the top level at least)?
>>> 
>>> That would make an excellent performance comparison.
>>> 
>>> If it makes a difference, it can the implementation for Dataset(Graph)Factory.create() and have the copy-in semantics for "add graph".
>>> 
>>> 	Andy
>> 
> 


Re: A release?

Posted by Andy Seaborne <an...@apache.org>.
On 10/12/15 13:31, A. Soroka wrote:
> To the first question, I think the answer is yes. It was very much my intention that TupleTable and its subtypes would provide opportunity to explore different useful structures (e.g. Claude Warren’s ideas about Bloom filters). A [Triple|Quad]Table that uses ordinary maps/structures in the same way as the current impl uses persistent structures should be very easy. Would you like me to cut such classes to have them available for non-transactional cases?
>
> I’m not quite sure what your last sentence means; is it about the semantics of the factory methods?

There are 4 factory methods in each factory

* DatasetFactory.create()
The new place for in-memory, non-transactional dataset with graph-copy 
semantics.  Currently, it is a general dataset with "add graph" 
overridden to do a copy-in.

* DatasetFactory.createTxnMem()
The new place for in-memory, transactional dataset (with graph-copy 
semantics).

* DatasetFactory.createMem()
The old place for in-memory datasets - goes to createGeneral()
Deprecated because if replaced by create() then details have changed.

* DatasetFactory.createGeneral()
The new place for general datasets.

    Andy

>
> ---
> A. Soroka
> The University of Virginia Library
>
>> On Dec 10, 2015, at 8:15 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>> On 09/12/15 21:00, A. Soroka wrote:
>>> Cool. The extension points are basically TripleTable and QuadTable, because of the constructor DatasetGraphInMemory(QuadTable, TripleTable). Someone could offer their own impls and use that (public) constructor.
>>
>> Does that mean it would be easy to do a non-transactional version, that used hash maps (or used hash maps at the top level at least)?
>>
>> That would make an excellent performance comparison.
>>
>> If it makes a difference, it can the implementation for Dataset(Graph)Factory.create() and have the copy-in semantics for "add graph".
>>
>> 	Andy
>


Re: A release?

Posted by "A. Soroka" <aj...@virginia.edu>.
To the first question, I think the answer is yes. It was very much my intention that TupleTable and its subtypes would provide opportunity to explore different useful structures (e.g. Claude Warren’s ideas about Bloom filters). A [Triple|Quad]Table that uses ordinary maps/structures in the same way as the current impl uses persistent structures should be very easy. Would you like me to cut such classes to have them available for non-transactional cases?

I’m not quite sure what your last sentence means; is it about the semantics of the factory methods?

---
A. Soroka
The University of Virginia Library

> On Dec 10, 2015, at 8:15 AM, Andy Seaborne <an...@apache.org> wrote:
> 
> On 09/12/15 21:00, A. Soroka wrote:
>> Cool. The extension points are basically TripleTable and QuadTable, because of the constructor DatasetGraphInMemory(QuadTable, TripleTable). Someone could offer their own impls and use that (public) constructor.
> 
> Does that mean it would be easy to do a non-transactional version, that used hash maps (or used hash maps at the top level at least)?
> 
> That would make an excellent performance comparison.
> 
> If it makes a difference, it can the implementation for Dataset(Graph)Factory.create() and have the copy-in semantics for "add graph".
> 
> 	Andy


Re: A release?

Posted by Andy Seaborne <an...@apache.org>.
On 09/12/15 21:00, A. Soroka wrote:
> Cool. The extension points are basically TripleTable and QuadTable, because of the constructor DatasetGraphInMemory(QuadTable, TripleTable). Someone could offer their own impls and use that (public) constructor.

Does that mean it would be easy to do a non-transactional version, that 
used hash maps (or used hash maps at the top level at least)?

That would make an excellent performance comparison.

If it makes a difference, it can the implementation for 
Dataset(Graph)Factory.create() and have the copy-in semantics for "add 
graph".

	Andy

Re: A release?

Posted by "A. Soroka" <aj...@virginia.edu>.
Cool. The extension points are basically TripleTable and QuadTable, because of the constructor DatasetGraphInMemory(QuadTable, TripleTable). Someone could offer their own impls and use that (public) constructor.

---
A. Soroka
The University of Virginia Library

> On Dec 9, 2015, at 3:38 PM, Andy Seaborne <an...@apache.org> wrote:
> 
> On 09/12/15 15:19, A. Soroka wrote:
>> Here:
>> 
>> https://gist.github.com/ajs6f/463117f0d0e094ffefc5
>> 
>> are some suggested edits for that doc, mostly just typo-fixes and suggestions for clarification.
> 
> Done, thanks.
> 
>> Other questions from me:
>> 
>> 1) Is it worth creating a component in Jira to track tickets for this stuff?
> 
> I think incoming JIRAs won't be simple to associate with this one component.  We can see how it goes but the various differences in the datasets are more likely the issues.  Just a guess.
> 
>> 
>> 2) Shall I write up a simple page describing the impl and more importantly, extension points?
> 
> Always useful, not urgent.
> 
> (What are the extension points?)
> 
> 	Andy
> 
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Dec 9, 2015, at 5:20 AM, Andy Seaborne <an...@apache.org> wrote:
>>> 
>>> I put some documentation in at
>>> 
>>> http://jena.staging.apache.org/documentation/rdf/datasets.html
>>> 
>>> because the announcement needs to link to something.  I hope I got it right.  This can be changed between releases -- the bytes aren't in the release itself.
>>> 
>>> (The file name was chosen so that more general text could go there and this becomes a section of documentation about dataset usage in Jena)
>>> 
>>> 	Andy
>>> 
>>> On 08/12/15 14:53, A. Soroka wrote:
>>>> That should certainly “enlist” a number of RC testers. {grin}
>>>> 
>>>> I will stand by for bug fixes.
>>>> 
>>>> ---
>>>> A. Soroka
>>>> The University of Virginia Library
>>>> 
>>>>> On Dec 8, 2015, at 3:38 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>> 
>>>>> On 23/11/15 15:31, A. Soroka wrote:
>>>>>> A weightless +1 from me, of course! {grin} I’m excited to see this stuff get in front of people.
>>>>>> 
>>>>>> I’m not sure what you mean by “Fuseki integration”, Andy. Do you mean to make the new dataset impl available as an option in Fuseki (presumably in both the web admin UI and the CLI for the standalone server)?
>>>>> 
>>>>> Fuseki should chnage to using the new in-memory txn dataset when creating new in-memory setups.  So the template-driven assemblers (UI) and from the command line (--mem) will change.  Existing setups are preserved because the template is used to create an assembler when first used, not each start-up.
>>>>> 
>>>>> 	Andy
>>>>> 
>>>>>> 
>>>>>> ---
>>>>>> A. Soroka
>>>>>> The University of Virginia Library
>>>>>> 
>>>>>>> On Nov 23, 2015, at 10:24 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>>>> 
>>>>>>> We have:
>>>>>>> 
>>>>>>> * in-memory transactional dataset from Adam is in the codebase
>>>>>>> (details on progress below)
>>>>>>> * configurable lucene analyzers
>>>>>>> * general maintenance.
>>>>>>> * Improved start-up and configuration to wire the jars together.
>>>>>>> 
>>>>>>> Shall we do a release? This fits our 3-6 months cycle.
>>>>>>> 
>>>>>>> As we know, getting people to try unreleased builds has limited effect (generally for point-bug fixes, less so for new features) so my suggestion is to release soon as v3.0.1 and describe the in-memory transactional dataset as "RC".  It's not fully integrated (test: Fuseki is not using it yet) and getting that done and stable isn't an instant task to do. At the same time, a release will let people try it.
>>>>>>> 
>>>>>>> So we expose the in-memory transactional dataset at the API level now, introduce the deprecations so people can see what's coming, and complete/stabilize the integration to make a v3.1.0 in a few months time.
>>>>>>> 
>>>>>>> 	Andy
>>>>>>> 
>>>>>>> 
>>>>>>> Progress on the in-memory transactional dataset
>>>>>>> https://pony-poc.apache.org/thread.html/Zj36f2pbaszgdnj
>>>>>>> 
>>>>>>> If you want to try it, then DatasetFactory.createTxnMem() is the entry point.
>>>>>>> 
>>>>>>> --------------------------------------------------------
>>>>>>> 
>>>>>>> Outstanding tasks:
>>>>>>> * Dependency version management in jena-parent.
>>>>>>> 
>>>>>>> Done
>>>>>>> 
>>>>>>> * more testing [done], and the organisation of the tests
>>>>>>> 
>>>>>>> Test done, organisation of tests is partially done. There is a wider tasks to organise the dataset tests we have into some sort of order but that's non-blocking.
>>>>>>> 
>>>>>>> * documentation
>>>>>>> * code cleaning for deprecation of DatasetFactory.createMem
>>>>>>> * Same migration for DatasetGraphFactory.createMem
>>>>>>> * Fuseki integration
>>>>>>> 
>>>>>>> --------------------------------------------------------
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 


Re: A release?

Posted by Andy Seaborne <an...@apache.org>.
On 09/12/15 15:19, A. Soroka wrote:
> Here:
>
> https://gist.github.com/ajs6f/463117f0d0e094ffefc5
>
> are some suggested edits for that doc, mostly just typo-fixes and suggestions for clarification.

Done, thanks.

> Other questions from me:
>
> 1) Is it worth creating a component in Jira to track tickets for this stuff?

I think incoming JIRAs won't be simple to associate with this one 
component.  We can see how it goes but the various differences in the 
datasets are more likely the issues.  Just a guess.

>
> 2) Shall I write up a simple page describing the impl and more importantly, extension points?

Always useful, not urgent.

(What are the extension points?)

	Andy

>
> ---
> A. Soroka
> The University of Virginia Library
>
>> On Dec 9, 2015, at 5:20 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>> I put some documentation in at
>>
>> http://jena.staging.apache.org/documentation/rdf/datasets.html
>>
>> because the announcement needs to link to something.  I hope I got it right.  This can be changed between releases -- the bytes aren't in the release itself.
>>
>> (The file name was chosen so that more general text could go there and this becomes a section of documentation about dataset usage in Jena)
>>
>> 	Andy
>>
>> On 08/12/15 14:53, A. Soroka wrote:
>>> That should certainly “enlist” a number of RC testers. {grin}
>>>
>>> I will stand by for bug fixes.
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>>> On Dec 8, 2015, at 3:38 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>
>>>> On 23/11/15 15:31, A. Soroka wrote:
>>>>> A weightless +1 from me, of course! {grin} I’m excited to see this stuff get in front of people.
>>>>>
>>>>> I’m not sure what you mean by “Fuseki integration”, Andy. Do you mean to make the new dataset impl available as an option in Fuseki (presumably in both the web admin UI and the CLI for the standalone server)?
>>>>
>>>> Fuseki should chnage to using the new in-memory txn dataset when creating new in-memory setups.  So the template-driven assemblers (UI) and from the command line (--mem) will change.  Existing setups are preserved because the template is used to create an assembler when first used, not each start-up.
>>>>
>>>> 	Andy
>>>>
>>>>>
>>>>> ---
>>>>> A. Soroka
>>>>> The University of Virginia Library
>>>>>
>>>>>> On Nov 23, 2015, at 10:24 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>>>
>>>>>> We have:
>>>>>>
>>>>>> * in-memory transactional dataset from Adam is in the codebase
>>>>>> (details on progress below)
>>>>>> * configurable lucene analyzers
>>>>>> * general maintenance.
>>>>>> * Improved start-up and configuration to wire the jars together.
>>>>>>
>>>>>> Shall we do a release? This fits our 3-6 months cycle.
>>>>>>
>>>>>> As we know, getting people to try unreleased builds has limited effect (generally for point-bug fixes, less so for new features) so my suggestion is to release soon as v3.0.1 and describe the in-memory transactional dataset as "RC".  It's not fully integrated (test: Fuseki is not using it yet) and getting that done and stable isn't an instant task to do. At the same time, a release will let people try it.
>>>>>>
>>>>>> So we expose the in-memory transactional dataset at the API level now, introduce the deprecations so people can see what's coming, and complete/stabilize the integration to make a v3.1.0 in a few months time.
>>>>>>
>>>>>> 	Andy
>>>>>>
>>>>>>
>>>>>> Progress on the in-memory transactional dataset
>>>>>> https://pony-poc.apache.org/thread.html/Zj36f2pbaszgdnj
>>>>>>
>>>>>> If you want to try it, then DatasetFactory.createTxnMem() is the entry point.
>>>>>>
>>>>>> --------------------------------------------------------
>>>>>>
>>>>>> Outstanding tasks:
>>>>>> * Dependency version management in jena-parent.
>>>>>>
>>>>>> Done
>>>>>>
>>>>>> * more testing [done], and the organisation of the tests
>>>>>>
>>>>>> Test done, organisation of tests is partially done. There is a wider tasks to organise the dataset tests we have into some sort of order but that's non-blocking.
>>>>>>
>>>>>> * documentation
>>>>>> * code cleaning for deprecation of DatasetFactory.createMem
>>>>>> * Same migration for DatasetGraphFactory.createMem
>>>>>> * Fuseki integration
>>>>>>
>>>>>> --------------------------------------------------------
>>>>>>
>>>>>
>>>>
>>>
>>
>


Re: A release?

Posted by "A. Soroka" <aj...@virginia.edu>.
Here:

https://gist.github.com/ajs6f/463117f0d0e094ffefc5

are some suggested edits for that doc, mostly just typo-fixes and suggestions for clarification. Other questions from me:

1) Is it worth creating a component in Jira to track tickets for this stuff?

2) Shall I write up a simple page describing the impl and more importantly, extension points?

---
A. Soroka
The University of Virginia Library

> On Dec 9, 2015, at 5:20 AM, Andy Seaborne <an...@apache.org> wrote:
> 
> I put some documentation in at
> 
> http://jena.staging.apache.org/documentation/rdf/datasets.html
> 
> because the announcement needs to link to something.  I hope I got it right.  This can be changed between releases -- the bytes aren't in the release itself.
> 
> (The file name was chosen so that more general text could go there and this becomes a section of documentation about dataset usage in Jena)
> 
> 	Andy
> 
> On 08/12/15 14:53, A. Soroka wrote:
>> That should certainly “enlist” a number of RC testers. {grin}
>> 
>> I will stand by for bug fixes.
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Dec 8, 2015, at 3:38 AM, Andy Seaborne <an...@apache.org> wrote:
>>> 
>>> On 23/11/15 15:31, A. Soroka wrote:
>>>> A weightless +1 from me, of course! {grin} I’m excited to see this stuff get in front of people.
>>>> 
>>>> I’m not sure what you mean by “Fuseki integration”, Andy. Do you mean to make the new dataset impl available as an option in Fuseki (presumably in both the web admin UI and the CLI for the standalone server)?
>>> 
>>> Fuseki should chnage to using the new in-memory txn dataset when creating new in-memory setups.  So the template-driven assemblers (UI) and from the command line (--mem) will change.  Existing setups are preserved because the template is used to create an assembler when first used, not each start-up.
>>> 
>>> 	Andy
>>> 
>>>> 
>>>> ---
>>>> A. Soroka
>>>> The University of Virginia Library
>>>> 
>>>>> On Nov 23, 2015, at 10:24 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>> 
>>>>> We have:
>>>>> 
>>>>> * in-memory transactional dataset from Adam is in the codebase
>>>>> (details on progress below)
>>>>> * configurable lucene analyzers
>>>>> * general maintenance.
>>>>> * Improved start-up and configuration to wire the jars together.
>>>>> 
>>>>> Shall we do a release? This fits our 3-6 months cycle.
>>>>> 
>>>>> As we know, getting people to try unreleased builds has limited effect (generally for point-bug fixes, less so for new features) so my suggestion is to release soon as v3.0.1 and describe the in-memory transactional dataset as "RC".  It's not fully integrated (test: Fuseki is not using it yet) and getting that done and stable isn't an instant task to do. At the same time, a release will let people try it.
>>>>> 
>>>>> So we expose the in-memory transactional dataset at the API level now, introduce the deprecations so people can see what's coming, and complete/stabilize the integration to make a v3.1.0 in a few months time.
>>>>> 
>>>>> 	Andy
>>>>> 
>>>>> 
>>>>> Progress on the in-memory transactional dataset
>>>>> https://pony-poc.apache.org/thread.html/Zj36f2pbaszgdnj
>>>>> 
>>>>> If you want to try it, then DatasetFactory.createTxnMem() is the entry point.
>>>>> 
>>>>> --------------------------------------------------------
>>>>> 
>>>>> Outstanding tasks:
>>>>> * Dependency version management in jena-parent.
>>>>> 
>>>>> Done
>>>>> 
>>>>> * more testing [done], and the organisation of the tests
>>>>> 
>>>>> Test done, organisation of tests is partially done. There is a wider tasks to organise the dataset tests we have into some sort of order but that's non-blocking.
>>>>> 
>>>>> * documentation
>>>>> * code cleaning for deprecation of DatasetFactory.createMem
>>>>> * Same migration for DatasetGraphFactory.createMem
>>>>> * Fuseki integration
>>>>> 
>>>>> --------------------------------------------------------
>>>>> 
>>>> 
>>> 
>> 
> 


Re: A release?

Posted by Andy Seaborne <an...@apache.org>.
I put some documentation in at

http://jena.staging.apache.org/documentation/rdf/datasets.html

because the announcement needs to link to something.  I hope I got it 
right.  This can be changed between releases -- the bytes aren't in the 
release itself.

(The file name was chosen so that more general text could go there and 
this becomes a section of documentation about dataset usage in Jena)

	Andy

On 08/12/15 14:53, A. Soroka wrote:
> That should certainly “enlist” a number of RC testers. {grin}
>
> I will stand by for bug fixes.
>
> ---
> A. Soroka
> The University of Virginia Library
>
>> On Dec 8, 2015, at 3:38 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>> On 23/11/15 15:31, A. Soroka wrote:
>>> A weightless +1 from me, of course! {grin} I’m excited to see this stuff get in front of people.
>>>
>>> I’m not sure what you mean by “Fuseki integration”, Andy. Do you mean to make the new dataset impl available as an option in Fuseki (presumably in both the web admin UI and the CLI for the standalone server)?
>>
>> Fuseki should chnage to using the new in-memory txn dataset when creating new in-memory setups.  So the template-driven assemblers (UI) and from the command line (--mem) will change.  Existing setups are preserved because the template is used to create an assembler when first used, not each start-up.
>>
>> 	Andy
>>
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>>> On Nov 23, 2015, at 10:24 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>
>>>> We have:
>>>>
>>>> * in-memory transactional dataset from Adam is in the codebase
>>>> (details on progress below)
>>>> * configurable lucene analyzers
>>>> * general maintenance.
>>>> * Improved start-up and configuration to wire the jars together.
>>>>
>>>> Shall we do a release? This fits our 3-6 months cycle.
>>>>
>>>> As we know, getting people to try unreleased builds has limited effect (generally for point-bug fixes, less so for new features) so my suggestion is to release soon as v3.0.1 and describe the in-memory transactional dataset as "RC".  It's not fully integrated (test: Fuseki is not using it yet) and getting that done and stable isn't an instant task to do. At the same time, a release will let people try it.
>>>>
>>>> So we expose the in-memory transactional dataset at the API level now, introduce the deprecations so people can see what's coming, and complete/stabilize the integration to make a v3.1.0 in a few months time.
>>>>
>>>> 	Andy
>>>>
>>>>
>>>> Progress on the in-memory transactional dataset
>>>> https://pony-poc.apache.org/thread.html/Zj36f2pbaszgdnj
>>>>
>>>> If you want to try it, then DatasetFactory.createTxnMem() is the entry point.
>>>>
>>>> --------------------------------------------------------
>>>>
>>>> Outstanding tasks:
>>>> * Dependency version management in jena-parent.
>>>>
>>>> Done
>>>>
>>>> * more testing [done], and the organisation of the tests
>>>>
>>>> Test done, organisation of tests is partially done. There is a wider tasks to organise the dataset tests we have into some sort of order but that's non-blocking.
>>>>
>>>> * documentation
>>>> * code cleaning for deprecation of DatasetFactory.createMem
>>>> * Same migration for DatasetGraphFactory.createMem
>>>> * Fuseki integration
>>>>
>>>> --------------------------------------------------------
>>>>
>>>
>>
>


Re: A release?

Posted by "A. Soroka" <aj...@virginia.edu>.
That should certainly “enlist” a number of RC testers. {grin}

I will stand by for bug fixes.

---
A. Soroka
The University of Virginia Library

> On Dec 8, 2015, at 3:38 AM, Andy Seaborne <an...@apache.org> wrote:
> 
> On 23/11/15 15:31, A. Soroka wrote:
>> A weightless +1 from me, of course! {grin} I’m excited to see this stuff get in front of people.
>> 
>> I’m not sure what you mean by “Fuseki integration”, Andy. Do you mean to make the new dataset impl available as an option in Fuseki (presumably in both the web admin UI and the CLI for the standalone server)?
> 
> Fuseki should chnage to using the new in-memory txn dataset when creating new in-memory setups.  So the template-driven assemblers (UI) and from the command line (--mem) will change.  Existing setups are preserved because the template is used to create an assembler when first used, not each start-up.
> 
> 	Andy
> 
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Nov 23, 2015, at 10:24 AM, Andy Seaborne <an...@apache.org> wrote:
>>> 
>>> We have:
>>> 
>>> * in-memory transactional dataset from Adam is in the codebase
>>> (details on progress below)
>>> * configurable lucene analyzers
>>> * general maintenance.
>>> * Improved start-up and configuration to wire the jars together.
>>> 
>>> Shall we do a release? This fits our 3-6 months cycle.
>>> 
>>> As we know, getting people to try unreleased builds has limited effect (generally for point-bug fixes, less so for new features) so my suggestion is to release soon as v3.0.1 and describe the in-memory transactional dataset as "RC".  It's not fully integrated (test: Fuseki is not using it yet) and getting that done and stable isn't an instant task to do. At the same time, a release will let people try it.
>>> 
>>> So we expose the in-memory transactional dataset at the API level now, introduce the deprecations so people can see what's coming, and complete/stabilize the integration to make a v3.1.0 in a few months time.
>>> 
>>> 	Andy
>>> 
>>> 
>>> Progress on the in-memory transactional dataset
>>> https://pony-poc.apache.org/thread.html/Zj36f2pbaszgdnj
>>> 
>>> If you want to try it, then DatasetFactory.createTxnMem() is the entry point.
>>> 
>>> --------------------------------------------------------
>>> 
>>> Outstanding tasks:
>>> * Dependency version management in jena-parent.
>>> 
>>> Done
>>> 
>>> * more testing [done], and the organisation of the tests
>>> 
>>> Test done, organisation of tests is partially done. There is a wider tasks to organise the dataset tests we have into some sort of order but that's non-blocking.
>>> 
>>> * documentation
>>> * code cleaning for deprecation of DatasetFactory.createMem
>>> * Same migration for DatasetGraphFactory.createMem
>>> * Fuseki integration
>>> 
>>> --------------------------------------------------------
>>> 
>> 
>