You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Dick Murray <da...@gmail.com> on 2019/10/23 09:58:22 UTC

Detecting writes natively to a DatasetGraph since a particular epoch

Hi.

Is it possible to natively detect whether a write has occurred to a
DatasetGraph since a particular epoch?

For the purposes of caching if I perform an expensive read from a
DatasetGraph knowing whether I need to invalidate the cache is very useful.
Does TDB or the Mem natively track if a write has occurred?

Currently I am wrapping the Transactional but am interested if this can be
shimmed to the underlying SPI.

Regards DickM

Re: Detecting writes natively to a DatasetGraph since a particular epoch

Posted by Andy Seaborne <an...@apache.org>.

On 25/10/2019 16:02, ajs6f wrote:
>> On Oct 24, 2019, at 7:17 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>> On 23/10/2019 15:31, ajs6f wrote:
>>> Hi Dick!
>>> I'm afraid the answer for TDB2 and TIM is no, not out of the box. Both use MVCC techniques, not buffering operations. Andy can tell you about TDB1.
>>> This might (maybe) be helpful:
>>> https://github.com/apache/jena/commit/241060548a9fca777b7d40f1f216ae7ed930b20e
>>> Before Andy and I got TIM into the design it now has (multiple MVCC indexes connected transactionally) I did another design that _does_ use buffering of operations. Since we didn't go with it, we cut it out of the codebase, but the history is there if you find it useful.
>>
>> Cool - might be a good starting point for buffering.  I had a need for a buffering dataset, which keeps the change, makes find() work as if the change happened but does not pass add/delete through until requested.
> 
> That's just what that code did, so it might be just what you want.
> 
> ajs6f
> 

It looks like a good basis.

It went through some iterations - the one at that commit records the 
change and call super.add(quad).
It creates an undo log so that abort() can be done by reversing the 
changes.  It does not need to do anything special for find() because the 
underlying store is, aside from visibility foo, up-to-date.

We did at least talk about a redo style - delaying the changes until 
commit. That would also be good input; I'll go look.

     Andy

Re: Detecting writes natively to a DatasetGraph since a particular epoch

Posted by ajs6f <aj...@apache.org>.
> On Oct 24, 2019, at 7:17 AM, Andy Seaborne <an...@apache.org> wrote:
> 
> On 23/10/2019 15:31, ajs6f wrote:
>> Hi Dick!
>> I'm afraid the answer for TDB2 and TIM is no, not out of the box. Both use MVCC techniques, not buffering operations. Andy can tell you about TDB1.
>> This might (maybe) be helpful:
>> https://github.com/apache/jena/commit/241060548a9fca777b7d40f1f216ae7ed930b20e
>> Before Andy and I got TIM into the design it now has (multiple MVCC indexes connected transactionally) I did another design that _does_ use buffering of operations. Since we didn't go with it, we cut it out of the codebase, but the history is there if you find it useful.
> 
> Cool - might be a good starting point for buffering.  I had a need for a buffering dataset, which keeps the change, makes find() work as if the change happened but does not pass add/delete through until requested.

That's just what that code did, so it might be just what you want.

ajs6f


Re: Detecting writes natively to a DatasetGraph since a particular epoch

Posted by Andy Seaborne <an...@apache.org>.
TDB2 recently got TransactionListener to tap into the transaction 
lifecycle.That was for a specific need (for abort to undo some changes 
made ahead of time) but the idea might be general.

In RDF Delta, there is DatasetGraphChanges whch enables you to get 
changes, which includes begin-commit.

Tapping into the Transactional does seem like a good way to go though.

     Andy

On 23/10/2019 15:31, ajs6f wrote:
> Hi Dick!
> 
> I'm afraid the answer for TDB2 and TIM is no, not out of the box. Both use MVCC techniques, not buffering operations. Andy can tell you about TDB1.
> 
> This might (maybe) be helpful:
> 
> https://github.com/apache/jena/commit/241060548a9fca777b7d40f1f216ae7ed930b20e
> 
> Before Andy and I got TIM into the design it now has (multiple MVCC indexes connected transactionally) I did another design that _does_ use buffering of operations. Since we didn't go with it, we cut it out of the codebase, but the history is there if you find it useful.

Cool - might be a good starting point for buffering.  I had a need for a 
buffering dataset, which keeps the change, makes find() work as if the 
change happened but does not pass add/delete through until requested.

> 
> ajs6f
> 
>> On Oct 23, 2019, at 5:58 AM, Dick Murray <da...@gmail.com> wrote:
>>
>> Hi.
>>
>> Is it possible to natively detect whether a write has occurred to a
>> DatasetGraph since a particular epoch?
>>
>> For the purposes of caching if I perform an expensive read from a
>> DatasetGraph knowing whether I need to invalidate the cache is very useful.
>> Does TDB or the Mem natively track if a write has occurred?
>>
>> Currently I am wrapping the Transactional but am interested if this can be
>> shimmed to the underlying SPI.
>>
>> Regards DickM
> 

Re: Detecting writes natively to a DatasetGraph since a particular epoch

Posted by ajs6f <aj...@apache.org>.
Hi Dick!

I'm afraid the answer for TDB2 and TIM is no, not out of the box. Both use MVCC techniques, not buffering operations. Andy can tell you about TDB1. 

This might (maybe) be helpful:

https://github.com/apache/jena/commit/241060548a9fca777b7d40f1f216ae7ed930b20e

Before Andy and I got TIM into the design it now has (multiple MVCC indexes connected transactionally) I did another design that _does_ use buffering of operations. Since we didn't go with it, we cut it out of the codebase, but the history is there if you find it useful. 

ajs6f

> On Oct 23, 2019, at 5:58 AM, Dick Murray <da...@gmail.com> wrote:
> 
> Hi.
> 
> Is it possible to natively detect whether a write has occurred to a
> DatasetGraph since a particular epoch?
> 
> For the purposes of caching if I perform an expensive read from a
> DatasetGraph knowing whether I need to invalidate the cache is very useful.
> Does TDB or the Mem natively track if a write has occurred?
> 
> Currently I am wrapping the Transactional but am interested if this can be
> shimmed to the underlying SPI.
> 
> Regards DickM