You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by GitBox <gi...@apache.org> on 2019/12/07 15:31:29 UTC

[GitHub] [jena] afs commented on issue #646: JENA-1785: A newly created node can remain invisible after commit

afs commented on issue #646: JENA-1785: A newly created node can remain invisible after commit
URL: https://github.com/apache/jena/pull/646#issuecomment-562860832
 
 
   Firstly - thank you very much for working on this and the discussion. The more eyes on concurrency and transaction issues the better.
   
   This approach looks like a good one.
   I think it can be simplified and remove some lock use. 
   
   1/
   The principle that when there is a writer, only the writer can update the not-present cache is good. When there isn't a writer, any reader can update the not-present cache and there is no need to track the data version that I can see. This is because, when there is no writer, the readers can see the whole node table. If they find a node/nodeid but it isn't in the RDF data, all that happens is that use in a triple pattern fails. The not-present cache just speeds that up; that cache don't have to be very large either.
   
   So the rule is either its the active writer, or any reader and no writer active.
   
   ```
    private boolean inTopLevelTxn() {
           Transaction t = txn.get();
           if ( t == null )
               return false;
           // Either t is the write transaction or is a reader and no W txn is active. 
           if ( t.isWriteTxn() ) 
               return true;
           return !hasActiveWriteTransaction.get(); 
       }
   ```
   
   2/
   
   That means `hasActiveWriteTransaction` can be a `AtomicBoolean`, is managed in `updateStart`/`updateCommit`/`updateAbort`. This removes the need for synchronized because only one value is needed.
   
   I've made these changes for discussion on a scratch branch:
   
   https://github.com/apache/jena/compare/master...afs:jena1785_tdb2_misscache
   
   Two commits - the first is PR646, the second is the suggestions above - commit message "AFS-PR646" and currently commit https://github.com/apache/jena/commit/d7063a3ad5aa2751a57b8e3356b2fee122f45595
   
   3/
   There are changes in the test suite to build a test dataset with smaller cache sizes. Then the caches can be fully cycled in a reasonable time.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services