You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Claude Warren <cl...@xenei.com> on 2015/01/27 07:44:07 UTC

Assembler black magic????

I have spent a bit of time in the last few days looking at assemblers
particularly with respect to the Mode object.  It seems this object gives a
hint as to whether or not to create a new object or reuse an old one.  but
only one assembler seems to use it (File I think).

>From what I recall, TDB doesn't like it if you try to open 2 different
connections to the same directory, but the assembler seems like it does
that in that every invocation of the DatasetTDB assembler.

For example using the following assembler file:


my:dataset rdf:type tdb:DatasetTDB;
    tdb:location "/tmp/myApp" ;
    tdb:unionDefaultGraph true ;
    .


my:modelA  rdf:type tdb:GraphTDB ;
    tdb:dataset my:dataset ;
    tdb:graphName <http://example/A> ;
    .

my:modelB  rdf:type tdb:GraphTDB ;
    tdb:dataset my:dataset ;
    tdb:graphName <http://example/B> ;
    .

my:dataset will be created 2x.  Once for my:modelA and once for my:modelB.
Obviously this is not a problem as the system works, but how?

Claude
-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Assembler black magic????

Posted by Claude Warren <cl...@xenei.com>.
Perhaps "assemblers" is another space where we can clean up for v3.

But for v2, there are no magical caches or such that I can code the Mode to
say "don't bother calling the real assembler if there already is one of
these built.", I have to handle it in my assembler (or in some code the
assembler depends on, ala TDB).

I thought that might be the answer, but I wanted to be sure before I went
down that path.

Thx,
Claude

Re: Assembler black magic????

Posted by Andy Seaborne <an...@apache.org>.
On 27/01/15 06:44, Claude Warren wrote:
> I have spent a bit of time in the last few days looking at assemblers
> particularly with respect to the Mode object.  It seems this object gives a
> hint as to whether or not to create a new object or reuse an old one.  but
> only one assembler seems to use it (File I think).

IMO That design doesn't work that well:

1/ There are two sharing cases: "must share" (functional requirement) 
and "can share" (efficiency).  These are not distinguished.

2/ Creating objects is not just about assemblers.  If something is a 
"must share" that invariant must be enforced for API calls as well.

>  From what I recall, TDB doesn't like it if you try to open 2 different
> connections to the same directory, but the assembler seems like it does
> that in that every invocation of the DatasetTDB assembler.

TDB takes care of all tha -- "must share" enforcement.

The assembler calls TDBFactory. Magic happens. No violation of the 
sharing rules for TDB.

> For example using the following assembler file:
>
>
> my:dataset rdf:type tdb:DatasetTDB;
>      tdb:location "/tmp/myApp" ;
>      tdb:unionDefaultGraph true ;
>      .
>
>
> my:modelA  rdf:type tdb:GraphTDB ;
>      tdb:dataset my:dataset ;
>      tdb:graphName <http://example/A> ;
>      .
>
> my:modelB  rdf:type tdb:GraphTDB ;
>      tdb:dataset my:dataset ;
>      tdb:graphName <http://example/B> ;
>      .
>
> my:dataset will be created 2x.  Once for my:modelA and once for my:modelB.
> Obviously this is not a problem as the system works, but how?
>
> Claude

Rob added an on-disk lock file for further checking, across JVMs and
for weird and wacky ways to name the same disk space with different
names (e.g. symbolic links).

There is duplication of the surface objects (Dataset; presentation API)
but it does not matter, it's just holding a pointer to a
DatasetGraphTransaction (duplicated) which holds a pointer to a
StoreConnection (not duplicated).

StoreConnection is the important class. It contains a system-wide
Location->StoreConnection registry so StoreConnection's get shared.

Actually, DatasetGraphTransaction also has per thread pointers to
DatasetGraphTxn, the transactional view. But fear not about duplication
of transactions because co-ordination is done in StoreConnection and
it's associated TransactionManager.

I have been working on a new transaction coordinator (for Lizard). It is 
a separate component, not tied to Datasets, which TransactionManager is. 
  Learnings from  TransactionManager put into the new one include a lot 
of internal consistency checking, less use of thread locals and less 
locking, introduction of per-transaction state managaement.

	Andy