You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jena.apache.org by Claude Warren <cl...@xenei.com> on 2014/01/01 08:43:10 UTC

Re: Using Graph from another JVM -- A solution.

I thought some more about my requirements and I think that making Node (and
Triple) Serializable would solve my problem.  I'll take a look at what it
would take to implement that.




On Mon, Dec 30, 2013 at 8:21 PM, Andy Seaborne <an...@apache.org> wrote:

> On 30/12/13 19:39, Claude Warren wrote:
>
>> With a Node interface I can implement a serializable node that handles all
>> the core node types.  It means I only have to convert the node once.
>>   Without an interface I have to convert to a serializable format and then
>> convert back to "native" form.
>>
>
> So it is a single class? Lucky it's only the concrete types! - there are
> subclasses of variable in at least two places.  Quite a bit of instanceof
> Node_RuleVariable in the reasoner code.
>
> I took a look at the code bases looking for other instanceof tests. I
> found OWLDLProfile, OWLProfile and OWLLiteProfile that do instanceof test
> when they should be doing .isXXX tests.  Changed.
>
> Other than that, there does not seem to be much code that makes use of the
> class hierarchy although my looking was not systematic.  Of course, other
> extension code may be doing so.
>
>
>         Andy
>
>  On Mon, Dec 30, 2013 at 7:24 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>>  On 30/12/13 18:58, Claude Warren wrote:
>>>
>>>  I did a quick Node (Interface) and NodeImpl implementation while working
>>>> on
>>>> the RMI code.  (It made some things easier) there was not much change to
>>>> the code to put in an interface that has the current methods of Node.  I
>>>> would like to move this into the current code base, but if we decide not
>>>> to
>>>> do that I can work around it.
>>>>
>>>>
>>> This would be better done on a branch for discussion.  I'm -1 to just
>>> putting it into trunk.
>>>
>>> "not much change" needs a migration strategy because this is going to
>>> affect all modules, and it's not just the project's code either.
>>>
>>> What does it make easier?
>>>
>>>          Andy
>>>
>>>
>>>
>>>
>>>> On Mon, Dec 30, 2013 at 6:23 PM, Andy Seaborne <an...@apache.org> wrote:
>>>>
>>>>   PS
>>>>
>>>>> http://mail-archives.apache.org/mod_mbox/jena-dev/201207.
>>>>> mbox/%3C5009735B.5020908@apache.org%3E
>>>>>
>>>>>
>>>>>
>>>>> On 30/12/13 18:21, Andy Seaborne wrote:
>>>>>
>>>>>   On 30/12/13 16:28, Claude Warren wrote:
>>>>>
>>>>>>
>>>>>>   For RMI I am only implementing a Graph.
>>>>>>
>>>>>>>
>>>>>>> It may make sense to wrap model and dataset in order to achieve
>>>>>>> better
>>>>>>> performance (e.g. wrapping a TDB model/dataset may provide better
>>>>>>> performance than creating a model against multiple graphs on the
>>>>>>> client
>>>>>>> side), but for now it is just a Graph.
>>>>>>>
>>>>>>>
>>>>>>>  Will be be more performant in any measurable way?
>>>>>>
>>>>>>    I did have to create a model wrapper for the Security code, but
>>>>>> that
>>>>>> is
>>>>>>
>>>>>>  another kettle of fish.
>>>>>>>
>>>>>>> My plan is to complete the RMI implementation - 90% or so complete
>>>>>>> now, and
>>>>>>> add security to it (so you can restrict RMI access to specific graphs
>>>>>>> etc).
>>>>>>>
>>>>>>> Is there any issue with turning on the UUID inside the NodeFactory? I
>>>>>>> see
>>>>>>> that there is code for this.
>>>>>>>
>>>>>>> I would also like to see Node changed to an interface -- but that is
>>>>>>> another discussion -- I think it will keep the core cleaner as things
>>>>>>> like
>>>>>>> Node_Null won't pollute it.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> I agree about interfaces that's why NodeFactory has gone in) but the
>>>>>> detail of the exact contract needs to be clear.
>>>>>>
>>>>>> One argument for them is holding per-storage info in a Node impl but
>>>>>> that is limited in the system like Jena where Node.equals is global
>>>>>> and
>>>>>> determined by RDF semantics.
>>>>>>
>>>>>> I'm looking to simplify Graph/Triple/Node, so get rid of AnonIds (a
>>>>>> nuisense - they show up in the RDF API). And TripleMatch.  Some
>>>>>> renaming
>>>>>> to sane length method names.  Extension for graphs as nodes(nested
>>>>>> graphs) and module-specific Nodestio reuse the storage  (they never
>>>>>> leave a model - they help reuse things like "Triple" and "Graph" - I
>>>>>> found them useful in ARQ/TDB etc for example "this pattern slot is
>>>>>> defined").
>>>>>>
>>>>>> There is lots of potential flexibility that is not used and I think we
>>>>>> know now that some of that is not of any use and it just confuses.
>>>>>>
>>>>>> By the way, abstract interface classes (i.e. all methods abstract) are
>>>>>> reported as a bit faster than interfaces.
>>>>>>
>>>>>> The most important factor to me is that we do realistic steps so we do
>>>>>> not get caught with an unresourceable transition from Jena2 to Jena3.
>>>>>>  I
>>>>>> think we should only consider things that people will resource.
>>>>>>
>>>>>> Node_NULL is not used anywhere - @deprecate and delete!
>>>>>>
>>>>>> (Looks like it is left over from RDB days.)
>>>>>>
>>>>>>        Andy
>>>>>>
>>>>>> JENA-189
>>>>>>
>>>>>>
>>>>>>   Claude
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Dec 30, 2013 at 3:27 PM, Andy Seaborne <an...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>    On 29/12/13 20:40, Claude Warren wrote:
>>>>>>>
>>>>>>>
>>>>>>>>    The RMI simply exposes an existing graph implementation on a
>>>>>>>> remote
>>>>>>>>
>>>>>>>>  system.
>>>>>>>>>
>>>>>>>>> The normal disclaimers apply but given the standard Jena
>>>>>>>>> configuration:
>>>>>>>>>
>>>>>>>>> NodeFactory.createAnon() uses UID to create an id that would be
>>>>>>>>> passed to
>>>>>>>>> the graph on the remote server where the anon would be recreated.
>>>>>>>>>
>>>>>>>>> The result is that both the client and the server have the same
>>>>>>>>> anon
>>>>>>>>> id
>>>>>>>>> for
>>>>>>>>> the blank node.
>>>>>>>>>
>>>>>>>>> Am I missing something?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Only that UID are, strictly, only unique for the machine they are
>>>>>>>>>
>>>>>>>> allocated on.  RMI etc can pass them around but they only safely
>>>>>>>> identify
>>>>>>>> things on the same machine as their origin (they aren't long enough
>>>>>>>> for
>>>>>>>> wider uniqueness).  Its the UID user's responsibility nor to present
>>>>>>>> them
>>>>>>>> on on a non-origin machine.
>>>>>>>>
>>>>>>>> Ideally, Jena3, I'd like to use UUIDs, and then store only two
>>>>>>>> longs,
>>>>>>>> for
>>>>>>>> blank nodes.  They they are globally safe as well as being smaller.
>>>>>>>>
>>>>>>>> Out of curiosity - why do you need to extend to Model?  Is there a
>>>>>>>> client-side implementation of graph and then it's just a case of
>>>>>>>> wrapping a
>>>>>>>> Graph just like another other graph?  Or am I missing something?
>>>>>>>>
>>>>>>>>
>>>>>>>> Another issue in parsing is keeping label->bnode mapping.  Labels
>>>>>>>> must be
>>>>>>>> matched to any previous use in the parser run.
>>>>>>>>
>>>>>>>> The RIOT parsers do not use jena-core UID generation for bnode ids.
>>>>>>>> If
>>>>>>>> it's a map of label to node allocated, there is a growing data
>>>>>>>> structure.
>>>>>>>>     Something that we occasionally get reports of being a problem as
>>>>>>>> the map
>>>>>>>> grows for very large parser runs.
>>>>>>>>
>>>>>>>> Instead, RIOT allocates a large number (122 bits of random) and xors
>>>>>>>> it
>>>>>>>> with the label.  So the internal id is calculated from the label and
>>>>>>>> is
>>>>>>>> unique yet there is no growing data structure.
>>>>>>>>
>>>>>>>>            Andy
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>    Claude
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Dec 29, 2013 at 7:43 PM, Andy Seaborne <an...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>     On 29/12/13 16:58, Claude Warren wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      Greetings,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  I have an initial implementation of an RMI based Graph that
>>>>>>>>>>> allows
>>>>>>>>>>> one
>>>>>>>>>>> JVM
>>>>>>>>>>> to access a graph in a different JVM.  I hope to extend this to
>>>>>>>>>>> the
>>>>>>>>>>> Model
>>>>>>>>>>> level in the near future.   I just wanted to know if anyone was
>>>>>>>>>>> interested
>>>>>>>>>>> in this project.
>>>>>>>>>>>
>>>>>>>>>>> Claude
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>     The perennial question ...
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  How do you treat blank nodes?
>>>>>>>>>>
>>>>>>>>>>             Andy
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>


-- 
I like: Like Like - The likeliest place on the web<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Using Graph from another JVM -- A solution.

Posted by Claude Warren <cl...@xenei.com>.

A quick review of netty + thrift would lead me to believe that converting
the RMI to netty+thrift would not be too difficult.


On Wed, Jan 1, 2014 at 12:22 PM, Andy Seaborne <an...@apache.org> wrote:

> On 01/01/14 07:43, Claude Warren wrote:
>
>> I thought some more about my requirements and I think that making Node
>> (and
>> Triple) Serializable would solve my problem.  I'll take a look at what it
>> would take to implement that.
>>
>
> If it's an interface (Jena3), then your alternative implementation
> approach *should* work; would even be a good test of the architecture.
> Won't it take a minimum of serialization methods in the top of the
> implementing class hierarchy?
>
> I don't know if Java8 default methods work with those serialization
> methods - I'd guessing "no" as serialization is treated specially, the
> methods must be an exact signature and include "private" (IIRC+checking the
> javadoc).  RMI is one of those early Java technologies that hasn't changed
> much in ages.
>
> (If there is only implementation, the JIT will treat all methods as final
> which helps optimization.)
>
> The RMI graph is a special case of a more general design where the graph
> (dataset) can be distributed across more then one machine.  I considered
> RMI for Lizard but rejected it because it's RPC, not streams.  RPC+small
> objects do not make for efficiency; RPC has latency issues for tight
> coupling systems and any call is one copy in, one copy out at an absolute
> minimum. [All a blast from the past for me!]  I may use it for a control
> control but if the system is already using netty+thrift (current plan),
> adding RMI is duplication.
>
> New Year's Resolution - start Jena3!
>
>
>         Andy
>
>  On Mon, Dec 30, 2013 at 8:21 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>>  On 30/12/13 19:39, Claude Warren wrote:
>>>
>>>  With a Node interface I can implement a serializable node that handles
>>>> all
>>>> the core node types.  It means I only have to convert the node once.
>>>>    Without an interface I have to convert to a serializable format and
>>>> then
>>>> convert back to "native" form.
>>>>
>>>>
>>> So it is a single class? Lucky it's only the concrete types! - there are
>>> subclasses of variable in at least two places.  Quite a bit of instanceof
>>> Node_RuleVariable in the reasoner code.
>>>
>>> I took a look at the code bases looking for other instanceof tests. I
>>> found OWLDLProfile, OWLProfile and OWLLiteProfile that do instanceof test
>>> when they should be doing .isXXX tests.  Changed.
>>>
>>> Other than that, there does not seem to be much code that makes use of
>>> the
>>> class hierarchy although my looking was not systematic.  Of course, other
>>> extension code may be doing so.
>>>
>>>
>>>          Andy
>>>
>>>   On Mon, Dec 30, 2013 at 7:24 PM, Andy Seaborne <an...@apache.org>
>>> wrote:
>>>
>>>>
>>>>   On 30/12/13 18:58, Claude Warren wrote:
>>>>
>>>>>
>>>>>   I did a quick Node (Interface) and NodeImpl implementation while
>>>>> working
>>>>>
>>>>>> on
>>>>>> the RMI code.  (It made some things easier) there was not much change
>>>>>> to
>>>>>> the code to put in an interface that has the current methods of Node.
>>>>>>  I
>>>>>> would like to move this into the current code base, but if we decide
>>>>>> not
>>>>>> to
>>>>>> do that I can work around it.
>>>>>>
>>>>>>
>>>>>>  This would be better done on a branch for discussion.  I'm -1 to just
>>>>> putting it into trunk.
>>>>>
>>>>> "not much change" needs a migration strategy because this is going to
>>>>> affect all modules, and it's not just the project's code either.
>>>>>
>>>>> What does it make easier?
>>>>>
>>>>>           Andy
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  On Mon, Dec 30, 2013 at 6:23 PM, Andy Seaborne <an...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>    PS
>>>>>>
>>>>>>  http://mail-archives.apache.org/mod_mbox/jena-dev/201207.
>>>>>>> mbox/%3C5009735B.5020908@apache.org%3E
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 30/12/13 18:21, Andy Seaborne wrote:
>>>>>>>
>>>>>>>    On 30/12/13 16:28, Claude Warren wrote:
>>>>>>>
>>>>>>>
>>>>>>>>    For RMI I am only implementing a Graph.
>>>>>>>>
>>>>>>>>
>>>>>>>>> It may make sense to wrap model and dataset in order to achieve
>>>>>>>>> better
>>>>>>>>> performance (e.g. wrapping a TDB model/dataset may provide better
>>>>>>>>> performance than creating a model against multiple graphs on the
>>>>>>>>> client
>>>>>>>>> side), but for now it is just a Graph.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Will be be more performant in any measurable way?
>>>>>>>>>
>>>>>>>>
>>>>>>>>     I did have to create a model wrapper for the Security code, but
>>>>>>>> that
>>>>>>>> is
>>>>>>>>
>>>>>>>>   another kettle of fish.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> My plan is to complete the RMI implementation - 90% or so complete
>>>>>>>>> now, and
>>>>>>>>> add security to it (so you can restrict RMI access to specific
>>>>>>>>> graphs
>>>>>>>>> etc).
>>>>>>>>>
>>>>>>>>> Is there any issue with turning on the UUID inside the
>>>>>>>>> NodeFactory? I
>>>>>>>>> see
>>>>>>>>> that there is code for this.
>>>>>>>>>
>>>>>>>>> I would also like to see Node changed to an interface -- but that
>>>>>>>>> is
>>>>>>>>> another discussion -- I think it will keep the core cleaner as
>>>>>>>>> things
>>>>>>>>> like
>>>>>>>>> Node_Null won't pollute it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  I agree about interfaces that's why NodeFactory has gone in) but
>>>>>>>> the
>>>>>>>> detail of the exact contract needs to be clear.
>>>>>>>>
>>>>>>>> One argument for them is holding per-storage info in a Node impl but
>>>>>>>> that is limited in the system like Jena where Node.equals is global
>>>>>>>> and
>>>>>>>> determined by RDF semantics.
>>>>>>>>
>>>>>>>> I'm looking to simplify Graph/Triple/Node, so get rid of AnonIds (a
>>>>>>>> nuisense - they show up in the RDF API). And TripleMatch.  Some
>>>>>>>> renaming
>>>>>>>> to sane length method names.  Extension for graphs as nodes(nested
>>>>>>>> graphs) and module-specific Nodestio reuse the storage  (they never
>>>>>>>> leave a model - they help reuse things like "Triple" and "Graph" - I
>>>>>>>> found them useful in ARQ/TDB etc for example "this pattern slot is
>>>>>>>> defined").
>>>>>>>>
>>>>>>>> There is lots of potential flexibility that is not used and I think
>>>>>>>> we
>>>>>>>> know now that some of that is not of any use and it just confuses.
>>>>>>>>
>>>>>>>> By the way, abstract interface classes (i.e. all methods abstract)
>>>>>>>> are
>>>>>>>> reported as a bit faster than interfaces.
>>>>>>>>
>>>>>>>> The most important factor to me is that we do realistic steps so we
>>>>>>>> do
>>>>>>>> not get caught with an unresourceable transition from Jena2 to
>>>>>>>> Jena3.
>>>>>>>>   I
>>>>>>>> think we should only consider things that people will resource.
>>>>>>>>
>>>>>>>> Node_NULL is not used anywhere - @deprecate and delete!
>>>>>>>>
>>>>>>>> (Looks like it is left over from RDB days.)
>>>>>>>>
>>>>>>>>         Andy
>>>>>>>>
>>>>>>>> JENA-189
>>>>>>>>
>>>>>>>>
>>>>>>>>    Claude
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Dec 30, 2013 at 3:27 PM, Andy Seaborne <an...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>     On 29/12/13 20:40, Claude Warren wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      The RMI simply exposes an existing graph implementation on a
>>>>>>>>>> remote
>>>>>>>>>>
>>>>>>>>>>   system.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The normal disclaimers apply but given the standard Jena
>>>>>>>>>>> configuration:
>>>>>>>>>>>
>>>>>>>>>>> NodeFactory.createAnon() uses UID to create an id that would be
>>>>>>>>>>> passed to
>>>>>>>>>>> the graph on the remote server where the anon would be recreated.
>>>>>>>>>>>
>>>>>>>>>>> The result is that both the client and the server have the same
>>>>>>>>>>> anon
>>>>>>>>>>> id
>>>>>>>>>>> for
>>>>>>>>>>> the blank node.
>>>>>>>>>>>
>>>>>>>>>>> Am I missing something?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    Only that UID are, strictly, only unique for the machine they
>>>>>>>>>>> are
>>>>>>>>>>>
>>>>>>>>>>>  allocated on.  RMI etc can pass them around but they only safely
>>>>>>>>>> identify
>>>>>>>>>> things on the same machine as their origin (they aren't long
>>>>>>>>>> enough
>>>>>>>>>> for
>>>>>>>>>> wider uniqueness).  Its the UID user's responsibility nor to
>>>>>>>>>> present
>>>>>>>>>> them
>>>>>>>>>> on on a non-origin machine.
>>>>>>>>>>
>>>>>>>>>> Ideally, Jena3, I'd like to use UUIDs, and then store only two
>>>>>>>>>> longs,
>>>>>>>>>> for
>>>>>>>>>> blank nodes.  They they are globally safe as well as being
>>>>>>>>>> smaller.
>>>>>>>>>>
>>>>>>>>>> Out of curiosity - why do you need to extend to Model?  Is there a
>>>>>>>>>> client-side implementation of graph and then it's just a case of
>>>>>>>>>> wrapping a
>>>>>>>>>> Graph just like another other graph?  Or am I missing something?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Another issue in parsing is keeping label->bnode mapping.  Labels
>>>>>>>>>> must be
>>>>>>>>>> matched to any previous use in the parser run.
>>>>>>>>>>
>>>>>>>>>> The RIOT parsers do not use jena-core UID generation for bnode
>>>>>>>>>> ids.
>>>>>>>>>> If
>>>>>>>>>> it's a map of label to node allocated, there is a growing data
>>>>>>>>>> structure.
>>>>>>>>>>      Something that we occasionally get reports of being a
>>>>>>>>>> problem as
>>>>>>>>>> the map
>>>>>>>>>> grows for very large parser runs.
>>>>>>>>>>
>>>>>>>>>> Instead, RIOT allocates a large number (122 bits of random) and
>>>>>>>>>> xors
>>>>>>>>>> it
>>>>>>>>>> with the label.  So the internal id is calculated from the label
>>>>>>>>>> and
>>>>>>>>>> is
>>>>>>>>>> unique yet there is no growing data structure.
>>>>>>>>>>
>>>>>>>>>>             Andy
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     Claude
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Dec 29, 2013 at 7:43 PM, Andy Seaborne <an...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>      On 29/12/13 16:58, Claude Warren wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>       Greetings,
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>   I have an initial implementation of an RMI based Graph that
>>>>>>>>>>>>
>>>>>>>>>>>>> allows
>>>>>>>>>>>>> one
>>>>>>>>>>>>> JVM
>>>>>>>>>>>>> to access a graph in a different JVM.  I hope to extend this to
>>>>>>>>>>>>> the
>>>>>>>>>>>>> Model
>>>>>>>>>>>>> level in the near future.   I just wanted to know if anyone was
>>>>>>>>>>>>> interested
>>>>>>>>>>>>> in this project.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Claude
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>      The perennial question ...
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>   How do you treat blank nodes?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>              Andy
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>


-- 
I like: Like Like - The likeliest place on the web<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Using Graph from another JVM -- A solution.

Posted by Andy Seaborne <an...@apache.org>.

On 01/01/14 07:43, Claude Warren wrote:
> I thought some more about my requirements and I think that making Node (and
> Triple) Serializable would solve my problem.  I'll take a look at what it
> would take to implement that.

If it's an interface (Jena3), then your alternative implementation 
approach *should* work; would even be a good test of the architecture. 
Won't it take a minimum of serialization methods in the top of the 
implementing class hierarchy?

I don't know if Java8 default methods work with those serialization 
methods - I'd guessing "no" as serialization is treated specially, the 
methods must be an exact signature and include "private" (IIRC+checking 
the javadoc).  RMI is one of those early Java technologies that hasn't 
changed much in ages.

(If there is only implementation, the JIT will treat all methods as 
final which helps optimization.)

The RMI graph is a special case of a more general design where the graph 
(dataset) can be distributed across more then one machine.  I considered 
RMI for Lizard but rejected it because it's RPC, not streams.  RPC+small 
objects do not make for efficiency; RPC has latency issues for tight 
coupling systems and any call is one copy in, one copy out at an 
absolute minimum. [All a blast from the past for me!]  I may use it for 
a control control but if the system is already using netty+thrift 
(current plan), adding RMI is duplication.

New Year's Resolution - start Jena3!

	Andy

> On Mon, Dec 30, 2013 at 8:21 PM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 30/12/13 19:39, Claude Warren wrote:
>>
>>> With a Node interface I can implement a serializable node that handles all
>>> the core node types.  It means I only have to convert the node once.
>>>    Without an interface I have to convert to a serializable format and then
>>> convert back to "native" form.
>>>
>>
>> So it is a single class? Lucky it's only the concrete types! - there are
>> subclasses of variable in at least two places.  Quite a bit of instanceof
>> Node_RuleVariable in the reasoner code.
>>
>> I took a look at the code bases looking for other instanceof tests. I
>> found OWLDLProfile, OWLProfile and OWLLiteProfile that do instanceof test
>> when they should be doing .isXXX tests.  Changed.
>>
>> Other than that, there does not seem to be much code that makes use of the
>> class hierarchy although my looking was not systematic.  Of course, other
>> extension code may be doing so.
>>
>>
>>          Andy
>>
>>   On Mon, Dec 30, 2013 at 7:24 PM, Andy Seaborne <an...@apache.org> wrote:
>>>
>>>   On 30/12/13 18:58, Claude Warren wrote:
>>>>
>>>>   I did a quick Node (Interface) and NodeImpl implementation while working
>>>>> on
>>>>> the RMI code.  (It made some things easier) there was not much change to
>>>>> the code to put in an interface that has the current methods of Node.  I
>>>>> would like to move this into the current code base, but if we decide not
>>>>> to
>>>>> do that I can work around it.
>>>>>
>>>>>
>>>> This would be better done on a branch for discussion.  I'm -1 to just
>>>> putting it into trunk.
>>>>
>>>> "not much change" needs a migration strategy because this is going to
>>>> affect all modules, and it's not just the project's code either.
>>>>
>>>> What does it make easier?
>>>>
>>>>           Andy
>>>>
>>>>
>>>>
>>>>
>>>>> On Mon, Dec 30, 2013 at 6:23 PM, Andy Seaborne <an...@apache.org> wrote:
>>>>>
>>>>>    PS
>>>>>
>>>>>> http://mail-archives.apache.org/mod_mbox/jena-dev/201207.
>>>>>> mbox/%3C5009735B.5020908@apache.org%3E
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 30/12/13 18:21, Andy Seaborne wrote:
>>>>>>
>>>>>>    On 30/12/13 16:28, Claude Warren wrote:
>>>>>>
>>>>>>>
>>>>>>>    For RMI I am only implementing a Graph.
>>>>>>>
>>>>>>>>
>>>>>>>> It may make sense to wrap model and dataset in order to achieve
>>>>>>>> better
>>>>>>>> performance (e.g. wrapping a TDB model/dataset may provide better
>>>>>>>> performance than creating a model against multiple graphs on the
>>>>>>>> client
>>>>>>>> side), but for now it is just a Graph.
>>>>>>>>
>>>>>>>>
>>>>>>>>   Will be be more performant in any measurable way?
>>>>>>>
>>>>>>>     I did have to create a model wrapper for the Security code, but
>>>>>>> that
>>>>>>> is
>>>>>>>
>>>>>>>   another kettle of fish.
>>>>>>>>
>>>>>>>> My plan is to complete the RMI implementation - 90% or so complete
>>>>>>>> now, and
>>>>>>>> add security to it (so you can restrict RMI access to specific graphs
>>>>>>>> etc).
>>>>>>>>
>>>>>>>> Is there any issue with turning on the UUID inside the NodeFactory? I
>>>>>>>> see
>>>>>>>> that there is code for this.
>>>>>>>>
>>>>>>>> I would also like to see Node changed to an interface -- but that is
>>>>>>>> another discussion -- I think it will keep the core cleaner as things
>>>>>>>> like
>>>>>>>> Node_Null won't pollute it.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> I agree about interfaces that's why NodeFactory has gone in) but the
>>>>>>> detail of the exact contract needs to be clear.
>>>>>>>
>>>>>>> One argument for them is holding per-storage info in a Node impl but
>>>>>>> that is limited in the system like Jena where Node.equals is global
>>>>>>> and
>>>>>>> determined by RDF semantics.
>>>>>>>
>>>>>>> I'm looking to simplify Graph/Triple/Node, so get rid of AnonIds (a
>>>>>>> nuisense - they show up in the RDF API). And TripleMatch.  Some
>>>>>>> renaming
>>>>>>> to sane length method names.  Extension for graphs as nodes(nested
>>>>>>> graphs) and module-specific Nodestio reuse the storage  (they never
>>>>>>> leave a model - they help reuse things like "Triple" and "Graph" - I
>>>>>>> found them useful in ARQ/TDB etc for example "this pattern slot is
>>>>>>> defined").
>>>>>>>
>>>>>>> There is lots of potential flexibility that is not used and I think we
>>>>>>> know now that some of that is not of any use and it just confuses.
>>>>>>>
>>>>>>> By the way, abstract interface classes (i.e. all methods abstract) are
>>>>>>> reported as a bit faster than interfaces.
>>>>>>>
>>>>>>> The most important factor to me is that we do realistic steps so we do
>>>>>>> not get caught with an unresourceable transition from Jena2 to Jena3.
>>>>>>>   I
>>>>>>> think we should only consider things that people will resource.
>>>>>>>
>>>>>>> Node_NULL is not used anywhere - @deprecate and delete!
>>>>>>>
>>>>>>> (Looks like it is left over from RDB days.)
>>>>>>>
>>>>>>>         Andy
>>>>>>>
>>>>>>> JENA-189
>>>>>>>
>>>>>>>
>>>>>>>    Claude
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Dec 30, 2013 at 3:27 PM, Andy Seaborne <an...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>     On 29/12/13 20:40, Claude Warren wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>>     The RMI simply exposes an existing graph implementation on a
>>>>>>>>> remote
>>>>>>>>>
>>>>>>>>>   system.
>>>>>>>>>>
>>>>>>>>>> The normal disclaimers apply but given the standard Jena
>>>>>>>>>> configuration:
>>>>>>>>>>
>>>>>>>>>> NodeFactory.createAnon() uses UID to create an id that would be
>>>>>>>>>> passed to
>>>>>>>>>> the graph on the remote server where the anon would be recreated.
>>>>>>>>>>
>>>>>>>>>> The result is that both the client and the server have the same
>>>>>>>>>> anon
>>>>>>>>>> id
>>>>>>>>>> for
>>>>>>>>>> the blank node.
>>>>>>>>>>
>>>>>>>>>> Am I missing something?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    Only that UID are, strictly, only unique for the machine they are
>>>>>>>>>>
>>>>>>>>> allocated on.  RMI etc can pass them around but they only safely
>>>>>>>>> identify
>>>>>>>>> things on the same machine as their origin (they aren't long enough
>>>>>>>>> for
>>>>>>>>> wider uniqueness).  Its the UID user's responsibility nor to present
>>>>>>>>> them
>>>>>>>>> on on a non-origin machine.
>>>>>>>>>
>>>>>>>>> Ideally, Jena3, I'd like to use UUIDs, and then store only two
>>>>>>>>> longs,
>>>>>>>>> for
>>>>>>>>> blank nodes.  They they are globally safe as well as being smaller.
>>>>>>>>>
>>>>>>>>> Out of curiosity - why do you need to extend to Model?  Is there a
>>>>>>>>> client-side implementation of graph and then it's just a case of
>>>>>>>>> wrapping a
>>>>>>>>> Graph just like another other graph?  Or am I missing something?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Another issue in parsing is keeping label->bnode mapping.  Labels
>>>>>>>>> must be
>>>>>>>>> matched to any previous use in the parser run.
>>>>>>>>>
>>>>>>>>> The RIOT parsers do not use jena-core UID generation for bnode ids.
>>>>>>>>> If
>>>>>>>>> it's a map of label to node allocated, there is a growing data
>>>>>>>>> structure.
>>>>>>>>>      Something that we occasionally get reports of being a problem as
>>>>>>>>> the map
>>>>>>>>> grows for very large parser runs.
>>>>>>>>>
>>>>>>>>> Instead, RIOT allocates a large number (122 bits of random) and xors
>>>>>>>>> it
>>>>>>>>> with the label.  So the internal id is calculated from the label and
>>>>>>>>> is
>>>>>>>>> unique yet there is no growing data structure.
>>>>>>>>>
>>>>>>>>>             Andy
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     Claude
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Dec 29, 2013 at 7:43 PM, Andy Seaborne <an...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>      On 29/12/13 16:58, Claude Warren wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>       Greetings,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   I have an initial implementation of an RMI based Graph that
>>>>>>>>>>>> allows
>>>>>>>>>>>> one
>>>>>>>>>>>> JVM
>>>>>>>>>>>> to access a graph in a different JVM.  I hope to extend this to
>>>>>>>>>>>> the
>>>>>>>>>>>> Model
>>>>>>>>>>>> level in the near future.   I just wanted to know if anyone was
>>>>>>>>>>>> interested
>>>>>>>>>>>> in this project.
>>>>>>>>>>>>
>>>>>>>>>>>> Claude
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>      The perennial question ...
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>   How do you treat blank nodes?
>>>>>>>>>>>
>>>>>>>>>>>              Andy
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>