You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commonsrdf.apache.org by Stian Soiland-Reyes <st...@apache.org> on 2016/06/02 13:24:04 UTC

Re: [apache/incubator-commonsrdf] TripleOrQuad -> TripleLike, GraphLike (bc639bb)

On 2 June 2016 at 12:22, Andy Seaborne <no...@github.com> wrote:
> I appreciate the attempt to support generalized RDF.

Great! :-)  I thought it was too ambitious..

So what do you think about having the generalized TripleLike using
RDFTerm as base in its generics, or should it be super-generic?


E.g. in my proposal you would need to add make a SPARQLVariable as a
subclass of RDFTerm to have a TripleLike representation of

    ?x rdfs:label "fred" .

..but by enforcing RDFTerm you couldn't do say a
TripleLike<java.util.UUID, java.util.URI, java.lang.String> - which I
know was also discussed, but which perhaps limits the usefulness of
anyone consuming TripleLike instances. (However they can still plug it
to a compatible GraphLike instance)



> How do these interfaces avoid the problems of Quad extends Triple?


Yes, QuadImpl could still theoretically extend TripleImpl, just like
it could still theoretically extend LiteralImpl - both which would
give incompatible .equals() and .hashCode() semantics. So you would be
breaking the textual contract even if it compiles.


Do you think we should explicitly allow a QuadImpl to implement both
Quad and Triple? In which case we must modify the .equals() for both
to consider that.


I think we can agree that these quads differ:

  <g1> { <s> <p> <o> .   }

  <g2> { <s> <p> <o> .  }

(And blank node-identified graphs should compared using the existing
BlankNode.equals() semantics - perhaps let's not get into that now!)


And that these quads in the default graph are equal:

  { <s> <p> <o> . }

  { <s> <p> <o> . }


But the question is if this Quad:

  { <s> <p> <o> . }

is equal to this Triple:

  <s> <p> <o> .


or even broader, if this Triple:

  <s> <p> <o> .

is equal to this quad:

  <g1> { <s> <p> <o> .   }


The last case is the difficult one, as that gives you hierarchical
equivalence where both the g1 and g2 quad could be equal to the
triple, but they are not equal to each-other.

So that breaks Object.equals() contract:

http://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#equals-java.lang.Object-

in that the comparison direction does not matter and A==B==C==A
(symmetric, transitive), so we don't want to do that - for instance it
would make a mixed Collection of Quad and Triple behave oddly.




The two options as I see it are:

a) A Quad can never be equal to a Triple or vice versa

b) A Quad can only be equal to a Triple if it's in the default graph
(and vice versa)


a) is easiest for semantics, but does beg for some Quad.asTriple()
converter method (and vice versa?) and means you can't have a QuadImpl
implements Triple,Quad.  (but Quad still implement TripleLike).

This is my current favourite and what this branch suggests.


b) requires that Triple.equals() specifically checks for instanceof
Quad and if so ensures getGraphName() is Optional.empty().  This
allows QuadImpl implements Triple,Quad
which could then be added to both a Graph and a Dataset.
If we go with this, should the interface Quad extend Triple (all quads
are triples), or do we allow some Quads to not be Triples?  (I think
you mentioned this the implementation should be free to choose)


> This is a general Java issue - value-based equality breaks when adding a field (so being in a set breaks).

Yes, but there are many kinds of extra fields that could presumably be
added to a TripleImpl, e.g. a database connection - but as we
specialize Object.equals() rather than .equalsTriple(),
implementations have not got any room to add fields to their .equals()
or .hashCode() -- unless checking those fields implicitly checks the
subject/predicate/object equivalence, like a .getInternalUUID().

Even if an implementation added "If both are TripleImpl, then also
check getSessionId() are equal" - then that would violate the
Triple.equals() contract unless we add weasel-words to it; e.g. don't
say when they are equal, just the conditions of something that IS
equal. e.g. "If Triple.equals() method returns true, then the
getSubject, getPredicate and getObject() must also equals() between
the two instances" -- allowing non-equality even when they DO match. I
think having that option could be very confusing, though, and goes
against our goal of cross-implementation compatibility.


> If a class adds a field, then value-based equality can not be made to work. Two "equals" objects, here TripleLike<S,P,O>, can have different G field for QuadLike<S,P,O,G>.

Which is why neither TripleLike or QuadLike specifies .equals() or
.hashCode() :)  They merely say that there is a
subject/predicate/object/graphname getters, not even that the object
is a Triple as specified in the RDF specs. These can thus also be used
by more specific SpecialTriple which .equals() say also checks
getTimestamp().


> Adding an operation to an interface which is modelling a new field does not change the situation. Should the value of the getGraphName method influence the outcome of .equals? If viewed as a TripleLike then "no", if as a QuadLike, then "yes". But a quad is both.

Most Java interfaces don't specify any equality constraints, but those
that do will use the weasel approach, which allows extensions to check
further properties, but provide some general guarantees IF two objects
are equal.

So a quad is cleary TripleLike in that it has
subject/predicate/object, but it can also have more.

Even our own RDFTerm and BlankNodeOrIRI don't. This is something we
have added to the RDF classes we consider is important to have
consistent equality of across implementations.   This can be
considered both our Value Proposition and our Achilles Heel :)


-- 
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons
http://orcid.org/0000-0001-9842-9718

Re: [apache/incubator-commonsrdf] TripleOrQuad -> TripleLike, GraphLike (bc639bb)

Posted by Andy Seaborne <an...@apache.org>.
Your objective seems to be to be able to work with quads and triples mixed.

The way to do this is to create a container (a tagged union type : 
enum-like) that holds a triple or a quad, not to make the hierarchy 
model this (because of the equals issue).

Re:

TripleLike<S extends RDFTerm, ...
seems like a good design -- it's "generalised RDF" as defined by the specs.

TripleLike<S extends Object,...
would require everything else to change and limit interoperability 
(working with two systems at he same time).

    Andy

 > Do you think we should explicitly allow a QuadImpl to implement both
 > Quad and Triple?

No - I think that is incompatible with java's equal requirements.

(this is now the discussion from last time)



On 02/06/16 14:24, Stian Soiland-Reyes wrote:
> On 2 June 2016 at 12:22, Andy Seaborne <no...@github.com> wrote:
>> I appreciate the attempt to support generalized RDF.
>
> Great! :-)  I thought it was too ambitious..
>
> So what do you think about having the generalized TripleLike using
> RDFTerm as base in its generics, or should it be super-generic?
>
>
> E.g. in my proposal you would need to add make a SPARQLVariable as a
> subclass of RDFTerm to have a TripleLike representation of
>
>      ?x rdfs:label "fred" .
>
> ..but by enforcing RDFTerm you couldn't do say a
> TripleLike<java.util.UUID, java.util.URI, java.lang.String> - which I
> know was also discussed, but which perhaps limits the usefulness of
> anyone consuming TripleLike instances. (However they can still plug it
> to a compatible GraphLike instance)
>
>
>
>> How do these interfaces avoid the problems of Quad extends Triple?
>
>
> Yes, QuadImpl could still theoretically extend TripleImpl, just like
> it could still theoretically extend LiteralImpl - both which would
> give incompatible .equals() and .hashCode() semantics. So you would be
> breaking the textual contract even if it compiles.
>
>
> Do you think we should explicitly allow a QuadImpl to implement both
> Quad and Triple? In which case we must modify the .equals() for both
> to consider that.
>
>
> I think we can agree that these quads differ:
>
>    <g1> { <s> <p> <o> .   }
>
>    <g2> { <s> <p> <o> .  }
>
> (And blank node-identified graphs should compared using the existing
> BlankNode.equals() semantics - perhaps let's not get into that now!)
>
>
> And that these quads in the default graph are equal:
>
>    { <s> <p> <o> . }
>
>    { <s> <p> <o> . }
>
>
> But the question is if this Quad:
>
>    { <s> <p> <o> . }
>
> is equal to this Triple:
>
>    <s> <p> <o> .
>
>
> or even broader, if this Triple:
>
>    <s> <p> <o> .
>
> is equal to this quad:
>
>    <g1> { <s> <p> <o> .   }
>
>
> The last case is the difficult one, as that gives you hierarchical
> equivalence where both the g1 and g2 quad could be equal to the
> triple, but they are not equal to each-other.
>
> So that breaks Object.equals() contract:
>
> http://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#equals-java.lang.Object-
>
> in that the comparison direction does not matter and A==B==C==A
> (symmetric, transitive), so we don't want to do that - for instance it
> would make a mixed Collection of Quad and Triple behave oddly.
>
>
>
>
> The two options as I see it are:
>
> a) A Quad can never be equal to a Triple or vice versa
>
> b) A Quad can only be equal to a Triple if it's in the default graph
> (and vice versa)
>
>
> a) is easiest for semantics, but does beg for some Quad.asTriple()
> converter method (and vice versa?) and means you can't have a QuadImpl
> implements Triple,Quad.  (but Quad still implement TripleLike).
>
> This is my current favourite and what this branch suggests.
>
>
> b) requires that Triple.equals() specifically checks for instanceof
> Quad and if so ensures getGraphName() is Optional.empty().  This
> allows QuadImpl implements Triple,Quad
> which could then be added to both a Graph and a Dataset.
> If we go with this, should the interface Quad extend Triple (all quads
> are triples), or do we allow some Quads to not be Triples?  (I think
> you mentioned this the implementation should be free to choose)
>
>
>> This is a general Java issue - value-based equality breaks when adding a field (so being in a set breaks).
>
> Yes, but there are many kinds of extra fields that could presumably be
> added to a TripleImpl, e.g. a database connection - but as we
> specialize Object.equals() rather than .equalsTriple(),
> implementations have not got any room to add fields to their .equals()
> or .hashCode() -- unless checking those fields implicitly checks the
> subject/predicate/object equivalence, like a .getInternalUUID().
>
> Even if an implementation added "If both are TripleImpl, then also
> check getSessionId() are equal" - then that would violate the
> Triple.equals() contract unless we add weasel-words to it; e.g. don't
> say when they are equal, just the conditions of something that IS
> equal. e.g. "If Triple.equals() method returns true, then the
> getSubject, getPredicate and getObject() must also equals() between
> the two instances" -- allowing non-equality even when they DO match. I
> think having that option could be very confusing, though, and goes
> against our goal of cross-implementation compatibility.
>
>
>> If a class adds a field, then value-based equality can not be made to work. Two "equals" objects, here TripleLike<S,P,O>, can have different G field for QuadLike<S,P,O,G>.
>
> Which is why neither TripleLike or QuadLike specifies .equals() or
> .hashCode() :)  They merely say that there is a
> subject/predicate/object/graphname getters, not even that the object
> is a Triple as specified in the RDF specs. These can thus also be used
> by more specific SpecialTriple which .equals() say also checks
> getTimestamp().
>
>
>> Adding an operation to an interface which is modelling a new field does not change the situation. Should the value of the getGraphName method influence the outcome of .equals? If viewed as a TripleLike then "no", if as a QuadLike, then "yes". But a quad is both.
>
> Most Java interfaces don't specify any equality constraints, but those
> that do will use the weasel approach, which allows extensions to check
> further properties, but provide some general guarantees IF two objects
> are equal.
>
> So a quad is cleary TripleLike in that it has
> subject/predicate/object, but it can also have more.
>
> Even our own RDFTerm and BlankNodeOrIRI don't. This is something we
> have added to the RDF classes we consider is important to have
> consistent equality of across implementations.   This can be
> considered both our Value Proposition and our Achilles Heel :)
>
>