You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by pieter <pi...@gmail.com> on 2015/12/12 15:01:20 UTC

RE: rdf questions

Hi,

I know many rdf vendors are TinkerPop providers.

Can it work in the other direction, i.e. can a rdf dataset be loaded
into a TinkerPop database?
Is it possible to load any rdf dataset into TinkerPop without loss?

Is this something TinkerPop is interested in?

Thanks
Pieter

Re: rdf questions

Posted by Stephen Mallette <sp...@gmail.com>.

I don't see why you couldn't load rdf into a graph.  I guess we don't have
a native rdf loader. Seems like you could load rdf with BLVP and a
ScriptInputFormat.  probably wouldn't be hard to add an RdfInputFormat or
something to avoid the need for a custom script. I guess Kuppitz will
correct me if there's a better way.  Any which way, I'm +0 on including
atm, but i don't think much in raw rdf so perhaps others would find it
valuable in some way.

On Sat, Dec 12, 2015 at 9:01 AM, pieter <pi...@gmail.com> wrote:

> Hi,
>
> I know many rdf vendors are TinkerPop providers.
>
> Can it work in the other direction, i.e. can a rdf dataset be loaded
> into a TinkerPop database?
> Is it possible to load any rdf dataset into TinkerPop without loss?
>
> Is this something TinkerPop is interested in?
>
> Thanks
> Pieter
>
>
>

Re: rdf questions

Posted by pieter-gmail <pi...@gmail.com>.

Thanks to @Mike and @Joshua for the information.

Interesting stuff. Admittedly I need to do some homework on RDF to
understand things better.

Having a keen interest in UML I noticed this OMG spec,
http://www.omg.org/spec/ODM/1.1/PDF/ including RDF meta model specification.
A brief scan of the spec shows that the MOF also had some trouble with
the meta modeling of RDF semantics.

Maybe some day PG too will get some attention from the OMG for what its
worth.

Thanks
Pieter

On 23/12/2015 20:43, Mike Personick wrote:
> Here is Blazegraph's TP3 RDF*/PG mapping just for reference.  Different
> from the original TP2 mapping, which did not use RDF*.
>
> // vertex (id="a", label="person")
> pg:a rdfs:label "A" .
>
> // vertex property (single or set)
> pg:a pg:key1 "val" .
>
> // vertex property (list)
> pg:a pg:key2 _:b1 .
> _:b1 rdf:value "val" .
> _:b1 rdf:li 0 .
>
> // vertex property property
> <<pg:a pg:key1 "val">> pg:acl "public" .
> <<pg:a pg:key2 _:b1>> pg:acl "private" .
>
> // edge (id="x", from="a", to="b", label="knows")
> pg:a pg:x pg:b .
> <<pg:a pg:x pg:b>> rdfs:label "knows" .
>
> // edge property
> <<pg:a pg:x pg:b>> pg:key "val" .
>
> Here is a link to Olaf and Bryan's original work on RDF*:
>
> http://arxiv.org/abs/1406.3399
>
>
> On Tue, Dec 22, 2015 at 4:33 PM, Joshua Shinavier <jo...@fortytwo.net> wrote:
>
>> On Tue, Dec 22, 2015 at 8:39 AM, Mike Personick <mi...@systap.com> wrote:
>>
>>> Neither generic RDF -> PG nor PG -> generic RDF can be lossless.
>>>
>>
>> Both can be lossless: you can translate any RDF graph or dataset into a PG
>> graph, and any PG graph into an RDF graph such that you can recover the
>> original graph exactly, having lost no information.  What you can't have is
>> a one-to-one mapping between RDF graphs and PG graphs.
>>
>>
>>
>>
>>> Even with reification you can't solve the problem that PG allows multiple
>>> edge instances with the same (s, p, o).  Same from, to, and edge label.
>>> Olaf and I went back and forth on this point quite a bit and we agreed
>> that
>>> this made the two models irreconcilable without using some specific RDF
>>> schema to keep track of edge ids.
>>
>>
>> You said it: use edge ids.  See the first example from [3].  A dataset
>> alternative I mentioned is to create one named graph per statement, but
>> that would be pretty unusual.
>>
>>
>>
>>
>>>   PG -> RDF cannot be lossless without a
>>> custom RDF schema for edge identifiers.
>>
>>
>> How are edge identifiers different than URIs or blank node IDs?  Their
>> syntax is opaque to either data model, but you do need a property to
>> connect the edge resource with the id resource.  Other vocabulary elements
>> are also needed, as you can't get away from mapping into a schema, in
>> either direction.
>>
>>
>>
>>
>>>   There are other things about PG
>>> that force a conversion to RDF to require a RDF/PG schema, such as
>>> Cardinality.list.  RDF lends itself well to Cardinality.single and
>>> Cardinality.set, list not so much.
>>>
>>> The reverse is true is well, RDF -> PG is not lossless either, since
>> there
>>> are many things you can do in RDF that you cannot do with PG.  One
>> example
>>> is edges connecting edges.  Another example is unlimited depth of
>> property
>>> properties with RDF* or old-school reification.
>>>
>>
>> Yes, and that's not even getting into named graphs, which are important for
>> SPARQL and most real applications.
>>
>>
>>
>> Long and short of it - you can have a feature limited PG implementation
>>> that works with some kinds of generic RDF, or you can have a full
>> featured
>>> PG implementation that only works on RDF graphs conforming to some
>> specific
>>> schema to deal with the impedance mismatches between RDF and PG.
>>
>>
>> You can have PG views of any RDF data, or RDF views of any PG data, but you
>> can't have it both ways at once because the data models aren't equivalent.
>>
>>
>>
>>
>>>   What
>>> might be nice in the future is decide on a standardized RDF/PG schema so
>>> that each vendor doesn't do it differently.
>>>
>>
>> PropertyGraphSail was probably the first PG-->RDF mapper [1].  It suggests
>> a vocabulary of five terms. SailGraph likewise has a handful of terms (some
>> of which, like "ng" and "kind", could use some tweaking) which could serve
>> as a starting point.
>>
>> Best,
>>
>> Josh
>>
>>
>> [1] https://groups.google.com/forum/#!topic/gremlin-users/Ov91RPkajBI
>>
>>
>>
>>
>>
>>>
>>>
>>> On Mon, Dec 21, 2015 at 10:56 PM, pieter-gmail <pi...@gmail.com>
>>> wrote:
>>>
>>>> Thanks for the explanation.
>>>> Cheers
>>>> Pieter
>>>>
>>>> On 21/12/2015 23:59, Joshua Shinavier wrote:
>>>>> Hi Pieter,
>>>>>
>>>>> Yes, it is possible to map RDF graphs, and also RDF datasets
>>> (collections
>>>>> of graphs with names), to a property graph data model without loss.
>>>>> GraphSail [1] had to do this in order to use Blueprints-based DBs as
>>>> triple
>>>>> stores, querying over the RDF data and retrieving it.  GraphSail
>> uses a
>>>>> mapping almost identical to that of SailGraph [2] (see a schematic on
>>>> that
>>>>> page), which maps RDF to property graphs.  For the "opposite" of
>>>> GraphSail
>>>>> and SailGraph (i.e. arbitrary property graphs to RDF), see
>>>>> PropertyGraphSail [3].
>>>>>
>>>>> Olaf Hartig discusses some incompatibilities between PG and RDF in
>> his
>>>>> paper.  Some essential things to keep in mind:
>>>>> *) In mapping between PG and RDF, you are forced to treat edges
>> either
>>> as
>>>>> resources or as statements.  If edges are statements, then any edge
>>>>> properties are lost in the PG-->RDF mapping (unless you were to do
>>>>> something a little weird with named graphs: one graph per statement).
>>> If
>>>>> edges are vertices, the RDF format is quite verbose and is not
>>>> symmetrical
>>>>> with a useful RDF-->PG mapping.  PropertyGraphSail supports two
>> styles
>>> of
>>>>> mapping: one "verbose" (edge-reified) and the other compact (edges as
>>>>> statements).
>>>>> *) A straightforward RDF(datasets)-->PG mapping treats resources as
>>>>> vertices and statements as edges or as properties depending on the
>>>> object,
>>>>> but this is more complicated if you want to preserve named graph
>>>> metadata,
>>>>> as you can't attach metadata to PG properties.  You already have a
>> bit
>>>> of a
>>>>> problem if you want to do anything graph-like with named graph
>>> metadata,
>>>> as
>>>>> PG is not a hypergraph data model (no edges from edges).
>>>>>
>>>>> Best,
>>>>>
>>>>> Josh
>>>>>
>>>>>
>>>>> [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
>>>>> [2] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
>>>>> [3]
>>>>>
>> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Dec 21, 2015 at 11:28 AM, pieter-gmail <
>>> pieter.martin@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks, I have just started on the rdf path.
>>>>>>
>>>>>> When you say the RDF data model and PG data model are not 100%
>> aligned
>>>>>> does that mean that for some RDF models to PG model there will be
>>>>>> information loss or just a increase in complexity and efficiency?
>>>>>>
>>>>>> Does the same hold for the other way around PG model to RDF model?
>>>>>>
>>>>>> I'll have a look at your implementation to understand things better.
>>>>>>
>>>>>> Cheers
>>>>>> Pieter
>>>>>>
>>>>>> On 21/12/2015 18:46, Mike Personick wrote:
>>>>>>> The RDF data model and the PG data model are not 100% aligned.  I
>>> know
>>>>>>> there have been a few academic papers on the subject.  For
>>> Blazegraph I
>>>>>> am
>>>>>>> using a PG schema built on top of raw RDF.  But a raw RDF graph
>> would
>>>> not
>>>>>>> work with the Blazegraph TP3 interface if it doesn't follow the PG
>>>>>> schema.
>>>>>>> On Mon, Dec 21, 2015 at 3:22 AM, pieter-gmail <
>>> pieter.martin@gmail.com
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Found this recently, fyi
>>>>>>>>
>>>>>>>> http://arxiv.org/abs/1409.3288
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> Pieter
>>>>>>>>
>>>>>>>> On 12/12/2015 16:01, pieter wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I know many rdf vendors are TinkerPop providers.
>>>>>>>>>
>>>>>>>>> Can it work in the other direction, i.e. can a rdf dataset be
>>> loaded
>>>>>>>>> into a TinkerPop database?
>>>>>>>>> Is it possible to load any rdf dataset into TinkerPop without
>> loss?
>>>>>>>>> Is this something TinkerPop is interested in?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Pieter
>>>>>>>>>
>>>>>>>>>
>>>>

Re: rdf questions

Posted by Mike Personick <mi...@systap.com>.

Here is Blazegraph's TP3 RDF*/PG mapping just for reference.  Different
from the original TP2 mapping, which did not use RDF*.

// vertex (id="a", label="person")
pg:a rdfs:label "A" .

// vertex property (single or set)
pg:a pg:key1 "val" .

// vertex property (list)
pg:a pg:key2 _:b1 .
_:b1 rdf:value "val" .
_:b1 rdf:li 0 .

// vertex property property
<<pg:a pg:key1 "val">> pg:acl "public" .
<<pg:a pg:key2 _:b1>> pg:acl "private" .

// edge (id="x", from="a", to="b", label="knows")
pg:a pg:x pg:b .
<<pg:a pg:x pg:b>> rdfs:label "knows" .

// edge property
<<pg:a pg:x pg:b>> pg:key "val" .

Here is a link to Olaf and Bryan's original work on RDF*:

http://arxiv.org/abs/1406.3399


On Tue, Dec 22, 2015 at 4:33 PM, Joshua Shinavier <jo...@fortytwo.net> wrote:

> On Tue, Dec 22, 2015 at 8:39 AM, Mike Personick <mi...@systap.com> wrote:
>
> > Neither generic RDF -> PG nor PG -> generic RDF can be lossless.
> >
>
>
> Both can be lossless: you can translate any RDF graph or dataset into a PG
> graph, and any PG graph into an RDF graph such that you can recover the
> original graph exactly, having lost no information.  What you can't have is
> a one-to-one mapping between RDF graphs and PG graphs.
>
>
>
>
> > Even with reification you can't solve the problem that PG allows multiple
> > edge instances with the same (s, p, o).  Same from, to, and edge label.
> > Olaf and I went back and forth on this point quite a bit and we agreed
> that
> > this made the two models irreconcilable without using some specific RDF
> > schema to keep track of edge ids.
>
>
>
> You said it: use edge ids.  See the first example from [3].  A dataset
> alternative I mentioned is to create one named graph per statement, but
> that would be pretty unusual.
>
>
>
>
> >   PG -> RDF cannot be lossless without a
> > custom RDF schema for edge identifiers.
>
>
>
> How are edge identifiers different than URIs or blank node IDs?  Their
> syntax is opaque to either data model, but you do need a property to
> connect the edge resource with the id resource.  Other vocabulary elements
> are also needed, as you can't get away from mapping into a schema, in
> either direction.
>
>
>
>
> >   There are other things about PG
> > that force a conversion to RDF to require a RDF/PG schema, such as
> > Cardinality.list.  RDF lends itself well to Cardinality.single and
> > Cardinality.set, list not so much.
> >
> > The reverse is true is well, RDF -> PG is not lossless either, since
> there
> > are many things you can do in RDF that you cannot do with PG.  One
> example
> > is edges connecting edges.  Another example is unlimited depth of
> property
> > properties with RDF* or old-school reification.
> >
>
>
> Yes, and that's not even getting into named graphs, which are important for
> SPARQL and most real applications.
>
>
>
> Long and short of it - you can have a feature limited PG implementation
> > that works with some kinds of generic RDF, or you can have a full
> featured
> > PG implementation that only works on RDF graphs conforming to some
> specific
> > schema to deal with the impedance mismatches between RDF and PG.
>
>
>
> You can have PG views of any RDF data, or RDF views of any PG data, but you
> can't have it both ways at once because the data models aren't equivalent.
>
>
>
>
> >   What
> > might be nice in the future is decide on a standardized RDF/PG schema so
> > that each vendor doesn't do it differently.
> >
>
>
> PropertyGraphSail was probably the first PG-->RDF mapper [1].  It suggests
> a vocabulary of five terms. SailGraph likewise has a handful of terms (some
> of which, like "ng" and "kind", could use some tweaking) which could serve
> as a starting point.
>
> Best,
>
> Josh
>
>
> [1] https://groups.google.com/forum/#!topic/gremlin-users/Ov91RPkajBI
>
>
>
>
>
> >
> >
> >
> > On Mon, Dec 21, 2015 at 10:56 PM, pieter-gmail <pi...@gmail.com>
> > wrote:
> >
> > > Thanks for the explanation.
> > > Cheers
> > > Pieter
> > >
> > > On 21/12/2015 23:59, Joshua Shinavier wrote:
> > > > Hi Pieter,
> > > >
> > > > Yes, it is possible to map RDF graphs, and also RDF datasets
> > (collections
> > > > of graphs with names), to a property graph data model without loss.
> > > > GraphSail [1] had to do this in order to use Blueprints-based DBs as
> > > triple
> > > > stores, querying over the RDF data and retrieving it.  GraphSail
> uses a
> > > > mapping almost identical to that of SailGraph [2] (see a schematic on
> > > that
> > > > page), which maps RDF to property graphs.  For the "opposite" of
> > > GraphSail
> > > > and SailGraph (i.e. arbitrary property graphs to RDF), see
> > > > PropertyGraphSail [3].
> > > >
> > > > Olaf Hartig discusses some incompatibilities between PG and RDF in
> his
> > > > paper.  Some essential things to keep in mind:
> > > > *) In mapping between PG and RDF, you are forced to treat edges
> either
> > as
> > > > resources or as statements.  If edges are statements, then any edge
> > > > properties are lost in the PG-->RDF mapping (unless you were to do
> > > > something a little weird with named graphs: one graph per statement).
> > If
> > > > edges are vertices, the RDF format is quite verbose and is not
> > > symmetrical
> > > > with a useful RDF-->PG mapping.  PropertyGraphSail supports two
> styles
> > of
> > > > mapping: one "verbose" (edge-reified) and the other compact (edges as
> > > > statements).
> > > > *) A straightforward RDF(datasets)-->PG mapping treats resources as
> > > > vertices and statements as edges or as properties depending on the
> > > object,
> > > > but this is more complicated if you want to preserve named graph
> > > metadata,
> > > > as you can't attach metadata to PG properties.  You already have a
> bit
> > > of a
> > > > problem if you want to do anything graph-like with named graph
> > metadata,
> > > as
> > > > PG is not a hypergraph data model (no edges from edges).
> > > >
> > > > Best,
> > > >
> > > > Josh
> > > >
> > > >
> > > > [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
> > > > [2] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
> > > > [3]
> > > >
> > >
> >
> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Dec 21, 2015 at 11:28 AM, pieter-gmail <
> > pieter.martin@gmail.com>
> > > > wrote:
> > > >
> > > >> Thanks, I have just started on the rdf path.
> > > >>
> > > >> When you say the RDF data model and PG data model are not 100%
> aligned
> > > >> does that mean that for some RDF models to PG model there will be
> > > >> information loss or just a increase in complexity and efficiency?
> > > >>
> > > >> Does the same hold for the other way around PG model to RDF model?
> > > >>
> > > >> I'll have a look at your implementation to understand things better.
> > > >>
> > > >> Cheers
> > > >> Pieter
> > > >>
> > > >> On 21/12/2015 18:46, Mike Personick wrote:
> > > >>> The RDF data model and the PG data model are not 100% aligned.  I
> > know
> > > >>> there have been a few academic papers on the subject.  For
> > Blazegraph I
> > > >> am
> > > >>> using a PG schema built on top of raw RDF.  But a raw RDF graph
> would
> > > not
> > > >>> work with the Blazegraph TP3 interface if it doesn't follow the PG
> > > >> schema.
> > > >>> On Mon, Dec 21, 2015 at 3:22 AM, pieter-gmail <
> > pieter.martin@gmail.com
> > > >
> > > >>> wrote:
> > > >>>
> > > >>>> Found this recently, fyi
> > > >>>>
> > > >>>> http://arxiv.org/abs/1409.3288
> > > >>>>
> > > >>>> Cheers
> > > >>>> Pieter
> > > >>>>
> > > >>>> On 12/12/2015 16:01, pieter wrote:
> > > >>>>> Hi,
> > > >>>>>
> > > >>>>> I know many rdf vendors are TinkerPop providers.
> > > >>>>>
> > > >>>>> Can it work in the other direction, i.e. can a rdf dataset be
> > loaded
> > > >>>>> into a TinkerPop database?
> > > >>>>> Is it possible to load any rdf dataset into TinkerPop without
> loss?
> > > >>>>>
> > > >>>>> Is this something TinkerPop is interested in?
> > > >>>>>
> > > >>>>> Thanks
> > > >>>>> Pieter
> > > >>>>>
> > > >>>>>
> > > >>
> > >
> > >
> >
>

Re: rdf questions

Posted by Joshua Shinavier <jo...@fortytwo.net>.

On Tue, Dec 22, 2015 at 8:39 AM, Mike Personick <mi...@systap.com> wrote:

> Neither generic RDF -> PG nor PG -> generic RDF can be lossless.
>


Both can be lossless: you can translate any RDF graph or dataset into a PG
graph, and any PG graph into an RDF graph such that you can recover the
original graph exactly, having lost no information.  What you can't have is
a one-to-one mapping between RDF graphs and PG graphs.




> Even with reification you can't solve the problem that PG allows multiple
> edge instances with the same (s, p, o).  Same from, to, and edge label.
> Olaf and I went back and forth on this point quite a bit and we agreed that
> this made the two models irreconcilable without using some specific RDF
> schema to keep track of edge ids.



You said it: use edge ids.  See the first example from [3].  A dataset
alternative I mentioned is to create one named graph per statement, but
that would be pretty unusual.




>   PG -> RDF cannot be lossless without a
> custom RDF schema for edge identifiers.



How are edge identifiers different than URIs or blank node IDs?  Their
syntax is opaque to either data model, but you do need a property to
connect the edge resource with the id resource.  Other vocabulary elements
are also needed, as you can't get away from mapping into a schema, in
either direction.




>   There are other things about PG
> that force a conversion to RDF to require a RDF/PG schema, such as
> Cardinality.list.  RDF lends itself well to Cardinality.single and
> Cardinality.set, list not so much.
>
> The reverse is true is well, RDF -> PG is not lossless either, since there
> are many things you can do in RDF that you cannot do with PG.  One example
> is edges connecting edges.  Another example is unlimited depth of property
> properties with RDF* or old-school reification.
>


Yes, and that's not even getting into named graphs, which are important for
SPARQL and most real applications.



Long and short of it - you can have a feature limited PG implementation
> that works with some kinds of generic RDF, or you can have a full featured
> PG implementation that only works on RDF graphs conforming to some specific
> schema to deal with the impedance mismatches between RDF and PG.



You can have PG views of any RDF data, or RDF views of any PG data, but you
can't have it both ways at once because the data models aren't equivalent.




>   What
> might be nice in the future is decide on a standardized RDF/PG schema so
> that each vendor doesn't do it differently.
>


PropertyGraphSail was probably the first PG-->RDF mapper [1].  It suggests
a vocabulary of five terms. SailGraph likewise has a handful of terms (some
of which, like "ng" and "kind", could use some tweaking) which could serve
as a starting point.

Best,

Josh


[1] https://groups.google.com/forum/#!topic/gremlin-users/Ov91RPkajBI





>
>
>
> On Mon, Dec 21, 2015 at 10:56 PM, pieter-gmail <pi...@gmail.com>
> wrote:
>
> > Thanks for the explanation.
> > Cheers
> > Pieter
> >
> > On 21/12/2015 23:59, Joshua Shinavier wrote:
> > > Hi Pieter,
> > >
> > > Yes, it is possible to map RDF graphs, and also RDF datasets
> (collections
> > > of graphs with names), to a property graph data model without loss.
> > > GraphSail [1] had to do this in order to use Blueprints-based DBs as
> > triple
> > > stores, querying over the RDF data and retrieving it.  GraphSail uses a
> > > mapping almost identical to that of SailGraph [2] (see a schematic on
> > that
> > > page), which maps RDF to property graphs.  For the "opposite" of
> > GraphSail
> > > and SailGraph (i.e. arbitrary property graphs to RDF), see
> > > PropertyGraphSail [3].
> > >
> > > Olaf Hartig discusses some incompatibilities between PG and RDF in his
> > > paper.  Some essential things to keep in mind:
> > > *) In mapping between PG and RDF, you are forced to treat edges either
> as
> > > resources or as statements.  If edges are statements, then any edge
> > > properties are lost in the PG-->RDF mapping (unless you were to do
> > > something a little weird with named graphs: one graph per statement).
> If
> > > edges are vertices, the RDF format is quite verbose and is not
> > symmetrical
> > > with a useful RDF-->PG mapping.  PropertyGraphSail supports two styles
> of
> > > mapping: one "verbose" (edge-reified) and the other compact (edges as
> > > statements).
> > > *) A straightforward RDF(datasets)-->PG mapping treats resources as
> > > vertices and statements as edges or as properties depending on the
> > object,
> > > but this is more complicated if you want to preserve named graph
> > metadata,
> > > as you can't attach metadata to PG properties.  You already have a bit
> > of a
> > > problem if you want to do anything graph-like with named graph
> metadata,
> > as
> > > PG is not a hypergraph data model (no edges from edges).
> > >
> > > Best,
> > >
> > > Josh
> > >
> > >
> > > [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
> > > [2] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
> > > [3]
> > >
> >
> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Dec 21, 2015 at 11:28 AM, pieter-gmail <
> pieter.martin@gmail.com>
> > > wrote:
> > >
> > >> Thanks, I have just started on the rdf path.
> > >>
> > >> When you say the RDF data model and PG data model are not 100% aligned
> > >> does that mean that for some RDF models to PG model there will be
> > >> information loss or just a increase in complexity and efficiency?
> > >>
> > >> Does the same hold for the other way around PG model to RDF model?
> > >>
> > >> I'll have a look at your implementation to understand things better.
> > >>
> > >> Cheers
> > >> Pieter
> > >>
> > >> On 21/12/2015 18:46, Mike Personick wrote:
> > >>> The RDF data model and the PG data model are not 100% aligned.  I
> know
> > >>> there have been a few academic papers on the subject.  For
> Blazegraph I
> > >> am
> > >>> using a PG schema built on top of raw RDF.  But a raw RDF graph would
> > not
> > >>> work with the Blazegraph TP3 interface if it doesn't follow the PG
> > >> schema.
> > >>> On Mon, Dec 21, 2015 at 3:22 AM, pieter-gmail <
> pieter.martin@gmail.com
> > >
> > >>> wrote:
> > >>>
> > >>>> Found this recently, fyi
> > >>>>
> > >>>> http://arxiv.org/abs/1409.3288
> > >>>>
> > >>>> Cheers
> > >>>> Pieter
> > >>>>
> > >>>> On 12/12/2015 16:01, pieter wrote:
> > >>>>> Hi,
> > >>>>>
> > >>>>> I know many rdf vendors are TinkerPop providers.
> > >>>>>
> > >>>>> Can it work in the other direction, i.e. can a rdf dataset be
> loaded
> > >>>>> into a TinkerPop database?
> > >>>>> Is it possible to load any rdf dataset into TinkerPop without loss?
> > >>>>>
> > >>>>> Is this something TinkerPop is interested in?
> > >>>>>
> > >>>>> Thanks
> > >>>>> Pieter
> > >>>>>
> > >>>>>
> > >>
> >
> >
>

Re: rdf questions

Posted by Mike Personick <mi...@systap.com>.

Neither generic RDF -> PG nor PG -> generic RDF can be lossless.

Even with reification you can't solve the problem that PG allows multiple
edge instances with the same (s, p, o).  Same from, to, and edge label.
Olaf and I went back and forth on this point quite a bit and we agreed that
this made the two models irreconcilable without using some specific RDF
schema to keep track of edge ids.  PG -> RDF cannot be lossless without a
custom RDF schema for edge identifiers.  There are other things about PG
that force a conversion to RDF to require a RDF/PG schema, such as
Cardinality.list.  RDF lends itself well to Cardinality.single and
Cardinality.set, list not so much.

The reverse is true is well, RDF -> PG is not lossless either, since there
are many things you can do in RDF that you cannot do with PG.  One example
is edges connecting edges.  Another example is unlimited depth of property
properties with RDF* or old-school reification.

Long and short of it - you can have a feature limited PG implementation
that works with some kinds of generic RDF, or you can have a full featured
PG implementation that only works on RDF graphs conforming to some specific
schema to deal with the impedance mismatches between RDF and PG.  What
might be nice in the future is decide on a standardized RDF/PG schema so
that each vendor doesn't do it differently.



On Mon, Dec 21, 2015 at 10:56 PM, pieter-gmail <pi...@gmail.com>
wrote:

> Thanks for the explanation.
> Cheers
> Pieter
>
> On 21/12/2015 23:59, Joshua Shinavier wrote:
> > Hi Pieter,
> >
> > Yes, it is possible to map RDF graphs, and also RDF datasets (collections
> > of graphs with names), to a property graph data model without loss.
> > GraphSail [1] had to do this in order to use Blueprints-based DBs as
> triple
> > stores, querying over the RDF data and retrieving it.  GraphSail uses a
> > mapping almost identical to that of SailGraph [2] (see a schematic on
> that
> > page), which maps RDF to property graphs.  For the "opposite" of
> GraphSail
> > and SailGraph (i.e. arbitrary property graphs to RDF), see
> > PropertyGraphSail [3].
> >
> > Olaf Hartig discusses some incompatibilities between PG and RDF in his
> > paper.  Some essential things to keep in mind:
> > *) In mapping between PG and RDF, you are forced to treat edges either as
> > resources or as statements.  If edges are statements, then any edge
> > properties are lost in the PG-->RDF mapping (unless you were to do
> > something a little weird with named graphs: one graph per statement).  If
> > edges are vertices, the RDF format is quite verbose and is not
> symmetrical
> > with a useful RDF-->PG mapping.  PropertyGraphSail supports two styles of
> > mapping: one "verbose" (edge-reified) and the other compact (edges as
> > statements).
> > *) A straightforward RDF(datasets)-->PG mapping treats resources as
> > vertices and statements as edges or as properties depending on the
> object,
> > but this is more complicated if you want to preserve named graph
> metadata,
> > as you can't attach metadata to PG properties.  You already have a bit
> of a
> > problem if you want to do anything graph-like with named graph metadata,
> as
> > PG is not a hypergraph data model (no edges from edges).
> >
> > Best,
> >
> > Josh
> >
> >
> > [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
> > [2] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
> > [3]
> >
> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
> >
> >
> >
> >
> >
> >
> > On Mon, Dec 21, 2015 at 11:28 AM, pieter-gmail <pi...@gmail.com>
> > wrote:
> >
> >> Thanks, I have just started on the rdf path.
> >>
> >> When you say the RDF data model and PG data model are not 100% aligned
> >> does that mean that for some RDF models to PG model there will be
> >> information loss or just a increase in complexity and efficiency?
> >>
> >> Does the same hold for the other way around PG model to RDF model?
> >>
> >> I'll have a look at your implementation to understand things better.
> >>
> >> Cheers
> >> Pieter
> >>
> >> On 21/12/2015 18:46, Mike Personick wrote:
> >>> The RDF data model and the PG data model are not 100% aligned.  I know
> >>> there have been a few academic papers on the subject.  For Blazegraph I
> >> am
> >>> using a PG schema built on top of raw RDF.  But a raw RDF graph would
> not
> >>> work with the Blazegraph TP3 interface if it doesn't follow the PG
> >> schema.
> >>> On Mon, Dec 21, 2015 at 3:22 AM, pieter-gmail <pieter.martin@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Found this recently, fyi
> >>>>
> >>>> http://arxiv.org/abs/1409.3288
> >>>>
> >>>> Cheers
> >>>> Pieter
> >>>>
> >>>> On 12/12/2015 16:01, pieter wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I know many rdf vendors are TinkerPop providers.
> >>>>>
> >>>>> Can it work in the other direction, i.e. can a rdf dataset be loaded
> >>>>> into a TinkerPop database?
> >>>>> Is it possible to load any rdf dataset into TinkerPop without loss?
> >>>>>
> >>>>> Is this something TinkerPop is interested in?
> >>>>>
> >>>>> Thanks
> >>>>> Pieter
> >>>>>
> >>>>>
> >>
>
>

Re: rdf questions

Posted by pieter-gmail <pi...@gmail.com>.

Thanks for the explanation.
Cheers
Pieter

On 21/12/2015 23:59, Joshua Shinavier wrote:
> Hi Pieter,
>
> Yes, it is possible to map RDF graphs, and also RDF datasets (collections
> of graphs with names), to a property graph data model without loss.
> GraphSail [1] had to do this in order to use Blueprints-based DBs as triple
> stores, querying over the RDF data and retrieving it.  GraphSail uses a
> mapping almost identical to that of SailGraph [2] (see a schematic on that
> page), which maps RDF to property graphs.  For the "opposite" of GraphSail
> and SailGraph (i.e. arbitrary property graphs to RDF), see
> PropertyGraphSail [3].
>
> Olaf Hartig discusses some incompatibilities between PG and RDF in his
> paper.  Some essential things to keep in mind:
> *) In mapping between PG and RDF, you are forced to treat edges either as
> resources or as statements.  If edges are statements, then any edge
> properties are lost in the PG-->RDF mapping (unless you were to do
> something a little weird with named graphs: one graph per statement).  If
> edges are vertices, the RDF format is quite verbose and is not symmetrical
> with a useful RDF-->PG mapping.  PropertyGraphSail supports two styles of
> mapping: one "verbose" (edge-reified) and the other compact (edges as
> statements).
> *) A straightforward RDF(datasets)-->PG mapping treats resources as
> vertices and statements as edges or as properties depending on the object,
> but this is more complicated if you want to preserve named graph metadata,
> as you can't attach metadata to PG properties.  You already have a bit of a
> problem if you want to do anything graph-like with named graph metadata, as
> PG is not a hypergraph data model (no edges from edges).
>
> Best,
>
> Josh
>
>
> [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
> [2] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
> [3]
> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
>
>
>
>
>
>
> On Mon, Dec 21, 2015 at 11:28 AM, pieter-gmail <pi...@gmail.com>
> wrote:
>
>> Thanks, I have just started on the rdf path.
>>
>> When you say the RDF data model and PG data model are not 100% aligned
>> does that mean that for some RDF models to PG model there will be
>> information loss or just a increase in complexity and efficiency?
>>
>> Does the same hold for the other way around PG model to RDF model?
>>
>> I'll have a look at your implementation to understand things better.
>>
>> Cheers
>> Pieter
>>
>> On 21/12/2015 18:46, Mike Personick wrote:
>>> The RDF data model and the PG data model are not 100% aligned.  I know
>>> there have been a few academic papers on the subject.  For Blazegraph I
>> am
>>> using a PG schema built on top of raw RDF.  But a raw RDF graph would not
>>> work with the Blazegraph TP3 interface if it doesn't follow the PG
>> schema.
>>> On Mon, Dec 21, 2015 at 3:22 AM, pieter-gmail <pi...@gmail.com>
>>> wrote:
>>>
>>>> Found this recently, fyi
>>>>
>>>> http://arxiv.org/abs/1409.3288
>>>>
>>>> Cheers
>>>> Pieter
>>>>
>>>> On 12/12/2015 16:01, pieter wrote:
>>>>> Hi,
>>>>>
>>>>> I know many rdf vendors are TinkerPop providers.
>>>>>
>>>>> Can it work in the other direction, i.e. can a rdf dataset be loaded
>>>>> into a TinkerPop database?
>>>>> Is it possible to load any rdf dataset into TinkerPop without loss?
>>>>>
>>>>> Is this something TinkerPop is interested in?
>>>>>
>>>>> Thanks
>>>>> Pieter
>>>>>
>>>>>
>>

Re: rdf questions

Posted by Joshua Shinavier <jo...@fortytwo.net>.

Hi Pieter,

Yes, it is possible to map RDF graphs, and also RDF datasets (collections
of graphs with names), to a property graph data model without loss.
GraphSail [1] had to do this in order to use Blueprints-based DBs as triple
stores, querying over the RDF data and retrieving it.  GraphSail uses a
mapping almost identical to that of SailGraph [2] (see a schematic on that
page), which maps RDF to property graphs.  For the "opposite" of GraphSail
and SailGraph (i.e. arbitrary property graphs to RDF), see
PropertyGraphSail [3].

Olaf Hartig discusses some incompatibilities between PG and RDF in his
paper.  Some essential things to keep in mind:
*) In mapping between PG and RDF, you are forced to treat edges either as
resources or as statements.  If edges are statements, then any edge
properties are lost in the PG-->RDF mapping (unless you were to do
something a little weird with named graphs: one graph per statement).  If
edges are vertices, the RDF format is quite verbose and is not symmetrical
with a useful RDF-->PG mapping.  PropertyGraphSail supports two styles of
mapping: one "verbose" (edge-reified) and the other compact (edges as
statements).
*) A straightforward RDF(datasets)-->PG mapping treats resources as
vertices and statements as edges or as properties depending on the object,
but this is more complicated if you want to preserve named graph metadata,
as you can't attach metadata to PG properties.  You already have a bit of a
problem if you want to do anything graph-like with named graph metadata, as
PG is not a hypergraph data model (no edges from edges).

Best,

Josh

[1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
[2] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
[3]
https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation

On Mon, Dec 21, 2015 at 11:28 AM, pieter-gmail <pi...@gmail.com>
wrote:

> Thanks, I have just started on the rdf path.
>
> When you say the RDF data model and PG data model are not 100% aligned
> does that mean that for some RDF models to PG model there will be
> information loss or just a increase in complexity and efficiency?
>
> Does the same hold for the other way around PG model to RDF model?
>
> I'll have a look at your implementation to understand things better.
>
> Cheers
> Pieter
>
> On 21/12/2015 18:46, Mike Personick wrote:
> > The RDF data model and the PG data model are not 100% aligned.  I know
> > there have been a few academic papers on the subject.  For Blazegraph I
> am
> > using a PG schema built on top of raw RDF.  But a raw RDF graph would not
> > work with the Blazegraph TP3 interface if it doesn't follow the PG
> schema.
> >
> > On Mon, Dec 21, 2015 at 3:22 AM, pieter-gmail <pi...@gmail.com>
> > wrote:
> >
> >> Found this recently, fyi
> >>
> >> http://arxiv.org/abs/1409.3288
> >>
> >> Cheers
> >> Pieter
> >>
> >> On 12/12/2015 16:01, pieter wrote:
> >>> Hi,
> >>>
> >>> I know many rdf vendors are TinkerPop providers.
> >>>
> >>> Can it work in the other direction, i.e. can a rdf dataset be loaded
> >>> into a TinkerPop database?
> >>> Is it possible to load any rdf dataset into TinkerPop without loss?
> >>>
> >>> Is this something TinkerPop is interested in?
> >>>
> >>> Thanks
> >>> Pieter
> >>>
> >>>
> >>
>
>

Re: rdf questions

Posted by pieter-gmail <pi...@gmail.com>.

Thanks, I have just started on the rdf path.

When you say the RDF data model and PG data model are not 100% aligned
does that mean that for some RDF models to PG model there will be
information loss or just a increase in complexity and efficiency?

Does the same hold for the other way around PG model to RDF model?

I'll have a look at your implementation to understand things better.

Cheers
Pieter

On 21/12/2015 18:46, Mike Personick wrote:
> The RDF data model and the PG data model are not 100% aligned.  I know
> there have been a few academic papers on the subject.  For Blazegraph I am
> using a PG schema built on top of raw RDF.  But a raw RDF graph would not
> work with the Blazegraph TP3 interface if it doesn't follow the PG schema.
>
> On Mon, Dec 21, 2015 at 3:22 AM, pieter-gmail <pi...@gmail.com>
> wrote:
>
>> Found this recently, fyi
>>
>> http://arxiv.org/abs/1409.3288
>>
>> Cheers
>> Pieter
>>
>> On 12/12/2015 16:01, pieter wrote:
>>> Hi,
>>>
>>> I know many rdf vendors are TinkerPop providers.
>>>
>>> Can it work in the other direction, i.e. can a rdf dataset be loaded
>>> into a TinkerPop database?
>>> Is it possible to load any rdf dataset into TinkerPop without loss?
>>>
>>> Is this something TinkerPop is interested in?
>>>
>>> Thanks
>>> Pieter
>>>
>>>
>>

Re: rdf questions

Posted by Mike Personick <mi...@systap.com>.

The RDF data model and the PG data model are not 100% aligned.  I know
there have been a few academic papers on the subject.  For Blazegraph I am
using a PG schema built on top of raw RDF.  But a raw RDF graph would not
work with the Blazegraph TP3 interface if it doesn't follow the PG schema.

On Mon, Dec 21, 2015 at 3:22 AM, pieter-gmail <pi...@gmail.com>
wrote:

> Found this recently, fyi
>
> http://arxiv.org/abs/1409.3288
>
> Cheers
> Pieter
>
> On 12/12/2015 16:01, pieter wrote:
> > Hi,
> >
> > I know many rdf vendors are TinkerPop providers.
> >
> > Can it work in the other direction, i.e. can a rdf dataset be loaded
> > into a TinkerPop database?
> > Is it possible to load any rdf dataset into TinkerPop without loss?
> >
> > Is this something TinkerPop is interested in?
> >
> > Thanks
> > Pieter
> >
> >
>
>

Re: rdf questions

Posted by pieter-gmail <pi...@gmail.com>.

Found this recently, fyi

http://arxiv.org/abs/1409.3288

Cheers
Pieter

On 12/12/2015 16:01, pieter wrote:
> Hi,
>
> I know many rdf vendors are TinkerPop providers.
>
> Can it work in the other direction, i.e. can a rdf dataset be loaded
> into a TinkerPop database?
> Is it possible to load any rdf dataset into TinkerPop without loss?
>
> Is this something TinkerPop is interested in?
>
> Thanks
> Pieter
>
>