You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by Stephen Mallette <sp...@gmail.com> on 2018/10/03 17:06:16 UTC

[DISCUSS] ReferenceStrategy

We currently have this situation where users get a fair bit of
inconsistency around the contents of graph elements depending on a matrix
of different usage options that we offer - here's just a few "options" as
examples:

1. Use embedded graph mode in OLTP and you likely get the implementation
version of a Vertex/Edge with accessible properties (e.g. TinkerVertex,
Neo4jVertex, etc)
2. Use embedded graph mode in OLAP and you get ReferenceVertex/Edge with no
properties
3. Use bytecode based requests with Gremlin Server and you get
ReferenceVertex/Edge with no properties
4. Use script based requests with Gremlin Server and you get
DetachedVertex/Edge with properties

All this chaos developed out of a healthy evolution of our view of "how
Gremlin should work and how it fits in the graph community." As I've
lamented before and I'll lament again, that if we'd foreseen "bytecode",
then a lot of things would have been more unified.

Anyway, irrespective of how we got here, I think this matrix of choices
ends up making things quite confusing for users.

To unify and simplify in 3.4.0, I think we could introduce a
ReferenceStrategy which would coerce graph elements to the "Reference".
Then we have some choices:

1. In the most extreme case, it could be installed as a default strategy.
All graphs was return the same stuff in whatever situation it was used.
Obviously, that breaks just about every test in existence and probably half
of the code on the planet that uses TinkerPop - everyone ready to not get
amazon packages delivered on time anymore over this? But, we're consistent!
:)
2. We install it as a default strategy in graphs in Gremlin Server. Still a
breaking change but at least Gremlin Server is completely consistent. Users
can uninstall the strategy if they don't like it and stuff will work as it
always did.
3. We simply supply ReferenceStrategy as an option and let users install it
for themselves to help bring greater consistency to their installations.

I think we should consider option 2. Unless someone has other options to
consider, it seems like the easiest starting point for this that will
actually have an impact. There could be devils lying in wait in the
details, but I thought I'd feel folks on a bit on the general idea.

Thanks,

Stephen

Re: [DISCUSS] ReferenceStrategy

Posted by Stephen Mallette <sp...@gmail.com>.
i don't think so. we've been consistent in saying that we don't want to add
properties to elements when a full Gremlin Traversal Machine isn't in the
host programming language of the particular GLV.

On Wed, Oct 24, 2018 at 10:05 AM bryncooke@gmail.com <br...@gmail.com>
wrote:

>
>
> On 2018/10/22 18:21:13, Stephen Mallette <sp...@gmail.com> wrote:
> > https://issues.apache.org/jira/browse/TINKERPOP-2075
> >
> > On Mon, Oct 8, 2018 at 1:37 PM Stephen Mallette <sp...@gmail.com>
> > wrote:
> >
> > > > I'd like to propose 2a. Update Gremlin Server and TinkerGraph to
> behave
> > > in the desired way, this would set the tone for all Graph
> implementations
> > > to eventually adopt this consistent behaviour.
> > >
> > > a little scary, but i see your point. if we went that far then i guess
> we
> > > would also do neo4j. i will say that this change will be on the same
> order
> > > of effort as 1 because it will break a lot of tests in the test suite.
> in
> > > that sense, i'm not sure we're ready for that much change along 3.4.0.
> i'd
> > > propose that we consider "2a" only after we get "2" into place. Then
> we can
> > > determine how much damage 1 would do. Maybe it's not as bad as I
> think. An
> > > interesting side-effect of 1 is that it makes our Java test suite tests
> > > become more in line with the GLV tests - they would be roughly 1 to 1
> match
> > > in assertions after all this.
> > >
> > >
> > > On Fri, Oct 5, 2018 at 11:21 AM bryncooke@gmail.com <
> bryncooke@gmail.com>
> > > wrote:
> > >
> > >>
> > >>
> > >> On 2018/10/03 17:06:16, Stephen Mallette <sp...@gmail.com>
> wrote:
> > >> > We currently have this situation where users get a fair bit of
> > >> > inconsistency around the contents of graph elements depending on a
> > >> matrix
> > >> > of different usage options that we offer - here's just a few
> "options"
> > >> as
> > >> > examples:
> > >> >
> > >> > 1. Use embedded graph mode in OLTP and you likely get the
> implementation
> > >> > version of a Vertex/Edge with accessible properties (e.g.
> TinkerVertex,
> > >> > Neo4jVertex, etc)
> > >> > 2. Use embedded graph mode in OLAP and you get ReferenceVertex/Edge
> > >> with no
> > >> > properties
> > >> > 3. Use bytecode based requests with Gremlin Server and you get
> > >> > ReferenceVertex/Edge with no properties
> > >> > 4. Use script based requests with Gremlin Server and you get
> > >> > DetachedVertex/Edge with properties
> > >> >
> > >> > All this chaos developed out of a healthy evolution of our view of
> "how
> > >> > Gremlin should work and how it fits in the graph community." As I've
> > >> > lamented before and I'll lament again, that if we'd foreseen
> "bytecode",
> > >> > then a lot of things would have been more unified.
> > >> >
> > >> > Anyway, irrespective of how we got here, I think this matrix of
> choices
> > >> > ends up making things quite confusing for users.
> > >> >
> > >> > To unify and simplify in 3.4.0, I think we could introduce a
> > >> > ReferenceStrategy which would coerce graph elements to the
> "Reference".
> > >> > Then we have some choices:
> > >> >
> > >> > 1. In the most extreme case, it could be installed as a default
> > >> strategy.
> > >> > All graphs was return the same stuff in whatever situation it was
> used.
> > >> > Obviously, that breaks just about every test in existence and
> probably
> > >> half
> > >> > of the code on the planet that uses TinkerPop - everyone ready to
> not
> > >> get
> > >> > amazon packages delivered on time anymore over this? But, we're
> > >> consistent!
> > >> > :)
> > >> > 2. We install it as a default strategy in graphs in Gremlin Server.
> > >> Still a
> > >> > breaking change but at least Gremlin Server is completely
> consistent.
> > >> Users
> > >> > can uninstall the strategy if they don't like it and stuff will
> work as
> > >> it
> > >> > always did.
> > >> > 3. We simply supply ReferenceStrategy as an option and let users
> > >> install it
> > >> > for themselves to help bring greater consistency to their
> installations.
> > >> >
> > >> > I think we should consider option 2. Unless someone has other
> options to
> > >> > consider, it seems like the easiest starting point for this that
> will
> > >> > actually have an impact. There could be devils lying in wait in the
> > >> > details, but I thought I'd feel folks on a bit on the general idea.
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Stephen
> > >> >
> > >>
> > >>
> > >> The thing that worries me is that to go for option 2 still leaves
> > >> TinkerGraph behaving in a way that is not consistent over the wire.
> > >>
> > >> The first thing that many users will do is fire up gremlin console and
> > >> create a TinkerGraph without Gremlin Server. This immediately give
> them a
> > >> false impression that properties on elements are available.
> > >>
> > >> I'd like to propose 2a. Update Gremlin Server and TinkerGraph to
> behave
> > >> in the desired way, this would set the tone for all Graph
> implementations
> > >> to eventually adopt this consistent behaviour.
> > >>
> > >> Bryn
> > >>
> > >
> >
> Just for completeness is it an option to support detached elements for
> GLVs?
>
> If GLVs did support detached elements then this would significantly reduce
> the impact of this change for users as this would mostly align with what
> they have now.
>
>
>

Re: [DISCUSS] ReferenceStrategy

Posted by br...@gmail.com, br...@gmail.com.

On 2018/10/22 18:21:13, Stephen Mallette <sp...@gmail.com> wrote: 
> https://issues.apache.org/jira/browse/TINKERPOP-2075
> 
> On Mon, Oct 8, 2018 at 1:37 PM Stephen Mallette <sp...@gmail.com>
> wrote:
> 
> > > I'd like to propose 2a. Update Gremlin Server and TinkerGraph to behave
> > in the desired way, this would set the tone for all Graph implementations
> > to eventually adopt this consistent behaviour.
> >
> > a little scary, but i see your point. if we went that far then i guess we
> > would also do neo4j. i will say that this change will be on the same order
> > of effort as 1 because it will break a lot of tests in the test suite. in
> > that sense, i'm not sure we're ready for that much change along 3.4.0. i'd
> > propose that we consider "2a" only after we get "2" into place. Then we can
> > determine how much damage 1 would do. Maybe it's not as bad as I think. An
> > interesting side-effect of 1 is that it makes our Java test suite tests
> > become more in line with the GLV tests - they would be roughly 1 to 1 match
> > in assertions after all this.
> >
> >
> > On Fri, Oct 5, 2018 at 11:21 AM bryncooke@gmail.com <br...@gmail.com>
> > wrote:
> >
> >>
> >>
> >> On 2018/10/03 17:06:16, Stephen Mallette <sp...@gmail.com> wrote:
> >> > We currently have this situation where users get a fair bit of
> >> > inconsistency around the contents of graph elements depending on a
> >> matrix
> >> > of different usage options that we offer - here's just a few "options"
> >> as
> >> > examples:
> >> >
> >> > 1. Use embedded graph mode in OLTP and you likely get the implementation
> >> > version of a Vertex/Edge with accessible properties (e.g. TinkerVertex,
> >> > Neo4jVertex, etc)
> >> > 2. Use embedded graph mode in OLAP and you get ReferenceVertex/Edge
> >> with no
> >> > properties
> >> > 3. Use bytecode based requests with Gremlin Server and you get
> >> > ReferenceVertex/Edge with no properties
> >> > 4. Use script based requests with Gremlin Server and you get
> >> > DetachedVertex/Edge with properties
> >> >
> >> > All this chaos developed out of a healthy evolution of our view of "how
> >> > Gremlin should work and how it fits in the graph community." As I've
> >> > lamented before and I'll lament again, that if we'd foreseen "bytecode",
> >> > then a lot of things would have been more unified.
> >> >
> >> > Anyway, irrespective of how we got here, I think this matrix of choices
> >> > ends up making things quite confusing for users.
> >> >
> >> > To unify and simplify in 3.4.0, I think we could introduce a
> >> > ReferenceStrategy which would coerce graph elements to the "Reference".
> >> > Then we have some choices:
> >> >
> >> > 1. In the most extreme case, it could be installed as a default
> >> strategy.
> >> > All graphs was return the same stuff in whatever situation it was used.
> >> > Obviously, that breaks just about every test in existence and probably
> >> half
> >> > of the code on the planet that uses TinkerPop - everyone ready to not
> >> get
> >> > amazon packages delivered on time anymore over this? But, we're
> >> consistent!
> >> > :)
> >> > 2. We install it as a default strategy in graphs in Gremlin Server.
> >> Still a
> >> > breaking change but at least Gremlin Server is completely consistent.
> >> Users
> >> > can uninstall the strategy if they don't like it and stuff will work as
> >> it
> >> > always did.
> >> > 3. We simply supply ReferenceStrategy as an option and let users
> >> install it
> >> > for themselves to help bring greater consistency to their installations.
> >> >
> >> > I think we should consider option 2. Unless someone has other options to
> >> > consider, it seems like the easiest starting point for this that will
> >> > actually have an impact. There could be devils lying in wait in the
> >> > details, but I thought I'd feel folks on a bit on the general idea.
> >> >
> >> > Thanks,
> >> >
> >> > Stephen
> >> >
> >>
> >>
> >> The thing that worries me is that to go for option 2 still leaves
> >> TinkerGraph behaving in a way that is not consistent over the wire.
> >>
> >> The first thing that many users will do is fire up gremlin console and
> >> create a TinkerGraph without Gremlin Server. This immediately give them a
> >> false impression that properties on elements are available.
> >>
> >> I'd like to propose 2a. Update Gremlin Server and TinkerGraph to behave
> >> in the desired way, this would set the tone for all Graph implementations
> >> to eventually adopt this consistent behaviour.
> >>
> >> Bryn
> >>
> >
> 
Just for completeness is it an option to support detached elements for GLVs? 

If GLVs did support detached elements then this would significantly reduce the impact of this change for users as this would mostly align with what they have now.



Re: [DISCUSS] ReferenceStrategy

Posted by Stephen Mallette <sp...@gmail.com>.
https://issues.apache.org/jira/browse/TINKERPOP-2075

On Mon, Oct 8, 2018 at 1:37 PM Stephen Mallette <sp...@gmail.com>
wrote:

> > I'd like to propose 2a. Update Gremlin Server and TinkerGraph to behave
> in the desired way, this would set the tone for all Graph implementations
> to eventually adopt this consistent behaviour.
>
> a little scary, but i see your point. if we went that far then i guess we
> would also do neo4j. i will say that this change will be on the same order
> of effort as 1 because it will break a lot of tests in the test suite. in
> that sense, i'm not sure we're ready for that much change along 3.4.0. i'd
> propose that we consider "2a" only after we get "2" into place. Then we can
> determine how much damage 1 would do. Maybe it's not as bad as I think. An
> interesting side-effect of 1 is that it makes our Java test suite tests
> become more in line with the GLV tests - they would be roughly 1 to 1 match
> in assertions after all this.
>
>
> On Fri, Oct 5, 2018 at 11:21 AM bryncooke@gmail.com <br...@gmail.com>
> wrote:
>
>>
>>
>> On 2018/10/03 17:06:16, Stephen Mallette <sp...@gmail.com> wrote:
>> > We currently have this situation where users get a fair bit of
>> > inconsistency around the contents of graph elements depending on a
>> matrix
>> > of different usage options that we offer - here's just a few "options"
>> as
>> > examples:
>> >
>> > 1. Use embedded graph mode in OLTP and you likely get the implementation
>> > version of a Vertex/Edge with accessible properties (e.g. TinkerVertex,
>> > Neo4jVertex, etc)
>> > 2. Use embedded graph mode in OLAP and you get ReferenceVertex/Edge
>> with no
>> > properties
>> > 3. Use bytecode based requests with Gremlin Server and you get
>> > ReferenceVertex/Edge with no properties
>> > 4. Use script based requests with Gremlin Server and you get
>> > DetachedVertex/Edge with properties
>> >
>> > All this chaos developed out of a healthy evolution of our view of "how
>> > Gremlin should work and how it fits in the graph community." As I've
>> > lamented before and I'll lament again, that if we'd foreseen "bytecode",
>> > then a lot of things would have been more unified.
>> >
>> > Anyway, irrespective of how we got here, I think this matrix of choices
>> > ends up making things quite confusing for users.
>> >
>> > To unify and simplify in 3.4.0, I think we could introduce a
>> > ReferenceStrategy which would coerce graph elements to the "Reference".
>> > Then we have some choices:
>> >
>> > 1. In the most extreme case, it could be installed as a default
>> strategy.
>> > All graphs was return the same stuff in whatever situation it was used.
>> > Obviously, that breaks just about every test in existence and probably
>> half
>> > of the code on the planet that uses TinkerPop - everyone ready to not
>> get
>> > amazon packages delivered on time anymore over this? But, we're
>> consistent!
>> > :)
>> > 2. We install it as a default strategy in graphs in Gremlin Server.
>> Still a
>> > breaking change but at least Gremlin Server is completely consistent.
>> Users
>> > can uninstall the strategy if they don't like it and stuff will work as
>> it
>> > always did.
>> > 3. We simply supply ReferenceStrategy as an option and let users
>> install it
>> > for themselves to help bring greater consistency to their installations.
>> >
>> > I think we should consider option 2. Unless someone has other options to
>> > consider, it seems like the easiest starting point for this that will
>> > actually have an impact. There could be devils lying in wait in the
>> > details, but I thought I'd feel folks on a bit on the general idea.
>> >
>> > Thanks,
>> >
>> > Stephen
>> >
>>
>>
>> The thing that worries me is that to go for option 2 still leaves
>> TinkerGraph behaving in a way that is not consistent over the wire.
>>
>> The first thing that many users will do is fire up gremlin console and
>> create a TinkerGraph without Gremlin Server. This immediately give them a
>> false impression that properties on elements are available.
>>
>> I'd like to propose 2a. Update Gremlin Server and TinkerGraph to behave
>> in the desired way, this would set the tone for all Graph implementations
>> to eventually adopt this consistent behaviour.
>>
>> Bryn
>>
>

Re: [DISCUSS] ReferenceStrategy

Posted by Stephen Mallette <sp...@gmail.com>.
> I'd like to propose 2a. Update Gremlin Server and TinkerGraph to behave
in the desired way, this would set the tone for all Graph implementations
to eventually adopt this consistent behaviour.

a little scary, but i see your point. if we went that far then i guess we
would also do neo4j. i will say that this change will be on the same order
of effort as 1 because it will break a lot of tests in the test suite. in
that sense, i'm not sure we're ready for that much change along 3.4.0. i'd
propose that we consider "2a" only after we get "2" into place. Then we can
determine how much damage 1 would do. Maybe it's not as bad as I think. An
interesting side-effect of 1 is that it makes our Java test suite tests
become more in line with the GLV tests - they would be roughly 1 to 1 match
in assertions after all this.


On Fri, Oct 5, 2018 at 11:21 AM bryncooke@gmail.com <br...@gmail.com>
wrote:

>
>
> On 2018/10/03 17:06:16, Stephen Mallette <sp...@gmail.com> wrote:
> > We currently have this situation where users get a fair bit of
> > inconsistency around the contents of graph elements depending on a matrix
> > of different usage options that we offer - here's just a few "options" as
> > examples:
> >
> > 1. Use embedded graph mode in OLTP and you likely get the implementation
> > version of a Vertex/Edge with accessible properties (e.g. TinkerVertex,
> > Neo4jVertex, etc)
> > 2. Use embedded graph mode in OLAP and you get ReferenceVertex/Edge with
> no
> > properties
> > 3. Use bytecode based requests with Gremlin Server and you get
> > ReferenceVertex/Edge with no properties
> > 4. Use script based requests with Gremlin Server and you get
> > DetachedVertex/Edge with properties
> >
> > All this chaos developed out of a healthy evolution of our view of "how
> > Gremlin should work and how it fits in the graph community." As I've
> > lamented before and I'll lament again, that if we'd foreseen "bytecode",
> > then a lot of things would have been more unified.
> >
> > Anyway, irrespective of how we got here, I think this matrix of choices
> > ends up making things quite confusing for users.
> >
> > To unify and simplify in 3.4.0, I think we could introduce a
> > ReferenceStrategy which would coerce graph elements to the "Reference".
> > Then we have some choices:
> >
> > 1. In the most extreme case, it could be installed as a default strategy.
> > All graphs was return the same stuff in whatever situation it was used.
> > Obviously, that breaks just about every test in existence and probably
> half
> > of the code on the planet that uses TinkerPop - everyone ready to not get
> > amazon packages delivered on time anymore over this? But, we're
> consistent!
> > :)
> > 2. We install it as a default strategy in graphs in Gremlin Server.
> Still a
> > breaking change but at least Gremlin Server is completely consistent.
> Users
> > can uninstall the strategy if they don't like it and stuff will work as
> it
> > always did.
> > 3. We simply supply ReferenceStrategy as an option and let users install
> it
> > for themselves to help bring greater consistency to their installations.
> >
> > I think we should consider option 2. Unless someone has other options to
> > consider, it seems like the easiest starting point for this that will
> > actually have an impact. There could be devils lying in wait in the
> > details, but I thought I'd feel folks on a bit on the general idea.
> >
> > Thanks,
> >
> > Stephen
> >
>
>
> The thing that worries me is that to go for option 2 still leaves
> TinkerGraph behaving in a way that is not consistent over the wire.
>
> The first thing that many users will do is fire up gremlin console and
> create a TinkerGraph without Gremlin Server. This immediately give them a
> false impression that properties on elements are available.
>
> I'd like to propose 2a. Update Gremlin Server and TinkerGraph to behave in
> the desired way, this would set the tone for all Graph implementations to
> eventually adopt this consistent behaviour.
>
> Bryn
>

Re: [DISCUSS] ReferenceStrategy

Posted by br...@gmail.com, br...@gmail.com.

On 2018/10/03 17:06:16, Stephen Mallette <sp...@gmail.com> wrote: 
> We currently have this situation where users get a fair bit of
> inconsistency around the contents of graph elements depending on a matrix
> of different usage options that we offer - here's just a few "options" as
> examples:
> 
> 1. Use embedded graph mode in OLTP and you likely get the implementation
> version of a Vertex/Edge with accessible properties (e.g. TinkerVertex,
> Neo4jVertex, etc)
> 2. Use embedded graph mode in OLAP and you get ReferenceVertex/Edge with no
> properties
> 3. Use bytecode based requests with Gremlin Server and you get
> ReferenceVertex/Edge with no properties
> 4. Use script based requests with Gremlin Server and you get
> DetachedVertex/Edge with properties
> 
> All this chaos developed out of a healthy evolution of our view of "how
> Gremlin should work and how it fits in the graph community." As I've
> lamented before and I'll lament again, that if we'd foreseen "bytecode",
> then a lot of things would have been more unified.
> 
> Anyway, irrespective of how we got here, I think this matrix of choices
> ends up making things quite confusing for users.
> 
> To unify and simplify in 3.4.0, I think we could introduce a
> ReferenceStrategy which would coerce graph elements to the "Reference".
> Then we have some choices:
> 
> 1. In the most extreme case, it could be installed as a default strategy.
> All graphs was return the same stuff in whatever situation it was used.
> Obviously, that breaks just about every test in existence and probably half
> of the code on the planet that uses TinkerPop - everyone ready to not get
> amazon packages delivered on time anymore over this? But, we're consistent!
> :)
> 2. We install it as a default strategy in graphs in Gremlin Server. Still a
> breaking change but at least Gremlin Server is completely consistent. Users
> can uninstall the strategy if they don't like it and stuff will work as it
> always did.
> 3. We simply supply ReferenceStrategy as an option and let users install it
> for themselves to help bring greater consistency to their installations.
> 
> I think we should consider option 2. Unless someone has other options to
> consider, it seems like the easiest starting point for this that will
> actually have an impact. There could be devils lying in wait in the
> details, but I thought I'd feel folks on a bit on the general idea.
> 
> Thanks,
> 
> Stephen
> 


The thing that worries me is that to go for option 2 still leaves TinkerGraph behaving in a way that is not consistent over the wire.

The first thing that many users will do is fire up gremlin console and create a TinkerGraph without Gremlin Server. This immediately give them a false impression that properties on elements are available.

I'd like to propose 2a. Update Gremlin Server and TinkerGraph to behave in the desired way, this would set the tone for all Graph implementations to eventually adopt this consistent behaviour.

Bryn