You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by Marko Rodriguez <ok...@gmail.com> on 2016/01/30 18:09:49 UTC

[DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Hello,

With TinkerPop 3.1.1 about to be put up for VOTE, we can start to turn our attentions towards 3.1.2 and 3.2.0.

I was thinking it would be good to have a planning session to organize JIRA and discuss order of operations. However, JIRA planning sessions are a bit boring as they are too "nitty gritty," so perhaps we can use this thread to discuss what we (as individuals) would like to accomplish for 3.1.2 and 3.2.0 in general. This way, we have more summaries of everyone's desires and then the specifics can be shakin' out in JIRA. As such, here are my desires:

TinkerPop 3.1.2
	* Test a new shuffle optimization idea in SparkGraphComputer and if its efficient, use it.
	* Benchmark GiraphGraphComputer at scale and optimize it where need be.

TinkerPop 3.2.0
	* Gremlin DSLs -- e.g. social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
	* TraversalSource API redesign. g = graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The current TraversalSourceBuilder model is horrible.
	* OLTP/OLAP-mixed traversal -- e.g. OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
	* GraphComputer API additions for intelligent data access -- e.g. g.V().count() does not need to grab all the edges of the graph!
	* Bulking beyond Long -- support BigInteger, Complex numbers, Doubles, etc.
	* Redesign TraverserRequirements -- this is a rats nest that didn't really work out as planned and its inefficient. I think I can make this a lot more simple.
	* ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate with GremlinServer people on this).
	* Scope.local & Scope.global rethinking -- count(local), dedup(local) … too many -- this is not manageable! What about  g.V().groupCount().inside(order().limit(10)) instead of g.V().groupCount().order(local).limit(local,10).
	* Clean up HadoopGraph configurations -- Why do we have gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We should just have one configuration: gremlin.hadoop.graphInputClass.
	* Publish a tutorial on the Gremlin VM and compiling other languages to it. I would really like to have the gremlin-examples/ package that Jason/Stephen were talking about.
	* Optimize Gryo serialization and SparkGraphComputer's GryoSerializer.

Those are the big ticket items that I would like to get handle for the next versions of TinkerPop. 

What are your thoughts on these and what are your thoughts on what you plan to accomplish in this next push?

Take care,
Marko.

http://markorodriguez.com

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Ted Wilmes <tw...@gmail.com>.

Hey Marko,
Was this the ticket you were thinking?
https://issues.apache.org/jira/browse/TINKERPOP-1014

I think that could definitely make a difference.  I'll try to get the JMH
bit finished in the next few weeks so we have a
good base to work from.

I'll also put an investigation ticket in with some concrete examples for
the code-gen idea.

--Ted


On Tue, Feb 2, 2016 at 10:16 AM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hello Ted,
>
> Dude, all of those items you list are so crucial to my efforts. If that
> infrastructure existed, I would have so much more confidence in each
> release.
>
> > * TinkerPop-965 - optimize strategy application
>
> This will be huge. I have a ticket around somewhere about getting rid of
> the TraversalRequirements-concept which should help to reduce clock-cycles.
>
> > * I'd like to familiarize myself with the OLAP side of things and
> hopefully
> > begin to help out a bit with those tickets.
>
> I can teach you this area of the codebase. There are lots of little nitty
> gritty areas here and there that could use work. Moreover, small
> optimizations lead to huge performance benefits because in OLAP, when you
> are processing billions of edges, if each edge is 100k less in size, well,
> thats (Xbillion * 100k)-less serialization/network traffic/etc. I think
> getting our Serializers micro-micro would be epic.
>
> > * Explore possibility of introducing code generation into certain steps
> to
> > cut down on traversal execution overhead.  Granted, the gains would need
> to
> > outweigh the cost of compilation of generated code.  Other folks have had
> > good success with this technique in certain scenarios.  See the code
> > generation portion of Spark's Tungsten project for one example [1].
>
> Hmmm… Can you provide an example? (ticket)
>
> Thanks,
> Marko.
>
> http://markorodriguez.com
>
>
> > On Mon, Feb 1, 2016 at 5:01 PM, Marko Rodriguez <ok...@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> Please bring this up on the respective ticket and we can discuss there.
> >> This way we don't steal this thread from 3.1.2 and 3.2.0 planning.
> >>
> >> Thanks,
> >> Marko.
> >>
> >> http://markorodriguez.com
> >>
> >> On Feb 1, 2016, at 2:30 PM, Marvin Froeder <ve...@gmail.com> wrote:
> >>
> >>> Any plans on making the return methods generic so we can specialize
> them?
> >>>
> >>> For instance, instead of
> >>> public interface Graph {
> >>> public Iterator<Vertex> vertices(final Object... vertexIds);
> >>> }
> >>> to have
> >>> public interface Graph<V extends Vertex> {
> >>> public Iterator<V> vertices(final Object... vertexIds);
> >>> }
> >>>
> >>>
> >>> That way, orientdb-gremlin can expose custom operations and even
> enforce
> >>> types for things like Element.id() and many other creative thinking =D
> >>>
> >>>
> >>> On Tue, Feb 2, 2016 at 3:52 AM, Marko Rodriguez <ok...@gmail.com>
> >>> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I think 3.2.0 can include breaking changes if need be. However, I
> >> believe
> >>>> all the things that I want to do will be have @Deprecated backwards
> >>>> compatible solutions.
> >>>>
> >>>> Marko.
> >>>>
> >>>> http://markorodriguez.com
> >>>>
> >>>> On Feb 1, 2016, at 4:25 AM, Stephen Mallette <sp...@gmail.com>
> >> wrote:
> >>>>
> >>>>> Is 3.2.0 going to be considered a "breaking" version in the sense
> that
> >> we
> >>>>> need to alter some APIs? or will it be possible to do 3.2.0 without
> >> that?
> >>>>> I'm in favor of a breaking version for 3.2.0 so that we can try to
> >> clean
> >>>> up
> >>>>> some old code especially if we have other changes driving that.
> >>>>>
> >>>>> On Sat, Jan 30, 2016 at 7:55 PM, Marko Rodriguez <
> okrammarko@gmail.com
> >>>
> >>>>> wrote:
> >>>>>
> >>>>>> Hello Pieter,
> >>>>>>
> >>>>>>> A tad selfish I know,
> >>>>>>> but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I
> am
> >>>>>>> waiting for.
> >>>>>>
> >>>>>> The things I listed are what I care about and what I plan to work
> on.
> >> If
> >>>>>> you have things you care about, you can work on those. If you are
> >>>> unsure of
> >>>>>> a development strategy, perhaps you can get others excited about
> your
> >>>> idea
> >>>>>> with a [DISCUSS], work through pros/cons, get some buy in, etc. From
> >>>> there,
> >>>>>> develop the idea, test it, document it, and ultimately provide a PR
> to
> >>>> get
> >>>>>> it merged into a release line.
> >>>>>>
> >>>>>>      http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/dev/developer/
> >>>>>>
> >>>>>> SIDENOTE: A few people emailed me personally saying comments to the
> >>>>>> effect: "Please deliver X, Y, Z feature." Note, if you want
> something
> >>>> done,
> >>>>>> do it. If you don't know how to do it, learn it. If you don't know
> how
> >>>> to
> >>>>>> learn it, ask and we can point you in the right direction. If you
> >> don't
> >>>>>> know how to ask -- I know you are lying cause you asked me to
> deliver
> >>>> X, Y,
> >>>>>> Z. Gotcha!
> >>>>>>
> >>>>>> Take care,
> >>>>>> Marko.
> >>>>>>
> >>>>>> http://markorodriguez.com
> >>>>>>
> >>>>>>>
> >>>>>>> Cheers
> >>>>>>> Pieter
> >>>>>>>
> >>>>>>> On 30/01/2016 19:09, Marko Rodriguez wrote:
> >>>>>>>> Hello,
> >>>>>>>>
> >>>>>>>> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to
> >> turn
> >>>>>> our attentions towards 3.1.2 and 3.2.0.
> >>>>>>>>
> >>>>>>>> I was thinking it would be good to have a planning session to
> >> organize
> >>>>>> JIRA and discuss order of operations. However, JIRA planning
> sessions
> >>>> are a
> >>>>>> bit boring as they are too "nitty gritty," so perhaps we can use
> this
> >>>>>> thread to discuss what we (as individuals) would like to accomplish
> >> for
> >>>>>> 3.1.2 and 3.2.0 in general. This way, we have more summaries of
> >>>> everyone's
> >>>>>> desires and then the specifics can be shakin' out in JIRA. As such,
> >> here
> >>>>>> are my desires:
> >>>>>>>>
> >>>>>>>> TinkerPop 3.1.2
> >>>>>>>>   * Test a new shuffle optimization idea in SparkGraphComputer and
> >>>>>> if its efficient, use it.
> >>>>>>>>   * Benchmark GiraphGraphComputer at scale and optimize it where
> >>>>>> need be.
> >>>>>>>>
> >>>>>>>> TinkerPop 3.2.0
> >>>>>>>>   * Gremlin DSLs -- e.g.
> >>>>>>
> >>>>
> >>
> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
> >>>>>>>>   * TraversalSource API redesign. g =
> >>>>>> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The
> >>>> current
> >>>>>> TraversalSourceBuilder model is horrible.
> >>>>>>>>   * OLTP/OLAP-mixed traversal -- e.g.
> >>>>>>
> >>>>
> >>
> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
> >>>>>>>>   * GraphComputer API additions for intelligent data access --
> e.g.
> >>>>>> g.V().count() does not need to grab all the edges of the graph!
> >>>>>>>>   * Bulking beyond Long -- support BigInteger, Complex numbers,
> >>>>>> Doubles, etc.
> >>>>>>>>   * Redesign TraverserRequirements -- this is a rats nest that
> >>>>>> didn't really work out as planned and its inefficient. I think I can
> >>>> make
> >>>>>> this a lot more simple.
> >>>>>>>>   * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
> >>>>>> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep]
> (collaborate
> >>>> with
> >>>>>> GremlinServer people on this).
> >>>>>>>>   * Scope.local & Scope.global rethinking -- count(local),
> >>>>>> dedup(local) … too many -- this is not manageable! What about
> >>>>>> g.V().groupCount().inside(order().limit(10)) instead of
> >>>>>> g.V().groupCount().order(local).limit(local,10).
> >>>>>>>>   * Clean up HadoopGraph configurations -- Why do we have
> >>>>>> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We
> >>>> should
> >>>>>> just have one configuration: gremlin.hadoop.graphInputClass.
> >>>>>>>>   * Publish a tutorial on the Gremlin VM and compiling other
> >>>>>> languages to it. I would really like to have the gremlin-examples/
> >>>> package
> >>>>>> that Jason/Stephen were talking about.
> >>>>>>>>   * Optimize Gryo serialization and SparkGraphComputer's
> >>>>>> GryoSerializer.
> >>>>>>>>
> >>>>>>>> Those are the big ticket items that I would like to get handle for
> >> the
> >>>>>> next versions of TinkerPop.
> >>>>>>>>
> >>>>>>>> What are your thoughts on these and what are your thoughts on what
> >> you
> >>>>>> plan to accomplish in this next push?
> >>>>>>>>
> >>>>>>>> Take care,
> >>>>>>>> Marko.
> >>>>>>>>
> >>>>>>>> http://markorodriguez.com
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Marko Rodriguez <ok...@gmail.com>.

Hello Ted,

Dude, all of those items you list are so crucial to my efforts. If that infrastructure existed, I would have so much more confidence in each release.

> * TinkerPop-965 - optimize strategy application

This will be huge. I have a ticket around somewhere about getting rid of the TraversalRequirements-concept which should help to reduce clock-cycles.

> * I'd like to familiarize myself with the OLAP side of things and hopefully
> begin to help out a bit with those tickets.

I can teach you this area of the codebase. There are lots of little nitty gritty areas here and there that could use work. Moreover, small optimizations lead to huge performance benefits because in OLAP, when you are processing billions of edges, if each edge is 100k less in size, well, thats (Xbillion * 100k)-less serialization/network traffic/etc. I think getting our Serializers micro-micro would be epic.

> * Explore possibility of introducing code generation into certain steps to
> cut down on traversal execution overhead.  Granted, the gains would need to
> outweigh the cost of compilation of generated code.  Other folks have had
> good success with this technique in certain scenarios.  See the code
> generation portion of Spark's Tungsten project for one example [1].

Hmmm… Can you provide an example? (ticket)

Thanks,
Marko.

http://markorodriguez.com


> On Mon, Feb 1, 2016 at 5:01 PM, Marko Rodriguez <ok...@gmail.com>
> wrote:
> 
>> Hi,
>> 
>> Please bring this up on the respective ticket and we can discuss there.
>> This way we don't steal this thread from 3.1.2 and 3.2.0 planning.
>> 
>> Thanks,
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>> On Feb 1, 2016, at 2:30 PM, Marvin Froeder <ve...@gmail.com> wrote:
>> 
>>> Any plans on making the return methods generic so we can specialize them?
>>> 
>>> For instance, instead of
>>> public interface Graph {
>>> public Iterator<Vertex> vertices(final Object... vertexIds);
>>> }
>>> to have
>>> public interface Graph<V extends Vertex> {
>>> public Iterator<V> vertices(final Object... vertexIds);
>>> }
>>> 
>>> 
>>> That way, orientdb-gremlin can expose custom operations and even enforce
>>> types for things like Element.id() and many other creative thinking =D
>>> 
>>> 
>>> On Tue, Feb 2, 2016 at 3:52 AM, Marko Rodriguez <ok...@gmail.com>
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I think 3.2.0 can include breaking changes if need be. However, I
>> believe
>>>> all the things that I want to do will be have @Deprecated backwards
>>>> compatible solutions.
>>>> 
>>>> Marko.
>>>> 
>>>> http://markorodriguez.com
>>>> 
>>>> On Feb 1, 2016, at 4:25 AM, Stephen Mallette <sp...@gmail.com>
>> wrote:
>>>> 
>>>>> Is 3.2.0 going to be considered a "breaking" version in the sense that
>> we
>>>>> need to alter some APIs? or will it be possible to do 3.2.0 without
>> that?
>>>>> I'm in favor of a breaking version for 3.2.0 so that we can try to
>> clean
>>>> up
>>>>> some old code especially if we have other changes driving that.
>>>>> 
>>>>> On Sat, Jan 30, 2016 at 7:55 PM, Marko Rodriguez <okrammarko@gmail.com
>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hello Pieter,
>>>>>> 
>>>>>>> A tad selfish I know,
>>>>>>> but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I am
>>>>>>> waiting for.
>>>>>> 
>>>>>> The things I listed are what I care about and what I plan to work on.
>> If
>>>>>> you have things you care about, you can work on those. If you are
>>>> unsure of
>>>>>> a development strategy, perhaps you can get others excited about your
>>>> idea
>>>>>> with a [DISCUSS], work through pros/cons, get some buy in, etc. From
>>>> there,
>>>>>> develop the idea, test it, document it, and ultimately provide a PR to
>>>> get
>>>>>> it merged into a release line.
>>>>>> 
>>>>>>      http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/dev/developer/
>>>>>> 
>>>>>> SIDENOTE: A few people emailed me personally saying comments to the
>>>>>> effect: "Please deliver X, Y, Z feature." Note, if you want something
>>>> done,
>>>>>> do it. If you don't know how to do it, learn it. If you don't know how
>>>> to
>>>>>> learn it, ask and we can point you in the right direction. If you
>> don't
>>>>>> know how to ask -- I know you are lying cause you asked me to deliver
>>>> X, Y,
>>>>>> Z. Gotcha!
>>>>>> 
>>>>>> Take care,
>>>>>> Marko.
>>>>>> 
>>>>>> http://markorodriguez.com
>>>>>> 
>>>>>>> 
>>>>>>> Cheers
>>>>>>> Pieter
>>>>>>> 
>>>>>>> On 30/01/2016 19:09, Marko Rodriguez wrote:
>>>>>>>> Hello,
>>>>>>>> 
>>>>>>>> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to
>> turn
>>>>>> our attentions towards 3.1.2 and 3.2.0.
>>>>>>>> 
>>>>>>>> I was thinking it would be good to have a planning session to
>> organize
>>>>>> JIRA and discuss order of operations. However, JIRA planning sessions
>>>> are a
>>>>>> bit boring as they are too "nitty gritty," so perhaps we can use this
>>>>>> thread to discuss what we (as individuals) would like to accomplish
>> for
>>>>>> 3.1.2 and 3.2.0 in general. This way, we have more summaries of
>>>> everyone's
>>>>>> desires and then the specifics can be shakin' out in JIRA. As such,
>> here
>>>>>> are my desires:
>>>>>>>> 
>>>>>>>> TinkerPop 3.1.2
>>>>>>>>   * Test a new shuffle optimization idea in SparkGraphComputer and
>>>>>> if its efficient, use it.
>>>>>>>>   * Benchmark GiraphGraphComputer at scale and optimize it where
>>>>>> need be.
>>>>>>>> 
>>>>>>>> TinkerPop 3.2.0
>>>>>>>>   * Gremlin DSLs -- e.g.
>>>>>> 
>>>> 
>> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
>>>>>>>>   * TraversalSource API redesign. g =
>>>>>> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The
>>>> current
>>>>>> TraversalSourceBuilder model is horrible.
>>>>>>>>   * OLTP/OLAP-mixed traversal -- e.g.
>>>>>> 
>>>> 
>> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
>>>>>>>>   * GraphComputer API additions for intelligent data access -- e.g.
>>>>>> g.V().count() does not need to grab all the edges of the graph!
>>>>>>>>   * Bulking beyond Long -- support BigInteger, Complex numbers,
>>>>>> Doubles, etc.
>>>>>>>>   * Redesign TraverserRequirements -- this is a rats nest that
>>>>>> didn't really work out as planned and its inefficient. I think I can
>>>> make
>>>>>> this a lot more simple.
>>>>>>>>   * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
>>>>>> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate
>>>> with
>>>>>> GremlinServer people on this).
>>>>>>>>   * Scope.local & Scope.global rethinking -- count(local),
>>>>>> dedup(local) … too many -- this is not manageable! What about
>>>>>> g.V().groupCount().inside(order().limit(10)) instead of
>>>>>> g.V().groupCount().order(local).limit(local,10).
>>>>>>>>   * Clean up HadoopGraph configurations -- Why do we have
>>>>>> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We
>>>> should
>>>>>> just have one configuration: gremlin.hadoop.graphInputClass.
>>>>>>>>   * Publish a tutorial on the Gremlin VM and compiling other
>>>>>> languages to it. I would really like to have the gremlin-examples/
>>>> package
>>>>>> that Jason/Stephen were talking about.
>>>>>>>>   * Optimize Gryo serialization and SparkGraphComputer's
>>>>>> GryoSerializer.
>>>>>>>> 
>>>>>>>> Those are the big ticket items that I would like to get handle for
>> the
>>>>>> next versions of TinkerPop.
>>>>>>>> 
>>>>>>>> What are your thoughts on these and what are your thoughts on what
>> you
>>>>>> plan to accomplish in this next push?
>>>>>>>> 
>>>>>>>> Take care,
>>>>>>>> Marko.
>>>>>>>> 
>>>>>>>> http://markorodriguez.com
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Ted Wilmes <tw...@gmail.com>.

Hi guys,
Here are a few things I've been thinking about for 3.1.2 and 3.2.

TinkerPop 3.1.2
* TinkerPop-1016 - finish first set of JMH benchmarks
* TinkerPop-965 - optimize strategy application
* Study Ferma/TinkerPop benchmark, see where the performance deltas are
coming from, and create tickets as necessary

TinkerPop 3.2
* I'd like to familiarize myself with the OLAP side of things and hopefully
begin to help out a bit with those tickets.

Profiling results ultimately need to drive targeted performance
improvements but I've been thinking about experimenting with a few more
"out-there" ideas:
* Explore possibility of introducing code generation into certain steps to
cut down on traversal execution overhead.  Granted, the gains would need to
outweigh the cost of compilation of generated code.  Other folks have had
good success with this technique in certain scenarios.  See the code
generation portion of Spark's Tungsten project for one example [1].
* Take code generation one step further and explore the possibility of a
"code generation" strategy that given a traversal (maybe only simple ones
to start), the strategy will generate pure Java code which can then be
compiled on the fly and executed in a highly performant manner.

--Ted

[1]
https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html

On Mon, Feb 1, 2016 at 5:01 PM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hi,
>
> Please bring this up on the respective ticket and we can discuss there.
> This way we don't steal this thread from 3.1.2 and 3.2.0 planning.
>
> Thanks,
> Marko.
>
> http://markorodriguez.com
>
> On Feb 1, 2016, at 2:30 PM, Marvin Froeder <ve...@gmail.com> wrote:
>
> > Any plans on making the return methods generic so we can specialize them?
> >
> > For instance, instead of
> > public interface Graph {
> > public Iterator<Vertex> vertices(final Object... vertexIds);
> > }
> > to have
> > public interface Graph<V extends Vertex> {
> > public Iterator<V> vertices(final Object... vertexIds);
> > }
> >
> >
> > That way, orientdb-gremlin can expose custom operations and even enforce
> > types for things like Element.id() and many other creative thinking =D
> >
> >
> > On Tue, Feb 2, 2016 at 3:52 AM, Marko Rodriguez <ok...@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> I think 3.2.0 can include breaking changes if need be. However, I
> believe
> >> all the things that I want to do will be have @Deprecated backwards
> >> compatible solutions.
> >>
> >> Marko.
> >>
> >> http://markorodriguez.com
> >>
> >> On Feb 1, 2016, at 4:25 AM, Stephen Mallette <sp...@gmail.com>
> wrote:
> >>
> >>> Is 3.2.0 going to be considered a "breaking" version in the sense that
> we
> >>> need to alter some APIs? or will it be possible to do 3.2.0 without
> that?
> >>> I'm in favor of a breaking version for 3.2.0 so that we can try to
> clean
> >> up
> >>> some old code especially if we have other changes driving that.
> >>>
> >>> On Sat, Jan 30, 2016 at 7:55 PM, Marko Rodriguez <okrammarko@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Hello Pieter,
> >>>>
> >>>>> A tad selfish I know,
> >>>>> but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I am
> >>>>> waiting for.
> >>>>
> >>>> The things I listed are what I care about and what I plan to work on.
> If
> >>>> you have things you care about, you can work on those. If you are
> >> unsure of
> >>>> a development strategy, perhaps you can get others excited about your
> >> idea
> >>>> with a [DISCUSS], work through pros/cons, get some buy in, etc. From
> >> there,
> >>>> develop the idea, test it, document it, and ultimately provide a PR to
> >> get
> >>>> it merged into a release line.
> >>>>
> >>>>       http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/dev/developer/
> >>>>
> >>>> SIDENOTE: A few people emailed me personally saying comments to the
> >>>> effect: "Please deliver X, Y, Z feature." Note, if you want something
> >> done,
> >>>> do it. If you don't know how to do it, learn it. If you don't know how
> >> to
> >>>> learn it, ask and we can point you in the right direction. If you
> don't
> >>>> know how to ask -- I know you are lying cause you asked me to deliver
> >> X, Y,
> >>>> Z. Gotcha!
> >>>>
> >>>> Take care,
> >>>> Marko.
> >>>>
> >>>> http://markorodriguez.com
> >>>>
> >>>>>
> >>>>> Cheers
> >>>>> Pieter
> >>>>>
> >>>>> On 30/01/2016 19:09, Marko Rodriguez wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to
> turn
> >>>> our attentions towards 3.1.2 and 3.2.0.
> >>>>>>
> >>>>>> I was thinking it would be good to have a planning session to
> organize
> >>>> JIRA and discuss order of operations. However, JIRA planning sessions
> >> are a
> >>>> bit boring as they are too "nitty gritty," so perhaps we can use this
> >>>> thread to discuss what we (as individuals) would like to accomplish
> for
> >>>> 3.1.2 and 3.2.0 in general. This way, we have more summaries of
> >> everyone's
> >>>> desires and then the specifics can be shakin' out in JIRA. As such,
> here
> >>>> are my desires:
> >>>>>>
> >>>>>> TinkerPop 3.1.2
> >>>>>>    * Test a new shuffle optimization idea in SparkGraphComputer and
> >>>> if its efficient, use it.
> >>>>>>    * Benchmark GiraphGraphComputer at scale and optimize it where
> >>>> need be.
> >>>>>>
> >>>>>> TinkerPop 3.2.0
> >>>>>>    * Gremlin DSLs -- e.g.
> >>>>
> >>
> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
> >>>>>>    * TraversalSource API redesign. g =
> >>>> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The
> >> current
> >>>> TraversalSourceBuilder model is horrible.
> >>>>>>    * OLTP/OLAP-mixed traversal -- e.g.
> >>>>
> >>
> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
> >>>>>>    * GraphComputer API additions for intelligent data access -- e.g.
> >>>> g.V().count() does not need to grab all the edges of the graph!
> >>>>>>    * Bulking beyond Long -- support BigInteger, Complex numbers,
> >>>> Doubles, etc.
> >>>>>>    * Redesign TraverserRequirements -- this is a rats nest that
> >>>> didn't really work out as planned and its inefficient. I think I can
> >> make
> >>>> this a lot more simple.
> >>>>>>    * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
> >>>> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate
> >> with
> >>>> GremlinServer people on this).
> >>>>>>    * Scope.local & Scope.global rethinking -- count(local),
> >>>> dedup(local) … too many -- this is not manageable! What about
> >>>> g.V().groupCount().inside(order().limit(10)) instead of
> >>>> g.V().groupCount().order(local).limit(local,10).
> >>>>>>    * Clean up HadoopGraph configurations -- Why do we have
> >>>> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We
> >> should
> >>>> just have one configuration: gremlin.hadoop.graphInputClass.
> >>>>>>    * Publish a tutorial on the Gremlin VM and compiling other
> >>>> languages to it. I would really like to have the gremlin-examples/
> >> package
> >>>> that Jason/Stephen were talking about.
> >>>>>>    * Optimize Gryo serialization and SparkGraphComputer's
> >>>> GryoSerializer.
> >>>>>>
> >>>>>> Those are the big ticket items that I would like to get handle for
> the
> >>>> next versions of TinkerPop.
> >>>>>>
> >>>>>> What are your thoughts on these and what are your thoughts on what
> you
> >>>> plan to accomplish in this next push?
> >>>>>>
> >>>>>> Take care,
> >>>>>> Marko.
> >>>>>>
> >>>>>> http://markorodriguez.com
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Marko Rodriguez <ok...@gmail.com>.

Hi,

Please bring this up on the respective ticket and we can discuss there. This way we don't steal this thread from 3.1.2 and 3.2.0 planning.

Thanks,
Marko.

http://markorodriguez.com

On Feb 1, 2016, at 2:30 PM, Marvin Froeder <ve...@gmail.com> wrote:

> Any plans on making the return methods generic so we can specialize them?
> 
> For instance, instead of
> public interface Graph {
> public Iterator<Vertex> vertices(final Object... vertexIds);
> }
> to have
> public interface Graph<V extends Vertex> {
> public Iterator<V> vertices(final Object... vertexIds);
> }
> 
> 
> That way, orientdb-gremlin can expose custom operations and even enforce
> types for things like Element.id() and many other creative thinking =D
> 
> 
> On Tue, Feb 2, 2016 at 3:52 AM, Marko Rodriguez <ok...@gmail.com>
> wrote:
> 
>> Hi,
>> 
>> I think 3.2.0 can include breaking changes if need be. However, I believe
>> all the things that I want to do will be have @Deprecated backwards
>> compatible solutions.
>> 
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>> On Feb 1, 2016, at 4:25 AM, Stephen Mallette <sp...@gmail.com> wrote:
>> 
>>> Is 3.2.0 going to be considered a "breaking" version in the sense that we
>>> need to alter some APIs? or will it be possible to do 3.2.0 without that?
>>> I'm in favor of a breaking version for 3.2.0 so that we can try to clean
>> up
>>> some old code especially if we have other changes driving that.
>>> 
>>> On Sat, Jan 30, 2016 at 7:55 PM, Marko Rodriguez <ok...@gmail.com>
>>> wrote:
>>> 
>>>> Hello Pieter,
>>>> 
>>>>> A tad selfish I know,
>>>>> but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I am
>>>>> waiting for.
>>>> 
>>>> The things I listed are what I care about and what I plan to work on. If
>>>> you have things you care about, you can work on those. If you are
>> unsure of
>>>> a development strategy, perhaps you can get others excited about your
>> idea
>>>> with a [DISCUSS], work through pros/cons, get some buy in, etc. From
>> there,
>>>> develop the idea, test it, document it, and ultimately provide a PR to
>> get
>>>> it merged into a release line.
>>>> 
>>>>       http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/dev/developer/
>>>> 
>>>> SIDENOTE: A few people emailed me personally saying comments to the
>>>> effect: "Please deliver X, Y, Z feature." Note, if you want something
>> done,
>>>> do it. If you don't know how to do it, learn it. If you don't know how
>> to
>>>> learn it, ask and we can point you in the right direction. If you don't
>>>> know how to ask -- I know you are lying cause you asked me to deliver
>> X, Y,
>>>> Z. Gotcha!
>>>> 
>>>> Take care,
>>>> Marko.
>>>> 
>>>> http://markorodriguez.com
>>>> 
>>>>> 
>>>>> Cheers
>>>>> Pieter
>>>>> 
>>>>> On 30/01/2016 19:09, Marko Rodriguez wrote:
>>>>>> Hello,
>>>>>> 
>>>>>> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to turn
>>>> our attentions towards 3.1.2 and 3.2.0.
>>>>>> 
>>>>>> I was thinking it would be good to have a planning session to organize
>>>> JIRA and discuss order of operations. However, JIRA planning sessions
>> are a
>>>> bit boring as they are too "nitty gritty," so perhaps we can use this
>>>> thread to discuss what we (as individuals) would like to accomplish for
>>>> 3.1.2 and 3.2.0 in general. This way, we have more summaries of
>> everyone's
>>>> desires and then the specifics can be shakin' out in JIRA. As such, here
>>>> are my desires:
>>>>>> 
>>>>>> TinkerPop 3.1.2
>>>>>>    * Test a new shuffle optimization idea in SparkGraphComputer and
>>>> if its efficient, use it.
>>>>>>    * Benchmark GiraphGraphComputer at scale and optimize it where
>>>> need be.
>>>>>> 
>>>>>> TinkerPop 3.2.0
>>>>>>    * Gremlin DSLs -- e.g.
>>>> 
>> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
>>>>>>    * TraversalSource API redesign. g =
>>>> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The
>> current
>>>> TraversalSourceBuilder model is horrible.
>>>>>>    * OLTP/OLAP-mixed traversal -- e.g.
>>>> 
>> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
>>>>>>    * GraphComputer API additions for intelligent data access -- e.g.
>>>> g.V().count() does not need to grab all the edges of the graph!
>>>>>>    * Bulking beyond Long -- support BigInteger, Complex numbers,
>>>> Doubles, etc.
>>>>>>    * Redesign TraverserRequirements -- this is a rats nest that
>>>> didn't really work out as planned and its inefficient. I think I can
>> make
>>>> this a lot more simple.
>>>>>>    * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
>>>> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate
>> with
>>>> GremlinServer people on this).
>>>>>>    * Scope.local & Scope.global rethinking -- count(local),
>>>> dedup(local) … too many -- this is not manageable! What about
>>>> g.V().groupCount().inside(order().limit(10)) instead of
>>>> g.V().groupCount().order(local).limit(local,10).
>>>>>>    * Clean up HadoopGraph configurations -- Why do we have
>>>> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We
>> should
>>>> just have one configuration: gremlin.hadoop.graphInputClass.
>>>>>>    * Publish a tutorial on the Gremlin VM and compiling other
>>>> languages to it. I would really like to have the gremlin-examples/
>> package
>>>> that Jason/Stephen were talking about.
>>>>>>    * Optimize Gryo serialization and SparkGraphComputer's
>>>> GryoSerializer.
>>>>>> 
>>>>>> Those are the big ticket items that I would like to get handle for the
>>>> next versions of TinkerPop.
>>>>>> 
>>>>>> What are your thoughts on these and what are your thoughts on what you
>>>> plan to accomplish in this next push?
>>>>>> 
>>>>>> Take care,
>>>>>> Marko.
>>>>>> 
>>>>>> http://markorodriguez.com
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>> 
>>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Marvin Froeder <ve...@gmail.com>.

Created a jira with the idea I'm proposing.
https://issues.apache.org/jira/browse/TINKERPOP-1124

I would be willing to work on it.

On Tue, Feb 2, 2016 at 10:30 AM, Marvin Froeder <ve...@gmail.com> wrote:

> Any plans on making the return methods generic so we can specialize them?
>
> For instance, instead of
> public interface Graph {
> public Iterator<Vertex> vertices(final Object... vertexIds);
> }
> to have
> public interface Graph<V extends Vertex> {
> public Iterator<V> vertices(final Object... vertexIds);
> }
>
>
> That way, orientdb-gremlin can expose custom operations and even enforce
> types for things like Element.id() and many other creative thinking =D
>
>
> On Tue, Feb 2, 2016 at 3:52 AM, Marko Rodriguez <ok...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I think 3.2.0 can include breaking changes if need be. However, I believe
>> all the things that I want to do will be have @Deprecated backwards
>> compatible solutions.
>>
>> Marko.
>>
>> http://markorodriguez.com
>>
>> On Feb 1, 2016, at 4:25 AM, Stephen Mallette <sp...@gmail.com>
>> wrote:
>>
>> > Is 3.2.0 going to be considered a "breaking" version in the sense that
>> we
>> > need to alter some APIs? or will it be possible to do 3.2.0 without
>> that?
>> > I'm in favor of a breaking version for 3.2.0 so that we can try to
>> clean up
>> > some old code especially if we have other changes driving that.
>> >
>> > On Sat, Jan 30, 2016 at 7:55 PM, Marko Rodriguez <ok...@gmail.com>
>> > wrote:
>> >
>> >> Hello Pieter,
>> >>
>> >>> A tad selfish I know,
>> >>> but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I am
>> >>> waiting for.
>> >>
>> >> The things I listed are what I care about and what I plan to work on.
>> If
>> >> you have things you care about, you can work on those. If you are
>> unsure of
>> >> a development strategy, perhaps you can get others excited about your
>> idea
>> >> with a [DISCUSS], work through pros/cons, get some buy in, etc. From
>> there,
>> >> develop the idea, test it, document it, and ultimately provide a PR to
>> get
>> >> it merged into a release line.
>> >>
>> >>        http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/dev/developer/
>> >>
>> >> SIDENOTE: A few people emailed me personally saying comments to the
>> >> effect: "Please deliver X, Y, Z feature." Note, if you want something
>> done,
>> >> do it. If you don't know how to do it, learn it. If you don't know how
>> to
>> >> learn it, ask and we can point you in the right direction. If you don't
>> >> know how to ask -- I know you are lying cause you asked me to deliver
>> X, Y,
>> >> Z. Gotcha!
>> >>
>> >> Take care,
>> >> Marko.
>> >>
>> >> http://markorodriguez.com
>> >>
>> >>>
>> >>> Cheers
>> >>> Pieter
>> >>>
>> >>> On 30/01/2016 19:09, Marko Rodriguez wrote:
>> >>>> Hello,
>> >>>>
>> >>>> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to
>> turn
>> >> our attentions towards 3.1.2 and 3.2.0.
>> >>>>
>> >>>> I was thinking it would be good to have a planning session to
>> organize
>> >> JIRA and discuss order of operations. However, JIRA planning sessions
>> are a
>> >> bit boring as they are too "nitty gritty," so perhaps we can use this
>> >> thread to discuss what we (as individuals) would like to accomplish for
>> >> 3.1.2 and 3.2.0 in general. This way, we have more summaries of
>> everyone's
>> >> desires and then the specifics can be shakin' out in JIRA. As such,
>> here
>> >> are my desires:
>> >>>>
>> >>>> TinkerPop 3.1.2
>> >>>>     * Test a new shuffle optimization idea in SparkGraphComputer and
>> >> if its efficient, use it.
>> >>>>     * Benchmark GiraphGraphComputer at scale and optimize it where
>> >> need be.
>> >>>>
>> >>>> TinkerPop 3.2.0
>> >>>>     * Gremlin DSLs -- e.g.
>> >>
>> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
>> >>>>     * TraversalSource API redesign. g =
>> >> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The
>> current
>> >> TraversalSourceBuilder model is horrible.
>> >>>>     * OLTP/OLAP-mixed traversal -- e.g.
>> >>
>> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
>> >>>>     * GraphComputer API additions for intelligent data access -- e.g.
>> >> g.V().count() does not need to grab all the edges of the graph!
>> >>>>     * Bulking beyond Long -- support BigInteger, Complex numbers,
>> >> Doubles, etc.
>> >>>>     * Redesign TraverserRequirements -- this is a rats nest that
>> >> didn't really work out as planned and its inefficient. I think I can
>> make
>> >> this a lot more simple.
>> >>>>     * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
>> >> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate
>> with
>> >> GremlinServer people on this).
>> >>>>     * Scope.local & Scope.global rethinking -- count(local),
>> >> dedup(local) … too many -- this is not manageable! What about
>> >> g.V().groupCount().inside(order().limit(10)) instead of
>> >> g.V().groupCount().order(local).limit(local,10).
>> >>>>     * Clean up HadoopGraph configurations -- Why do we have
>> >> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We
>> should
>> >> just have one configuration: gremlin.hadoop.graphInputClass.
>> >>>>     * Publish a tutorial on the Gremlin VM and compiling other
>> >> languages to it. I would really like to have the gremlin-examples/
>> package
>> >> that Jason/Stephen were talking about.
>> >>>>     * Optimize Gryo serialization and SparkGraphComputer's
>> >> GryoSerializer.
>> >>>>
>> >>>> Those are the big ticket items that I would like to get handle for
>> the
>> >> next versions of TinkerPop.
>> >>>>
>> >>>> What are your thoughts on these and what are your thoughts on what
>> you
>> >> plan to accomplish in this next push?
>> >>>>
>> >>>> Take care,
>> >>>> Marko.
>> >>>>
>> >>>> http://markorodriguez.com
>> >>>>
>> >>>>
>> >>>
>> >>
>> >>
>>
>>
>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Marvin Froeder <ve...@gmail.com>.

Any plans on making the return methods generic so we can specialize them?

For instance, instead of
public interface Graph {
public Iterator<Vertex> vertices(final Object... vertexIds);
}
to have
public interface Graph<V extends Vertex> {
public Iterator<V> vertices(final Object... vertexIds);
}


That way, orientdb-gremlin can expose custom operations and even enforce
types for things like Element.id() and many other creative thinking =D


On Tue, Feb 2, 2016 at 3:52 AM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hi,
>
> I think 3.2.0 can include breaking changes if need be. However, I believe
> all the things that I want to do will be have @Deprecated backwards
> compatible solutions.
>
> Marko.
>
> http://markorodriguez.com
>
> On Feb 1, 2016, at 4:25 AM, Stephen Mallette <sp...@gmail.com> wrote:
>
> > Is 3.2.0 going to be considered a "breaking" version in the sense that we
> > need to alter some APIs? or will it be possible to do 3.2.0 without that?
> > I'm in favor of a breaking version for 3.2.0 so that we can try to clean
> up
> > some old code especially if we have other changes driving that.
> >
> > On Sat, Jan 30, 2016 at 7:55 PM, Marko Rodriguez <ok...@gmail.com>
> > wrote:
> >
> >> Hello Pieter,
> >>
> >>> A tad selfish I know,
> >>> but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I am
> >>> waiting for.
> >>
> >> The things I listed are what I care about and what I plan to work on. If
> >> you have things you care about, you can work on those. If you are
> unsure of
> >> a development strategy, perhaps you can get others excited about your
> idea
> >> with a [DISCUSS], work through pros/cons, get some buy in, etc. From
> there,
> >> develop the idea, test it, document it, and ultimately provide a PR to
> get
> >> it merged into a release line.
> >>
> >>        http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/dev/developer/
> >>
> >> SIDENOTE: A few people emailed me personally saying comments to the
> >> effect: "Please deliver X, Y, Z feature." Note, if you want something
> done,
> >> do it. If you don't know how to do it, learn it. If you don't know how
> to
> >> learn it, ask and we can point you in the right direction. If you don't
> >> know how to ask -- I know you are lying cause you asked me to deliver
> X, Y,
> >> Z. Gotcha!
> >>
> >> Take care,
> >> Marko.
> >>
> >> http://markorodriguez.com
> >>
> >>>
> >>> Cheers
> >>> Pieter
> >>>
> >>> On 30/01/2016 19:09, Marko Rodriguez wrote:
> >>>> Hello,
> >>>>
> >>>> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to turn
> >> our attentions towards 3.1.2 and 3.2.0.
> >>>>
> >>>> I was thinking it would be good to have a planning session to organize
> >> JIRA and discuss order of operations. However, JIRA planning sessions
> are a
> >> bit boring as they are too "nitty gritty," so perhaps we can use this
> >> thread to discuss what we (as individuals) would like to accomplish for
> >> 3.1.2 and 3.2.0 in general. This way, we have more summaries of
> everyone's
> >> desires and then the specifics can be shakin' out in JIRA. As such, here
> >> are my desires:
> >>>>
> >>>> TinkerPop 3.1.2
> >>>>     * Test a new shuffle optimization idea in SparkGraphComputer and
> >> if its efficient, use it.
> >>>>     * Benchmark GiraphGraphComputer at scale and optimize it where
> >> need be.
> >>>>
> >>>> TinkerPop 3.2.0
> >>>>     * Gremlin DSLs -- e.g.
> >>
> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
> >>>>     * TraversalSource API redesign. g =
> >> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The
> current
> >> TraversalSourceBuilder model is horrible.
> >>>>     * OLTP/OLAP-mixed traversal -- e.g.
> >>
> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
> >>>>     * GraphComputer API additions for intelligent data access -- e.g.
> >> g.V().count() does not need to grab all the edges of the graph!
> >>>>     * Bulking beyond Long -- support BigInteger, Complex numbers,
> >> Doubles, etc.
> >>>>     * Redesign TraverserRequirements -- this is a rats nest that
> >> didn't really work out as planned and its inefficient. I think I can
> make
> >> this a lot more simple.
> >>>>     * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
> >> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate
> with
> >> GremlinServer people on this).
> >>>>     * Scope.local & Scope.global rethinking -- count(local),
> >> dedup(local) … too many -- this is not manageable! What about
> >> g.V().groupCount().inside(order().limit(10)) instead of
> >> g.V().groupCount().order(local).limit(local,10).
> >>>>     * Clean up HadoopGraph configurations -- Why do we have
> >> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We
> should
> >> just have one configuration: gremlin.hadoop.graphInputClass.
> >>>>     * Publish a tutorial on the Gremlin VM and compiling other
> >> languages to it. I would really like to have the gremlin-examples/
> package
> >> that Jason/Stephen were talking about.
> >>>>     * Optimize Gryo serialization and SparkGraphComputer's
> >> GryoSerializer.
> >>>>
> >>>> Those are the big ticket items that I would like to get handle for the
> >> next versions of TinkerPop.
> >>>>
> >>>> What are your thoughts on these and what are your thoughts on what you
> >> plan to accomplish in this next push?
> >>>>
> >>>> Take care,
> >>>> Marko.
> >>>>
> >>>> http://markorodriguez.com
> >>>>
> >>>>
> >>>
> >>
> >>
>
>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Marko Rodriguez <ok...@gmail.com>.

Hi,

I think 3.2.0 can include breaking changes if need be. However, I believe all the things that I want to do will be have @Deprecated backwards compatible solutions. 

Marko.

http://markorodriguez.com

On Feb 1, 2016, at 4:25 AM, Stephen Mallette <sp...@gmail.com> wrote:

> Is 3.2.0 going to be considered a "breaking" version in the sense that we
> need to alter some APIs? or will it be possible to do 3.2.0 without that?
> I'm in favor of a breaking version for 3.2.0 so that we can try to clean up
> some old code especially if we have other changes driving that.
> 
> On Sat, Jan 30, 2016 at 7:55 PM, Marko Rodriguez <ok...@gmail.com>
> wrote:
> 
>> Hello Pieter,
>> 
>>> A tad selfish I know,
>>> but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I am
>>> waiting for.
>> 
>> The things I listed are what I care about and what I plan to work on. If
>> you have things you care about, you can work on those. If you are unsure of
>> a development strategy, perhaps you can get others excited about your idea
>> with a [DISCUSS], work through pros/cons, get some buy in, etc. From there,
>> develop the idea, test it, document it, and ultimately provide a PR to get
>> it merged into a release line.
>> 
>>        http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/dev/developer/
>> 
>> SIDENOTE: A few people emailed me personally saying comments to the
>> effect: "Please deliver X, Y, Z feature." Note, if you want something done,
>> do it. If you don't know how to do it, learn it. If you don't know how to
>> learn it, ask and we can point you in the right direction. If you don't
>> know how to ask -- I know you are lying cause you asked me to deliver X, Y,
>> Z. Gotcha!
>> 
>> Take care,
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>>> 
>>> Cheers
>>> Pieter
>>> 
>>> On 30/01/2016 19:09, Marko Rodriguez wrote:
>>>> Hello,
>>>> 
>>>> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to turn
>> our attentions towards 3.1.2 and 3.2.0.
>>>> 
>>>> I was thinking it would be good to have a planning session to organize
>> JIRA and discuss order of operations. However, JIRA planning sessions are a
>> bit boring as they are too "nitty gritty," so perhaps we can use this
>> thread to discuss what we (as individuals) would like to accomplish for
>> 3.1.2 and 3.2.0 in general. This way, we have more summaries of everyone's
>> desires and then the specifics can be shakin' out in JIRA. As such, here
>> are my desires:
>>>> 
>>>> TinkerPop 3.1.2
>>>>     * Test a new shuffle optimization idea in SparkGraphComputer and
>> if its efficient, use it.
>>>>     * Benchmark GiraphGraphComputer at scale and optimize it where
>> need be.
>>>> 
>>>> TinkerPop 3.2.0
>>>>     * Gremlin DSLs -- e.g.
>> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
>>>>     * TraversalSource API redesign. g =
>> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The current
>> TraversalSourceBuilder model is horrible.
>>>>     * OLTP/OLAP-mixed traversal -- e.g.
>> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
>>>>     * GraphComputer API additions for intelligent data access -- e.g.
>> g.V().count() does not need to grab all the edges of the graph!
>>>>     * Bulking beyond Long -- support BigInteger, Complex numbers,
>> Doubles, etc.
>>>>     * Redesign TraverserRequirements -- this is a rats nest that
>> didn't really work out as planned and its inefficient. I think I can make
>> this a lot more simple.
>>>>     * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
>> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate with
>> GremlinServer people on this).
>>>>     * Scope.local & Scope.global rethinking -- count(local),
>> dedup(local) … too many -- this is not manageable! What about
>> g.V().groupCount().inside(order().limit(10)) instead of
>> g.V().groupCount().order(local).limit(local,10).
>>>>     * Clean up HadoopGraph configurations -- Why do we have
>> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We should
>> just have one configuration: gremlin.hadoop.graphInputClass.
>>>>     * Publish a tutorial on the Gremlin VM and compiling other
>> languages to it. I would really like to have the gremlin-examples/ package
>> that Jason/Stephen were talking about.
>>>>     * Optimize Gryo serialization and SparkGraphComputer's
>> GryoSerializer.
>>>> 
>>>> Those are the big ticket items that I would like to get handle for the
>> next versions of TinkerPop.
>>>> 
>>>> What are your thoughts on these and what are your thoughts on what you
>> plan to accomplish in this next push?
>>>> 
>>>> Take care,
>>>> Marko.
>>>> 
>>>> http://markorodriguez.com
>>>> 
>>>> 
>>> 
>> 
>>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Stephen Mallette <sp...@gmail.com>.

Is 3.2.0 going to be considered a "breaking" version in the sense that we
need to alter some APIs? or will it be possible to do 3.2.0 without that?
I'm in favor of a breaking version for 3.2.0 so that we can try to clean up
some old code especially if we have other changes driving that.

On Sat, Jan 30, 2016 at 7:55 PM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hello Pieter,
>
> > A tad selfish I know,
> > but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I am
> > waiting for.
>
> The things I listed are what I care about and what I plan to work on. If
> you have things you care about, you can work on those. If you are unsure of
> a development strategy, perhaps you can get others excited about your idea
> with a [DISCUSS], work through pros/cons, get some buy in, etc. From there,
> develop the idea, test it, document it, and ultimately provide a PR to get
> it merged into a release line.
>
>         http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/dev/developer/
>
> SIDENOTE: A few people emailed me personally saying comments to the
> effect: "Please deliver X, Y, Z feature." Note, if you want something done,
> do it. If you don't know how to do it, learn it. If you don't know how to
> learn it, ask and we can point you in the right direction. If you don't
> know how to ask -- I know you are lying cause you asked me to deliver X, Y,
> Z. Gotcha!
>
> Take care,
> Marko.
>
> http://markorodriguez.com
>
> >
> > Cheers
> > Pieter
> >
> > On 30/01/2016 19:09, Marko Rodriguez wrote:
> >> Hello,
> >>
> >> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to turn
> our attentions towards 3.1.2 and 3.2.0.
> >>
> >> I was thinking it would be good to have a planning session to organize
> JIRA and discuss order of operations. However, JIRA planning sessions are a
> bit boring as they are too "nitty gritty," so perhaps we can use this
> thread to discuss what we (as individuals) would like to accomplish for
> 3.1.2 and 3.2.0 in general. This way, we have more summaries of everyone's
> desires and then the specifics can be shakin' out in JIRA. As such, here
> are my desires:
> >>
> >> TinkerPop 3.1.2
> >>      * Test a new shuffle optimization idea in SparkGraphComputer and
> if its efficient, use it.
> >>      * Benchmark GiraphGraphComputer at scale and optimize it where
> need be.
> >>
> >> TinkerPop 3.2.0
> >>      * Gremlin DSLs -- e.g.
> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
> >>      * TraversalSource API redesign. g =
> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The current
> TraversalSourceBuilder model is horrible.
> >>      * OLTP/OLAP-mixed traversal -- e.g.
> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
> >>      * GraphComputer API additions for intelligent data access -- e.g.
> g.V().count() does not need to grab all the edges of the graph!
> >>      * Bulking beyond Long -- support BigInteger, Complex numbers,
> Doubles, etc.
> >>      * Redesign TraverserRequirements -- this is a rats nest that
> didn't really work out as planned and its inefficient. I think I can make
> this a lot more simple.
> >>      * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate with
> GremlinServer people on this).
> >>      * Scope.local & Scope.global rethinking -- count(local),
> dedup(local) … too many -- this is not manageable! What about
> g.V().groupCount().inside(order().limit(10)) instead of
> g.V().groupCount().order(local).limit(local,10).
> >>      * Clean up HadoopGraph configurations -- Why do we have
> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We should
> just have one configuration: gremlin.hadoop.graphInputClass.
> >>      * Publish a tutorial on the Gremlin VM and compiling other
> languages to it. I would really like to have the gremlin-examples/ package
> that Jason/Stephen were talking about.
> >>      * Optimize Gryo serialization and SparkGraphComputer's
> GryoSerializer.
> >>
> >> Those are the big ticket items that I would like to get handle for the
> next versions of TinkerPop.
> >>
> >> What are your thoughts on these and what are your thoughts on what you
> plan to accomplish in this next push?
> >>
> >> Take care,
> >> Marko.
> >>
> >> http://markorodriguez.com
> >>
> >>
> >
>
>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Marko Rodriguez <ok...@gmail.com>.

Hello Pieter,

> A tad selfish I know,
> but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I am
> waiting for.

The things I listed are what I care about and what I plan to work on. If you have things you care about, you can work on those. If you are unsure of a development strategy, perhaps you can get others excited about your idea with a [DISCUSS], work through pros/cons, get some buy in, etc. From there, develop the idea, test it, document it, and ultimately provide a PR to get it merged into a release line.

	http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/dev/developer/

SIDENOTE: A few people emailed me personally saying comments to the effect: "Please deliver X, Y, Z feature." Note, if you want something done, do it. If you don't know how to do it, learn it. If you don't know how to learn it, ask and we can point you in the right direction. If you don't know how to ask -- I know you are lying cause you asked me to deliver X, Y, Z. Gotcha!

Take care,
Marko.

http://markorodriguez.com

> 
> Cheers
> Pieter
> 
> On 30/01/2016 19:09, Marko Rodriguez wrote:
>> Hello,
>> 
>> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to turn our attentions towards 3.1.2 and 3.2.0.
>> 
>> I was thinking it would be good to have a planning session to organize JIRA and discuss order of operations. However, JIRA planning sessions are a bit boring as they are too "nitty gritty," so perhaps we can use this thread to discuss what we (as individuals) would like to accomplish for 3.1.2 and 3.2.0 in general. This way, we have more summaries of everyone's desires and then the specifics can be shakin' out in JIRA. As such, here are my desires:
>> 
>> TinkerPop 3.1.2
>> 	* Test a new shuffle optimization idea in SparkGraphComputer and if its efficient, use it.
>> 	* Benchmark GiraphGraphComputer at scale and optimize it where need be.
>> 
>> TinkerPop 3.2.0
>> 	* Gremlin DSLs -- e.g. social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
>> 	* TraversalSource API redesign. g = graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The current TraversalSourceBuilder model is horrible.
>> 	* OLTP/OLAP-mixed traversal -- e.g. OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
>> 	* GraphComputer API additions for intelligent data access -- e.g. g.V().count() does not need to grab all the edges of the graph!
>> 	* Bulking beyond Long -- support BigInteger, Complex numbers, Doubles, etc.
>> 	* Redesign TraverserRequirements -- this is a rats nest that didn't really work out as planned and its inefficient. I think I can make this a lot more simple.
>> 	* ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate with GremlinServer people on this).
>> 	* Scope.local & Scope.global rethinking -- count(local), dedup(local) … too many -- this is not manageable! What about  g.V().groupCount().inside(order().limit(10)) instead of g.V().groupCount().order(local).limit(local,10).
>> 	* Clean up HadoopGraph configurations -- Why do we have gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We should just have one configuration: gremlin.hadoop.graphInputClass.
>> 	* Publish a tutorial on the Gremlin VM and compiling other languages to it. I would really like to have the gremlin-examples/ package that Jason/Stephen were talking about.
>> 	* Optimize Gryo serialization and SparkGraphComputer's GryoSerializer.
>> 
>> Those are the big ticket items that I would like to get handle for the next versions of TinkerPop. 
>> 
>> What are your thoughts on these and what are your thoughts on what you plan to accomplish in this next push?
>> 
>> Take care,
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>> 
>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by pieter-gmail <pi...@gmail.com>.

A tad selfish I know,
but https://issues.apache.org/jira/browse/TINKERPOP-968 is what I am
waiting for.

Cheers
Pieter

On 30/01/2016 19:09, Marko Rodriguez wrote:
> Hello,
>
> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to turn our attentions towards 3.1.2 and 3.2.0.
>
> I was thinking it would be good to have a planning session to organize JIRA and discuss order of operations. However, JIRA planning sessions are a bit boring as they are too "nitty gritty," so perhaps we can use this thread to discuss what we (as individuals) would like to accomplish for 3.1.2 and 3.2.0 in general. This way, we have more summaries of everyone's desires and then the specifics can be shakin' out in JIRA. As such, here are my desires:
>
> TinkerPop 3.1.2
> 	* Test a new shuffle optimization idea in SparkGraphComputer and if its efficient, use it.
> 	* Benchmark GiraphGraphComputer at scale and optimize it where need be.
>
> TinkerPop 3.2.0
> 	* Gremlin DSLs -- e.g. social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
> 	* TraversalSource API redesign. g = graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The current TraversalSourceBuilder model is horrible.
> 	* OLTP/OLAP-mixed traversal -- e.g. OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
> 	* GraphComputer API additions for intelligent data access -- e.g. g.V().count() does not need to grab all the edges of the graph!
> 	* Bulking beyond Long -- support BigInteger, Complex numbers, Doubles, etc.
> 	* Redesign TraverserRequirements -- this is a rats nest that didn't really work out as planned and its inefficient. I think I can make this a lot more simple.
> 	* ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate with GremlinServer people on this).
> 	* Scope.local & Scope.global rethinking -- count(local), dedup(local) … too many -- this is not manageable! What about  g.V().groupCount().inside(order().limit(10)) instead of g.V().groupCount().order(local).limit(local,10).
> 	* Clean up HadoopGraph configurations -- Why do we have gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We should just have one configuration: gremlin.hadoop.graphInputClass.
> 	* Publish a tutorial on the Gremlin VM and compiling other languages to it. I would really like to have the gremlin-examples/ package that Jason/Stephen were talking about.
> 	* Optimize Gryo serialization and SparkGraphComputer's GryoSerializer.
>
> Those are the big ticket items that I would like to get handle for the next versions of TinkerPop. 
>
> What are your thoughts on these and what are your thoughts on what you plan to accomplish in this next push?
>
> Take care,
> Marko.
>
> http://markorodriguez.com
>
>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Marko Rodriguez <ok...@gmail.com>.

Hi Marvin,

> How can I get involved in gremlin dsl?
> I have a history with querydsl and would love to see the same behavior for
> graphs.

Here is the ticket we have so far.

	https://issues.apache.org/jira/browse/TINKERPOP-786

Here are some avenues forward:

	1. You do not like the approach in TINKERPOP-786
		- You can provide a [DISCUSS] email related to this ticket and pitch an approach that you feel is better.
	2. You do like the approach in TINKERPOP-786
		- You can comment on the ticket with extended ideas, notes, comments, etc.

I think once we have an approach that people are happy with, we move forward with development. Question: are you a developer?

Take care,
Marko.

http://markorodriguez.com




> 
> On Sun, 31 Jan 2016 06:10 Marko Rodriguez <ok...@gmail.com> wrote:
> 
>> Hello,
>> 
>> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to turn our
>> attentions towards 3.1.2 and 3.2.0.
>> 
>> I was thinking it would be good to have a planning session to organize
>> JIRA and discuss order of operations. However, JIRA planning sessions are a
>> bit boring as they are too "nitty gritty," so perhaps we can use this
>> thread to discuss what we (as individuals) would like to accomplish for
>> 3.1.2 and 3.2.0 in general. This way, we have more summaries of everyone's
>> desires and then the specifics can be shakin' out in JIRA. As such, here
>> are my desires:
>> 
>> TinkerPop 3.1.2
>>        * Test a new shuffle optimization idea in SparkGraphComputer and
>> if its efficient, use it.
>>        * Benchmark GiraphGraphComputer at scale and optimize it where
>> need be.
>> 
>> TinkerPop 3.2.0
>>        * Gremlin DSLs -- e.g.
>> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
>>        * TraversalSource API redesign. g =
>> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The current
>> TraversalSourceBuilder model is horrible.
>>        * OLTP/OLAP-mixed traversal -- e.g.
>> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
>>        * GraphComputer API additions for intelligent data access -- e.g.
>> g.V().count() does not need to grab all the edges of the graph!
>>        * Bulking beyond Long -- support BigInteger, Complex numbers,
>> Doubles, etc.
>>        * Redesign TraverserRequirements -- this is a rats nest that
>> didn't really work out as planned and its inefficient. I think I can make
>> this a lot more simple.
>>        * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
>> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate with
>> GremlinServer people on this).
>>        * Scope.local & Scope.global rethinking -- count(local),
>> dedup(local) … too many -- this is not manageable! What about
>> g.V().groupCount().inside(order().limit(10)) instead of
>> g.V().groupCount().order(local).limit(local,10).
>>        * Clean up HadoopGraph configurations -- Why do we have
>> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We should
>> just have one configuration: gremlin.hadoop.graphInputClass.
>>        * Publish a tutorial on the Gremlin VM and compiling other
>> languages to it. I would really like to have the gremlin-examples/ package
>> that Jason/Stephen were talking about.
>>        * Optimize Gryo serialization and SparkGraphComputer's
>> GryoSerializer.
>> 
>> Those are the big ticket items that I would like to get handle for the
>> next versions of TinkerPop.
>> 
>> What are your thoughts on these and what are your thoughts on what you
>> plan to accomplish in this next push?
>> 
>> Take care,
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>>

Re: [DISCUSS] TinkerPop 3.1.2 and 3.2.0 Planning

Posted by Marvin Froeder <ve...@gmail.com>.

How can I get involved in gremlin dsl?

I have a history with querydsl and would love to see the same behavior for
graphs.

On Sun, 31 Jan 2016 06:10 Marko Rodriguez <ok...@gmail.com> wrote:

> Hello,
>
> With TinkerPop 3.1.1 about to be put up for VOTE, we can start to turn our
> attentions towards 3.1.2 and 3.2.0.
>
> I was thinking it would be good to have a planning session to organize
> JIRA and discuss order of operations. However, JIRA planning sessions are a
> bit boring as they are too "nitty gritty," so perhaps we can use this
> thread to discuss what we (as individuals) would like to accomplish for
> 3.1.2 and 3.2.0 in general. This way, we have more summaries of everyone's
> desires and then the specifics can be shakin' out in JIRA. As such, here
> are my desires:
>
> TinkerPop 3.1.2
>         * Test a new shuffle optimization idea in SparkGraphComputer and
> if its efficient, use it.
>         * Benchmark GiraphGraphComputer at scale and optimize it where
> need be.
>
> TinkerPop 3.2.0
>         * Gremlin DSLs -- e.g.
> social.people().aged(36).who().know().person("daniel").who().worksFor().company("cisco")
>         * TraversalSource API redesign. g =
> graph.traversal().withComputer(…).withStrategy(…).withBulk(…). The current
> TraversalSourceBuilder model is horrible.
>         * OLTP/OLAP-mixed traversal -- e.g.
> OLAP[g.V().out()]OLTP[limit(10)]OLAP[out().values("name").order()]OLTP[sample(1)]
>         * GraphComputer API additions for intelligent data access -- e.g.
> g.V().count() does not need to grab all the edges of the graph!
>         * Bulking beyond Long -- support BigInteger, Complex numbers,
> Doubles, etc.
>         * Redesign TraverserRequirements -- this is a rats nest that
> didn't really work out as planned and its inefficient. I think I can make
> this a lot more simple.
>         * ServerGraph/ServerStep/ServerStrategy -- like OLAP, but for
> GremlinServer -- e.g. [GraphStep, VertexStep, ServerStep] (collaborate with
> GremlinServer people on this).
>         * Scope.local & Scope.global rethinking -- count(local),
> dedup(local) … too many -- this is not manageable! What about
> g.V().groupCount().inside(order().limit(10)) instead of
> g.V().groupCount().order(local).limit(local,10).
>         * Clean up HadoopGraph configurations -- Why do we have
> gremlin.spark.graphInputRDD and gremlin.hadoop.graphInputFormat. We should
> just have one configuration: gremlin.hadoop.graphInputClass.
>         * Publish a tutorial on the Gremlin VM and compiling other
> languages to it. I would really like to have the gremlin-examples/ package
> that Jason/Stephen were talking about.
>         * Optimize Gryo serialization and SparkGraphComputer's
> GryoSerializer.
>
> Those are the big ticket items that I would like to get handle for the
> next versions of TinkerPop.
>
> What are your thoughts on these and what are your thoughts on what you
> plan to accomplish in this next push?
>
> Take care,
> Marko.
>
> http://markorodriguez.com
>
>