You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by Marko Rodriguez <ok...@gmail.com> on 2015/09/23 20:35:42 UTC

Gremlin benchmarking over releases.

Hello,

I think it would be important to have have an integration test that benchmarks Gremlin (over TinkerGraph).

What does this look like?

1. We have a collection of traversals that span the various uses of Gremlin. (write, read, path, aggregates, etc.)
2. We have a scale free graph (250k edges?) in TinkerGraph that we run this traversal set against.
3. We save the results of this benchmark a stats/ like directory that get pushed to the repository.
- ???/stats/marko-09-23-2015-macosx-3.0.1.txt
- ???/stats/marko-09-23-2015-macosx-3.1.0-SNAPSHOT.txt
- etc.
4. We can then look at how queries become better or worse with each release (and in SNAPSHOTS).
- A cross file visualization these benchmarks would be great so we can see (easily) which aspects of Gremlin are getting better/worse.

Why only TinkerGraph?

1. We don't want to benchmark a database/disk. This is to benchmark Gremlin with itself through time.
2. TinkerGraph doesn't evolve. Its probably the most stable piece of code in TP3 -- thus, its a good baseline system.

Is anyone interested in working on something like this? Note that I'm not versed in the best practices for doing this, so if there is a better way to accomplish the goal of benchmarking Gremlin over releases, lets do that. However, lets start simple and grow so we can get something working and providing us insights ASAP.

Finally, if this is generally a good idea, I can make a ticket in JIRA for this.

Thoughts?,
Marko.

http://markorodriguez.com

Re: Gremlin benchmarking over releases.

Posted by Stephen Mallette <sp...@gmail.com>.

There are performance tests in the test suite and you can wire them up just
as you do the normal tests - note how TinkerGraph does them:

https://github.com/apache/incubator-tinkerpop/blob/ad27fce579a182de3ebf886fdbd85d5960852bdd/tinkergraph-gremlin/src/test/java/org/apache/tinkerpop/gremlin/tinkergraph/structure/TinkerGraphStructurePerformanceTest.java

https://github.com/apache/incubator-tinkerpop/blob/ad27fce579a182de3ebf886fdbd85d5960852bdd/tinkergraph-gremlin/src/test/java/org/apache/tinkerpop/gremlin/tinkergraph/structure/groovy/TinkerGraphGroovyEnvironmentPerformanceTest.java

There's not much meat to them I'm afraid but it's what we have right now.
I think that expanding on these in the way Marko had said would make some
sense.  For TinkerPop core purposes we can focus on tracking TinkerGraph
only, but if those who implement the core interfaces want to make use of
the benchmarking tests I think it would be good if we allowed that.



On Mon, Sep 28, 2015 at 7:40 AM, Ran Magen <rm...@gmail.com> wrote:

> Hopping on the discussion, is there a standard way for vendors to test
> performance? Do the performance suites give some kind of result? Can it be
> used to compare different implementations?
> On יום א׳, 27 בספט׳ 2015 at 17:26 Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > We already have a pattern for "performance tests" that uses some
> > benchmarking libs so it wouldn't be hard to extend on that.  We'd
> discussed
> > this kinda thing before on the list (or perhaps an old ticket) - Daniel
> > Kuppitz had some reservations about it, though I don't remember what they
> > were, so we never extended on it.  Kinda like the idea of storing the
> stats
> > (labelled as you have it) in the repo itself - that seems like a good
> idea.
> >
> > On Wed, Sep 23, 2015 at 2:35 PM, Marko Rodriguez <ok...@gmail.com>
> > wrote:
> >
> > > Hello,
> > >
> > > I think it would be important to have have an integration test that
> > > benchmarks Gremlin (over TinkerGraph).
> > >
> > > What does this look like?
> > >
> > >         1. We have a collection of traversals that span the various
> uses
> > > of Gremlin. (write, read, path, aggregates, etc.)
> > >         2. We have a scale free graph (250k edges?) in TinkerGraph that
> > we
> > > run this traversal set against.
> > >         3. We save the results of this benchmark a stats/ like
> directory
> > > that get pushed to the repository.
> > >                 - ???/stats/marko-09-23-2015-macosx-3.0.1.txt
> > >                 - ???/stats/marko-09-23-2015-macosx-3.1.0-SNAPSHOT.txt
> > >                 - etc.
> > >         4. We can then look at how queries become better or worse with
> > > each release (and in SNAPSHOTS).
> > >                 - A cross file visualization these benchmarks would be
> > > great so we can see (easily) which aspects of Gremlin are getting
> > > better/worse.
> > >
> > > Why only TinkerGraph?
> > >
> > >         1. We don't want to benchmark a database/disk. This is to
> > > benchmark Gremlin with itself through time.
> > >         2. TinkerGraph doesn't evolve. Its probably the most stable
> piece
> > > of code in TP3 -- thus, its a good baseline system.
> > >
> > > Is anyone interested in working on something like this? Note that I'm
> not
> > > versed in the best practices for doing this, so if there is a better
> way
> > to
> > > accomplish the goal of benchmarking Gremlin over releases, lets do
> that.
> > > However, lets start simple and grow so we can get something working and
> > > providing us insights ASAP.
> > >
> > > Finally, if this is generally a good idea, I can make a ticket in JIRA
> > for
> > > this.
> > >
> > > Thoughts?,
> > > Marko.
> > >
> > > http://markorodriguez.com
> > >
> > >
> >
>

Re: Gremlin benchmarking over releases.

Posted by Ran Magen <rm...@gmail.com>.

Hopping on the discussion, is there a standard way for vendors to test
performance? Do the performance suites give some kind of result? Can it be
used to compare different implementations?
On יום א׳, 27 בספט׳ 2015 at 17:26 Stephen Mallette <sp...@gmail.com>
wrote:

> We already have a pattern for "performance tests" that uses some
> benchmarking libs so it wouldn't be hard to extend on that.  We'd discussed
> this kinda thing before on the list (or perhaps an old ticket) - Daniel
> Kuppitz had some reservations about it, though I don't remember what they
> were, so we never extended on it.  Kinda like the idea of storing the stats
> (labelled as you have it) in the repo itself - that seems like a good idea.
>
> On Wed, Sep 23, 2015 at 2:35 PM, Marko Rodriguez <ok...@gmail.com>
> wrote:
>
> > Hello,
> >
> > I think it would be important to have have an integration test that
> > benchmarks Gremlin (over TinkerGraph).
> >
> > What does this look like?
> >
> >         1. We have a collection of traversals that span the various uses
> > of Gremlin. (write, read, path, aggregates, etc.)
> >         2. We have a scale free graph (250k edges?) in TinkerGraph that
> we
> > run this traversal set against.
> >         3. We save the results of this benchmark a stats/ like directory
> > that get pushed to the repository.
> >                 - ???/stats/marko-09-23-2015-macosx-3.0.1.txt
> >                 - ???/stats/marko-09-23-2015-macosx-3.1.0-SNAPSHOT.txt
> >                 - etc.
> >         4. We can then look at how queries become better or worse with
> > each release (and in SNAPSHOTS).
> >                 - A cross file visualization these benchmarks would be
> > great so we can see (easily) which aspects of Gremlin are getting
> > better/worse.
> >
> > Why only TinkerGraph?
> >
> >         1. We don't want to benchmark a database/disk. This is to
> > benchmark Gremlin with itself through time.
> >         2. TinkerGraph doesn't evolve. Its probably the most stable piece
> > of code in TP3 -- thus, its a good baseline system.
> >
> > Is anyone interested in working on something like this? Note that I'm not
> > versed in the best practices for doing this, so if there is a better way
> to
> > accomplish the goal of benchmarking Gremlin over releases, lets do that.
> > However, lets start simple and grow so we can get something working and
> > providing us insights ASAP.
> >
> > Finally, if this is generally a good idea, I can make a ticket in JIRA
> for
> > this.
> >
> > Thoughts?,
> > Marko.
> >
> > http://markorodriguez.com
> >
> >
>

Re: Gremlin benchmarking over releases.

Posted by Stephen Mallette <sp...@gmail.com>.

We already have a pattern for "performance tests" that uses some
benchmarking libs so it wouldn't be hard to extend on that.  We'd discussed
this kinda thing before on the list (or perhaps an old ticket) - Daniel
Kuppitz had some reservations about it, though I don't remember what they
were, so we never extended on it.  Kinda like the idea of storing the stats
(labelled as you have it) in the repo itself - that seems like a good idea.

On Wed, Sep 23, 2015 at 2:35 PM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hello,
>
> I think it would be important to have have an integration test that
> benchmarks Gremlin (over TinkerGraph).
>
> What does this look like?
>
>         1. We have a collection of traversals that span the various uses
> of Gremlin. (write, read, path, aggregates, etc.)
>         2. We have a scale free graph (250k edges?) in TinkerGraph that we
> run this traversal set against.
>         3. We save the results of this benchmark a stats/ like directory
> that get pushed to the repository.
>                 - ???/stats/marko-09-23-2015-macosx-3.0.1.txt
>                 - ???/stats/marko-09-23-2015-macosx-3.1.0-SNAPSHOT.txt
>                 - etc.
>         4. We can then look at how queries become better or worse with
> each release (and in SNAPSHOTS).
>                 - A cross file visualization these benchmarks would be
> great so we can see (easily) which aspects of Gremlin are getting
> better/worse.
>
> Why only TinkerGraph?
>
>         1. We don't want to benchmark a database/disk. This is to
> benchmark Gremlin with itself through time.
>         2. TinkerGraph doesn't evolve. Its probably the most stable piece
> of code in TP3 -- thus, its a good baseline system.
>
> Is anyone interested in working on something like this? Note that I'm not
> versed in the best practices for doing this, so if there is a better way to
> accomplish the goal of benchmarking Gremlin over releases, lets do that.
> However, lets start simple and grow so we can get something working and
> providing us insights ASAP.
>
> Finally, if this is generally a good idea, I can make a ticket in JIRA for
> this.
>
> Thoughts?,
> Marko.
>
> http://markorodriguez.com
>
>