You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by artem-aliev <gi...@git.apache.org> on 2017/10/17 18:19:32 UTC
[GitHub] tinkerpop pull request #734: TINKERPOP-1801: fix profile() timing in OLAP by...
GitHub user artem-aliev opened a pull request:
https://github.com/apache/tinkerpop/pull/734
TINKERPOP-1801: fix profile() timing in OLAP by adding worker iterati…
…on timings to step metrics
this is a simple fix that do not change any API
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1801
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tinkerpop/pull/734.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #734
----
commit 827ea9cfd57202612518e5e6bcff18f601dd2018
Author: artemaliev <artem.aliev@gmail,com>
Date: 2017-10-17T18:00:31Z
TINKERPOP-1801: fix profile() timing in OLAP by adding worker iteration timings to step metrics
this is a simple fix that do not change any API
----
---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Posted by artem-aliev <gi...@git.apache.org>.
Github user artem-aliev commented on the issue:
https://github.com/apache/tinkerpop/pull/734
Something strange happened, during rebases. I'll fix.
---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Posted by artem-aliev <gi...@git.apache.org>.
Github user artem-aliev commented on the issue:
https://github.com/apache/tinkerpop/pull/734
I have fixed test failures.
TinkerPopComputer does not call ComputerPorgram.execute methods if spit has no vertices.
For example: modern graph has 6 vertices but computer has 8 cores, there will be two empty splits.
TraversalVertexProgram use execute step to setup next profiling step, so it is not setup side effects properly for empty splits.
So tests did not filed in docker but failed on computer with more then 6 cores.
The fix add check that profile side effects were regester properly before using
---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Posted by okram <gi...@git.apache.org>.
Github user okram commented on the issue:
https://github.com/apache/tinkerpop/pull/734
This is a nice update @artem-aliev because it doesn't change API and it is general for all `GraphComputer` implementations. Great! A couple things please for a solid VOTE.
1. Please update the `CHANGELOG.asciidoc` with the change you made.
2. In this PR discussion, please provide a `CUT/PASTE` of what the new metrics `toString()` looks like so people can judge its merits.
Thank you.
---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:
https://github.com/apache/tinkerpop/pull/734
@okram if all looks good to you after artem's changes you should be free to merge this now:
All tests pass with `docker/build.sh -t -n -i`
VOTE +1
---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Posted by artem-aliev <gi...@git.apache.org>.
Github user artem-aliev commented on the issue:
https://github.com/apache/tinkerpop/pull/734
The fix add iteration time to the appropriate step. Iteration time is a time between workerIterationStart() and workerIterationEnd() callbacks.
So before fix the timing looks like
```
gremlin> g.V().out().out().count().profile()
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
GraphStep(vertex,[]) 9962 9962 70.873 48.95
VertexStep(OUT,vertex) 1012657 3745 37.132 25.65
VertexStep(OUT,edge) 2101815 6192 36.751 25.39
CountGlobalStep 1 1 0.018 0.01
>TOTAL - - 144.775 -
```
While query runs 10s seconds.
After the fix:
```
gremlin> g.V().out().out().count().profile()
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
GraphStep(vertex,[]) 9962 9962 14186.809 97.43
VertexStep(OUT,vertex) 1012657 3745 340.051 2.34
VertexStep(OUT,edge) 2101815 6192 33.684 0.23
CountGlobalStep 1 1 0.004 0.00
>TOTAL - - 14560.549 -
```
That shows that most of the time for this OLAP query was spend in the initial iteration (actually star graph creation).
There still could be some inconsistencies because
1. No computer specific setup time is measured
2. Spark has a lot of lazy staff, so most of Spark RDD setup are counted as a first step.
3. The algorithm could fail to assign timing to the right step in bas of sophisticated queries.
By the way new timing is pretty close to the wall clock time.
---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the issue:
https://github.com/apache/tinkerpop/pull/734
VOTE: +1
---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Posted by okram <gi...@git.apache.org>.
Github user okram commented on the issue:
https://github.com/apache/tinkerpop/pull/734
VOTE +1
---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:
https://github.com/apache/tinkerpop/pull/734
This doesn't build for me with just regular `mvn clean install`
---
[GitHub] tinkerpop pull request #734: TINKERPOP-1801: fix profile() timing in OLAP by...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/tinkerpop/pull/734
---