You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by artem-aliev <gi...@git.apache.org> on 2017/10/17 18:19:32 UTC

[GitHub] tinkerpop pull request #734: TINKERPOP-1801: fix profile() timing in OLAP by...

GitHub user artem-aliev opened a pull request:

    https://github.com/apache/tinkerpop/pull/734

    TINKERPOP-1801: fix profile() timing in OLAP by adding worker iterati…

    …on timings to step metrics
    
    this is a simple fix that do not change any API

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1801

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tinkerpop/pull/734.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #734
    
----
commit 827ea9cfd57202612518e5e6bcff18f601dd2018
Author: artemaliev <artem.aliev@gmail,com>
Date:   2017-10-17T18:00:31Z

    TINKERPOP-1801: fix profile() timing in OLAP by adding worker iteration timings to step metrics
    this is a simple fix that do not change any API

----


---

[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...

Posted by artem-aliev <gi...@git.apache.org>.
Github user artem-aliev commented on the issue:

    https://github.com/apache/tinkerpop/pull/734
  
    Something strange happened, during rebases. I'll fix.


---

[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...

Posted by artem-aliev <gi...@git.apache.org>.
Github user artem-aliev commented on the issue:

    https://github.com/apache/tinkerpop/pull/734
  
    I have fixed test failures.
        TinkerPopComputer does not call ComputerPorgram.execute methods if spit has no vertices.
        For example: modern graph has 6 vertices but computer has 8 cores, there will be two empty splits.
        TraversalVertexProgram use execute step to setup next profiling step, so it is not setup side effects properly for empty splits.
        So tests did not filed in docker but failed on computer with more then 6 cores.
        The fix add check that profile side effects were regester properly before using


---

[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the issue:

    https://github.com/apache/tinkerpop/pull/734
  
    This is a nice update @artem-aliev because it doesn't change API and it is general for all `GraphComputer` implementations. Great! A couple things please for a solid VOTE.
    
    1. Please update the `CHANGELOG.asciidoc` with the change you made.
    2. In this PR discussion, please provide a `CUT/PASTE` of what the new metrics `toString()` looks like so people can judge its merits.
    
    Thank you.


---

[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:

    https://github.com/apache/tinkerpop/pull/734
  
    @okram if all looks good to you after artem's changes you should be free to merge this now:
    
    All tests pass with `docker/build.sh -t -n -i`
    
    VOTE +1


---

[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...

Posted by artem-aliev <gi...@git.apache.org>.
Github user artem-aliev commented on the issue:

    https://github.com/apache/tinkerpop/pull/734
  
    The fix add iteration time to the appropriate step. Iteration time is a time between workerIterationStart() and workerIterationEnd() callbacks.
    So before fix the timing looks like 
    ```
    gremlin> g.V().out().out().count().profile()
    ==>Traversal Metrics
    Step                                                               Count  Traversers       Time (ms)    % Dur
    =============================================================================================================
    GraphStep(vertex,[])                                                9962        9962          70.873    48.95
    VertexStep(OUT,vertex)                                           1012657        3745          37.132    25.65
    VertexStep(OUT,edge)                                             2101815        6192          36.751    25.39
    CountGlobalStep                                                        1           1           0.018     0.01
                                                >TOTAL                     -           -         144.775        -
    ```
    While query runs 10s seconds.
    After the fix:
    ```
    gremlin> g.V().out().out().count().profile()
    ==>Traversal Metrics
    Step                                                               Count  Traversers       Time (ms)    % Dur
    =============================================================================================================
    GraphStep(vertex,[])                                                9962        9962       14186.809    97.43
    VertexStep(OUT,vertex)                                           1012657        3745         340.051     2.34
    VertexStep(OUT,edge)                                             2101815        6192          33.684     0.23
    CountGlobalStep                                                        1           1           0.004     0.00
                                                >TOTAL                     -           -       14560.549        -
    ```
    That shows that most of the time for this OLAP query was spend in the initial iteration (actually star graph creation).
    There still could be some inconsistencies because
    1. No computer specific setup time is measured
    2. Spark has a lot of lazy staff, so most of Spark RDD setup are counted as a first step.
    3. The algorithm could fail to assign timing to the right step in bas of sophisticated queries.
    
    By the way new timing is pretty close to the wall clock time. 



---

[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...

Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the issue:

    https://github.com/apache/tinkerpop/pull/734
  
    VOTE: +1


---

[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the issue:

    https://github.com/apache/tinkerpop/pull/734
  
    VOTE +1


---

[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:

    https://github.com/apache/tinkerpop/pull/734
  
    This doesn't build for me with just regular `mvn clean install`


---

[GitHub] tinkerpop pull request #734: TINKERPOP-1801: fix profile() timing in OLAP by...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/tinkerpop/pull/734


---