You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/17 18:07:07 UTC

[jira] [Commented] (TINKERPOP-1801) OLAP profile() step return incorrect timing

    [ https://issues.apache.org/jira/browse/TINKERPOP-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208050#comment-16208050 ] 

ASF GitHub Bot commented on TINKERPOP-1801:
-------------------------------------------

GitHub user artem-aliev opened a pull request:

    https://github.com/apache/tinkerpop/pull/733

    TINKERPOP-1801: fix profile() timing in OLAP by adding worker iterati…

    …on timings to step metrics
    
    this is a simple fix that do not change any API

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1801

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tinkerpop/pull/733.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #733
    
----
commit 827ea9cfd57202612518e5e6bcff18f601dd2018
Author: artemaliev <artem.aliev@gmail,com>
Date:   2017-10-17T18:00:31Z

    TINKERPOP-1801: fix profile() timing in OLAP by adding worker iteration timings to step metrics
    this is a simple fix that do not change any API

----


>  OLAP profile() step return incorrect timing
> --------------------------------------------
>
>                 Key: TINKERPOP-1801
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1801
>             Project: TinkerPop
>          Issue Type: Bug
>    Affects Versions: 3.3.0, 3.2.6
>            Reporter: Artem Aliev
>
> Graph ProfileStep calculates time of next()/hasNext() calls, expecting recursion.
> But Message passing/RDD joins is used by GraphComputer.
> So next() does not recursively call next steps, but message is generated. And most of the time is taken by message passing (RDD join). 
> Thus on graph computer the time between ProfileStep should be measured, not inside it.
> The other approach is to get Spark statistics with SparkListener and add spark stages timings into profiler metrics. that will work only for spark but will give better representation of step costs.
> The simple fix is measuring time between OLAP iterations and add it to the profiler step.
> This will not take into account computer setup time, but will be precise enough for long running queries.
> To reproduce:
> tinkerPop 3.2.6 gremlin:
> {code}
> plugin activated: tinkerpop.server
> plugin activated: tinkerpop.utilities
> plugin activated: tinkerpop.spark
> plugin activated: tinkerpop.tinkergraph
> gremlin> graph = GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties')
> gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
> ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], sparkgraphcomputer]
> gremlin> g.V().out().out().count().profile()
> ==>Traversal Metrics
> Step                                                               Count  Traversers       Time (ms)    % Dur
> =============================================================================================================
> GraphStep(vertex,[])                                                 808         808           2.025    18.35
> VertexStep(OUT,vertex)                                              8049         562           4.430    40.14
> VertexStep(OUT,edge)                                              327370        7551           4.581    41.50
> CountGlobalStep                                                        1           1           0.001     0.01
>                                             >TOTAL                     -           -          11.038        -
> gremlin> clock(1){g.V().out().out().count().next() }
> ==>3421.92758
> gremlin>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)