You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Marko A. Rodriguez (JIRA)" <ji...@apache.org> on 2017/10/31 15:08:00 UTC
[jira] [Closed] (TINKERPOP-1801) OLAP profile() step return
incorrect timing
[ https://issues.apache.org/jira/browse/TINKERPOP-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marko A. Rodriguez closed TINKERPOP-1801.
-----------------------------------------
Resolution: Fixed
Assignee: Marko A. Rodriguez
Fix Version/s: 3.3.1
3.2.7
> OLAP profile() step return incorrect timing
> --------------------------------------------
>
> Key: TINKERPOP-1801
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1801
> Project: TinkerPop
> Issue Type: Bug
> Components: hadoop
> Affects Versions: 3.3.0, 3.2.6
> Reporter: Artem Aliev
> Assignee: Marko A. Rodriguez
> Fix For: 3.2.7, 3.3.1
>
>
> Graph ProfileStep calculates time of next()/hasNext() calls, expecting recursion.
> But Message passing/RDD joins is used by GraphComputer.
> So next() does not recursively call next steps, but message is generated. And most of the time is taken by message passing (RDD join).
> Thus on graph computer the time between ProfileStep should be measured, not inside it.
> The other approach is to get Spark statistics with SparkListener and add spark stages timings into profiler metrics. that will work only for spark but will give better representation of step costs.
> The simple fix is measuring time between OLAP iterations and add it to the profiler step.
> This will not take into account computer setup time, but will be precise enough for long running queries.
> To reproduce:
> tinkerPop 3.2.6 gremlin:
> {code}
> plugin activated: tinkerpop.server
> plugin activated: tinkerpop.utilities
> plugin activated: tinkerpop.spark
> plugin activated: tinkerpop.tinkergraph
> gremlin> graph = GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties')
> gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
> ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], sparkgraphcomputer]
> gremlin> g.V().out().out().count().profile()
> ==>Traversal Metrics
> Step Count Traversers Time (ms) % Dur
> =============================================================================================================
> GraphStep(vertex,[]) 808 808 2.025 18.35
> VertexStep(OUT,vertex) 8049 562 4.430 40.14
> VertexStep(OUT,edge) 327370 7551 4.581 41.50
> CountGlobalStep 1 1 0.001 0.01
> >TOTAL - - 11.038 -
> gremlin> clock(1){g.V().out().out().count().next() }
> ==>3421.92758
> gremlin>
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)