You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Marc de Lignie (JIRA)" <ji...@apache.org> on 2018/06/06 19:58:00 UTC

[jira] [Comment Edited] (TINKERPOP-1967) Add a connectedComponent() step

    [ https://issues.apache.org/jira/browse/TINKERPOP-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503837#comment-16503837 ] 

Marc de Lignie edited comment on TINKERPOP-1967 at 6/6/18 7:57 PM:
-------------------------------------------------------------------

The comparison between the OLTP query and the connectedComponent step was more productive. Below you see figures for graphs with random components of fixed size with an edge/vertex ratio of 6, for which the two algorithms have the same order of magnitude runtime. This means the OLTP query could still be useful for networks with long strings of vertices. So, I will do some more work to produce some comparison graphs, add them to my work and rework them into a PR against TINKERPOP-1967.

OLTP: Duration: 0.064
 Component sizes: [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]
 Step: Duration: 0.052
 Component sizes: [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]
 OLTP: Duration: 0.134
 Component sizes: [200, 200, 200, 200, 200, 200, 200, 200, 200, 200]
 Step: Duration: 0.096
 Component sizes: [200, 200, 200, 200, 200, 200, 200, 200, 200, 200]
 OLTP: Duration: 0.269
 Component sizes: [400, 400, 400, 400, 400, 400, 400, 400, 400, 400]
 Step: Duration: 0.129
 Component sizes: [400, 400, 400, 400, 400, 400, 400, 400, 400, 400]
 OLTP: Duration: 0.585
 Component sizes: [800, 800, 800, 800, 800, 800, 800, 800, 800, 800]
 Step: Duration: 0.252
 Component sizes: [800, 800, 800, 800, 800, 800, 800, 800, 800, 800]
 OLTP: Duration: 1.183
 Component sizes: [1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600]
 Step: Duration: 0.559
 Component sizes: [1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600]
 OLTP: Duration: 2.375
 Component sizes: [3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200]
 Step: Duration: 1.125
 Component sizes: [3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200]
 OLTP: Duration: 4.986
 Component sizes: [6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400]
 Step: Duration: 2.326
 Component sizes: [6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400]
 OLTP: Duration: 10.507
 Component sizes: [12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800]
 Step: Duration: 6.308
 Component sizes: [12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800]
 OLTP: Duration: 22.265
 Component sizes: [25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600]
 Step: Duration: 16.212
 Component sizes: [25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600]


was (Author: hadoopmarc):
The comparison between the OLTP query and the connectedComponent step was more productive. Below you see figures for graphs with random components of fixed size with a edge/vertex ratio of 6, for which the two algorithms have the same order of magnitude runtime. This means the OLTP query could still be useful for networks with long strings of vertices. So, I will do some more work to produce some comparison graphs, add them to my work and rework them into a PR against TTINKERPOP-1967.

OLTP: Duration: 0.064
Component sizes: [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]
Step: Duration: 0.052
Component sizes: [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]
OLTP: Duration: 0.134
Component sizes: [200, 200, 200, 200, 200, 200, 200, 200, 200, 200]
Step: Duration: 0.096
Component sizes: [200, 200, 200, 200, 200, 200, 200, 200, 200, 200]
OLTP: Duration: 0.269
Component sizes: [400, 400, 400, 400, 400, 400, 400, 400, 400, 400]
Step: Duration: 0.129
Component sizes: [400, 400, 400, 400, 400, 400, 400, 400, 400, 400]
OLTP: Duration: 0.585
Component sizes: [800, 800, 800, 800, 800, 800, 800, 800, 800, 800]
Step: Duration: 0.252
Component sizes: [800, 800, 800, 800, 800, 800, 800, 800, 800, 800]
OLTP: Duration: 1.183
Component sizes: [1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600]
Step: Duration: 0.559
Component sizes: [1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600]
OLTP: Duration: 2.375
Component sizes: [3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200]
Step: Duration: 1.125
Component sizes: [3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200, 3200]
OLTP: Duration: 4.986
Component sizes: [6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400]
Step: Duration: 2.326
Component sizes: [6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400, 6400]
OLTP: Duration: 10.507
Component sizes: [12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800]
Step: Duration: 6.308
Component sizes: [12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800, 12800]
OLTP: Duration: 22.265
Component sizes: [25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600]
Step: Duration: 16.212
Component sizes: [25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600, 25600]

> Add a connectedComponent() step
> -------------------------------
>
>                 Key: TINKERPOP-1967
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1967
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.3.3
>            Reporter: stephen mallette
>            Assignee: stephen mallette
>            Priority: Minor
>             Fix For: 3.4.0
>
>
> Given TINKERPOP-1852 we should probably just simplify and improve performance of connected component identification. Implementing this will involved the creation of {{ConnectedComponentVertexProgram}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)