You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Stephen Mallette (Jira)" <ji...@apache.org> on 2023/02/16 13:11:00 UTC

[jira] [Commented] (TINKERPOP-2869) Inconsistent results when using dedup()

    [ https://issues.apache.org/jira/browse/TINKERPOP-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689755#comment-17689755 ] 

Stephen Mallette commented on TINKERPOP-2869:
---------------------------------------------

It's a little hard to tell because your sample data is fairly complex, but I sense this is expected behavior. I don't think you can expect those queries to be the same because you start with both "A" and "B" vertices in one and in the other you start with just "A". Using {{bothE()}} the {{dedup()}} will simply take the first time it sees the edge and ignore the rest. If you start with "A" and "B" vertices that means sometimes you will traverse to "A" or "B" edges for that unique edge, but if you start just with "A" it means you will only ever get "B" edges. This changes the nature of the query and the number of traversers as you can see in the different profiles:

{code}
gremlin> g.V().as('x').bothE().dedup().otherV().select('x').hasLabel('A').profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[])@[x]                                        10          10           0.186     2.92
VertexStep(BOTH,edge)                                                100         100           0.884    13.86
DedupGlobalStep(null,null)                                            50          50           1.194    18.72
EdgeOtherVertexStep                                                   50          50           1.390    21.80
SelectOneStep(last,x,null)                                            50          50           2.109    33.07
HasStep([~label.eq(A)])                                               31          31           0.614     9.63
                                            >TOTAL                     -           -           6.379        -
gremlin> g.V().as('x').hasLabel('A').bothE().dedup().otherV().select('x').profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[~label.eq(A)])@[x]                             5           5           0.188     7.21
VertexStep(BOTH,edge)                                                 50          50           0.146     5.61
DedupGlobalStep(null,null)                                            36          36           1.054    40.41
EdgeOtherVertexStep                                                   36          36           0.121     4.66
SelectOneStep(last,x,null)                                            36          36           1.099    42.11
                                            >TOTAL                     -           -           2.610        -
{code}

If you still think there is a problem (it's possible I've misinterpreted this), perhaps you could come up with a more simple example where it is more clearly demonstrated? 

> Inconsistent results when using dedup()
> ---------------------------------------
>
>                 Key: TINKERPOP-2869
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2869
>             Project: TinkerPop
>          Issue Type: Bug
>    Affects Versions: 3.6.1
>            Reporter: Yuancheng
>            Priority: Major
>
>  
> {code:java}
> g.V().as('x').hasLabel('A').bothE().dedup().otherV().select('x').count()
> result: 36{code}
> {code:java}
> g.V().as('x').bothE().dedup().otherV().select('x').hasLabel('A').count()
> result: 31  {code}
> Dataset: same to https://issues.apache.org/jira/browse/TINKERPOP-2867
> Path results:
> {code:java}
> path[v[0], e[58][0-X->15], v[15], v[0]]
> path[v[0], e[60][0-X->12], v[12], v[0]]
> path[v[0], e[30][0-X->3], v[3], v[0]]
> path[v[0], e[49][0-Y->27], v[27], v[0]]
> path[v[0], e[52][0-Y->21], v[21], v[0]]
> path[v[0], e[78][0-Y->24], v[24], v[0]]
> path[v[0], e[39][27-X->0], v[27], v[0]]
> path[v[0], e[61][3-X->0], v[3], v[0]]
> path[v[0], e[70][3-Y->0], v[3], v[0]]
> path[v[0], e[40][3-Y->0], v[3], v[0]]
> path[v[0], e[59][3-Y->0], v[3], v[0]]
> path[v[3], e[53][3-X->21], v[21], v[3]]
> path[v[3], e[31][3-X->6], v[6], v[3]]
> path[v[3], e[74][3-Y->12], v[12], v[3]]
> path[v[3], e[67][21-X->3], v[21], v[3]]
> path[v[3], e[71][21-Y->3], v[21], v[3]]
> path[v[3], e[56][24-Y->3], v[24], v[3]]
> path[v[3], e[41][6-Y->3], v[6], v[3]]
> path[v[6], e[32][6-X->9], v[9], v[6]]
> path[v[6], e[50][6-X->24], v[24], v[6]]
> path[v[6], e[62][6-X->12], v[12], v[6]]
> path[v[6], e[77][6-Y->21], v[21], v[6]] (MISSED)
> path[v[6], e[69][27-X->6], v[27], v[6]]
> path[v[6], e[55][18-X->6], v[18], v[6]] (MISSED)
> path[v[6], e[51][15-Y->6], v[15], v[6]]
> path[v[6], e[72][24-Y->6], v[24], v[6]]
> path[v[6], e[42][9-Y->6], v[9], v[6]]
> path[v[9], e[33][9-X->12], v[12], v[9]]
> path[v[9], e[63][9-X->18], v[18], v[9]] (MISSED)
> path[v[9], e[73][24-Y->9], v[24], v[9]] (MISSED)
> path[v[9], e[43][12-Y->9], v[12], v[9]]
> path[v[12], e[64][12-X->15], v[15], v[12]]
> path[v[12], e[34][12-X->15], v[15], v[12]]
> path[v[12], e[54][12-Y->21], v[21], v[12]] (MISSED)
> path[v[12], e[65][15-X->12], v[15], v[12]]
> path[v[12], e[44][15-Y->12], v[15], v[12]] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)