You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Jinfeng Ni (JIRA)" <ji...@apache.org> on 2013/11/05 20:40:17 UTC

[jira] [Created] (DRILL-275) hash-to-random-exchange cause incorrect row count returned.

Jinfeng Ni created DRILL-275:
--------------------------------

             Summary: hash-to-random-exchange cause incorrect row count returned. 
                 Key: DRILL-275
                 URL: https://issues.apache.org/jira/browse/DRILL-275
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Jinfeng Ni
            Priority: Minor


I have the following physical plan:

{
    head:{
        type:"APACHE_DRILL_PHYSICAL",
        version:"1",
        generator:{
            type:"manual"
        }
    },
    graph:[
    {pop : "parquet-scan",
        @id : 1,
        entries : [ {
          path : "nation.parquet"
        } ],
        storageengine : {
          type : "parquet",
          dfsName : "file:///"
        },
        ref : "_MAP",
        fragmentPointer : 0
      }, {
          @id:2,
          child: 1,
          pop:"project",
          exprs: [
             { ref: "hkey", expr:"_MAP.N_REGIONKEY"}
                 ]
       }, {
            @id: 3,
            child: 2,
            pop: "hash-to-random-exchange",
            expr: "hash(hkey)"
        }, {
            @id: 4,
            child: 3,
            pop: "union-exchange"
        }, {
            @id: 5,
            child: 4,
            pop: "screen"
        }
    ]
}

The output if we submit the above physical plan through submit_plan tool :

------------------
| hkey           |
------------------
| 1              |
| 1              |
| 1              |
| 3              |
| 3              |
| 1              |
| 3              |
| 3              |
| 3              |
| 1              |
------------------
| hkey           |
------------------
| 2              |
| 2              |
| 2              |
| 2              |
| 2              |
------------------
| hkey           |
------------------
| 0              |
| 0              |
| 0              |
| 0              |
| 0              |
------------------
| hkey           |
------------------
| 4              |
| 4              |
| 4              |
| 4              |
| 4              |
------------------
Got 50 records in 930.671021 seconds

Notice that the results are 25 rows, but the message shows it got 50 rows. 

If I remove hash-to-random-exchange from the plan, then the output is valid. 

------------------
| hkey           |
------------------
| 0              |
| 1              |
| 1              |
| 1              |
| 4              |
| 0              |
| 3              |
| 3              |
| 2              |
| 2              |
| 4              |
| 4              |
| 2              |
| 4              |
| 0              |
| 0              |
| 0              |
| 1              |
| 2              |
| 3              |
| 4              |
| 2              |
| 3              |
| 3              |
| 1              |
------------------
Got 25 records in 3.375000 seconds.






--
This message was sent by Atlassian JIRA
(v6.1#6144)