You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Jinfeng Ni (JIRA)" <ji...@apache.org> on 2013/11/05 20:40:17 UTC
[jira] [Created] (DRILL-275) hash-to-random-exchange cause
incorrect row count returned.
Jinfeng Ni created DRILL-275:
--------------------------------
Summary: hash-to-random-exchange cause incorrect row count returned.
Key: DRILL-275
URL: https://issues.apache.org/jira/browse/DRILL-275
Project: Apache Drill
Issue Type: Bug
Reporter: Jinfeng Ni
Priority: Minor
I have the following physical plan:
{
head:{
type:"APACHE_DRILL_PHYSICAL",
version:"1",
generator:{
type:"manual"
}
},
graph:[
{pop : "parquet-scan",
@id : 1,
entries : [ {
path : "nation.parquet"
} ],
storageengine : {
type : "parquet",
dfsName : "file:///"
},
ref : "_MAP",
fragmentPointer : 0
}, {
@id:2,
child: 1,
pop:"project",
exprs: [
{ ref: "hkey", expr:"_MAP.N_REGIONKEY"}
]
}, {
@id: 3,
child: 2,
pop: "hash-to-random-exchange",
expr: "hash(hkey)"
}, {
@id: 4,
child: 3,
pop: "union-exchange"
}, {
@id: 5,
child: 4,
pop: "screen"
}
]
}
The output if we submit the above physical plan through submit_plan tool :
------------------
| hkey |
------------------
| 1 |
| 1 |
| 1 |
| 3 |
| 3 |
| 1 |
| 3 |
| 3 |
| 3 |
| 1 |
------------------
| hkey |
------------------
| 2 |
| 2 |
| 2 |
| 2 |
| 2 |
------------------
| hkey |
------------------
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
------------------
| hkey |
------------------
| 4 |
| 4 |
| 4 |
| 4 |
| 4 |
------------------
Got 50 records in 930.671021 seconds
Notice that the results are 25 rows, but the message shows it got 50 rows.
If I remove hash-to-random-exchange from the plan, then the output is valid.
------------------
| hkey |
------------------
| 0 |
| 1 |
| 1 |
| 1 |
| 4 |
| 0 |
| 3 |
| 3 |
| 2 |
| 2 |
| 4 |
| 4 |
| 2 |
| 4 |
| 0 |
| 0 |
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 2 |
| 3 |
| 3 |
| 1 |
------------------
Got 25 records in 3.375000 seconds.
--
This message was sent by Atlassian JIRA
(v6.1#6144)