You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2013/05/19 00:37:15 UTC

[jira] [Commented] (DRILL-61) Logical plan operator "collapsesegment" produces wrong results

    [ https://issues.apache.org/jira/browse/DRILL-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661440#comment-13661440 ] 

Julian Hyde commented on DRILL-61:
----------------------------------

By the way, I ran these queries using "java org.apache.drill.exec.ref.ReferenceImplementation.main --stdout query.json" from Intellij. Should also work from the command line, if you get the classpath right.
                
> Logical plan operator "collapsesegment" produces wrong results
> --------------------------------------------------------------
>
>                 Key: DRILL-61
>                 URL: https://issues.apache.org/jira/browse/DRILL-61
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Julian Hyde
>
> Logical plan operator "collapsesegment" produces wrong results. There is a null value present -- maybe it is responsible.
> Query:
> {
>   "head" : {
>     "type" : "apache_drill_logical_plan",
>     "version" : 1,
>     "generator" : {
>       "type" : "manual",
>       "info" : "na"
>     }
>   },
>   "storage" : [ {
>     "type" : "queue",
>     "name" : "queue"
>   }, {
>     "type" : "classpath",
>     "name" : "donuts-json"
>   } ],
>   "query" : [ {
>     "op" : "scan",
>     "@id" : 1,
>     "memo" : "initial_scan",
>     "storageengine" : "donuts-json",
>     "selection" : {
>       "path" : "/employees.json",
>       "type" : "JSON"
>     },
>     "ref" : "_MAP"
>   }, {
>     "op" : "project",
>     "input" : 1,
>     "@id" : 2,
>     "projections" : [ {
>       "ref" : "output.deptId",
>       "expr" : "_MAP.deptId"
>     } ]
>   },  {
>     op: "segment",
>     "input" : 2,
>     "@id" : 3,
>     ref: "segment",
>     exprs: ["deptId"]
>   }, {
>     "input" : 3,
>     "@id" : 4,
>     op: "collapsingaggregate",
>     within: "segment",
>     carryovers: [ "deptId" ],
>     aggregations: [
>                 { ref: "typeCount",  expr: "count(1)" }
>               ]
>   }, {
>     "op" : "store",
>     "input" : 4,
>     "@id" : 5,
>     "memo" : "output sink",
>     "target" : {
>       "number" : 0
>     },
>     "partition" : null,
>     "storageEngine" : "queue"
>   } ]
> }
> gives result
> {  "typeCount" : 2,  "deptId" : 34 }
> {   "typeCount" : 2,  "deptId" : null }
> {   "typeCount" : 1,   "deptId" : 31 }
> {  "typeCount" : 1,   "deptId" : 31 }
> I think the correct result would be
> {  "typeCount" : 2,  "deptId" : 33 }
> {   "typeCount" : 2,  "deptId" : 34 }
> {   "typeCount" : 1,   "deptId" : null }
> {  "typeCount" : 1,   "deptId" : 31 }
> Note that the "segment" operator is working correctly. A similar query with "collapseaggregate" removed:
> {
>   "head" : {
>     "type" : "apache_drill_logical_plan",
>     "version" : 1,
>     "generator" : {
>       "type" : "manual",
>       "info" : "na"
>     }
>   },
>   "storage" : [ {
>     "type" : "queue",
>     "name" : "queue"
>   }, {
>     "type" : "classpath",
>     "name" : "donuts-json"
>   } ],
>   "query" : [ {
>     "op" : "scan",
>     "@id" : 1,
>     "memo" : "initial_scan",
>     "storageengine" : "donuts-json",
>     "selection" : {
>       "path" : "/employees.json",
>       "type" : "JSON"
>     },
>     "ref" : "_MAP"
>   }, {
>     "op" : "project",
>     "input" : 1,
>     "@id" : 2,
>     "projections" : [ {
>       "ref" : "output.deptId",
>       "expr" : "_MAP.deptId"
>     } ]
>   },  {
>     op: "segment",
>     "input" : 2,
>     "@id" : 3,
>     ref: "segment",
>     exprs: ["deptId"]
>   }, {
>     "op" : "store",
>     "input" : 3,
>     "@id" : 5,
>     "memo" : "output sink",
>     "target" : {
>       "number" : 0
>     },
>     "partition" : null,
>     "storageEngine" : "queue"
>   } ]
> }
> gives
> {  "segment" : 1,  "deptId" : 33 }
> {  "segment" : 1,  "deptId" : 33 }
> {  "segment" : 2,  "deptId" : 34 }
> {  "segment" : 2,  "deptId" : 34 }
> {  "segment" : 3,  "deptId" : null }
> {  "segment" : 4,  "deptId" : 31 }
> It is reasonsble to assume that these are the records flowing into the "collapseaggregate" ROP in the first query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira