You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2013/05/19 00:37:15 UTC
[jira] [Commented] (DRILL-61) Logical plan operator
"collapsesegment" produces wrong results
[ https://issues.apache.org/jira/browse/DRILL-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661440#comment-13661440 ]
Julian Hyde commented on DRILL-61:
----------------------------------
By the way, I ran these queries using "java org.apache.drill.exec.ref.ReferenceImplementation.main --stdout query.json" from Intellij. Should also work from the command line, if you get the classpath right.
> Logical plan operator "collapsesegment" produces wrong results
> --------------------------------------------------------------
>
> Key: DRILL-61
> URL: https://issues.apache.org/jira/browse/DRILL-61
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Julian Hyde
>
> Logical plan operator "collapsesegment" produces wrong results. There is a null value present -- maybe it is responsible.
> Query:
> {
> "head" : {
> "type" : "apache_drill_logical_plan",
> "version" : 1,
> "generator" : {
> "type" : "manual",
> "info" : "na"
> }
> },
> "storage" : [ {
> "type" : "queue",
> "name" : "queue"
> }, {
> "type" : "classpath",
> "name" : "donuts-json"
> } ],
> "query" : [ {
> "op" : "scan",
> "@id" : 1,
> "memo" : "initial_scan",
> "storageengine" : "donuts-json",
> "selection" : {
> "path" : "/employees.json",
> "type" : "JSON"
> },
> "ref" : "_MAP"
> }, {
> "op" : "project",
> "input" : 1,
> "@id" : 2,
> "projections" : [ {
> "ref" : "output.deptId",
> "expr" : "_MAP.deptId"
> } ]
> }, {
> op: "segment",
> "input" : 2,
> "@id" : 3,
> ref: "segment",
> exprs: ["deptId"]
> }, {
> "input" : 3,
> "@id" : 4,
> op: "collapsingaggregate",
> within: "segment",
> carryovers: [ "deptId" ],
> aggregations: [
> { ref: "typeCount", expr: "count(1)" }
> ]
> }, {
> "op" : "store",
> "input" : 4,
> "@id" : 5,
> "memo" : "output sink",
> "target" : {
> "number" : 0
> },
> "partition" : null,
> "storageEngine" : "queue"
> } ]
> }
> gives result
> { "typeCount" : 2, "deptId" : 34 }
> { "typeCount" : 2, "deptId" : null }
> { "typeCount" : 1, "deptId" : 31 }
> { "typeCount" : 1, "deptId" : 31 }
> I think the correct result would be
> { "typeCount" : 2, "deptId" : 33 }
> { "typeCount" : 2, "deptId" : 34 }
> { "typeCount" : 1, "deptId" : null }
> { "typeCount" : 1, "deptId" : 31 }
> Note that the "segment" operator is working correctly. A similar query with "collapseaggregate" removed:
> {
> "head" : {
> "type" : "apache_drill_logical_plan",
> "version" : 1,
> "generator" : {
> "type" : "manual",
> "info" : "na"
> }
> },
> "storage" : [ {
> "type" : "queue",
> "name" : "queue"
> }, {
> "type" : "classpath",
> "name" : "donuts-json"
> } ],
> "query" : [ {
> "op" : "scan",
> "@id" : 1,
> "memo" : "initial_scan",
> "storageengine" : "donuts-json",
> "selection" : {
> "path" : "/employees.json",
> "type" : "JSON"
> },
> "ref" : "_MAP"
> }, {
> "op" : "project",
> "input" : 1,
> "@id" : 2,
> "projections" : [ {
> "ref" : "output.deptId",
> "expr" : "_MAP.deptId"
> } ]
> }, {
> op: "segment",
> "input" : 2,
> "@id" : 3,
> ref: "segment",
> exprs: ["deptId"]
> }, {
> "op" : "store",
> "input" : 3,
> "@id" : 5,
> "memo" : "output sink",
> "target" : {
> "number" : 0
> },
> "partition" : null,
> "storageEngine" : "queue"
> } ]
> }
> gives
> { "segment" : 1, "deptId" : 33 }
> { "segment" : 1, "deptId" : 33 }
> { "segment" : 2, "deptId" : 34 }
> { "segment" : 2, "deptId" : 34 }
> { "segment" : 3, "deptId" : null }
> { "segment" : 4, "deptId" : 31 }
> It is reasonsble to assume that these are the records flowing into the "collapseaggregate" ROP in the first query.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira