You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Julian Hyde <ju...@gmail.com> on 2013/03/26 21:59:31 UTC

Trivial project?

I would like to generate a trivial project operator that returns the entire input record. What expression should I use?

When implementing "select * from donuts", here is the logical plan I am currently generating:

{
  "head" : {
    "type" : "apache_drill_logical_plan",
    "version" : 1,
    "generator" : {
      "type" : "manual",
      "info" : "na"
    }
  },
  "storage" : [ {
    "type" : "classpath",
    "name" : "donuts-json"
  }, {
    "type" : "queue",
    "name" : "queue"
  } ],
  "query" : [ {
    "op" : "scan",
    "@id" : 1,
    "memo" : "initial_scan",
    "storageengine" : "donuts-json",
    "selection" : {
      "path" : "/donuts.json",
      "type" : "JSON"
    },
    "ref" : "donuts"
  }, {
    "op" : "project",
    "@id" : 2,
    "input" : 1,
    "exprs" : [ {
      "ref" : "output._MAP",
      "expr" : null
    } ]
  }, {
    "op" : "store",
    "@id" : 3,
    "memo" : "output sink",
    "input" : 2,
    "target" : {
      "number" : 0
    },
    "partition" : null,
    "storageEngine" : "queue"
  } ]
}

The

      "expr" : null

causes the reference implementation to barf (not surprisingly). What should I put in place of "null"?

Julian

Re: Trivial project?

Posted by Timothy Chen <tn...@gmail.com>.
Hi Julian,

Have you tried just putting in the field name you're looking for?

If you want all fields, probably should just get rid of the project.

Tim


On Tue, Mar 26, 2013 at 1:59 PM, Julian Hyde <ju...@gmail.com> wrote:

> I would like to generate a trivial project operator that returns the
> entire input record. What expression should I use?
>
> When implementing "select * from donuts", here is the logical plan I am
> currently generating:
>
> {
>   "head" : {
>     "type" : "apache_drill_logical_plan",
>     "version" : 1,
>     "generator" : {
>       "type" : "manual",
>       "info" : "na"
>     }
>   },
>   "storage" : [ {
>     "type" : "classpath",
>     "name" : "donuts-json"
>   }, {
>     "type" : "queue",
>     "name" : "queue"
>   } ],
>   "query" : [ {
>     "op" : "scan",
>     "@id" : 1,
>     "memo" : "initial_scan",
>     "storageengine" : "donuts-json",
>     "selection" : {
>       "path" : "/donuts.json",
>       "type" : "JSON"
>     },
>     "ref" : "donuts"
>   }, {
>     "op" : "project",
>     "@id" : 2,
>     "input" : 1,
>     "exprs" : [ {
>       "ref" : "output._MAP",
>       "expr" : null
>     } ]
>   }, {
>     "op" : "store",
>     "@id" : 3,
>     "memo" : "output sink",
>     "input" : 2,
>     "target" : {
>       "number" : 0
>     },
>     "partition" : null,
>     "storageEngine" : "queue"
>   } ]
> }
>
> The
>
>       "expr" : null
>
> causes the reference implementation to barf (not surprisingly). What
> should I put in place of "null"?
>
> Julian

Re: Trivial project?

Posted by Julian Hyde <ju...@gmail.com>.
Thanks, David and Tim.

From David's reply, it seems that I can create a trivial project as long as I know the name of the one and only one "holder field" from the source.

From Tim's reply, it seems that there is no identity expression. That is going to make logical expressions more difficult to work with, especially since Drill is dynamically typed. (Just as arithmetic was difficult before someone invented zero.)

In the SQL-to-LE translator, I think I can glean the holder expression from the context, so I should be able to use David's approach.

Julian

Re: Trivial project?

Posted by David Alves <da...@gmail.com>.
Hi Julian

	If you really must use "Project" for whichever reason this will project all fields of donut into output_MAP:
…
{
   "op" : "project",
   "@id" : 2,
   "input" : 1,
   "projections" : [ {
       "ref" : "output._MAP",
       "expr" : "donuts"
     } ]
 }
…
resulting in:
…
{ "_MAP" : { "id" : "0005", "sales" : 700, "name" : "Apple Fritter", "topping" : [ { "id" : "5002", "type" : "Glazed" } ], "ppu" : 1.0, "type" : "donut", "batters" : { "batter" : [ { "id" : "1001", "type" : "Regular" } ] } }	
…
	HTH.

-david

	
On Mar 26, 2013, at 3:59 PM, Julian Hyde <ju...@gmail.com> wrote:

> I would like to generate a trivial project operator that returns the entire input record. What expression should I use?
> 
> When implementing "select * from donuts", here is the logical plan I am currently generating:
> 
> {
>  "head" : {
>    "type" : "apache_drill_logical_plan",
>    "version" : 1,
>    "generator" : {
>      "type" : "manual",
>      "info" : "na"
>    }
>  },
>  "storage" : [ {
>    "type" : "classpath",
>    "name" : "donuts-json"
>  }, {
>    "type" : "queue",
>    "name" : "queue"
>  } ],
>  "query" : [ {
>    "op" : "scan",
>    "@id" : 1,
>    "memo" : "initial_scan",
>    "storageengine" : "donuts-json",
>    "selection" : {
>      "path" : "/donuts.json",
>      "type" : "JSON"
>    },
>    "ref" : "donuts"
>  }, {
>    "op" : "project",
>    "@id" : 2,
>    "input" : 1,
>    "exprs" : [ {
>      "ref" : "output._MAP",
>      "expr" : null
>    } ]
>  }, {
>    "op" : "store",
>    "@id" : 3,
>    "memo" : "output sink",
>    "input" : 2,
>    "target" : {
>      "number" : 0
>    },
>    "partition" : null,
>    "storageEngine" : "queue"
>  } ]
> }
> 
> The
> 
>      "expr" : null
> 
> causes the reference implementation to barf (not surprisingly). What should I put in place of "null"?
> 
> Julian