You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by "Lavelle, Shawn" <Sh...@osii.com> on 2016/03/08 20:32:25 UTC

Simple UDFS and IN Operator

Hello All,
   I hope that this question isn’t too rudimentary – but I’m relatively new to HIVE.

   In Hive 0.11, I’ve written a UDF that returns a list of Integers. I’d like to use this in a WHERE clause of a query, something like SELECT * FROM <table> WHERE <col>  in ( getList() ). (Extra parenthesis needed to pass the parser.)  Is such a thing possible?  Keying in values for the list parameter works, but they have WritableConstantIntObjectInspectors whereas what is returned by my list (despite my best efforts) has an element inspector of WritabeIntObjectInspector. This doesn’t work.

  So, two questions – Should It? (The HIVE I’m working on is heavily modified :/ ) and how might I accomplish this?  Joins would be ideal, but we haven’t upgraded yet.

  Thank you for your insight,

~ Shawn M Lavelle



[cid:image641033.GIF@0ad5cb29.448952f7]

Shawn Lavelle
Software Development

4101 Arrowhead Drive
Medina, Minnesota 55340-9457
Phone: 763 551 0559
Fax: 763 551 0750
Email: Shawn.Lavelle@osii.com<ma...@osii.com>
Website: www.osii.com<http://www.osii.com>


Re: Simple UDFS and IN Operator

Posted by Gopal Vijayaraghavan <go...@apache.org>.
 
>   In Hive 0.11, I¹ve written a UDF that returns a list of Integers. I¹d
>like to use this in a WHERE clause of a query, something like SELECT *
>FROM <table> WHERE <col>  in ( getList()).
...
> Joins would be ideal, but we haven¹t upgraded yet.

IN() is actually rewritten into a JOIN (distinct ...) internally, but if
that is your only goal, Hive should still allow you to do that using the
array functions.

where array_contains(getList(), <col>);

Cheers,
Gopal


RE: Simple UDFS and IN Operator

Posted by "Lavelle, Shawn" <Sh...@osii.com>.
Thanks Edward, (and Gopal),

   This fits with what I was seeing.  I have modified GenericIN to accept a list (by having it look at the listElementObjInspector), but it still fails to be accepted as a partitionkey expression.  I traced that to IndexPredicateAnalyzer.analyzeExpr where it’s looking for an ExprNodeColumnDesc and a ExprNodeConstantDesc. The List will return a WritableListObjectInspector and java’s collections contains an Immutable list which will yield a WritableConstantListObjectInspector, but the Element Object Inspector does not pick up the members of the list as constant Expr.   ( I tried unmodifiableList too, even Collections.unmodifiableList(ImmutableList.copyOf(retArray)) but that doesn’t get me a constant object inspector.)

   I’m guessing bad things™ will happen if I open the analyzeExpr to non-constant expressions?

  Any further insights?

   The array_contains() method won’t work with the modified HIVE I’m working on :/  If I could get the IN operator to work on LISTS returned from UDFs as a join like it does now – I’d be set, however, since it’s part of ANTLR I can see how that’s not going to be the right solution.

   Thanks for your help,

~ Shawn M Lavelle

From: Edward Capriolo [mailto:edlinuxguru@gmail.com]
Sent: Tuesday, March 08, 2016 6:10 PM
To: user@hive.apache.org
Subject: Re: Simple UDFS and IN Operator

The IN UDF is a special one in that unlike many others there is support in the ANTLR language and parsers for it. The rough answer is it can be done but it is not as direct as making other UDFs.


On Tue, Mar 8, 2016 at 2:32 PM, Lavelle, Shawn <Sh...@osii.com>> wrote:
Hello All,
   I hope that this question isn’t too rudimentary – but I’m relatively new to HIVE.

   In Hive 0.11, I’ve written a UDF that returns a list of Integers. I’d like to use this in a WHERE clause of a query, something like SELECT * FROM <table> WHERE <col>  in ( getList() ). (Extra parenthesis needed to pass the parser.)  Is such a thing possible?  Keying in values for the list parameter works, but they have WritableConstantIntObjectInspectors whereas what is returned by my list (despite my best efforts) has an element inspector of WritabeIntObjectInspector. This doesn’t work.

  So, two questions – Should It? (The HIVE I’m working on is heavily modified :/ ) and how might I accomplish this?  Joins would be ideal, but we haven’t upgraded yet.

  Thank you for your insight,

~ Shawn M Lavelle



[cid:image002.png@01D17A07.14F62350]
Shawn Lavelle
Software Development

4101 Arrowhead Drive
Medina, Minnesota 55340-9457
Phone: 763 551 0559<tel:763%20551%200559>
Fax: 763 551 0750<tel:763%20551%200750>
Email: Shawn.Lavelle@osii.com<ma...@osii.com>
Website: www.osii.com<http://www.osii.com>




Re: Simple UDFS and IN Operator

Posted by Edward Capriolo <ed...@gmail.com>.
The IN UDF is a special one in that unlike many others there is support in
the ANTLR language and parsers for it. The rough answer is it can be done
but it is not as direct as making other UDFs.


On Tue, Mar 8, 2016 at 2:32 PM, Lavelle, Shawn <Sh...@osii.com>
wrote:

> Hello All,
>
>    I hope that this question isn’t too rudimentary – but I’m relatively
> new to HIVE.
>
>
>
>    In Hive 0.11, I’ve written a UDF that returns a list of Integers. I’d
> like to use this in a WHERE clause of a query, something like SELECT * FROM
> <table> WHERE <col>  in ( getList() ). (Extra parenthesis needed to pass
> the parser.)  Is such a thing possible?  Keying in values for the list
> parameter works, but they have WritableConstantIntObjectInspectors whereas
> what is returned by my list (despite my best efforts) has an element
> inspector of WritabeIntObjectInspector. This doesn’t work.
>
>   So, two questions – Should It? (The HIVE I’m working on is heavily
> modified :/ ) and how might I accomplish this?  Joins would be ideal, but
> we haven’t upgraded yet.
>
>   Thank you for your insight,
>
>
>
> ~ Shawn M Lavelle
>
>
>
>
>
>
> Shawn Lavelle
> Software Development
>
> 4101 Arrowhead Drive
> Medina, Minnesota 55340-9457
> Phone: 763 551 0559
> Fax: 763 551 0750
> *Email:* Shawn.Lavelle@osii.com
> *Website: **www.osii.com* <http://www.osii.com>
>