You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Nitin Pawar <ni...@gmail.com> on 2015/08/20 11:16:49 UTC

Re: query behaviors with subquery in clause

any help guys ?

On Thu, Aug 13, 2015 at 2:52 PM, Nitin Pawar <ni...@gmail.com>
wrote:

> Hi,
>
> right now hive does not support the equality clause in sub-queries.
> for ex:  select * from A where date = (select max(date) from B)
>
> It though supports IN clause
> select * from A where date in (select max(date) from B)
>
> in table A the table is partitioned by date column so i was hoping that
> when I apply IN clause it would look only for that partition but it is
> reading the entire table
>
> select * from A where date='2015-08-09' ... reads one partition
> select * from A where date in ('2015-08-09') ... reads one partitions
> select * from A where date in (select max(date) from B) ... reads all
> partitions from A
>
> am I missing anything error or am i doing something wrong ?
>
> --
> Nitin Pawar
>



-- 
Nitin Pawar

Re: query behaviors with subquery in clause

Posted by Nitin Pawar <ni...@gmail.com>.
Thanks Noam.
As we are doing this via oozie, it will be either EL Action of something
else

I will just get around with a temp table and do a join with temp table with
date column

On Thu, Aug 20, 2015 at 5:27 PM, Noam Hasson <no...@kenshoo.com>
wrote:

> I observed in other situation, when ever you run queries where you don't
> specify statistics partitions, Hive doesn't pre-compute which one to take
> so it will take all the table.
>
> I would suggest implementing the max date by code in a separate query.
>
>
> On Thu, Aug 20, 2015 at 12:16 PM, Nitin Pawar <ni...@gmail.com>
> wrote:
>
>> any help guys ?
>>
>> On Thu, Aug 13, 2015 at 2:52 PM, Nitin Pawar <ni...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> right now hive does not support the equality clause in sub-queries.
>>> for ex:  select * from A where date = (select max(date) from B)
>>>
>>> It though supports IN clause
>>> select * from A where date in (select max(date) from B)
>>>
>>> in table A the table is partitioned by date column so i was hoping that
>>> when I apply IN clause it would look only for that partition but it is
>>> reading the entire table
>>>
>>> select * from A where date='2015-08-09' ... reads one partition
>>> select * from A where date in ('2015-08-09') ... reads one partitions
>>> select * from A where date in (select max(date) from B) ... reads all
>>> partitions from A
>>>
>>> am I missing anything error or am i doing something wrong ?
>>>
>>> --
>>> Nitin Pawar
>>>
>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>
> This e-mail, as well as any attached document, may contain material which
> is confidential and privileged and may include trademark, copyright and
> other intellectual property rights that are proprietary to Kenshoo Ltd,
>  its subsidiaries or affiliates ("Kenshoo"). This e-mail and its
> attachments may be read, copied and used only by the addressee for the
> purpose(s) for which it was disclosed herein. If you have received it in
> error, please destroy the message and any attachment, and contact us
> immediately. If you are not the intended recipient, be aware that any
> review, reliance, disclosure, copying, distribution or use of the contents
> of this message without Kenshoo's express permission is strictly prohibited.




-- 
Nitin Pawar

Re: query behaviors with subquery in clause

Posted by Noam Hasson <no...@kenshoo.com>.
I observed in other situation, when ever you run queries where you don't
specify statistics partitions, Hive doesn't pre-compute which one to take
so it will take all the table.

I would suggest implementing the max date by code in a separate query.


On Thu, Aug 20, 2015 at 12:16 PM, Nitin Pawar <ni...@gmail.com>
wrote:

> any help guys ?
>
> On Thu, Aug 13, 2015 at 2:52 PM, Nitin Pawar <ni...@gmail.com>
> wrote:
>
>> Hi,
>>
>> right now hive does not support the equality clause in sub-queries.
>> for ex:  select * from A where date = (select max(date) from B)
>>
>> It though supports IN clause
>> select * from A where date in (select max(date) from B)
>>
>> in table A the table is partitioned by date column so i was hoping that
>> when I apply IN clause it would look only for that partition but it is
>> reading the entire table
>>
>> select * from A where date='2015-08-09' ... reads one partition
>> select * from A where date in ('2015-08-09') ... reads one partitions
>> select * from A where date in (select max(date) from B) ... reads all
>> partitions from A
>>
>> am I missing anything error or am i doing something wrong ?
>>
>> --
>> Nitin Pawar
>>
>
>
>
> --
> Nitin Pawar
>

-- 
This e-mail, as well as any attached document, may contain material which 
is confidential and privileged and may include trademark, copyright and 
other intellectual property rights that are proprietary to Kenshoo Ltd, 
 its subsidiaries or affiliates ("Kenshoo"). This e-mail and its 
attachments may be read, copied and used only by the addressee for the 
purpose(s) for which it was disclosed herein. If you have received it in 
error, please destroy the message and any attachment, and contact us 
immediately. If you are not the intended recipient, be aware that any 
review, reliance, disclosure, copying, distribution or use of the contents 
of this message without Kenshoo's express permission is strictly prohibited.