You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Narendra <un...@gmail.com> on 2010/12/28 08:09:13 UTC

Question of Hive Partitioning

Hi,

Hive wiki says "Hive currently does input pruning only if the partition
predicates are specified in the WHERE clause closest to the table_reference
in the FROM clause..." what does this mean?

I have a query in which I am joining 4 tables two of which need to be pruned
by partition. The partition predicates are specified in the WHERE clause.
However, I see that only one of the table gets pruned by the partition
whereas the other one goes for a full table scan.

Have a great holiday season!
Narendra

Re: Question of Hive Partitioning

Posted by Narendra <un...@gmail.com>.
I guess it's the left outer that's making it do a full table scan...but
why???

On Tue, Dec 28, 2010 at 2:34 PM, Narendra <un...@gmail.com> wrote:

> Thanks for the response.
>
> The query is something like this -
>
> select
> a.a_val1,
> count(a.a_val2),
> sum(b.b_val),
> c.c_val,
> d.d_val
> from
> table1 a
> left outer join table2 b on (a.someval = b.someval),
> left outer join table3 c on (a.someval2 = c.someval2),
> left outer join table4 d on (a.someval3 = d.someval3)
> where
> a.dt > '2010-12-28 00:00:00' -- table1 partition
> and b.dt > '2010-12-28 00:00:00' -- table2 partition
> group by a_val1, c_val, d_val
>
> Thanks,
> Narendra
>
>   On Tue, Dec 28, 2010 at 1:06 PM, yongqiang he <he...@gmail.com>wrote:
>
>> Can you post your query? Is there " or " connecting the partitions'
>> predicate with others?
>>
>> Thanks
>> Yongqiang
>>  On Mon, Dec 27, 2010 at 11:09 PM, Narendra <un...@gmail.com> wrote:
>> > Hi,
>> >
>> > Hive wiki says "Hive currently does input pruning only if the partition
>> > predicates are specified in the WHERE clause closest to the
>> table_reference
>> > in the FROM clause..." what does this mean?
>> >
>> > I have a query in which I am joining 4 tables two of which need to be
>> pruned
>> > by partition. The partition predicates are specified in the WHERE
>> clause.
>> > However, I see that only one of the table gets pruned by the partition
>> > whereas the other one goes for a full table scan.
>> >
>> > Have a great holiday season!
>> > Narendra
>>
>
>

Re: Question of Hive Partitioning

Posted by Narendra <un...@gmail.com>.
Thanks for the response.

The query is something like this -

select
a.a_val1,
count(a.a_val2),
sum(b.b_val),
c.c_val,
d.d_val
from
table1 a
left outer join table2 b on (a.someval = b.someval),
left outer join table3 c on (a.someval2 = c.someval2),
left outer join table4 d on (a.someval3 = d.someval3)
where
a.dt > '2010-12-28 00:00:00' -- table1 partition
and b.dt > '2010-12-28 00:00:00' -- table2 partition
group by a_val1, c_val, d_val

Thanks,
Narendra

On Tue, Dec 28, 2010 at 1:06 PM, yongqiang he <he...@gmail.com>wrote:

> Can you post your query? Is there " or " connecting the partitions'
> predicate with others?
>
> Thanks
> Yongqiang
>  On Mon, Dec 27, 2010 at 11:09 PM, Narendra <un...@gmail.com> wrote:
> > Hi,
> >
> > Hive wiki says "Hive currently does input pruning only if the partition
> > predicates are specified in the WHERE clause closest to the
> table_reference
> > in the FROM clause..." what does this mean?
> >
> > I have a query in which I am joining 4 tables two of which need to be
> pruned
> > by partition. The partition predicates are specified in the WHERE clause.
> > However, I see that only one of the table gets pruned by the partition
> > whereas the other one goes for a full table scan.
> >
> > Have a great holiday season!
> > Narendra
>

Re: Question of Hive Partitioning

Posted by yongqiang he <he...@gmail.com>.
Can you post your query? Is there " or " connecting the partitions'
predicate with others?

Thanks
Yongqiang
On Mon, Dec 27, 2010 at 11:09 PM, Narendra <un...@gmail.com> wrote:
> Hi,
>
> Hive wiki says "Hive currently does input pruning only if the partition
> predicates are specified in the WHERE clause closest to the table_reference
> in the FROM clause..." what does this mean?
>
> I have a query in which I am joining 4 tables two of which need to be pruned
> by partition. The partition predicates are specified in the WHERE clause.
> However, I see that only one of the table gets pruned by the partition
> whereas the other one goes for a full table scan.
>
> Have a great holiday season!
> Narendra