You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Narendra <un...@gmail.com> on 2010/12/28 08:09:13 UTC
Question of Hive Partitioning
Hi,
Hive wiki says "Hive currently does input pruning only if the partition
predicates are specified in the WHERE clause closest to the table_reference
in the FROM clause..." what does this mean?
I have a query in which I am joining 4 tables two of which need to be pruned
by partition. The partition predicates are specified in the WHERE clause.
However, I see that only one of the table gets pruned by the partition
whereas the other one goes for a full table scan.
Have a great holiday season!
Narendra
Re: Question of Hive Partitioning
Posted by Narendra <un...@gmail.com>.
I guess it's the left outer that's making it do a full table scan...but
why???
On Tue, Dec 28, 2010 at 2:34 PM, Narendra <un...@gmail.com> wrote:
> Thanks for the response.
>
> The query is something like this -
>
> select
> a.a_val1,
> count(a.a_val2),
> sum(b.b_val),
> c.c_val,
> d.d_val
> from
> table1 a
> left outer join table2 b on (a.someval = b.someval),
> left outer join table3 c on (a.someval2 = c.someval2),
> left outer join table4 d on (a.someval3 = d.someval3)
> where
> a.dt > '2010-12-28 00:00:00' -- table1 partition
> and b.dt > '2010-12-28 00:00:00' -- table2 partition
> group by a_val1, c_val, d_val
>
> Thanks,
> Narendra
>
> On Tue, Dec 28, 2010 at 1:06 PM, yongqiang he <he...@gmail.com>wrote:
>
>> Can you post your query? Is there " or " connecting the partitions'
>> predicate with others?
>>
>> Thanks
>> Yongqiang
>> On Mon, Dec 27, 2010 at 11:09 PM, Narendra <un...@gmail.com> wrote:
>> > Hi,
>> >
>> > Hive wiki says "Hive currently does input pruning only if the partition
>> > predicates are specified in the WHERE clause closest to the
>> table_reference
>> > in the FROM clause..." what does this mean?
>> >
>> > I have a query in which I am joining 4 tables two of which need to be
>> pruned
>> > by partition. The partition predicates are specified in the WHERE
>> clause.
>> > However, I see that only one of the table gets pruned by the partition
>> > whereas the other one goes for a full table scan.
>> >
>> > Have a great holiday season!
>> > Narendra
>>
>
>
Re: Question of Hive Partitioning
Posted by Narendra <un...@gmail.com>.
Thanks for the response.
The query is something like this -
select
a.a_val1,
count(a.a_val2),
sum(b.b_val),
c.c_val,
d.d_val
from
table1 a
left outer join table2 b on (a.someval = b.someval),
left outer join table3 c on (a.someval2 = c.someval2),
left outer join table4 d on (a.someval3 = d.someval3)
where
a.dt > '2010-12-28 00:00:00' -- table1 partition
and b.dt > '2010-12-28 00:00:00' -- table2 partition
group by a_val1, c_val, d_val
Thanks,
Narendra
On Tue, Dec 28, 2010 at 1:06 PM, yongqiang he <he...@gmail.com>wrote:
> Can you post your query? Is there " or " connecting the partitions'
> predicate with others?
>
> Thanks
> Yongqiang
> On Mon, Dec 27, 2010 at 11:09 PM, Narendra <un...@gmail.com> wrote:
> > Hi,
> >
> > Hive wiki says "Hive currently does input pruning only if the partition
> > predicates are specified in the WHERE clause closest to the
> table_reference
> > in the FROM clause..." what does this mean?
> >
> > I have a query in which I am joining 4 tables two of which need to be
> pruned
> > by partition. The partition predicates are specified in the WHERE clause.
> > However, I see that only one of the table gets pruned by the partition
> > whereas the other one goes for a full table scan.
> >
> > Have a great holiday season!
> > Narendra
>
Re: Question of Hive Partitioning
Posted by yongqiang he <he...@gmail.com>.
Can you post your query? Is there " or " connecting the partitions'
predicate with others?
Thanks
Yongqiang
On Mon, Dec 27, 2010 at 11:09 PM, Narendra <un...@gmail.com> wrote:
> Hi,
>
> Hive wiki says "Hive currently does input pruning only if the partition
> predicates are specified in the WHERE clause closest to the table_reference
> in the FROM clause..." what does this mean?
>
> I have a query in which I am joining 4 tables two of which need to be pruned
> by partition. The partition predicates are specified in the WHERE clause.
> However, I see that only one of the table gets pruned by the partition
> whereas the other one goes for a full table scan.
>
> Have a great holiday season!
> Narendra