You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Darapaneni, Hazarath" <hd...@ea.com> on 2014/03/24 20:34:12 UTC

hive partition questions

Hi Team,


I have a scenario that I need help from you guys, please check and let me know, if it's feasible or not.

Scenario:

I have a table called page_view and it has 10 partitions with date (2013-12-20 to 2013-12-30) and I need to select latest partition without giving date as where condition.

If I do  select * from page_view then it should got to latest partition as default (in this case its 2013-12-30) , is there a way to do this in hive configuration.

Thanks,
Hazarath.

Re: hive partition questions

Posted by Selina Zhang <se...@yahoo-inc.com>.
Try left semi join? It should equal to Xuefu¹s solution.

select * from jobs t left semi join (select max(dt) d from jobs) m on
m.d=t.dt;

Thanks,
Selina

On 3/24/14, 5:41 PM, "Darapaneni, Hazarath" <hd...@ea.com> wrote:

>I have tried Join but it's taking too much time. Thanks Zhang for your
>inputs.
>
>Thanks,
>Hazarath.
>
>-----Original Message-----
>From: Xuefu Zhang [mailto:xzhang@cloudera.com]
>Sent: Monday, March 24, 2014 7:36 PM
>To: dev@hive.apache.org
>Subject: Re: hive partition questions
>
>I'm on latest trunk, as named above. 0.9 might not have IN support.
>However, you might be able to work around using join instead.
>
>
>On Mon, Mar 24, 2014 at 4:26 PM, Darapaneni, Hazarath
><hd...@ea.com>wrote:
>
>> Which trunk are you in, am in 0.9.
>>
>> Thanks,
>> Hazarath.
>>
>> -----Original Message-----
>> From: Xuefu Zhang [mailto:xzhang@cloudera.com]
>> Sent: Monday, March 24, 2014 6:07 PM
>> To: dev@hive.apache.org
>> Subject: Re: hive partition questions
>>
>> I'm on latest trunk, and the following similar query worked for me:
>>
>> select * from test as X where X.d in (select max(T.d) from test as T);
>>
>>
>> On Mon, Mar 24, 2014 at 4:04 PM, Darapaneni, Hazarath
>> <hdarapaneni@ea.com
>> >wrote:
>>
>> > I have tried but no luck..
>> >
>> > FAILED: Parse Error: line 1:27 mismatched input 'as' expecting EOF
>>near '
>> > page_view'  and took out as and tried and it gave me  same syntax
>> > error as earlier one.
>> >
>> > FAILED: Parse Error: line 1:48 cannot recognize input near 'select'
>>'max'
>> > '(' in expression specification
>> >
>> > Thanks,
>> > Hazarath.
>> >
>> > -----Original Message-----
>> > From: Xuefu Zhang [mailto:xzhang@cloudera.com]
>> > Sent: Monday, March 24, 2014 5:53 PM
>> > To: dev@hive.apache.org
>> > Subject: Re: hive partition questions
>> >
>> > You may try:
>> >
>> > select * from page_view as PV1 where PV1.dt IN (select max(dt) from
>> > page_view);
>> >
>> >
>> > --Xuefu
>> >
>> >
>> > On Mon, Mar 24, 2014 at 3:47 PM, Darapaneni, Hazarath
>> > <hdarapaneni@ea.com
>> > >wrote:
>> >
>> > > Thanks Zhang,
>> > >
>> > > Is there a way to access ma partition in hive without specifying
>> > > as
>> > value.
>> > > Like as sub query .. select * from page_view where dt=(select
>> > > max(dt) from page_view)..i tried this but its failing with syntax.
>> > >
>> > > Thanks,
>> > > Hazarath.
>> > >
>> > > -----Original Message-----
>> > > From: Xuefu Zhang [mailto:xzhang@cloudera.com]
>> > > Sent: Monday, March 24, 2014 3:21 PM
>> > > To: dev@hive.apache.org
>> > > Subject: Re: hive partition questions
>> > >
>> > > As far as I know, select * from table gives you all data across
>> > > all partitions, and I don't think there is a config option to ask
>> > > Hive to read "latest" partition.
>> > >
>> > > --Xuefu
>> > >
>> > >
>> > > On Mon, Mar 24, 2014 at 12:34 PM, Darapaneni, Hazarath
>> > > <hd...@ea.com>wrote:
>> > >
>> > > > Hi Team,
>> > > >
>> > > >
>> > > > I have a scenario that I need help from you guys, please check
>> > > > and let me know, if it's feasible or not.
>> > > >
>> > > > Scenario:
>> > > >
>> > > > I have a table called page_view and it has 10 partitions with
>> > > > date
>> > > > (2013-12-20 to 2013-12-30) and I need to select latest partition
>> > > > without giving date as where condition.
>> > > >
>> > > > If I do  select * from page_view then it should got to latest
>> > > > partition as default (in this case its 2013-12-30) , is there a
>> > > > way to do this in hive configuration.
>> > > >
>> > > > Thanks,
>> > > > Hazarath.
>> > > >
>> > >
>> >
>>


RE: hive partition questions

Posted by "Darapaneni, Hazarath" <hd...@ea.com>.
I have tried Join but it's taking too much time. Thanks Zhang for your inputs.

Thanks,
Hazarath.

-----Original Message-----
From: Xuefu Zhang [mailto:xzhang@cloudera.com] 
Sent: Monday, March 24, 2014 7:36 PM
To: dev@hive.apache.org
Subject: Re: hive partition questions

I'm on latest trunk, as named above. 0.9 might not have IN support.
However, you might be able to work around using join instead.


On Mon, Mar 24, 2014 at 4:26 PM, Darapaneni, Hazarath <hd...@ea.com>wrote:

> Which trunk are you in, am in 0.9.
>
> Thanks,
> Hazarath.
>
> -----Original Message-----
> From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> Sent: Monday, March 24, 2014 6:07 PM
> To: dev@hive.apache.org
> Subject: Re: hive partition questions
>
> I'm on latest trunk, and the following similar query worked for me:
>
> select * from test as X where X.d in (select max(T.d) from test as T);
>
>
> On Mon, Mar 24, 2014 at 4:04 PM, Darapaneni, Hazarath 
> <hdarapaneni@ea.com
> >wrote:
>
> > I have tried but no luck..
> >
> > FAILED: Parse Error: line 1:27 mismatched input 'as' expecting EOF near '
> > page_view'  and took out as and tried and it gave me  same syntax 
> > error as earlier one.
> >
> > FAILED: Parse Error: line 1:48 cannot recognize input near 'select' 'max'
> > '(' in expression specification
> >
> > Thanks,
> > Hazarath.
> >
> > -----Original Message-----
> > From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> > Sent: Monday, March 24, 2014 5:53 PM
> > To: dev@hive.apache.org
> > Subject: Re: hive partition questions
> >
> > You may try:
> >
> > select * from page_view as PV1 where PV1.dt IN (select max(dt) from 
> > page_view);
> >
> >
> > --Xuefu
> >
> >
> > On Mon, Mar 24, 2014 at 3:47 PM, Darapaneni, Hazarath 
> > <hdarapaneni@ea.com
> > >wrote:
> >
> > > Thanks Zhang,
> > >
> > > Is there a way to access ma partition in hive without specifying 
> > > as
> > value.
> > > Like as sub query .. select * from page_view where dt=(select
> > > max(dt) from page_view)..i tried this but its failing with syntax.
> > >
> > > Thanks,
> > > Hazarath.
> > >
> > > -----Original Message-----
> > > From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> > > Sent: Monday, March 24, 2014 3:21 PM
> > > To: dev@hive.apache.org
> > > Subject: Re: hive partition questions
> > >
> > > As far as I know, select * from table gives you all data across 
> > > all partitions, and I don't think there is a config option to ask 
> > > Hive to read "latest" partition.
> > >
> > > --Xuefu
> > >
> > >
> > > On Mon, Mar 24, 2014 at 12:34 PM, Darapaneni, Hazarath
> > > <hd...@ea.com>wrote:
> > >
> > > > Hi Team,
> > > >
> > > >
> > > > I have a scenario that I need help from you guys, please check 
> > > > and let me know, if it's feasible or not.
> > > >
> > > > Scenario:
> > > >
> > > > I have a table called page_view and it has 10 partitions with 
> > > > date
> > > > (2013-12-20 to 2013-12-30) and I need to select latest partition 
> > > > without giving date as where condition.
> > > >
> > > > If I do  select * from page_view then it should got to latest 
> > > > partition as default (in this case its 2013-12-30) , is there a 
> > > > way to do this in hive configuration.
> > > >
> > > > Thanks,
> > > > Hazarath.
> > > >
> > >
> >
>

Re: hive partition questions

Posted by Xuefu Zhang <xz...@cloudera.com>.
I'm on latest trunk, as named above. 0.9 might not have IN support.
However, you might be able to work around using join instead.


On Mon, Mar 24, 2014 at 4:26 PM, Darapaneni, Hazarath <hd...@ea.com>wrote:

> Which trunk are you in, am in 0.9.
>
> Thanks,
> Hazarath.
>
> -----Original Message-----
> From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> Sent: Monday, March 24, 2014 6:07 PM
> To: dev@hive.apache.org
> Subject: Re: hive partition questions
>
> I'm on latest trunk, and the following similar query worked for me:
>
> select * from test as X where X.d in (select max(T.d) from test as T);
>
>
> On Mon, Mar 24, 2014 at 4:04 PM, Darapaneni, Hazarath <hdarapaneni@ea.com
> >wrote:
>
> > I have tried but no luck..
> >
> > FAILED: Parse Error: line 1:27 mismatched input 'as' expecting EOF near '
> > page_view'  and took out as and tried and it gave me  same syntax
> > error as earlier one.
> >
> > FAILED: Parse Error: line 1:48 cannot recognize input near 'select' 'max'
> > '(' in expression specification
> >
> > Thanks,
> > Hazarath.
> >
> > -----Original Message-----
> > From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> > Sent: Monday, March 24, 2014 5:53 PM
> > To: dev@hive.apache.org
> > Subject: Re: hive partition questions
> >
> > You may try:
> >
> > select * from page_view as PV1 where PV1.dt IN (select max(dt) from
> > page_view);
> >
> >
> > --Xuefu
> >
> >
> > On Mon, Mar 24, 2014 at 3:47 PM, Darapaneni, Hazarath
> > <hdarapaneni@ea.com
> > >wrote:
> >
> > > Thanks Zhang,
> > >
> > > Is there a way to access ma partition in hive without specifying as
> > value.
> > > Like as sub query .. select * from page_view where dt=(select
> > > max(dt) from page_view)..i tried this but its failing with syntax.
> > >
> > > Thanks,
> > > Hazarath.
> > >
> > > -----Original Message-----
> > > From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> > > Sent: Monday, March 24, 2014 3:21 PM
> > > To: dev@hive.apache.org
> > > Subject: Re: hive partition questions
> > >
> > > As far as I know, select * from table gives you all data across all
> > > partitions, and I don't think there is a config option to ask Hive
> > > to read "latest" partition.
> > >
> > > --Xuefu
> > >
> > >
> > > On Mon, Mar 24, 2014 at 12:34 PM, Darapaneni, Hazarath
> > > <hd...@ea.com>wrote:
> > >
> > > > Hi Team,
> > > >
> > > >
> > > > I have a scenario that I need help from you guys, please check and
> > > > let me know, if it's feasible or not.
> > > >
> > > > Scenario:
> > > >
> > > > I have a table called page_view and it has 10 partitions with date
> > > > (2013-12-20 to 2013-12-30) and I need to select latest partition
> > > > without giving date as where condition.
> > > >
> > > > If I do  select * from page_view then it should got to latest
> > > > partition as default (in this case its 2013-12-30) , is there a
> > > > way to do this in hive configuration.
> > > >
> > > > Thanks,
> > > > Hazarath.
> > > >
> > >
> >
>

RE: hive partition questions

Posted by "Darapaneni, Hazarath" <hd...@ea.com>.
Which trunk are you in, am in 0.9.

Thanks,
Hazarath.

-----Original Message-----
From: Xuefu Zhang [mailto:xzhang@cloudera.com] 
Sent: Monday, March 24, 2014 6:07 PM
To: dev@hive.apache.org
Subject: Re: hive partition questions

I'm on latest trunk, and the following similar query worked for me:

select * from test as X where X.d in (select max(T.d) from test as T);


On Mon, Mar 24, 2014 at 4:04 PM, Darapaneni, Hazarath <hd...@ea.com>wrote:

> I have tried but no luck..
>
> FAILED: Parse Error: line 1:27 mismatched input 'as' expecting EOF near '
> page_view'  and took out as and tried and it gave me  same syntax 
> error as earlier one.
>
> FAILED: Parse Error: line 1:48 cannot recognize input near 'select' 'max'
> '(' in expression specification
>
> Thanks,
> Hazarath.
>
> -----Original Message-----
> From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> Sent: Monday, March 24, 2014 5:53 PM
> To: dev@hive.apache.org
> Subject: Re: hive partition questions
>
> You may try:
>
> select * from page_view as PV1 where PV1.dt IN (select max(dt) from 
> page_view);
>
>
> --Xuefu
>
>
> On Mon, Mar 24, 2014 at 3:47 PM, Darapaneni, Hazarath 
> <hdarapaneni@ea.com
> >wrote:
>
> > Thanks Zhang,
> >
> > Is there a way to access ma partition in hive without specifying as
> value.
> > Like as sub query .. select * from page_view where dt=(select 
> > max(dt) from page_view)..i tried this but its failing with syntax.
> >
> > Thanks,
> > Hazarath.
> >
> > -----Original Message-----
> > From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> > Sent: Monday, March 24, 2014 3:21 PM
> > To: dev@hive.apache.org
> > Subject: Re: hive partition questions
> >
> > As far as I know, select * from table gives you all data across all 
> > partitions, and I don't think there is a config option to ask Hive 
> > to read "latest" partition.
> >
> > --Xuefu
> >
> >
> > On Mon, Mar 24, 2014 at 12:34 PM, Darapaneni, Hazarath
> > <hd...@ea.com>wrote:
> >
> > > Hi Team,
> > >
> > >
> > > I have a scenario that I need help from you guys, please check and 
> > > let me know, if it's feasible or not.
> > >
> > > Scenario:
> > >
> > > I have a table called page_view and it has 10 partitions with date
> > > (2013-12-20 to 2013-12-30) and I need to select latest partition 
> > > without giving date as where condition.
> > >
> > > If I do  select * from page_view then it should got to latest 
> > > partition as default (in this case its 2013-12-30) , is there a 
> > > way to do this in hive configuration.
> > >
> > > Thanks,
> > > Hazarath.
> > >
> >
>

Re: hive partition questions

Posted by Xuefu Zhang <xz...@cloudera.com>.
I'm on latest trunk, and the following similar query worked for me:

select * from test as X where X.d in (select max(T.d) from test as T);


On Mon, Mar 24, 2014 at 4:04 PM, Darapaneni, Hazarath <hd...@ea.com>wrote:

> I have tried but no luck..
>
> FAILED: Parse Error: line 1:27 mismatched input 'as' expecting EOF near '
> page_view'  and took out as and tried and it gave me  same syntax error as
> earlier one.
>
> FAILED: Parse Error: line 1:48 cannot recognize input near 'select' 'max'
> '(' in expression specification
>
> Thanks,
> Hazarath.
>
> -----Original Message-----
> From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> Sent: Monday, March 24, 2014 5:53 PM
> To: dev@hive.apache.org
> Subject: Re: hive partition questions
>
> You may try:
>
> select * from page_view as PV1 where PV1.dt IN (select max(dt) from
> page_view);
>
>
> --Xuefu
>
>
> On Mon, Mar 24, 2014 at 3:47 PM, Darapaneni, Hazarath <hdarapaneni@ea.com
> >wrote:
>
> > Thanks Zhang,
> >
> > Is there a way to access ma partition in hive without specifying as
> value.
> > Like as sub query .. select * from page_view where dt=(select max(dt)
> > from page_view)..i tried this but its failing with syntax.
> >
> > Thanks,
> > Hazarath.
> >
> > -----Original Message-----
> > From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> > Sent: Monday, March 24, 2014 3:21 PM
> > To: dev@hive.apache.org
> > Subject: Re: hive partition questions
> >
> > As far as I know, select * from table gives you all data across all
> > partitions, and I don't think there is a config option to ask Hive to
> > read "latest" partition.
> >
> > --Xuefu
> >
> >
> > On Mon, Mar 24, 2014 at 12:34 PM, Darapaneni, Hazarath
> > <hd...@ea.com>wrote:
> >
> > > Hi Team,
> > >
> > >
> > > I have a scenario that I need help from you guys, please check and
> > > let me know, if it's feasible or not.
> > >
> > > Scenario:
> > >
> > > I have a table called page_view and it has 10 partitions with date
> > > (2013-12-20 to 2013-12-30) and I need to select latest partition
> > > without giving date as where condition.
> > >
> > > If I do  select * from page_view then it should got to latest
> > > partition as default (in this case its 2013-12-30) , is there a way
> > > to do this in hive configuration.
> > >
> > > Thanks,
> > > Hazarath.
> > >
> >
>

RE: hive partition questions

Posted by "Darapaneni, Hazarath" <hd...@ea.com>.
I have tried but no luck..

FAILED: Parse Error: line 1:27 mismatched input 'as' expecting EOF near ' page_view'  and took out as and tried and it gave me  same syntax error as earlier one.

FAILED: Parse Error: line 1:48 cannot recognize input near 'select' 'max' '(' in expression specification

Thanks,
Hazarath.

-----Original Message-----
From: Xuefu Zhang [mailto:xzhang@cloudera.com] 
Sent: Monday, March 24, 2014 5:53 PM
To: dev@hive.apache.org
Subject: Re: hive partition questions

You may try:

select * from page_view as PV1 where PV1.dt IN (select max(dt) from page_view);


--Xuefu


On Mon, Mar 24, 2014 at 3:47 PM, Darapaneni, Hazarath <hd...@ea.com>wrote:

> Thanks Zhang,
>
> Is there a way to access ma partition in hive without specifying as value.
> Like as sub query .. select * from page_view where dt=(select max(dt) 
> from page_view)..i tried this but its failing with syntax.
>
> Thanks,
> Hazarath.
>
> -----Original Message-----
> From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> Sent: Monday, March 24, 2014 3:21 PM
> To: dev@hive.apache.org
> Subject: Re: hive partition questions
>
> As far as I know, select * from table gives you all data across all 
> partitions, and I don't think there is a config option to ask Hive to 
> read "latest" partition.
>
> --Xuefu
>
>
> On Mon, Mar 24, 2014 at 12:34 PM, Darapaneni, Hazarath
> <hd...@ea.com>wrote:
>
> > Hi Team,
> >
> >
> > I have a scenario that I need help from you guys, please check and 
> > let me know, if it's feasible or not.
> >
> > Scenario:
> >
> > I have a table called page_view and it has 10 partitions with date
> > (2013-12-20 to 2013-12-30) and I need to select latest partition 
> > without giving date as where condition.
> >
> > If I do  select * from page_view then it should got to latest 
> > partition as default (in this case its 2013-12-30) , is there a way 
> > to do this in hive configuration.
> >
> > Thanks,
> > Hazarath.
> >
>

Re: hive partition questions

Posted by Xuefu Zhang <xz...@cloudera.com>.
You may try:

select * from page_view as PV1 where PV1.dt IN (select max(dt) from
page_view);


--Xuefu


On Mon, Mar 24, 2014 at 3:47 PM, Darapaneni, Hazarath <hd...@ea.com>wrote:

> Thanks Zhang,
>
> Is there a way to access ma partition in hive without specifying as value.
> Like as sub query .. select * from page_view where dt=(select max(dt) from
> page_view)..i tried this but its failing with syntax.
>
> Thanks,
> Hazarath.
>
> -----Original Message-----
> From: Xuefu Zhang [mailto:xzhang@cloudera.com]
> Sent: Monday, March 24, 2014 3:21 PM
> To: dev@hive.apache.org
> Subject: Re: hive partition questions
>
> As far as I know, select * from table gives you all data across all
> partitions, and I don't think there is a config option to ask Hive to read
> "latest" partition.
>
> --Xuefu
>
>
> On Mon, Mar 24, 2014 at 12:34 PM, Darapaneni, Hazarath
> <hd...@ea.com>wrote:
>
> > Hi Team,
> >
> >
> > I have a scenario that I need help from you guys, please check and let
> > me know, if it's feasible or not.
> >
> > Scenario:
> >
> > I have a table called page_view and it has 10 partitions with date
> > (2013-12-20 to 2013-12-30) and I need to select latest partition
> > without giving date as where condition.
> >
> > If I do  select * from page_view then it should got to latest
> > partition as default (in this case its 2013-12-30) , is there a way to
> > do this in hive configuration.
> >
> > Thanks,
> > Hazarath.
> >
>

RE: hive partition questions

Posted by "Darapaneni, Hazarath" <hd...@ea.com>.
Thanks Zhang,

Is there a way to access ma partition in hive without specifying as value. Like as sub query .. select * from page_view where dt=(select max(dt) from page_view)..i tried this but its failing with syntax.

Thanks,
Hazarath.

-----Original Message-----
From: Xuefu Zhang [mailto:xzhang@cloudera.com] 
Sent: Monday, March 24, 2014 3:21 PM
To: dev@hive.apache.org
Subject: Re: hive partition questions

As far as I know, select * from table gives you all data across all partitions, and I don't think there is a config option to ask Hive to read "latest" partition.

--Xuefu


On Mon, Mar 24, 2014 at 12:34 PM, Darapaneni, Hazarath
<hd...@ea.com>wrote:

> Hi Team,
>
>
> I have a scenario that I need help from you guys, please check and let 
> me know, if it's feasible or not.
>
> Scenario:
>
> I have a table called page_view and it has 10 partitions with date
> (2013-12-20 to 2013-12-30) and I need to select latest partition 
> without giving date as where condition.
>
> If I do  select * from page_view then it should got to latest 
> partition as default (in this case its 2013-12-30) , is there a way to 
> do this in hive configuration.
>
> Thanks,
> Hazarath.
>

Re: hive partition questions

Posted by Xuefu Zhang <xz...@cloudera.com>.
As far as I know, select * from table gives you all data across all
partitions, and I don't think there is a config option to ask Hive to read
"latest" partition.

--Xuefu


On Mon, Mar 24, 2014 at 12:34 PM, Darapaneni, Hazarath
<hd...@ea.com>wrote:

> Hi Team,
>
>
> I have a scenario that I need help from you guys, please check and let me
> know, if it's feasible or not.
>
> Scenario:
>
> I have a table called page_view and it has 10 partitions with date
> (2013-12-20 to 2013-12-30) and I need to select latest partition without
> giving date as where condition.
>
> If I do  select * from page_view then it should got to latest partition as
> default (in this case its 2013-12-30) , is there a way to do this in hive
> configuration.
>
> Thanks,
> Hazarath.
>