You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by "@Sanjiv Singh" <sa...@gmail.com> on 2016/06/23 00:42:18 UTC

Optimize Hive Query

Hi All,

I am running performance issue with below query. Its took 2-3 hours to
complete in hive.

Try tried to partition and bucketing changes on this tables, but without
luck.

Please help me in optimizing this query.

what schema level changes can be done ?
other parameters recommendations ?


*Below are complete details :*

*Hive Table DDL :*

CREATE TABLE `tuning_dd_key`(
>   m_d_key smallint,
>   sb_gu_key bigint,
>   t_ev_st_dt date,
>   a_z_key int,
>   c_dt date,
>   e_p_dt date,
>   sq_nbr int);


*Total data size : *

> 250 GB


*Long running query :*


> SELECT
> sb_gu_key, m_d_key, t_ev_st_dt,
> LAG( t_ev_st_dt )  OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY
>  t_ev_st_dt ) AS LAG_START_DT,
> a_z_key,
> c_dt,
> e_p_dt,
> sq_nbr,
> CASE WHEN LAG( t_ev_st_dt )  OVER ( PARTITION BY  m_d_key , sb_gu_key
>  ORDER BY  t_ev_st_dt ) IS NULL  OR a_z_key <> LAG( a_z_key , 1 , -999 )
>  OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt )  THEN 'S'
>  ELSE NULL  END AS ST_FLAG
> FROM  `PRDDB`.tuning_dd_key ;



*More info :*

number of distinct value in column m_d_key : 29
> number of distinct value in column sb_gu_key : 15434343





Regards
Sanjiv Singh
Mob :  +091 9990-447-339

Re: Optimize Hive Query

Posted by Jörn Franke <jo...@gmail.com>.
The query looks a little bit too complex from what it is supposed to do. Can you reformulate and restrict the data in a where clause (highest restriction first). Another hint would be to use the Orc format (with indexes and optionally bloom filters) with snappy compression as well as sorting the data on the column you choose to restrict in the where part.

> On 23 Jun 2016, at 02:42, @Sanjiv Singh <sa...@gmail.com> wrote:
> 
> Hi All,
> 
> I am running performance issue with below query. Its took 2-3 hours to complete in hive.
> 
> Try tried to partition and bucketing changes on this tables, but without luck.
> 
> Please help me in optimizing this query. 
> 
> what schema level changes can be done ? 
> other parameters recommendations ?
> 
> 
> Below are complete details :
> 
> Hive Table DDL :
> 
>> CREATE TABLE `tuning_dd_key`(
>>   m_d_key smallint,
>>   sb_gu_key bigint, 
>>   t_ev_st_dt date, 
>>   a_z_key int, 
>>   c_dt date, 
>>   e_p_dt date, 
>>   sq_nbr int);
>   
> Total data size : 
>> 250 GB  
>   
> Long running query :
>   
>> SELECT 
>> sb_gu_key, m_d_key, t_ev_st_dt, 
>> LAG( t_ev_st_dt )  OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt ) AS LAG_START_DT, 
>> a_z_key, 
>> c_dt, 
>> e_p_dt, 
>> sq_nbr, 
>> CASE WHEN LAG( t_ev_st_dt )  OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt ) IS NULL  OR a_z_key <> LAG( a_z_key , 1 , -999 )  OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt )  THEN 'S'  ELSE NULL  END AS ST_FLAG 
>> FROM  `PRDDB`.tuning_dd_key ;
>  
> 
> More info :
> 
>> number of distinct value in column m_d_key : 29
>> number of distinct value in column sb_gu_key : 15434343
> 
> 
> 
> 
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339

RE: Optimize Hive Query

Posted by "Markovitz, Dudu" <dm...@paypal.com>.
Thanks, I wanted to rule out skewedness over m_d_key,sb_gu_key

Dudu

From: @Sanjiv Singh [mailto:sanjiv.is.on@gmail.com]
Sent: Thursday, June 23, 2016 11:55 PM
To: user@hive.apache.org; Markovitz, Dudu <dm...@paypal.com>; sanjiv singh (ME) <sa...@gmail.com>
Subject: Re: Optimize Hive Query

Hi Dudu,

find below query response.

Query :
select      m_d_key,sb_gu_key   ,count (*)   as cnt
from        tuning_dd_key
group by    m_d_key,sb_gu_key
order by    cnt desc
limit       100;

Output :

16    9042668          1361
16    8063808          1361
16    8569864          1361
16    8909889          1361
16    9864785          1361
16    8269717          1361
16    10180282        1361
16    8913062          1361
16    8418183          1361
16    8003791          1361
16    10201084        1361
16    8470942          1361
16    9234223          1361
16    8330286          1361
16    12966192        1361
16    9008767          1361
16    8902598          1361
16    9878885          1361
16    8741214          1361
16    8732856          1361
16    9692696          1361
16    8072042          1361
16    8802681          1361
16    14087558        1361
16    9027186          1361
16    9587342          1361
16    9699202          1361
16    8542344          1361
16    9680544          1361
16    8903570          1361
16    9542542          1361
4      3576041          1361
16    9126774          1361
16    9957826          1361
16    8345331          1361
16    9756883          1361
16    9399702          1361
18    9403442          1361
16    9746288          1361
16    9435202          1361
16    9069894          1361
16    9920826          1361
16    8765877          1361
16    8813448          1361
18    9635460          1361
16    8463714          1361
16    8166965          1361
16    9597903          1361
16    9432100          1361
16    8847857          1361
16    13953068        1361
16    8744451          1361
16    8089463          1361
16    9674902          1361
16    8418200          1361
16    8028509          1361
16    9243086          1361
16    8892184          1361
16    8801594          1361
16    9849079          1361
16    8556753          1361
16    8979232          1361
16    8081946          1361
16    8724046          1361
16    9984434          1361
16    8651659          1361
16    9116866          1361
1      17870072        1361
16    8860630          1361
16    9888398          1361
16    9463782          1361
16    9602127          1361
16    9353325          1361
16    7991816          1361
16    9920420          1361
16    8497624          1361
16    8987980          1361
16    8234751          1361
16    8389490          1361
18    9975575          1361
16    8026536          1361
16    8790618          1361
16    9846791          1361
16    8363833          1361
16    9025525          1361
16    9241297          1361
16    8712487          1361
16    8692003          1361
16    9316523          1361
16    8124338          1361
16    9941027          1361
16    9547973          1361
16    8007742          1361
16    8418425          1361
16    8944940          1361
16    8890232          1361
16    9248984          1361
16    9784461          1361
16    9009374          1361
16    8395861          1361


Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Thu, Jun 23, 2016 at 4:01 AM, Markovitz, Dudu <dm...@paypal.com>> wrote:

Could you also add the results of the following query?



Thanks



Dudu





select      m_d_key

           ,sb_gu_key

           ,count (*)   as cnt



from        tuning_dd_key



group by    m_d_key

           ,sb_gu_key



order by    cnt desc



limit       100

;



-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com<ma...@hortonworks.com>] On Behalf Of Gopal Vijayaraghavan
Sent: Thursday, June 23, 2016 9:45 AM
To: user@hive.apache.org<ma...@hive.apache.org>
Subject: Re: Optimize Hive Query





> Long running query :



Are you running this on MapReduce or Tez?



Please post the output of explain - if you are seeing > 1 shuffle edge in your query while having only one window for OVER(), that might be the reason.



OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)



The multiple PTF operators should have been collapsed by the reduce sink-deduplication.



Cheers,

Gopal






Re: Optimize Hive Query

Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Dudu,

find below query response.

Query :

> select      m_d_key,sb_gu_key   ,count (*)   as cnt
> from        tuning_dd_key
> group by    m_d_key,sb_gu_key
> order by    cnt desc
> limit       100;


Output :

16 9042668 1361
> 16 8063808 1361
> 16 8569864 1361
> 16 8909889 1361
> 16 9864785 1361
> 16 8269717 1361
> 16 10180282 1361
> 16 8913062 1361
> 16 8418183 1361
> 16 8003791 1361
> 16 10201084 1361
> 16 8470942 1361
> 16 9234223 1361
> 16 8330286 1361
> 16 12966192 1361
> 16 9008767 1361
> 16 8902598 1361
> 16 9878885 1361
> 16 8741214 1361
> 16 8732856 1361
> 16 9692696 1361
> 16 8072042 1361
> 16 8802681 1361
> 16 14087558 1361
> 16 9027186 1361
> 16 9587342 1361
> 16 9699202 1361
> 16 8542344 1361
> 16 9680544 1361
> 16 8903570 1361
> 16 9542542 1361
> 4 3576041 1361
> 16 9126774 1361
> 16 9957826 1361
> 16 8345331 1361
> 16 9756883 1361
> 16 9399702 1361
> 18 9403442 1361
> 16 9746288 1361
> 16 9435202 1361
> 16 9069894 1361
> 16 9920826 1361
> 16 8765877 1361
> 16 8813448 1361
> 18 9635460 1361
> 16 8463714 1361
> 16 8166965 1361
> 16 9597903 1361
> 16 9432100 1361
> 16 8847857 1361
> 16 13953068 1361
> 16 8744451 1361
> 16 8089463 1361
> 16 9674902 1361
> 16 8418200 1361
> 16 8028509 1361
> 16 9243086 1361
> 16 8892184 1361
> 16 8801594 1361
> 16 9849079 1361
> 16 8556753 1361
> 16 8979232 1361
> 16 8081946 1361
> 16 8724046 1361
> 16 9984434 1361
> 16 8651659 1361
> 16 9116866 1361
> 1 17870072 1361
> 16 8860630 1361
> 16 9888398 1361
> 16 9463782 1361
> 16 9602127 1361
> 16 9353325 1361
> 16 7991816 1361
> 16 9920420 1361
> 16 8497624 1361
> 16 8987980 1361
> 16 8234751 1361
> 16 8389490 1361
> 18 9975575 1361
> 16 8026536 1361
> 16 8790618 1361
> 16 9846791 1361
> 16 8363833 1361
> 16 9025525 1361
> 16 9241297 1361
> 16 8712487 1361
> 16 8692003 1361
> 16 9316523 1361
> 16 8124338 1361
> 16 9941027 1361
> 16 9547973 1361
> 16 8007742 1361
> 16 8418425 1361
> 16 8944940 1361
> 16 8890232 1361
> 16 9248984 1361
> 16 9784461 1361
> 16 9009374 1361
> 16 8395861 1361



Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Thu, Jun 23, 2016 at 4:01 AM, Markovitz, Dudu <dm...@paypal.com>
wrote:

> Could you also add the results of the following query?
>
>
>
> Thanks
>
>
>
> Dudu
>
>
>
>
>
> select      m_d_key
>
>            ,sb_gu_key
>
>            ,count (*)   as cnt
>
>
>
> from        tuning_dd_key
>
>
>
> group by    m_d_key
>
>            ,sb_gu_key
>
>
>
> order by    cnt desc
>
>
>
> limit       100
>
> ;
>
>
>
> -----Original Message-----
> From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com] On Behalf Of
> Gopal Vijayaraghavan
> Sent: Thursday, June 23, 2016 9:45 AM
> To: user@hive.apache.org
> Subject: Re: Optimize Hive Query
>
>
>
>
>
> > Long running query :
>
>
>
> Are you running this on MapReduce or Tez?
>
>
>
> Please post the output of explain - if you are seeing > 1 shuffle edge in
> your query while having only one window for OVER(), that might be the
> reason.
>
>
>
> OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
>
>
>
> The multiple PTF operators should have been collapsed by the reduce
> sink-deduplication.
>
>
>
> Cheers,
>
> Gopal
>
>
>
>
>

RE: Optimize Hive Query

Posted by "Markovitz, Dudu" <dm...@paypal.com>.
Could you also add the results of the following query?



Thanks



Dudu





select      m_d_key

           ,sb_gu_key

           ,count (*)   as cnt



from        tuning_dd_key



group by    m_d_key

           ,sb_gu_key



order by    cnt desc



limit       100

;



-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com] On Behalf Of Gopal Vijayaraghavan
Sent: Thursday, June 23, 2016 9:45 AM
To: user@hive.apache.org
Subject: Re: Optimize Hive Query





> Long running query :



Are you running this on MapReduce or Tez?



Please post the output of explain - if you are seeing > 1 shuffle edge in your query while having only one window for OVER(), that might be the reason.



OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)



The multiple PTF operators should have been collapsed by the reduce sink-deduplication.



Cheers,

Gopal





Re: Optimize Hive Query

Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Sanjiv,

Normally when it comes to this, I will try to find the section of the code
which cause the largest lag

SELECT
> sb_gu_key, m_d_key, t_ev_st_dt,
> LAG( t_ev_st_dt )  OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY
>  t_ev_st_dt ) AS LAG_START_DT,
> a_z_key,
> c_dt,
> e_p_dt,
> sq_nbr,
> CASE WHEN LAG( t_ev_st_dt )  OVER ( PARTITION BY  m_d_key , sb_gu_key
>  ORDER BY  t_ev_st_dt ) IS NULL  OR a_z_key <> LAG( a_z_key , 1 , -999 )
>  OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt )  THEN 'S'
>  ELSE NULL  END AS ST_FLAG
> FROM  `PRDDB`.tuning_dd_key ;




From the above query which part is the most time consuming?

For example is the LAG function the most consuming section that takers the
lion's hare of the query?

Just execute the code and comment out LAG(t_ev_st_td) .....  first

I suspect

CASE WHEN LAG( t_ev_st_dt )  OVER ( PARTITION BY  m_d_key , sb_gu_key
 ORDER BY  t_ev_st_dt ) IS NULL  OR a_z_key <> LAG( a_z_key , 1 , -999 )
 OVER ( PARTITION BY

is the other possible candidate as well with that OR than can cause the
issue

For example you can do the following to measure the timing

select from_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') AS
StartTime;
SELECT COUNT(1) FROM PRDDB`.tuning_dd_key
WHERE (LAG( t_ev_st_dt )  OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER
BY  t_ev_st_dt ) IS NULL)
OR
a_z_key <> LAG( a_z_key , 1 , -999 )  OVER ( PARTITION BY  m_d_key ,
sb_gu_key  ORDER BY  t_ev_st_dt )
select from_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') AS EndTime;


HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Any and all responsibility for any loss, damage or
destruction of data or any other property which may arise from relying on
this email's technical content is explicitly disclaimed. The author will in
no case be liable for any monetary damages arising from such loss, damage
or destruction.



On 24 June 2016 at 22:34, @Sanjiv Singh <sa...@gmail.com> wrote:

> Hi Vijay,
>
> Please help me on this....let me know you need other info.
>
>
>
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339
>
> On Thu, Jun 23, 2016 at 12:41 PM, @Sanjiv Singh <sa...@gmail.com>
> wrote:
>
>> Hi Gopal,
>>
>> I am using Tez as execution engine.
>>
>> DAG :
>>
>> +--------------------------------------------------------+--+
>> |
>>                                   Explain
>>                                                                         |
>> +---------------------------------------------------------+--+
>> | Plan not optimized by CBO.
>>         |
>> |
>>                            |
>> | Vertex dependency in root stage
>>                             |
>> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
>>                                                           |
>> |
>>                                  |
>> | Stage-0
>>                               |
>> |    Fetch Operator
>>                                                               |
>> |       limit:-1
>>                                   |
>> |       Stage-1
>>                                 |
>> |          Reducer 2
>>                                           |
>> |          File Output Operator [FS_55596]
>>                                                         |
>> |             compressed:false
>>                                                              |
>> |             Statistics:Num rows: 6357592675 Data size: 54076899328
>> Basic stats: COMPLETE Column stats: NONE          |
>> |
>> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
>> format:":"org.apache.hadoop.mapred.TextInputFormat","output
>> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}  |
>> |             Select Operator [SEL_55594]
>>                 |
>> |
>>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>>                                    |
>> |                Statistics:Num rows: 6357592675 Data size: 54076899328
>> Basic stats: COMPLETE Column stats: NONE                       |
>> |                PTF Operator [PTF_55593]
>>                                    |
>> |                   Function definitions:[{"Input
>> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
>> _col1","name:":"windowingtablefunction","order by:":"_col2"}]      |
>> |                   Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE           |
>> |                   Select Operator [SEL_55592]
>>                                           |
>> |                   |
>>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
>>                                                                   |
>> |                   |  Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE                    |
>> |                   |<-Map 1 [SIMPLE_EDGE] vectorized
>>                                  |
>> |                      Reduce Output Operator [RS_55597]
>>                                |
>> |                         key expressions:m_d_key (type: smallint),
>> sb_gu_key (type: bigint), t_ev_st_dt (type: date)         |
>> |                         Map-reduce partition columns:m_d_key (type:
>> smallint), sb_gu_key (type: bigint)            |
>> |                         sort order:+++
>>        |
>> |                         Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE
>>  |
>> |                         value expressions:ad_zn_key (type: int), c_dt
>> (type: date), e_p_dt (type: date), sq_nbr (type: int)           |
>> |                         TableScan [TS_55590]
>>                                                             |
>> |                            ACID table:true
>>                                            |
>> |                            alias:tuning_dd_key
>>                         |
>> |                            Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE                      |
>> |
>>
>>                 |
>>
>> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>>
>> Regards
>> Sanjiv Singh
>> Mob :  +091 9990-447-339
>>
>> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
>> wrote:
>>
>>>
>>> > Long running query :
>>>
>>> Are you running this on MapReduce or Tez?
>>>
>>> Please post the output of explain - if you are seeing > 1 shuffle edge in
>>> your query while having only one window for OVER(), that might be the
>>> reason.
>>>
>>> OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
>>>
>>>
>>> The multiple PTF operators should have been collapsed by the reduce
>>> sink-deduplication.
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>
>

Re: Optimize Hive Query

Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi,

Curious to see if this issue been resolved (performance) after compaction?

Thanks

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 26 June 2016 at 21:11, @Sanjiv Singh <sa...@gmail.com> wrote:

> Thanks Gopal for your inputs ....For now I have create NON ACID table and
> loaded data ....see below from logs proper group splits happening .
>
> 2016-06-25 12:52:00,160 [INFO] [InputInitializer {Map 1} #0]
> |tez.HiveSplitGenerator|: Number of grouped splits: 512
>
>
> On compaction issue , Compaction enabled with two workers. why compaction
> not happened ? will check metastore logs.
>
> I have too many ACID tables on hive and how many worker should be
> configured ? currently it is 2.
>
> Thanks a lot once again.
>
>
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339
>
> On Fri, Jun 24, 2016 at 9:14 PM, @Sanjiv Singh <sa...@gmail.com>
> wrote:
>
>> Thanks Gopal for your inputs. Let me run compaction explicitly on table
>> then see how query works.
>>
>>
>>
>> Let
>>
>> Regards
>> Sanjiv Singh
>> Mob :  +091 9990-447-339
>>
>> On Fri, Jun 24, 2016 at 7:53 PM, Gopal Vijayaraghavan <go...@apache.org>
>> wrote:
>>
>>>
>>> > Yes for this tables, ACID enabled.  it has only 256 files for each
>>> >buckets. these are create only when data initially loaded in this table.
>>>
>>> Yes, the initial load goes in as an insert DELTA too - that requires
>>> another compaction to move into base files.
>>>
>>> The fact that they haven't been automatically compacted yet, suggests
>>> that
>>> the compactor isn't working for some reason (check hive metastore logs).
>>>
>>> > One thing that I am not able to understand that its is running with 1
>>> >MAPPER.
>>>
>>> The size of deltas shows up as 0, till the compaction goes through - in
>>> Hive2, it will be -1 which will be correctly interpreted as "unknown
>>> size".
>>>
>>>
>>> > | -rw-r--r--   3 H56473 hdfs  215973009 2016-06-23 17:38
>>>
>>> >/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
>>> >0000  |
>>>
>>> Clearly an issue due to the lack of compaction - I see a single delta
>>> with
>>> 255 buckets and no base_* files at all.
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Optimize Hive Query

Posted by Eugene Koifman <ek...@hortonworks.com>.
if you have many acid tables you almost certainly want more than 2 workers.  If you have 2 workers (and a single metastore instance) you can run at most 2 compaction jobs at a time.  Unless the tables are very small, compaction may fall behind if it's configured to run too serially.

In order for compactions to run automatically, at a minimum you have to have hive.compactor.initiator.on=true for one standalone metastore instance.
hive.compactor.delta.num.threshold determines when compaction is triggered for a given table/partition.
There is more details in https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration

Look for log messages in metastore.log form Initiator/Cleaner classes.  If you don't see any, it must be disabled.

SHOW COMPACTIONS is a command you can run at CLI to see if there are any currently running.

you can also use ALTER TABLE (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionCompact) to launch compaction on demand.

could you send results of dfs -ls /apps/hive/warehouse/PRDDB.db/tuning_dd_key

thanks,
Eugene


From: "@Sanjiv Singh" <sa...@gmail.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, "sanjiv.is.on@gmail.com<ma...@gmail.com>" <sa...@gmail.com>>
Date: Sunday, June 26, 2016 at 1:11 PM
To: Gopal Vijayaraghavan <go...@apache.org>>
Cc: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Subject: Re: Optimize Hive Query

Thanks Gopal for your inputs ....For now I have create NON ACID table and loaded data ....see below from logs proper group splits happening .

2016-06-25 12:52:00,160 [INFO] [InputInitializer {Map 1} #0] |tez.HiveSplitGenerator|: Number of grouped splits: 512


On compaction issue , Compaction enabled with two workers. why compaction not happened ? will check metastore logs.

I have too many ACID tables on hive and how many worker should be configured ? currently it is 2.

Thanks a lot once again.


Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Fri, Jun 24, 2016 at 9:14 PM, @Sanjiv Singh <sa...@gmail.com>> wrote:
Thanks Gopal for your inputs. Let me run compaction explicitly on table then see how query works.



Let

Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Fri, Jun 24, 2016 at 7:53 PM, Gopal Vijayaraghavan <go...@apache.org>> wrote:

> Yes for this tables, ACID enabled.  it has only 256 files for each
>buckets. these are create only when data initially loaded in this table.

Yes, the initial load goes in as an insert DELTA too - that requires
another compaction to move into base files.

The fact that they haven't been automatically compacted yet, suggests that
the compactor isn't working for some reason (check hive metastore logs).

> One thing that I am not able to understand that its is running with 1
>MAPPER.

The size of deltas shows up as 0, till the compaction goes through - in
Hive2, it will be -1 which will be correctly interpreted as "unknown size".


> | -rw-r--r--   3 H56473 hdfs  215973009 2016-06-23 17:38
>/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
>0000  |

Clearly an issue due to the lack of compaction - I see a single delta with
255 buckets and no base_* files at all.

Cheers,
Gopal











Re: Optimize Hive Query

Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Thanks Gopal for your inputs ....For now I have create NON ACID table and
loaded data ....see below from logs proper group splits happening .

2016-06-25 12:52:00,160 [INFO] [InputInitializer {Map 1} #0]
|tez.HiveSplitGenerator|: Number of grouped splits: 512


On compaction issue , Compaction enabled with two workers. why compaction
not happened ? will check metastore logs.

I have too many ACID tables on hive and how many worker should be
configured ? currently it is 2.

Thanks a lot once again.


Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Fri, Jun 24, 2016 at 9:14 PM, @Sanjiv Singh <sa...@gmail.com>
wrote:

> Thanks Gopal for your inputs. Let me run compaction explicitly on table
> then see how query works.
>
>
>
> Let
>
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339
>
> On Fri, Jun 24, 2016 at 7:53 PM, Gopal Vijayaraghavan <go...@apache.org>
> wrote:
>
>>
>> > Yes for this tables, ACID enabled.  it has only 256 files for each
>> >buckets. these are create only when data initially loaded in this table.
>>
>> Yes, the initial load goes in as an insert DELTA too - that requires
>> another compaction to move into base files.
>>
>> The fact that they haven't been automatically compacted yet, suggests that
>> the compactor isn't working for some reason (check hive metastore logs).
>>
>> > One thing that I am not able to understand that its is running with 1
>> >MAPPER.
>>
>> The size of deltas shows up as 0, till the compaction goes through - in
>> Hive2, it will be -1 which will be correctly interpreted as "unknown
>> size".
>>
>>
>> > | -rw-r--r--   3 H56473 hdfs  215973009 2016-06-23 17:38
>>
>> >/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
>> >0000  |
>>
>> Clearly an issue due to the lack of compaction - I see a single delta with
>> 255 buckets and no base_* files at all.
>>
>> Cheers,
>> Gopal
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: Optimize Hive Query

Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Thanks Gopal for your inputs. Let me run compaction explicitly on table
then see how query works.



Let

Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Fri, Jun 24, 2016 at 7:53 PM, Gopal Vijayaraghavan <go...@apache.org>
wrote:

>
> > Yes for this tables, ACID enabled.  it has only 256 files for each
> >buckets. these are create only when data initially loaded in this table.
>
> Yes, the initial load goes in as an insert DELTA too - that requires
> another compaction to move into base files.
>
> The fact that they haven't been automatically compacted yet, suggests that
> the compactor isn't working for some reason (check hive metastore logs).
>
> > One thing that I am not able to understand that its is running with 1
> >MAPPER.
>
> The size of deltas shows up as 0, till the compaction goes through - in
> Hive2, it will be -1 which will be correctly interpreted as "unknown size".
>
>
> > | -rw-r--r--   3 H56473 hdfs  215973009 2016-06-23 17:38
> >/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
> >0000  |
>
> Clearly an issue due to the lack of compaction - I see a single delta with
> 255 buckets and no base_* files at all.
>
> Cheers,
> Gopal
>
>
>
>
>
>
>
>
>

Re: Optimize Hive Query

Posted by Gopal Vijayaraghavan <go...@apache.org>.
> Yes for this tables, ACID enabled.  it has only 256 files for each
>buckets. these are create only when data initially loaded in this table.

Yes, the initial load goes in as an insert DELTA too - that requires
another compaction to move into base files.

The fact that they haven't been automatically compacted yet, suggests that
the compactor isn't working for some reason (check hive metastore logs).

> One thing that I am not able to understand that its is running with 1
>MAPPER.  

The size of deltas shows up as 0, till the compaction goes through - in
Hive2, it will be -1 which will be correctly interpreted as "unknown size".

 
> | -rw-r--r--   3 H56473 hdfs  215973009 2016-06-23 17:38
>/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
>0000  |

Clearly an issue due to the lack of compaction - I see a single delta with
255 buckets and no base_* files at all.

Cheers,
Gopal









Re: Optimize Hive Query

Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Vijay,

Yes for this tables, ACID enabled.  it has only 256 files for each buckets.
these are create only when data initially loaded in this table.
There is not transaction done after that.

I see that all file for buckets are also in equal size.


One thing that I am not able to understand that its is running with 1
MAPPER.

I will try options you provided , will get insight after that.

Please find below response of command "dfs -ls <path>"


dfs -ls /apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|
    DFS Output
                   |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| Found 256 items

                  |
| -rw-r--r--   3 H56473 hdfs  215973009 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00000
 |
| -rw-r--r--   3 H56473 hdfs  215253561 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00001
 |
| -rw-r--r--   3 H56473 hdfs  215470193 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00002
 |
| -rw-r--r--   3 H56473 hdfs  215455699 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00003
 |
| -rw-r--r--   3 H56473 hdfs  215620935 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00004
 |
| -rw-r--r--   3 H56473 hdfs  215704945 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00005
 |
| -rw-r--r--   3 H56473 hdfs  215842552 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00006
 |
| -rw-r--r--   3 H56473 hdfs  215808371 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00007
 |
| -rw-r--r--   3 H56473 hdfs  215969399 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00008
 |
| -rw-r--r--   3 H56473 hdfs  216130285 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00009
 |
| -rw-r--r--   3 H56473 hdfs  216168481 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00010
 |
| -rw-r--r--   3 H56473 hdfs  216152569 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00011
 |
| -rw-r--r--   3 H56473 hdfs  216088033 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00012
 |
| -rw-r--r--   3 H56473 hdfs  216264972 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00013
 |
| -rw-r--r--   3 H56473 hdfs  216267772 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00014
 |
| -rw-r--r--   3 H56473 hdfs  216385428 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00015
 |
| -rw-r--r--   3 H56473 hdfs  216399617 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00016
 |
| -rw-r--r--   3 H56473 hdfs  216544838 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00017
 |
| -rw-r--r--   3 H56473 hdfs  216542321 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00018
 |
| -rw-r--r--   3 H56473 hdfs  216520174 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00019
 |
| -rw-r--r--   3 H56473 hdfs  216513782 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00020
 |
| -rw-r--r--   3 H56473 hdfs  216546732 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00021
 |
| -rw-r--r--   3 H56473 hdfs  216474592 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00022
 |
| -rw-r--r--   3 H56473 hdfs  216503547 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00023
 |
| -rw-r--r--   3 H56473 hdfs  216590266 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00024
 |
| -rw-r--r--   3 H56473 hdfs  216692713 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00025
 |
| -rw-r--r--   3 H56473 hdfs  216593362 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00026
 |
| -rw-r--r--   3 H56473 hdfs  216736077 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00027
 |
| -rw-r--r--   3 H56473 hdfs  216691738 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00028
 |
| -rw-r--r--   3 H56473 hdfs  216751031 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00029
 |
| -rw-r--r--   3 H56473 hdfs  216748097 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00030
 |
| -rw-r--r--   3 H56473 hdfs  216561861 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00031
 |
| -rw-r--r--   3 H56473 hdfs  216827688 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00032
 |
| -rw-r--r--   3 H56473 hdfs  216785788 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00033
 |
| -rw-r--r--   3 H56473 hdfs  216824525 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00034
 |
| -rw-r--r--   3 H56473 hdfs  216744531 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00035
 |
| -rw-r--r--   3 H56473 hdfs  216801179 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00036
 |
| -rw-r--r--   3 H56473 hdfs  216843923 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00037
 |
| -rw-r--r--   3 H56473 hdfs  216841512 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00038
 |
| -rw-r--r--   3 H56473 hdfs  216786403 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00039
 |
| -rw-r--r--   3 H56473 hdfs  216828047 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00040
 |
| -rw-r--r--   3 H56473 hdfs  216825787 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00041
 |
| -rw-r--r--   3 H56473 hdfs  216876891 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00042
 |
| -rw-r--r--   3 H56473 hdfs  216955603 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00043
 |
| -rw-r--r--   3 H56473 hdfs  216875485 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00044
 |
| -rw-r--r--   3 H56473 hdfs  216935044 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00045
 |
| -rw-r--r--   3 H56473 hdfs  216819647 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00046
 |
| -rw-r--r--   3 H56473 hdfs  216951906 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00047
 |
| -rw-r--r--   3 H56473 hdfs  216992043 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00048
 |
| -rw-r--r--   3 H56473 hdfs  216893267 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00049
 |
| -rw-r--r--   3 H56473 hdfs  216887568 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00050
 |
| -rw-r--r--   3 H56473 hdfs  216932512 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00051
 |
| -rw-r--r--   3 H56473 hdfs  216995951 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00052
 |
| -rw-r--r--   3 H56473 hdfs  216980201 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00053
 |
| -rw-r--r--   3 H56473 hdfs  217022321 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00054
 |
| -rw-r--r--   3 H56473 hdfs  217061082 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00055
 |
| -rw-r--r--   3 H56473 hdfs  216915085 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00056
 |
| -rw-r--r--   3 H56473 hdfs  216973899 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00057
 |
| -rw-r--r--   3 H56473 hdfs  216947693 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00058
 |
| -rw-r--r--   3 H56473 hdfs  216978410 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00059
 |
| -rw-r--r--   3 H56473 hdfs  216966029 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00060
 |
| -rw-r--r--   3 H56473 hdfs  217062158 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00061
 |
| -rw-r--r--   3 H56473 hdfs  217012664 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00062
 |
| -rw-r--r--   3 H56473 hdfs  217126561 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00063
 |
| -rw-r--r--   3 H56473 hdfs  217066999 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00064
 |
| -rw-r--r--   3 H56473 hdfs  217052347 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00065
 |
| -rw-r--r--   3 H56473 hdfs  217054965 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00066
 |
| -rw-r--r--   3 H56473 hdfs  217155817 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00067
 |
| -rw-r--r--   3 H56473 hdfs  217163685 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00068
 |
| -rw-r--r--   3 H56473 hdfs  217075833 2016-06-23 17:44
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00069
 |
| -rw-r--r--   3 H56473 hdfs  217110271 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00070
 |
| -rw-r--r--   3 H56473 hdfs  217099877 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00071
 |
| -rw-r--r--   3 H56473 hdfs  217015698 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00072
 |
| -rw-r--r--   3 H56473 hdfs  217077120 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00073
 |
| -rw-r--r--   3 H56473 hdfs  216985477 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00074
 |
| -rw-r--r--   3 H56473 hdfs  217026304 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00075
 |
| -rw-r--r--   3 H56473 hdfs  217071942 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00076
 |
| -rw-r--r--   3 H56473 hdfs  217043625 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00077
 |
| -rw-r--r--   3 H56473 hdfs  217065652 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00078
 |
| -rw-r--r--   3 H56473 hdfs  217016300 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00079
 |
| -rw-r--r--   3 H56473 hdfs  216937153 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00080
 |
| -rw-r--r--   3 H56473 hdfs  217133015 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00081
 |
| -rw-r--r--   3 H56473 hdfs  217200238 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00082
 |
| -rw-r--r--   3 H56473 hdfs  217054759 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00083
 |
| -rw-r--r--   3 H56473 hdfs  217009417 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00084
 |
| -rw-r--r--   3 H56473 hdfs  217120440 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00085
 |
| -rw-r--r--   3 H56473 hdfs  216987824 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00086
 |
| -rw-r--r--   3 H56473 hdfs  217053581 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00087
 |
| -rw-r--r--   3 H56473 hdfs  217139935 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00088
 |
| -rw-r--r--   3 H56473 hdfs  217133938 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00089
 |
| -rw-r--r--   3 H56473 hdfs  217082768 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00090
 |
| -rw-r--r--   3 H56473 hdfs  217163033 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00091
 |
| -rw-r--r--   3 H56473 hdfs  217160879 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00092
 |
| -rw-r--r--   3 H56473 hdfs  217118996 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00093
 |
| -rw-r--r--   3 H56473 hdfs  217075658 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00094
 |
| -rw-r--r--   3 H56473 hdfs  217155885 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00095
 |
| -rw-r--r--   3 H56473 hdfs  217073602 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00096
 |
| -rw-r--r--   3 H56473 hdfs  217048454 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00097
 |
| -rw-r--r--   3 H56473 hdfs  217073480 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00098
 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|
    DFS Output
                   |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| -rw-r--r--   3 H56473 hdfs  217008473 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00099
 |
| -rw-r--r--   3 H56473 hdfs  217055083 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00100
 |
| -rw-r--r--   3 H56473 hdfs  217170178 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00101
 |
| -rw-r--r--   3 H56473 hdfs  217231866 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00102
 |
| -rw-r--r--   3 H56473 hdfs  217065415 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00103
 |
| -rw-r--r--   3 H56473 hdfs  217082898 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00104
 |
| -rw-r--r--   3 H56473 hdfs  217091916 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00105
 |
| -rw-r--r--   3 H56473 hdfs  217104784 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00106
 |
| -rw-r--r--   3 H56473 hdfs  217230988 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00107
 |
| -rw-r--r--   3 H56473 hdfs  217165086 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00108
 |
| -rw-r--r--   3 H56473 hdfs  217182369 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00109
 |
| -rw-r--r--   3 H56473 hdfs  217118862 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00110
 |
| -rw-r--r--   3 H56473 hdfs  217180020 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00111
 |
| -rw-r--r--   3 H56473 hdfs  217208232 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00112
 |
| -rw-r--r--   3 H56473 hdfs  217199476 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00113
 |
| -rw-r--r--   3 H56473 hdfs  217059597 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00114
 |
| -rw-r--r--   3 H56473 hdfs  217198679 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00115
 |
| -rw-r--r--   3 H56473 hdfs  217091012 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00116
 |
| -rw-r--r--   3 H56473 hdfs  217190091 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00117
 |
| -rw-r--r--   3 H56473 hdfs  217212151 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00118
 |
| -rw-r--r--   3 H56473 hdfs  217177634 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00119
 |
| -rw-r--r--   3 H56473 hdfs  217173936 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00120
 |
| -rw-r--r--   3 H56473 hdfs  217091743 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00121
 |
| -rw-r--r--   3 H56473 hdfs  217141298 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00122
 |
| -rw-r--r--   3 H56473 hdfs  217132204 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00123
 |
| -rw-r--r--   3 H56473 hdfs  217246598 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00124
 |
| -rw-r--r--   3 H56473 hdfs  217151074 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00125
 |
| -rw-r--r--   3 H56473 hdfs  217159300 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00126
 |
| -rw-r--r--   3 H56473 hdfs  217065983 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00127
 |
| -rw-r--r--   3 H56473 hdfs  217193747 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00128
 |
| -rw-r--r--   3 H56473 hdfs  217228013 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00129
 |
| -rw-r--r--   3 H56473 hdfs  217375225 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00130
 |
| -rw-r--r--   3 H56473 hdfs  217335197 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00131
 |
| -rw-r--r--   3 H56473 hdfs  217182537 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00132
 |
| -rw-r--r--   3 H56473 hdfs  217229480 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00133
 |
| -rw-r--r--   3 H56473 hdfs  217283215 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00134
 |
| -rw-r--r--   3 H56473 hdfs  217372431 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00135
 |
| -rw-r--r--   3 H56473 hdfs  217234042 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00136
 |
| -rw-r--r--   3 H56473 hdfs  217176898 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00137
 |
| -rw-r--r--   3 H56473 hdfs  217301958 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00138
 |
| -rw-r--r--   3 H56473 hdfs  217451069 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00139
 |
| -rw-r--r--   3 H56473 hdfs  217314824 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00140
 |
| -rw-r--r--   3 H56473 hdfs  217236714 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00141
 |
| -rw-r--r--   3 H56473 hdfs  217198689 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00142
 |
| -rw-r--r--   3 H56473 hdfs  217300105 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00143
 |
| -rw-r--r--   3 H56473 hdfs  217322560 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00144
 |
| -rw-r--r--   3 H56473 hdfs  217251308 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00145
 |
| -rw-r--r--   3 H56473 hdfs  217268420 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00146
 |
| -rw-r--r--   3 H56473 hdfs  217250943 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00147
 |
| -rw-r--r--   3 H56473 hdfs  217317036 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00148
 |
| -rw-r--r--   3 H56473 hdfs  217249339 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00149
 |
| -rw-r--r--   3 H56473 hdfs  217199279 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00150
 |
| -rw-r--r--   3 H56473 hdfs  217201217 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00151
 |
| -rw-r--r--   3 H56473 hdfs  217254531 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00152
 |
| -rw-r--r--   3 H56473 hdfs  217325584 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00153
 |
| -rw-r--r--   3 H56473 hdfs  217361150 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00154
 |
| -rw-r--r--   3 H56473 hdfs  217108797 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00155
 |
| -rw-r--r--   3 H56473 hdfs  217250257 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00156
 |
| -rw-r--r--   3 H56473 hdfs  217207462 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00157
 |
| -rw-r--r--   3 H56473 hdfs  217283114 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00158
 |
| -rw-r--r--   3 H56473 hdfs  217336632 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00159
 |
| -rw-r--r--   3 H56473 hdfs  217240497 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00160
 |
| -rw-r--r--   3 H56473 hdfs  217253770 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00161
 |
| -rw-r--r--   3 H56473 hdfs  217304981 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00162
 |
| -rw-r--r--   3 H56473 hdfs  217337919 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00163
 |
| -rw-r--r--   3 H56473 hdfs  217246615 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00164
 |
| -rw-r--r--   3 H56473 hdfs  217175520 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00165
 |
| -rw-r--r--   3 H56473 hdfs  217196985 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00166
 |
| -rw-r--r--   3 H56473 hdfs  217251912 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00167
 |
| -rw-r--r--   3 H56473 hdfs  217220392 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00168
 |
| -rw-r--r--   3 H56473 hdfs  217197378 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00169
 |
| -rw-r--r--   3 H56473 hdfs  217348762 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00170
 |
| -rw-r--r--   3 H56473 hdfs  217328016 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00171
 |
| -rw-r--r--   3 H56473 hdfs  217142176 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00172
 |
| -rw-r--r--   3 H56473 hdfs  217184677 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00173
 |
| -rw-r--r--   3 H56473 hdfs  217137840 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00174
 |
| -rw-r--r--   3 H56473 hdfs  217256754 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00175
 |
| -rw-r--r--   3 H56473 hdfs  217220424 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00176
 |
| -rw-r--r--   3 H56473 hdfs  217185562 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00177
 |
| -rw-r--r--   3 H56473 hdfs  217194575 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00178
 |
| -rw-r--r--   3 H56473 hdfs  217188097 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00179
 |
| -rw-r--r--   3 H56473 hdfs  217158674 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00180
 |
| -rw-r--r--   3 H56473 hdfs  217222821 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00181
 |
| -rw-r--r--   3 H56473 hdfs  217170798 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00182
 |
| -rw-r--r--   3 H56473 hdfs  217112615 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00183
 |
| -rw-r--r--   3 H56473 hdfs  217247444 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00184
 |
| -rw-r--r--   3 H56473 hdfs  217136592 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00185
 |
| -rw-r--r--   3 H56473 hdfs  217142003 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00186
 |
| -rw-r--r--   3 H56473 hdfs  217125721 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00187
 |
| -rw-r--r--   3 H56473 hdfs  217223243 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00188
 |
| -rw-r--r--   3 H56473 hdfs  217130702 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00189
 |
| -rw-r--r--   3 H56473 hdfs  217183947 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00190
 |
| -rw-r--r--   3 H56473 hdfs  216992034 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00191
 |
| -rw-r--r--   3 H56473 hdfs  217092469 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00192
 |
| -rw-r--r--   3 H56473 hdfs  217105133 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00193
 |
| -rw-r--r--   3 H56473 hdfs  217004917 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00194
 |
| -rw-r--r--   3 H56473 hdfs  217078557 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00195
 |
| -rw-r--r--   3 H56473 hdfs  217142035 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00196
 |
| -rw-r--r--   3 H56473 hdfs  216993711 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00197
 |
| -rw-r--r--   3 H56473 hdfs  217113515 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00198
 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|
    DFS Output
                   |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| -rw-r--r--   3 H56473 hdfs  216982052 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00199
 |
| -rw-r--r--   3 H56473 hdfs  217086795 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00200
 |
| -rw-r--r--   3 H56473 hdfs  217093281 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00201
 |
| -rw-r--r--   3 H56473 hdfs  217019781 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00202
 |
| -rw-r--r--   3 H56473 hdfs  217042372 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00203
 |
| -rw-r--r--   3 H56473 hdfs  217057324 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00204
 |
| -rw-r--r--   3 H56473 hdfs  216952020 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00205
 |
| -rw-r--r--   3 H56473 hdfs  217032930 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00206
 |
| -rw-r--r--   3 H56473 hdfs  217092936 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00207
 |
| -rw-r--r--   3 H56473 hdfs  216983923 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00208
 |
| -rw-r--r--   3 H56473 hdfs  216968290 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00209
 |
| -rw-r--r--   3 H56473 hdfs  217005680 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00210
 |
| -rw-r--r--   3 H56473 hdfs  216967777 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00211
 |
| -rw-r--r--   3 H56473 hdfs  217012052 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00212
 |
| -rw-r--r--   3 H56473 hdfs  217030822 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00213
 |
| -rw-r--r--   3 H56473 hdfs  216879098 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00214
 |
| -rw-r--r--   3 H56473 hdfs  216938981 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00215
 |
| -rw-r--r--   3 H56473 hdfs  216888064 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00216
 |
| -rw-r--r--   3 H56473 hdfs  217006890 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00217
 |
| -rw-r--r--   3 H56473 hdfs  216882204 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00218
 |
| -rw-r--r--   3 H56473 hdfs  216878124 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00219
 |
| -rw-r--r--   3 H56473 hdfs  216924030 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00220
 |
| -rw-r--r--   3 H56473 hdfs  216870359 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00221
 |
| -rw-r--r--   3 H56473 hdfs  216965696 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00222
 |
| -rw-r--r--   3 H56473 hdfs  216835873 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00223
 |
| -rw-r--r--   3 H56473 hdfs  216901023 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00224
 |
| -rw-r--r--   3 H56473 hdfs  216800511 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00225
 |
| -rw-r--r--   3 H56473 hdfs  216831982 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00226
 |
| -rw-r--r--   3 H56473 hdfs  216800692 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00227
 |
| -rw-r--r--   3 H56473 hdfs  216857795 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00228
 |
| -rw-r--r--   3 H56473 hdfs  216788102 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00229
 |
| -rw-r--r--   3 H56473 hdfs  216853182 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00230
 |
| -rw-r--r--   3 H56473 hdfs  216762991 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00231
 |
| -rw-r--r--   3 H56473 hdfs  216757552 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00232
 |
| -rw-r--r--   3 H56473 hdfs  216654495 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00233
 |
| -rw-r--r--   3 H56473 hdfs  216588262 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00234
 |
| -rw-r--r--   3 H56473 hdfs  216648141 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00235
 |
| -rw-r--r--   3 H56473 hdfs  216687726 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00236
 |
| -rw-r--r--   3 H56473 hdfs  216613018 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00237
 |
| -rw-r--r--   3 H56473 hdfs  216561040 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00238
 |
| -rw-r--r--   3 H56473 hdfs  216561957 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00239
 |
| -rw-r--r--   3 H56473 hdfs  216508756 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00240
 |
| -rw-r--r--   3 H56473 hdfs  216416846 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00241
 |
| -rw-r--r--   3 H56473 hdfs  216404453 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00242
 |
| -rw-r--r--   3 H56473 hdfs  216439975 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00243
 |
| -rw-r--r--   3 H56473 hdfs  216347834 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00244
 |
| -rw-r--r--   3 H56473 hdfs  216338215 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00245
 |
| -rw-r--r--   3 H56473 hdfs  216234103 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00246
 |
| -rw-r--r--   3 H56473 hdfs  216350621 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00247
 |
| -rw-r--r--   3 H56473 hdfs  216290393 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00248
 |
| -rw-r--r--   3 H56473 hdfs  216131030 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00249
 |
| -rw-r--r--   3 H56473 hdfs  216065350 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00250
 |
| -rw-r--r--   3 H56473 hdfs  215942146 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00251
 |
| -rw-r--r--   3 H56473 hdfs  215876602 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00252
 |
| -rw-r--r--   3 H56473 hdfs  215906085 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00253
 |
| -rw-r--r--   3 H56473 hdfs  215650113 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00254
 |
| -rw-r--r--   3 H56473 hdfs  215941082 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00255
 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
257 rows selected (0.065 seconds)


Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Fri, Jun 24, 2016 at 6:16 PM, Gopal Vijayaraghavan <go...@apache.org>
wrote:

>
>
> > Please help me on this....let me know you need other info.
>
> Are the ORC tables fully compacted? Looks like you're running a version of
> Hive-ACID, which does not perform well without compacting delta files.
>
> dfs -ls <path>;
>
> should tell you whether there are any delta_* files in the list.
>
> > |                            ACID table:true
> >                                           |
> > |                            alias:tuning_dd_key
> >                        |
>
> Other than that, it does look like you have only 1 shuffle and if you did
> run this using Tez, I recommend using
>
> <https://github.com/apache/tez/tree/master/tez-tools/swimlanes>
>
> to find out the slowest task & find more information about it. You will
> get a diagram which looks like this
>
> <http://www.slideshare.net/t3rmin4t0r/tez8-ui-walkthrough/20>
>
>
> The longest bar of that is the slowest task. I have another version of it,
> which is unreleased yet (is a bit hard to explain) giving an image which
> looks like
>
> <
> http://people.apache.org/~gopalv/q21_suppliers_who_kept_orders_waiting.svg
> >
>
>
> <https://github.com/t3rmin4t0r/tez-swimlanes/blob/master/vertex.py>
>
> which is better at finding the 1-2 reducers skewing in a large DAG like
> q21 [1].
>
>
> Cheers,
> Gopal
> [1] -
> <
> http://people.apache.org/~gopalv/tpch-plans/q21_suppliers_who_kept_orders_
> waiting.svg>
>
>
>

Re: Optimize Hive Query

Posted by Gopal Vijayaraghavan <go...@apache.org>.

> Please help me on this....let me know you need other info.

Are the ORC tables fully compacted? Looks like you're running a version of
Hive-ACID, which does not perform well without compacting delta files.

dfs -ls <path>;

should tell you whether there are any delta_* files in the list.

> |                            ACID table:true
>                                           |
> |                            alias:tuning_dd_key
>                        |

Other than that, it does look like you have only 1 shuffle and if you did
run this using Tez, I recommend using

<https://github.com/apache/tez/tree/master/tez-tools/swimlanes>

to find out the slowest task & find more information about it. You will
get a diagram which looks like this

<http://www.slideshare.net/t3rmin4t0r/tez8-ui-walkthrough/20>


The longest bar of that is the slowest task. I have another version of it,
which is unreleased yet (is a bit hard to explain) giving an image which
looks like 

<http://people.apache.org/~gopalv/q21_suppliers_who_kept_orders_waiting.svg
>


<https://github.com/t3rmin4t0r/tez-swimlanes/blob/master/vertex.py>

which is better at finding the 1-2 reducers skewing in a large DAG like
q21 [1].


Cheers,
Gopal
[1] - 
<http://people.apache.org/~gopalv/tpch-plans/q21_suppliers_who_kept_orders_
waiting.svg>



Re: Optimize Hive Query

Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Vijay,

Please help me on this....let me know you need other info.



Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Thu, Jun 23, 2016 at 12:41 PM, @Sanjiv Singh <sa...@gmail.com>
wrote:

> Hi Gopal,
>
> I am using Tez as execution engine.
>
> DAG :
>
> +--------------------------------------------------------+--+
> |
>                                   Explain
>                                                                         |
> +---------------------------------------------------------+--+
> | Plan not optimized by CBO.
>       |
> |
>                            |
> | Vertex dependency in root stage
>                             |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
>                                                         |
> |
>                                  |
> | Stage-0
>                               |
> |    Fetch Operator
>                                                               |
> |       limit:-1
>                                 |
> |       Stage-1
>                                 |
> |          Reducer 2
>                                         |
> |          File Output Operator [FS_55596]
>                                                       |
> |             compressed:false
>                                                            |
> |             Statistics:Num rows: 6357592675 Data size: 54076899328 Basic
> stats: COMPLETE Column stats: NONE          |
> |
> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
> format:":"org.apache.hadoop.mapred.TextInputFormat","output
> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}  |
> |             Select Operator [SEL_55594]
>                 |
> |
>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>                                    |
> |                Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE                       |
> |                PTF Operator [PTF_55593]
>                                    |
> |                   Function definitions:[{"Input
> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
> _col1","name:":"windowingtablefunction","order by:":"_col2"}]      |
> |                   Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE           |
> |                   Select Operator [SEL_55592]
>                                           |
> |                   |
>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
>                                                                   |
> |                   |  Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE                    |
> |                   |<-Map 1 [SIMPLE_EDGE] vectorized
>                                  |
> |                      Reduce Output Operator [RS_55597]
>                              |
> |                         key expressions:m_d_key (type: smallint),
> sb_gu_key (type: bigint), t_ev_st_dt (type: date)         |
> |                         Map-reduce partition columns:m_d_key (type:
> smallint), sb_gu_key (type: bigint)            |
> |                         sort order:+++
>      |
> |                         Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE
>  |
> |                         value expressions:ad_zn_key (type: int), c_dt
> (type: date), e_p_dt (type: date), sq_nbr (type: int)           |
> |                         TableScan [TS_55590]
>                                                           |
> |                            ACID table:true
>                                          |
> |                            alias:tuning_dd_key
>                       |
> |                            Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE                      |
> |
>
>                 |
>
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339
>
> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
> wrote:
>
>>
>> > Long running query :
>>
>> Are you running this on MapReduce or Tez?
>>
>> Please post the output of explain - if you are seeing > 1 shuffle edge in
>> your query while having only one window for OVER(), that might be the
>> reason.
>>
>> OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
>>
>>
>> The multiple PTF operators should have been collapsed by the reduce
>> sink-deduplication.
>>
>> Cheers,
>> Gopal
>>
>>
>>
>

Re: Optimize Hive Query

Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Mich,

I tried the same without any luck. I don't see any improvement.



Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Thu, Jun 23, 2016 at 5:38 PM, @Sanjiv Singh <sa...@gmail.com>
wrote:

> Thanks Mich. for your inputs.
>
> Let me try that as well. Will post response.
>
>
>

Re: Optimize Hive Query

Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Thanks Mich. for your inputs.

Let me try that as well. Will post response.

Re: Optimize Hive Query

Posted by Mich Talebzadeh <mi...@gmail.com>.
Funny enough it is pretty close to similar ORC transactional tables I have.
Standard with 256 buckets with two columns as below

number of distinct value in column m_d_key : 29
> number of distinct value in column sb_gu_key : 15434343


You have also vectorised data taking 1024 rows at once.

Still the optimizer does not tell me much. Also I don't use TEZ. I use
Spark as the execution engine.  From my experience (and I am sure there
will be plenty who will disagree with me :)), the optimiser does not make
much difference, it is the execution engine than delivers the performance.

The other alternative is that when you populate the table insert the data
sorted by m_d_key, sb_gu_key, t_ev_st_dt to ensure that that the optimizer
will be better off.

Also may help if you add t_ev_st_dt   as the third column of the bucket as
the LAG() function is using it in ORDER BY


CLUSTERED BY (m_d_key, sb_gu_key, t_ev_st_dt)  INTO 256 BUCKETS
STORED AS ORC
TBLPROPERTIES (
  "transactional"="true",
  "orc.create.index"="true",
  "orc.bloom.filter.columns"="m_d_key, sb_gu_key, t_ev_st_dt",
  "orc.bloom.filter.fpp"="0.05",
  "orc.stripe.size"="16777216",
  "orc.row.index.stride"="10000"
)

Others may have better ideas.

HTH


Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 23 June 2016 at 20:30, @Sanjiv Singh <sa...@gmail.com> wrote:

> Hi Mich ,
>
> Please find below output of command.
>
> desc formatted tuning_dd_key ;
>
>
> +-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
> |           col_name            |
>   data_type                                           |        comment
>    |
>
> +-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
> | # col_name                    | data_type
>                                                       | comment
>   |
> |                               | NULL
>                                                      | NULL
>  |
> | m_d_key                       | smallint
>                                                      |
>   |
> | sb_gu_key                     | bigint
>                                                      |
>   |
> | t_ev_st_dt                    | date
>                                                      |
>   |
> | ad_zn_key                     | int
>                                                       |
>   |
> | c_dt                          | date
>                                                      |
>   |
> | e_p_dt                        | date
>                                                      |
>   |
> | sq_nbr                        | int
>                                                       |
>   |
> |                               | NULL
>                                                      | NULL
>  |
> | # Detailed Table Information  | NULL
>                                                      | NULL
>  |
> | Database:                     | PRDDB
>                                                       | NULL
>    |
> | CreateTime:                   | Thu Jun 23 11:03:53 EDT 2016
>                                                      | NULL
>  |
> | LastAccessTime:               | UNKNOWN
>                                                       | NULL
>    |
> | Protect Mode:                 | None
>                                                      | NULL
>  |
> | Retention:                    | 0
>                                                       | NULL
>    |
> | Table Type:                   | MANAGED_TABLE
>                                                       | NULL
>    |
> | Table Parameters:             | NULL
>                                                      | NULL
>  |
> |                               | COLUMN_STATS_ACCURATE
>                                                       | true
>    |
> |                               | numFiles
>                                                      | 256
>   |
> |                               | numRows
>                                                       | 6357592675
>    |
> |                               | rawDataSize
>                                                       | 0
>   |
> |                               | totalSize
>                                                       | 54076898961
>   |
> |                               | transactional
>                                                       | true
>    |
> |                               | transient_lastDdlTime
>                                                       | 1466694970
>    |
> |                               | NULL
>                                                      | NULL
>  |
> | # Storage Information         | NULL
>                                                      | NULL
>  |
> | SerDe Library:                |
> org.apache.hadoop.hive.ql.io.orc.OrcSerde
>                   | NULL                  |
> | InputFormat:                  |
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                   | NULL                  |
> | OutputFormat:                 |
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
>                    | NULL                  |
> | Compressed:                   | No
>                                                      | NULL
>  |
> | Num Buckets:                  | 256
>                                                       | NULL
>    |
> | Bucket Columns:               | [sbsc_guid_key, mas_div_key]
>                                                      | NULL
>  |
> | Sort Columns:                 | []
>                                                      | NULL
>  |
> | Storage Desc Params:          | NULL
>                                                      | NULL
>  |
> |                               | serialization.format
>                                                      | 1
>   |
>
> +-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
>
>
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339
>
> On Thu, Jun 23, 2016 at 12:47 PM, Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
>> Do you also have the output from
>>
>> desc formatted tuning_dd_key
>>
>>  and send the output please?
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 23 June 2016 at 17:41, @Sanjiv Singh <sa...@gmail.com> wrote:
>>
>>> Hi Gopal,
>>>
>>> I am using Tez as execution engine.
>>>
>>> DAG :
>>>
>>> +--------------------------------------------------------+--+
>>> |
>>>                                     Explain
>>>                                                                           |
>>> +---------------------------------------------------------+--+
>>> | Plan not optimized by CBO.
>>>         |
>>> |
>>>                              |
>>> | Vertex dependency in root stage
>>>                               |
>>> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
>>>                                                           |
>>> |
>>>                                    |
>>> | Stage-0
>>>                                 |
>>> |    Fetch Operator
>>>                                                                 |
>>> |       limit:-1
>>>                                   |
>>> |       Stage-1
>>>                                   |
>>> |          Reducer 2
>>>                                           |
>>> |          File Output Operator [FS_55596]
>>>                                                         |
>>> |             compressed:false
>>>                                                              |
>>> |             Statistics:Num rows: 6357592675 Data size: 54076899328
>>> Basic stats: COMPLETE Column stats: NONE          |
>>> |
>>> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
>>> format:":"org.apache.hadoop.mapred.TextInputFormat","output
>>> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}  |
>>> |             Select Operator [SEL_55594]
>>>                   |
>>> |
>>>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>>>                                    |
>>> |                Statistics:Num rows: 6357592675 Data size: 54076899328
>>> Basic stats: COMPLETE Column stats: NONE                       |
>>> |                PTF Operator [PTF_55593]
>>>                                      |
>>> |                   Function definitions:[{"Input
>>> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
>>> _col1","name:":"windowingtablefunction","order by:":"_col2"}]      |
>>> |                   Statistics:Num rows: 6357592675 Data size:
>>> 54076899328 Basic stats: COMPLETE Column stats: NONE           |
>>> |                   Select Operator [SEL_55592]
>>>                                             |
>>> |                   |
>>>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
>>>                                                                   |
>>> |                   |  Statistics:Num rows: 6357592675 Data size:
>>> 54076899328 Basic stats: COMPLETE Column stats: NONE                    |
>>> |                   |<-Map 1 [SIMPLE_EDGE] vectorized
>>>                                    |
>>> |                      Reduce Output Operator [RS_55597]
>>>                                |
>>> |                         key expressions:m_d_key (type: smallint),
>>> sb_gu_key (type: bigint), t_ev_st_dt (type: date)         |
>>> |                         Map-reduce partition columns:m_d_key (type:
>>> smallint), sb_gu_key (type: bigint)            |
>>> |                         sort order:+++
>>>        |
>>> |                         Statistics:Num rows: 6357592675 Data size:
>>> 54076899328 Basic stats: COMPLETE Column stats: NONE
>>>  |
>>> |                         value expressions:ad_zn_key (type: int), c_dt
>>> (type: date), e_p_dt (type: date), sq_nbr (type: int)           |
>>> |                         TableScan [TS_55590]
>>>                                                             |
>>> |                            ACID table:true
>>>                                            |
>>> |                            alias:tuning_dd_key
>>>                         |
>>> |                            Statistics:Num rows: 6357592675 Data size:
>>> 54076899328 Basic stats: COMPLETE Column stats: NONE                      |
>>> |
>>>
>>>                   |
>>>
>>> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>>>
>>> Regards
>>> Sanjiv Singh
>>> Mob :  +091 9990-447-339
>>>
>>> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <gopalv@apache.org
>>> > wrote:
>>>
>>>>
>>>> > Long running query :
>>>>
>>>> Are you running this on MapReduce or Tez?
>>>>
>>>> Please post the output of explain - if you are seeing > 1 shuffle edge
>>>> in
>>>> your query while having only one window for OVER(), that might be the
>>>> reason.
>>>>
>>>> OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
>>>>
>>>>
>>>> The multiple PTF operators should have been collapsed by the reduce
>>>> sink-deduplication.
>>>>
>>>> Cheers,
>>>> Gopal
>>>>
>>>>
>>>>
>>>
>>
>

Re: Optimize Hive Query

Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Mich ,

Please find below output of command.

desc formatted tuning_dd_key ;

+-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
|           col_name            |
data_type                                           |        comment
 |
+-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
| # col_name                    | data_type
                                                    | comment
|
|                               | NULL
                                                     | NULL
 |
| m_d_key                       | smallint
                                                     |
  |
| sb_gu_key                     | bigint
                                                     |
  |
| t_ev_st_dt                    | date
                                                     |
  |
| ad_zn_key                     | int
                                                    |
|
| c_dt                          | date
                                                     |
  |
| e_p_dt                        | date
                                                     |
  |
| sq_nbr                        | int
                                                    |
|
|                               | NULL
                                                     | NULL
 |
| # Detailed Table Information  | NULL
                                                     | NULL
 |
| Database:                     | PRDDB
                                                    | NULL
 |
| CreateTime:                   | Thu Jun 23 11:03:53 EDT 2016
                                                     | NULL
 |
| LastAccessTime:               | UNKNOWN
                                                    | NULL
 |
| Protect Mode:                 | None
                                                     | NULL
 |
| Retention:                    | 0
                                                    | NULL
 |
| Table Type:                   | MANAGED_TABLE
                                                    | NULL
 |
| Table Parameters:             | NULL
                                                     | NULL
 |
|                               | COLUMN_STATS_ACCURATE
                                                    | true
 |
|                               | numFiles
                                                     | 256
  |
|                               | numRows
                                                    | 6357592675
 |
|                               | rawDataSize
                                                    | 0
|
|                               | totalSize
                                                    | 54076898961
|
|                               | transactional
                                                    | true
 |
|                               | transient_lastDdlTime
                                                    | 1466694970
 |
|                               | NULL
                                                     | NULL
 |
| # Storage Information         | NULL
                                                     | NULL
 |
| SerDe Library:                | org.apache.hadoop.hive.ql.io.orc.OrcSerde
                                                    | NULL
 |
| InputFormat:                  |
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                  | NULL                  |
| OutputFormat:                 |
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                   | NULL                  |
| Compressed:                   | No
                                                     | NULL
 |
| Num Buckets:                  | 256
                                                    | NULL
 |
| Bucket Columns:               | [sbsc_guid_key, mas_div_key]
                                                     | NULL
 |
| Sort Columns:                 | []
                                                     | NULL
 |
| Storage Desc Params:          | NULL
                                                     | NULL
 |
|                               | serialization.format
                                                     | 1
  |
+-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+


Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Thu, Jun 23, 2016 at 12:47 PM, Mich Talebzadeh <mich.talebzadeh@gmail.com
> wrote:

> Do you also have the output from
>
> desc formatted tuning_dd_key
>
>  and send the output please?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 23 June 2016 at 17:41, @Sanjiv Singh <sa...@gmail.com> wrote:
>
>> Hi Gopal,
>>
>> I am using Tez as execution engine.
>>
>> DAG :
>>
>> +--------------------------------------------------------+--+
>> |
>>                                   Explain
>>                                                                         |
>> +---------------------------------------------------------+--+
>> | Plan not optimized by CBO.
>>         |
>> |
>>                            |
>> | Vertex dependency in root stage
>>                             |
>> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
>>                                                           |
>> |
>>                                  |
>> | Stage-0
>>                               |
>> |    Fetch Operator
>>                                                               |
>> |       limit:-1
>>                                   |
>> |       Stage-1
>>                                 |
>> |          Reducer 2
>>                                           |
>> |          File Output Operator [FS_55596]
>>                                                         |
>> |             compressed:false
>>                                                              |
>> |             Statistics:Num rows: 6357592675 Data size: 54076899328
>> Basic stats: COMPLETE Column stats: NONE          |
>> |
>> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
>> format:":"org.apache.hadoop.mapred.TextInputFormat","output
>> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}  |
>> |             Select Operator [SEL_55594]
>>                 |
>> |
>>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>>                                    |
>> |                Statistics:Num rows: 6357592675 Data size: 54076899328
>> Basic stats: COMPLETE Column stats: NONE                       |
>> |                PTF Operator [PTF_55593]
>>                                    |
>> |                   Function definitions:[{"Input
>> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
>> _col1","name:":"windowingtablefunction","order by:":"_col2"}]      |
>> |                   Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE           |
>> |                   Select Operator [SEL_55592]
>>                                           |
>> |                   |
>>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
>>                                                                   |
>> |                   |  Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE                    |
>> |                   |<-Map 1 [SIMPLE_EDGE] vectorized
>>                                  |
>> |                      Reduce Output Operator [RS_55597]
>>                                |
>> |                         key expressions:m_d_key (type: smallint),
>> sb_gu_key (type: bigint), t_ev_st_dt (type: date)         |
>> |                         Map-reduce partition columns:m_d_key (type:
>> smallint), sb_gu_key (type: bigint)            |
>> |                         sort order:+++
>>        |
>> |                         Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE
>>  |
>> |                         value expressions:ad_zn_key (type: int), c_dt
>> (type: date), e_p_dt (type: date), sq_nbr (type: int)           |
>> |                         TableScan [TS_55590]
>>                                                             |
>> |                            ACID table:true
>>                                            |
>> |                            alias:tuning_dd_key
>>                         |
>> |                            Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE                      |
>> |
>>
>>                 |
>>
>> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>>
>> Regards
>> Sanjiv Singh
>> Mob :  +091 9990-447-339
>>
>> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
>> wrote:
>>
>>>
>>> > Long running query :
>>>
>>> Are you running this on MapReduce or Tez?
>>>
>>> Please post the output of explain - if you are seeing > 1 shuffle edge in
>>> your query while having only one window for OVER(), that might be the
>>> reason.
>>>
>>> OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
>>>
>>>
>>> The multiple PTF operators should have been collapsed by the reduce
>>> sink-deduplication.
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>
>

Re: Optimize Hive Query

Posted by Mich Talebzadeh <mi...@gmail.com>.
Do you also have the output from

desc formatted tuning_dd_key

 and send the output please?

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 23 June 2016 at 17:41, @Sanjiv Singh <sa...@gmail.com> wrote:

> Hi Gopal,
>
> I am using Tez as execution engine.
>
> DAG :
>
> +--------------------------------------------------------+--+
> |
>                                   Explain
>                                                                         |
> +---------------------------------------------------------+--+
> | Plan not optimized by CBO.
>       |
> |
>                            |
> | Vertex dependency in root stage
>                             |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
>                                                         |
> |
>                                  |
> | Stage-0
>                               |
> |    Fetch Operator
>                                                               |
> |       limit:-1
>                                 |
> |       Stage-1
>                                 |
> |          Reducer 2
>                                         |
> |          File Output Operator [FS_55596]
>                                                       |
> |             compressed:false
>                                                            |
> |             Statistics:Num rows: 6357592675 Data size: 54076899328 Basic
> stats: COMPLETE Column stats: NONE          |
> |
> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
> format:":"org.apache.hadoop.mapred.TextInputFormat","output
> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}  |
> |             Select Operator [SEL_55594]
>                 |
> |
>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>                                    |
> |                Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE                       |
> |                PTF Operator [PTF_55593]
>                                    |
> |                   Function definitions:[{"Input
> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
> _col1","name:":"windowingtablefunction","order by:":"_col2"}]      |
> |                   Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE           |
> |                   Select Operator [SEL_55592]
>                                           |
> |                   |
>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
>                                                                   |
> |                   |  Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE                    |
> |                   |<-Map 1 [SIMPLE_EDGE] vectorized
>                                  |
> |                      Reduce Output Operator [RS_55597]
>                              |
> |                         key expressions:m_d_key (type: smallint),
> sb_gu_key (type: bigint), t_ev_st_dt (type: date)         |
> |                         Map-reduce partition columns:m_d_key (type:
> smallint), sb_gu_key (type: bigint)            |
> |                         sort order:+++
>      |
> |                         Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE
>  |
> |                         value expressions:ad_zn_key (type: int), c_dt
> (type: date), e_p_dt (type: date), sq_nbr (type: int)           |
> |                         TableScan [TS_55590]
>                                                           |
> |                            ACID table:true
>                                          |
> |                            alias:tuning_dd_key
>                       |
> |                            Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE                      |
> |
>
>                 |
>
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339
>
> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
> wrote:
>
>>
>> > Long running query :
>>
>> Are you running this on MapReduce or Tez?
>>
>> Please post the output of explain - if you are seeing > 1 shuffle edge in
>> your query while having only one window for OVER(), that might be the
>> reason.
>>
>> OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
>>
>>
>> The multiple PTF operators should have been collapsed by the reduce
>> sink-deduplication.
>>
>> Cheers,
>> Gopal
>>
>>
>>
>

Re: Optimize Hive Query

Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Gopal,

I am using Tez as execution engine.

DAG :

+--------------------------------------------------------+--+
|
                                Explain
                                                                      |
+---------------------------------------------------------+--+
| Plan not optimized by CBO.
      |
|
                         |
| Vertex dependency in root stage
                          |
| Reducer 2 <- Map 1 (SIMPLE_EDGE)
                                                        |
|
                               |
| Stage-0
                            |
|    Fetch Operator
                                                            |
|       limit:-1
                                |
|       Stage-1
                              |
|          Reducer 2
                                        |
|          File Output Operator [FS_55596]
                                                      |
|             compressed:false
                                                           |
|             Statistics:Num rows: 6357592675 Data size: 54076899328 Basic
stats: COMPLETE Column stats: NONE          |
|
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
format:":"org.apache.hadoop.mapred.TextInputFormat","output
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}  |
|             Select Operator [SEL_55594]
              |
|
 outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
                                   |
|                Statistics:Num rows: 6357592675 Data size: 54076899328
Basic stats: COMPLETE Column stats: NONE                       |
|                PTF Operator [PTF_55593]
                                 |
|                   Function definitions:[{"Input
definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
_col1","name:":"windowingtablefunction","order by:":"_col2"}]      |
|                   Statistics:Num rows: 6357592675 Data size: 54076899328
Basic stats: COMPLETE Column stats: NONE           |
|                   Select Operator [SEL_55592]
                                        |
|                   |
 outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
                                                                  |
|                   |  Statistics:Num rows: 6357592675 Data size:
54076899328 Basic stats: COMPLETE Column stats: NONE                    |
|                   |<-Map 1 [SIMPLE_EDGE] vectorized
                               |
|                      Reduce Output Operator [RS_55597]
                             |
|                         key expressions:m_d_key (type: smallint),
sb_gu_key (type: bigint), t_ev_st_dt (type: date)         |
|                         Map-reduce partition columns:m_d_key (type:
smallint), sb_gu_key (type: bigint)            |
|                         sort order:+++
     |
|                         Statistics:Num rows: 6357592675 Data size:
54076899328 Basic stats: COMPLETE Column stats: NONE
 |
|                         value expressions:ad_zn_key (type: int), c_dt
(type: date), e_p_dt (type: date), sq_nbr (type: int)           |
|                         TableScan [TS_55590]
                                                          |
|                            ACID table:true
                                         |
|                            alias:tuning_dd_key
                      |
|                            Statistics:Num rows: 6357592675 Data size:
54076899328 Basic stats: COMPLETE Column stats: NONE                      |
|

              |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+

Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
wrote:

>
> > Long running query :
>
> Are you running this on MapReduce or Tez?
>
> Please post the output of explain - if you are seeing > 1 shuffle edge in
> your query while having only one window for OVER(), that might be the
> reason.
>
> OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
>
>
> The multiple PTF operators should have been collapsed by the reduce
> sink-deduplication.
>
> Cheers,
> Gopal
>
>
>

Re: Optimize Hive Query

Posted by Gopal Vijayaraghavan <go...@apache.org>.
> Long running query :

Are you running this on MapReduce or Tez?

Please post the output of explain - if you are seeing > 1 shuffle edge in
your query while having only one window for OVER(), that might be the
reason.

OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
 

The multiple PTF operators should have been collapsed by the reduce
sink-deduplication.

Cheers,
Gopal