You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by "@Sanjiv Singh" <sa...@gmail.com> on 2016/06/23 00:42:18 UTC
Optimize Hive Query
Hi All,
I am running performance issue with below query. Its took 2-3 hours to
complete in hive.
Try tried to partition and bucketing changes on this tables, but without
luck.
Please help me in optimizing this query.
what schema level changes can be done ?
other parameters recommendations ?
*Below are complete details :*
*Hive Table DDL :*
CREATE TABLE `tuning_dd_key`(
> m_d_key smallint,
> sb_gu_key bigint,
> t_ev_st_dt date,
> a_z_key int,
> c_dt date,
> e_p_dt date,
> sq_nbr int);
*Total data size : *
> 250 GB
*Long running query :*
> SELECT
> sb_gu_key, m_d_key, t_ev_st_dt,
> LAG( t_ev_st_dt ) OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY
> t_ev_st_dt ) AS LAG_START_DT,
> a_z_key,
> c_dt,
> e_p_dt,
> sq_nbr,
> CASE WHEN LAG( t_ev_st_dt ) OVER ( PARTITION BY m_d_key , sb_gu_key
> ORDER BY t_ev_st_dt ) IS NULL OR a_z_key <> LAG( a_z_key , 1 , -999 )
> OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt ) THEN 'S'
> ELSE NULL END AS ST_FLAG
> FROM `PRDDB`.tuning_dd_key ;
*More info :*
number of distinct value in column m_d_key : 29
> number of distinct value in column sb_gu_key : 15434343
Regards
Sanjiv Singh
Mob : +091 9990-447-339
Re: Optimize Hive Query
Posted by Jörn Franke <jo...@gmail.com>.
The query looks a little bit too complex from what it is supposed to do. Can you reformulate and restrict the data in a where clause (highest restriction first). Another hint would be to use the Orc format (with indexes and optionally bloom filters) with snappy compression as well as sorting the data on the column you choose to restrict in the where part.
> On 23 Jun 2016, at 02:42, @Sanjiv Singh <sa...@gmail.com> wrote:
>
> Hi All,
>
> I am running performance issue with below query. Its took 2-3 hours to complete in hive.
>
> Try tried to partition and bucketing changes on this tables, but without luck.
>
> Please help me in optimizing this query.
>
> what schema level changes can be done ?
> other parameters recommendations ?
>
>
> Below are complete details :
>
> Hive Table DDL :
>
>> CREATE TABLE `tuning_dd_key`(
>> m_d_key smallint,
>> sb_gu_key bigint,
>> t_ev_st_dt date,
>> a_z_key int,
>> c_dt date,
>> e_p_dt date,
>> sq_nbr int);
>
> Total data size :
>> 250 GB
>
> Long running query :
>
>> SELECT
>> sb_gu_key, m_d_key, t_ev_st_dt,
>> LAG( t_ev_st_dt ) OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt ) AS LAG_START_DT,
>> a_z_key,
>> c_dt,
>> e_p_dt,
>> sq_nbr,
>> CASE WHEN LAG( t_ev_st_dt ) OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt ) IS NULL OR a_z_key <> LAG( a_z_key , 1 , -999 ) OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt ) THEN 'S' ELSE NULL END AS ST_FLAG
>> FROM `PRDDB`.tuning_dd_key ;
>
>
> More info :
>
>> number of distinct value in column m_d_key : 29
>> number of distinct value in column sb_gu_key : 15434343
>
>
>
>
> Regards
> Sanjiv Singh
> Mob : +091 9990-447-339
RE: Optimize Hive Query
Posted by "Markovitz, Dudu" <dm...@paypal.com>.
Thanks, I wanted to rule out skewedness over m_d_key,sb_gu_key
Dudu
From: @Sanjiv Singh [mailto:sanjiv.is.on@gmail.com]
Sent: Thursday, June 23, 2016 11:55 PM
To: user@hive.apache.org; Markovitz, Dudu <dm...@paypal.com>; sanjiv singh (ME) <sa...@gmail.com>
Subject: Re: Optimize Hive Query
Hi Dudu,
find below query response.
Query :
select m_d_key,sb_gu_key ,count (*) as cnt
from tuning_dd_key
group by m_d_key,sb_gu_key
order by cnt desc
limit 100;
Output :
16 9042668 1361
16 8063808 1361
16 8569864 1361
16 8909889 1361
16 9864785 1361
16 8269717 1361
16 10180282 1361
16 8913062 1361
16 8418183 1361
16 8003791 1361
16 10201084 1361
16 8470942 1361
16 9234223 1361
16 8330286 1361
16 12966192 1361
16 9008767 1361
16 8902598 1361
16 9878885 1361
16 8741214 1361
16 8732856 1361
16 9692696 1361
16 8072042 1361
16 8802681 1361
16 14087558 1361
16 9027186 1361
16 9587342 1361
16 9699202 1361
16 8542344 1361
16 9680544 1361
16 8903570 1361
16 9542542 1361
4 3576041 1361
16 9126774 1361
16 9957826 1361
16 8345331 1361
16 9756883 1361
16 9399702 1361
18 9403442 1361
16 9746288 1361
16 9435202 1361
16 9069894 1361
16 9920826 1361
16 8765877 1361
16 8813448 1361
18 9635460 1361
16 8463714 1361
16 8166965 1361
16 9597903 1361
16 9432100 1361
16 8847857 1361
16 13953068 1361
16 8744451 1361
16 8089463 1361
16 9674902 1361
16 8418200 1361
16 8028509 1361
16 9243086 1361
16 8892184 1361
16 8801594 1361
16 9849079 1361
16 8556753 1361
16 8979232 1361
16 8081946 1361
16 8724046 1361
16 9984434 1361
16 8651659 1361
16 9116866 1361
1 17870072 1361
16 8860630 1361
16 9888398 1361
16 9463782 1361
16 9602127 1361
16 9353325 1361
16 7991816 1361
16 9920420 1361
16 8497624 1361
16 8987980 1361
16 8234751 1361
16 8389490 1361
18 9975575 1361
16 8026536 1361
16 8790618 1361
16 9846791 1361
16 8363833 1361
16 9025525 1361
16 9241297 1361
16 8712487 1361
16 8692003 1361
16 9316523 1361
16 8124338 1361
16 9941027 1361
16 9547973 1361
16 8007742 1361
16 8418425 1361
16 8944940 1361
16 8890232 1361
16 9248984 1361
16 9784461 1361
16 9009374 1361
16 8395861 1361
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Thu, Jun 23, 2016 at 4:01 AM, Markovitz, Dudu <dm...@paypal.com>> wrote:
Could you also add the results of the following query?
Thanks
Dudu
select m_d_key
,sb_gu_key
,count (*) as cnt
from tuning_dd_key
group by m_d_key
,sb_gu_key
order by cnt desc
limit 100
;
-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com<ma...@hortonworks.com>] On Behalf Of Gopal Vijayaraghavan
Sent: Thursday, June 23, 2016 9:45 AM
To: user@hive.apache.org<ma...@hive.apache.org>
Subject: Re: Optimize Hive Query
> Long running query :
Are you running this on MapReduce or Tez?
Please post the output of explain - if you are seeing > 1 shuffle edge in your query while having only one window for OVER(), that might be the reason.
OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
The multiple PTF operators should have been collapsed by the reduce sink-deduplication.
Cheers,
Gopal
Re: Optimize Hive Query
Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Dudu,
find below query response.
Query :
> select m_d_key,sb_gu_key ,count (*) as cnt
> from tuning_dd_key
> group by m_d_key,sb_gu_key
> order by cnt desc
> limit 100;
Output :
16 9042668 1361
> 16 8063808 1361
> 16 8569864 1361
> 16 8909889 1361
> 16 9864785 1361
> 16 8269717 1361
> 16 10180282 1361
> 16 8913062 1361
> 16 8418183 1361
> 16 8003791 1361
> 16 10201084 1361
> 16 8470942 1361
> 16 9234223 1361
> 16 8330286 1361
> 16 12966192 1361
> 16 9008767 1361
> 16 8902598 1361
> 16 9878885 1361
> 16 8741214 1361
> 16 8732856 1361
> 16 9692696 1361
> 16 8072042 1361
> 16 8802681 1361
> 16 14087558 1361
> 16 9027186 1361
> 16 9587342 1361
> 16 9699202 1361
> 16 8542344 1361
> 16 9680544 1361
> 16 8903570 1361
> 16 9542542 1361
> 4 3576041 1361
> 16 9126774 1361
> 16 9957826 1361
> 16 8345331 1361
> 16 9756883 1361
> 16 9399702 1361
> 18 9403442 1361
> 16 9746288 1361
> 16 9435202 1361
> 16 9069894 1361
> 16 9920826 1361
> 16 8765877 1361
> 16 8813448 1361
> 18 9635460 1361
> 16 8463714 1361
> 16 8166965 1361
> 16 9597903 1361
> 16 9432100 1361
> 16 8847857 1361
> 16 13953068 1361
> 16 8744451 1361
> 16 8089463 1361
> 16 9674902 1361
> 16 8418200 1361
> 16 8028509 1361
> 16 9243086 1361
> 16 8892184 1361
> 16 8801594 1361
> 16 9849079 1361
> 16 8556753 1361
> 16 8979232 1361
> 16 8081946 1361
> 16 8724046 1361
> 16 9984434 1361
> 16 8651659 1361
> 16 9116866 1361
> 1 17870072 1361
> 16 8860630 1361
> 16 9888398 1361
> 16 9463782 1361
> 16 9602127 1361
> 16 9353325 1361
> 16 7991816 1361
> 16 9920420 1361
> 16 8497624 1361
> 16 8987980 1361
> 16 8234751 1361
> 16 8389490 1361
> 18 9975575 1361
> 16 8026536 1361
> 16 8790618 1361
> 16 9846791 1361
> 16 8363833 1361
> 16 9025525 1361
> 16 9241297 1361
> 16 8712487 1361
> 16 8692003 1361
> 16 9316523 1361
> 16 8124338 1361
> 16 9941027 1361
> 16 9547973 1361
> 16 8007742 1361
> 16 8418425 1361
> 16 8944940 1361
> 16 8890232 1361
> 16 9248984 1361
> 16 9784461 1361
> 16 9009374 1361
> 16 8395861 1361
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Thu, Jun 23, 2016 at 4:01 AM, Markovitz, Dudu <dm...@paypal.com>
wrote:
> Could you also add the results of the following query?
>
>
>
> Thanks
>
>
>
> Dudu
>
>
>
>
>
> select m_d_key
>
> ,sb_gu_key
>
> ,count (*) as cnt
>
>
>
> from tuning_dd_key
>
>
>
> group by m_d_key
>
> ,sb_gu_key
>
>
>
> order by cnt desc
>
>
>
> limit 100
>
> ;
>
>
>
> -----Original Message-----
> From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com] On Behalf Of
> Gopal Vijayaraghavan
> Sent: Thursday, June 23, 2016 9:45 AM
> To: user@hive.apache.org
> Subject: Re: Optimize Hive Query
>
>
>
>
>
> > Long running query :
>
>
>
> Are you running this on MapReduce or Tez?
>
>
>
> Please post the output of explain - if you are seeing > 1 shuffle edge in
> your query while having only one window for OVER(), that might be the
> reason.
>
>
>
> OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
>
>
>
> The multiple PTF operators should have been collapsed by the reduce
> sink-deduplication.
>
>
>
> Cheers,
>
> Gopal
>
>
>
>
>
RE: Optimize Hive Query
Posted by "Markovitz, Dudu" <dm...@paypal.com>.
Could you also add the results of the following query?
Thanks
Dudu
select m_d_key
,sb_gu_key
,count (*) as cnt
from tuning_dd_key
group by m_d_key
,sb_gu_key
order by cnt desc
limit 100
;
-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com] On Behalf Of Gopal Vijayaraghavan
Sent: Thursday, June 23, 2016 9:45 AM
To: user@hive.apache.org
Subject: Re: Optimize Hive Query
> Long running query :
Are you running this on MapReduce or Tez?
Please post the output of explain - if you are seeing > 1 shuffle edge in your query while having only one window for OVER(), that might be the reason.
OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
The multiple PTF operators should have been collapsed by the reduce sink-deduplication.
Cheers,
Gopal
Re: Optimize Hive Query
Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Sanjiv,
Normally when it comes to this, I will try to find the section of the code
which cause the largest lag
SELECT
> sb_gu_key, m_d_key, t_ev_st_dt,
> LAG( t_ev_st_dt ) OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY
> t_ev_st_dt ) AS LAG_START_DT,
> a_z_key,
> c_dt,
> e_p_dt,
> sq_nbr,
> CASE WHEN LAG( t_ev_st_dt ) OVER ( PARTITION BY m_d_key , sb_gu_key
> ORDER BY t_ev_st_dt ) IS NULL OR a_z_key <> LAG( a_z_key , 1 , -999 )
> OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt ) THEN 'S'
> ELSE NULL END AS ST_FLAG
> FROM `PRDDB`.tuning_dd_key ;
From the above query which part is the most time consuming?
For example is the LAG function the most consuming section that takers the
lion's hare of the query?
Just execute the code and comment out LAG(t_ev_st_td) ..... first
I suspect
CASE WHEN LAG( t_ev_st_dt ) OVER ( PARTITION BY m_d_key , sb_gu_key
ORDER BY t_ev_st_dt ) IS NULL OR a_z_key <> LAG( a_z_key , 1 , -999 )
OVER ( PARTITION BY
is the other possible candidate as well with that OR than can cause the
issue
For example you can do the following to measure the timing
select from_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') AS
StartTime;
SELECT COUNT(1) FROM PRDDB`.tuning_dd_key
WHERE (LAG( t_ev_st_dt ) OVER ( PARTITION BY m_d_key , sb_gu_key ORDER
BY t_ev_st_dt ) IS NULL)
OR
a_z_key <> LAG( a_z_key , 1 , -999 ) OVER ( PARTITION BY m_d_key ,
sb_gu_key ORDER BY t_ev_st_dt )
select from_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') AS EndTime;
HTH
Dr Mich Talebzadeh
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Any and all responsibility for any loss, damage or
destruction of data or any other property which may arise from relying on
this email's technical content is explicitly disclaimed. The author will in
no case be liable for any monetary damages arising from such loss, damage
or destruction.
On 24 June 2016 at 22:34, @Sanjiv Singh <sa...@gmail.com> wrote:
> Hi Vijay,
>
> Please help me on this....let me know you need other info.
>
>
>
> Regards
> Sanjiv Singh
> Mob : +091 9990-447-339
>
> On Thu, Jun 23, 2016 at 12:41 PM, @Sanjiv Singh <sa...@gmail.com>
> wrote:
>
>> Hi Gopal,
>>
>> I am using Tez as execution engine.
>>
>> DAG :
>>
>> +--------------------------------------------------------+--+
>> |
>> Explain
>> |
>> +---------------------------------------------------------+--+
>> | Plan not optimized by CBO.
>> |
>> |
>> |
>> | Vertex dependency in root stage
>> |
>> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
>> |
>> |
>> |
>> | Stage-0
>> |
>> | Fetch Operator
>> |
>> | limit:-1
>> |
>> | Stage-1
>> |
>> | Reducer 2
>> |
>> | File Output Operator [FS_55596]
>> |
>> | compressed:false
>> |
>> | Statistics:Num rows: 6357592675 Data size: 54076899328
>> Basic stats: COMPLETE Column stats: NONE |
>> |
>> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
>> format:":"org.apache.hadoop.mapred.TextInputFormat","output
>> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} |
>> | Select Operator [SEL_55594]
>> |
>> |
>> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>> |
>> | Statistics:Num rows: 6357592675 Data size: 54076899328
>> Basic stats: COMPLETE Column stats: NONE |
>> | PTF Operator [PTF_55593]
>> |
>> | Function definitions:[{"Input
>> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
>> _col1","name:":"windowingtablefunction","order by:":"_col2"}] |
>> | Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE |
>> | Select Operator [SEL_55592]
>> |
>> | |
>> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
>> |
>> | | Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE |
>> | |<-Map 1 [SIMPLE_EDGE] vectorized
>> |
>> | Reduce Output Operator [RS_55597]
>> |
>> | key expressions:m_d_key (type: smallint),
>> sb_gu_key (type: bigint), t_ev_st_dt (type: date) |
>> | Map-reduce partition columns:m_d_key (type:
>> smallint), sb_gu_key (type: bigint) |
>> | sort order:+++
>> |
>> | Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE
>> |
>> | value expressions:ad_zn_key (type: int), c_dt
>> (type: date), e_p_dt (type: date), sq_nbr (type: int) |
>> | TableScan [TS_55590]
>> |
>> | ACID table:true
>> |
>> | alias:tuning_dd_key
>> |
>> | Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE |
>> |
>>
>> |
>>
>> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>>
>> Regards
>> Sanjiv Singh
>> Mob : +091 9990-447-339
>>
>> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
>> wrote:
>>
>>>
>>> > Long running query :
>>>
>>> Are you running this on MapReduce or Tez?
>>>
>>> Please post the output of explain - if you are seeing > 1 shuffle edge in
>>> your query while having only one window for OVER(), that might be the
>>> reason.
>>>
>>> OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
>>>
>>>
>>> The multiple PTF operators should have been collapsed by the reduce
>>> sink-deduplication.
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>
>
Re: Optimize Hive Query
Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi,
Curious to see if this issue been resolved (performance) after compaction?
Thanks
Dr Mich Talebzadeh
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On 26 June 2016 at 21:11, @Sanjiv Singh <sa...@gmail.com> wrote:
> Thanks Gopal for your inputs ....For now I have create NON ACID table and
> loaded data ....see below from logs proper group splits happening .
>
> 2016-06-25 12:52:00,160 [INFO] [InputInitializer {Map 1} #0]
> |tez.HiveSplitGenerator|: Number of grouped splits: 512
>
>
> On compaction issue , Compaction enabled with two workers. why compaction
> not happened ? will check metastore logs.
>
> I have too many ACID tables on hive and how many worker should be
> configured ? currently it is 2.
>
> Thanks a lot once again.
>
>
> Regards
> Sanjiv Singh
> Mob : +091 9990-447-339
>
> On Fri, Jun 24, 2016 at 9:14 PM, @Sanjiv Singh <sa...@gmail.com>
> wrote:
>
>> Thanks Gopal for your inputs. Let me run compaction explicitly on table
>> then see how query works.
>>
>>
>>
>> Let
>>
>> Regards
>> Sanjiv Singh
>> Mob : +091 9990-447-339
>>
>> On Fri, Jun 24, 2016 at 7:53 PM, Gopal Vijayaraghavan <go...@apache.org>
>> wrote:
>>
>>>
>>> > Yes for this tables, ACID enabled. it has only 256 files for each
>>> >buckets. these are create only when data initially loaded in this table.
>>>
>>> Yes, the initial load goes in as an insert DELTA too - that requires
>>> another compaction to move into base files.
>>>
>>> The fact that they haven't been automatically compacted yet, suggests
>>> that
>>> the compactor isn't working for some reason (check hive metastore logs).
>>>
>>> > One thing that I am not able to understand that its is running with 1
>>> >MAPPER.
>>>
>>> The size of deltas shows up as 0, till the compaction goes through - in
>>> Hive2, it will be -1 which will be correctly interpreted as "unknown
>>> size".
>>>
>>>
>>> > | -rw-r--r-- 3 H56473 hdfs 215973009 2016-06-23 17:38
>>>
>>> >/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
>>> >0000 |
>>>
>>> Clearly an issue due to the lack of compaction - I see a single delta
>>> with
>>> 255 buckets and no base_* files at all.
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
Re: Optimize Hive Query
Posted by Eugene Koifman <ek...@hortonworks.com>.
if you have many acid tables you almost certainly want more than 2 workers. If you have 2 workers (and a single metastore instance) you can run at most 2 compaction jobs at a time. Unless the tables are very small, compaction may fall behind if it's configured to run too serially.
In order for compactions to run automatically, at a minimum you have to have hive.compactor.initiator.on=true for one standalone metastore instance.
hive.compactor.delta.num.threshold determines when compaction is triggered for a given table/partition.
There is more details in https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
Look for log messages in metastore.log form Initiator/Cleaner classes. If you don't see any, it must be disabled.
SHOW COMPACTIONS is a command you can run at CLI to see if there are any currently running.
you can also use ALTER TABLE (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionCompact) to launch compaction on demand.
could you send results of dfs -ls /apps/hive/warehouse/PRDDB.db/tuning_dd_key
thanks,
Eugene
From: "@Sanjiv Singh" <sa...@gmail.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, "sanjiv.is.on@gmail.com<ma...@gmail.com>" <sa...@gmail.com>>
Date: Sunday, June 26, 2016 at 1:11 PM
To: Gopal Vijayaraghavan <go...@apache.org>>
Cc: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Subject: Re: Optimize Hive Query
Thanks Gopal for your inputs ....For now I have create NON ACID table and loaded data ....see below from logs proper group splits happening .
2016-06-25 12:52:00,160 [INFO] [InputInitializer {Map 1} #0] |tez.HiveSplitGenerator|: Number of grouped splits: 512
On compaction issue , Compaction enabled with two workers. why compaction not happened ? will check metastore logs.
I have too many ACID tables on hive and how many worker should be configured ? currently it is 2.
Thanks a lot once again.
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Fri, Jun 24, 2016 at 9:14 PM, @Sanjiv Singh <sa...@gmail.com>> wrote:
Thanks Gopal for your inputs. Let me run compaction explicitly on table then see how query works.
Let
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Fri, Jun 24, 2016 at 7:53 PM, Gopal Vijayaraghavan <go...@apache.org>> wrote:
> Yes for this tables, ACID enabled. it has only 256 files for each
>buckets. these are create only when data initially loaded in this table.
Yes, the initial load goes in as an insert DELTA too - that requires
another compaction to move into base files.
The fact that they haven't been automatically compacted yet, suggests that
the compactor isn't working for some reason (check hive metastore logs).
> One thing that I am not able to understand that its is running with 1
>MAPPER.
The size of deltas shows up as 0, till the compaction goes through - in
Hive2, it will be -1 which will be correctly interpreted as "unknown size".
> | -rw-r--r-- 3 H56473 hdfs 215973009 2016-06-23 17:38
>/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
>0000 |
Clearly an issue due to the lack of compaction - I see a single delta with
255 buckets and no base_* files at all.
Cheers,
Gopal
Re: Optimize Hive Query
Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Thanks Gopal for your inputs ....For now I have create NON ACID table and
loaded data ....see below from logs proper group splits happening .
2016-06-25 12:52:00,160 [INFO] [InputInitializer {Map 1} #0]
|tez.HiveSplitGenerator|: Number of grouped splits: 512
On compaction issue , Compaction enabled with two workers. why compaction
not happened ? will check metastore logs.
I have too many ACID tables on hive and how many worker should be
configured ? currently it is 2.
Thanks a lot once again.
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Fri, Jun 24, 2016 at 9:14 PM, @Sanjiv Singh <sa...@gmail.com>
wrote:
> Thanks Gopal for your inputs. Let me run compaction explicitly on table
> then see how query works.
>
>
>
> Let
>
> Regards
> Sanjiv Singh
> Mob : +091 9990-447-339
>
> On Fri, Jun 24, 2016 at 7:53 PM, Gopal Vijayaraghavan <go...@apache.org>
> wrote:
>
>>
>> > Yes for this tables, ACID enabled. it has only 256 files for each
>> >buckets. these are create only when data initially loaded in this table.
>>
>> Yes, the initial load goes in as an insert DELTA too - that requires
>> another compaction to move into base files.
>>
>> The fact that they haven't been automatically compacted yet, suggests that
>> the compactor isn't working for some reason (check hive metastore logs).
>>
>> > One thing that I am not able to understand that its is running with 1
>> >MAPPER.
>>
>> The size of deltas shows up as 0, till the compaction goes through - in
>> Hive2, it will be -1 which will be correctly interpreted as "unknown
>> size".
>>
>>
>> > | -rw-r--r-- 3 H56473 hdfs 215973009 2016-06-23 17:38
>>
>> >/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
>> >0000 |
>>
>> Clearly an issue due to the lack of compaction - I see a single delta with
>> 255 buckets and no base_* files at all.
>>
>> Cheers,
>> Gopal
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
Re: Optimize Hive Query
Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Thanks Gopal for your inputs. Let me run compaction explicitly on table
then see how query works.
Let
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Fri, Jun 24, 2016 at 7:53 PM, Gopal Vijayaraghavan <go...@apache.org>
wrote:
>
> > Yes for this tables, ACID enabled. it has only 256 files for each
> >buckets. these are create only when data initially loaded in this table.
>
> Yes, the initial load goes in as an insert DELTA too - that requires
> another compaction to move into base files.
>
> The fact that they haven't been automatically compacted yet, suggests that
> the compactor isn't working for some reason (check hive metastore logs).
>
> > One thing that I am not able to understand that its is running with 1
> >MAPPER.
>
> The size of deltas shows up as 0, till the compaction goes through - in
> Hive2, it will be -1 which will be correctly interpreted as "unknown size".
>
>
> > | -rw-r--r-- 3 H56473 hdfs 215973009 2016-06-23 17:38
> >/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
> >0000 |
>
> Clearly an issue due to the lack of compaction - I see a single delta with
> 255 buckets and no base_* files at all.
>
> Cheers,
> Gopal
>
>
>
>
>
>
>
>
>
Re: Optimize Hive Query
Posted by Gopal Vijayaraghavan <go...@apache.org>.
> Yes for this tables, ACID enabled. it has only 256 files for each
>buckets. these are create only when data initially loaded in this table.
Yes, the initial load goes in as an insert DELTA too - that requires
another compaction to move into base files.
The fact that they haven't been automatically compacted yet, suggests that
the compactor isn't working for some reason (check hive metastore logs).
> One thing that I am not able to understand that its is running with 1
>MAPPER.
The size of deltas shows up as 0, till the compaction goes through - in
Hive2, it will be -1 which will be correctly interpreted as "unknown size".
> | -rw-r--r-- 3 H56473 hdfs 215973009 2016-06-23 17:38
>/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_0
>0000 |
Clearly an issue due to the lack of compaction - I see a single delta with
255 buckets and no base_* files at all.
Cheers,
Gopal
Re: Optimize Hive Query
Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Vijay,
Yes for this tables, ACID enabled. it has only 256 files for each buckets.
these are create only when data initially loaded in this table.
There is not transaction done after that.
I see that all file for buckets are also in equal size.
One thing that I am not able to understand that its is running with 1
MAPPER.
I will try options you provided , will get insight after that.
Please find below response of command "dfs -ls <path>"
dfs -ls /apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|
DFS Output
|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| Found 256 items
|
| -rw-r--r-- 3 H56473 hdfs 215973009 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00000
|
| -rw-r--r-- 3 H56473 hdfs 215253561 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00001
|
| -rw-r--r-- 3 H56473 hdfs 215470193 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00002
|
| -rw-r--r-- 3 H56473 hdfs 215455699 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00003
|
| -rw-r--r-- 3 H56473 hdfs 215620935 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00004
|
| -rw-r--r-- 3 H56473 hdfs 215704945 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00005
|
| -rw-r--r-- 3 H56473 hdfs 215842552 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00006
|
| -rw-r--r-- 3 H56473 hdfs 215808371 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00007
|
| -rw-r--r-- 3 H56473 hdfs 215969399 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00008
|
| -rw-r--r-- 3 H56473 hdfs 216130285 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00009
|
| -rw-r--r-- 3 H56473 hdfs 216168481 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00010
|
| -rw-r--r-- 3 H56473 hdfs 216152569 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00011
|
| -rw-r--r-- 3 H56473 hdfs 216088033 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00012
|
| -rw-r--r-- 3 H56473 hdfs 216264972 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00013
|
| -rw-r--r-- 3 H56473 hdfs 216267772 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00014
|
| -rw-r--r-- 3 H56473 hdfs 216385428 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00015
|
| -rw-r--r-- 3 H56473 hdfs 216399617 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00016
|
| -rw-r--r-- 3 H56473 hdfs 216544838 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00017
|
| -rw-r--r-- 3 H56473 hdfs 216542321 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00018
|
| -rw-r--r-- 3 H56473 hdfs 216520174 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00019
|
| -rw-r--r-- 3 H56473 hdfs 216513782 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00020
|
| -rw-r--r-- 3 H56473 hdfs 216546732 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00021
|
| -rw-r--r-- 3 H56473 hdfs 216474592 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00022
|
| -rw-r--r-- 3 H56473 hdfs 216503547 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00023
|
| -rw-r--r-- 3 H56473 hdfs 216590266 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00024
|
| -rw-r--r-- 3 H56473 hdfs 216692713 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00025
|
| -rw-r--r-- 3 H56473 hdfs 216593362 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00026
|
| -rw-r--r-- 3 H56473 hdfs 216736077 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00027
|
| -rw-r--r-- 3 H56473 hdfs 216691738 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00028
|
| -rw-r--r-- 3 H56473 hdfs 216751031 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00029
|
| -rw-r--r-- 3 H56473 hdfs 216748097 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00030
|
| -rw-r--r-- 3 H56473 hdfs 216561861 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00031
|
| -rw-r--r-- 3 H56473 hdfs 216827688 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00032
|
| -rw-r--r-- 3 H56473 hdfs 216785788 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00033
|
| -rw-r--r-- 3 H56473 hdfs 216824525 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00034
|
| -rw-r--r-- 3 H56473 hdfs 216744531 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00035
|
| -rw-r--r-- 3 H56473 hdfs 216801179 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00036
|
| -rw-r--r-- 3 H56473 hdfs 216843923 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00037
|
| -rw-r--r-- 3 H56473 hdfs 216841512 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00038
|
| -rw-r--r-- 3 H56473 hdfs 216786403 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00039
|
| -rw-r--r-- 3 H56473 hdfs 216828047 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00040
|
| -rw-r--r-- 3 H56473 hdfs 216825787 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00041
|
| -rw-r--r-- 3 H56473 hdfs 216876891 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00042
|
| -rw-r--r-- 3 H56473 hdfs 216955603 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00043
|
| -rw-r--r-- 3 H56473 hdfs 216875485 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00044
|
| -rw-r--r-- 3 H56473 hdfs 216935044 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00045
|
| -rw-r--r-- 3 H56473 hdfs 216819647 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00046
|
| -rw-r--r-- 3 H56473 hdfs 216951906 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00047
|
| -rw-r--r-- 3 H56473 hdfs 216992043 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00048
|
| -rw-r--r-- 3 H56473 hdfs 216893267 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00049
|
| -rw-r--r-- 3 H56473 hdfs 216887568 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00050
|
| -rw-r--r-- 3 H56473 hdfs 216932512 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00051
|
| -rw-r--r-- 3 H56473 hdfs 216995951 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00052
|
| -rw-r--r-- 3 H56473 hdfs 216980201 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00053
|
| -rw-r--r-- 3 H56473 hdfs 217022321 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00054
|
| -rw-r--r-- 3 H56473 hdfs 217061082 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00055
|
| -rw-r--r-- 3 H56473 hdfs 216915085 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00056
|
| -rw-r--r-- 3 H56473 hdfs 216973899 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00057
|
| -rw-r--r-- 3 H56473 hdfs 216947693 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00058
|
| -rw-r--r-- 3 H56473 hdfs 216978410 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00059
|
| -rw-r--r-- 3 H56473 hdfs 216966029 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00060
|
| -rw-r--r-- 3 H56473 hdfs 217062158 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00061
|
| -rw-r--r-- 3 H56473 hdfs 217012664 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00062
|
| -rw-r--r-- 3 H56473 hdfs 217126561 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00063
|
| -rw-r--r-- 3 H56473 hdfs 217066999 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00064
|
| -rw-r--r-- 3 H56473 hdfs 217052347 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00065
|
| -rw-r--r-- 3 H56473 hdfs 217054965 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00066
|
| -rw-r--r-- 3 H56473 hdfs 217155817 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00067
|
| -rw-r--r-- 3 H56473 hdfs 217163685 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00068
|
| -rw-r--r-- 3 H56473 hdfs 217075833 2016-06-23 17:44
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00069
|
| -rw-r--r-- 3 H56473 hdfs 217110271 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00070
|
| -rw-r--r-- 3 H56473 hdfs 217099877 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00071
|
| -rw-r--r-- 3 H56473 hdfs 217015698 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00072
|
| -rw-r--r-- 3 H56473 hdfs 217077120 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00073
|
| -rw-r--r-- 3 H56473 hdfs 216985477 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00074
|
| -rw-r--r-- 3 H56473 hdfs 217026304 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00075
|
| -rw-r--r-- 3 H56473 hdfs 217071942 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00076
|
| -rw-r--r-- 3 H56473 hdfs 217043625 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00077
|
| -rw-r--r-- 3 H56473 hdfs 217065652 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00078
|
| -rw-r--r-- 3 H56473 hdfs 217016300 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00079
|
| -rw-r--r-- 3 H56473 hdfs 216937153 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00080
|
| -rw-r--r-- 3 H56473 hdfs 217133015 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00081
|
| -rw-r--r-- 3 H56473 hdfs 217200238 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00082
|
| -rw-r--r-- 3 H56473 hdfs 217054759 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00083
|
| -rw-r--r-- 3 H56473 hdfs 217009417 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00084
|
| -rw-r--r-- 3 H56473 hdfs 217120440 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00085
|
| -rw-r--r-- 3 H56473 hdfs 216987824 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00086
|
| -rw-r--r-- 3 H56473 hdfs 217053581 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00087
|
| -rw-r--r-- 3 H56473 hdfs 217139935 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00088
|
| -rw-r--r-- 3 H56473 hdfs 217133938 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00089
|
| -rw-r--r-- 3 H56473 hdfs 217082768 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00090
|
| -rw-r--r-- 3 H56473 hdfs 217163033 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00091
|
| -rw-r--r-- 3 H56473 hdfs 217160879 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00092
|
| -rw-r--r-- 3 H56473 hdfs 217118996 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00093
|
| -rw-r--r-- 3 H56473 hdfs 217075658 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00094
|
| -rw-r--r-- 3 H56473 hdfs 217155885 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00095
|
| -rw-r--r-- 3 H56473 hdfs 217073602 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00096
|
| -rw-r--r-- 3 H56473 hdfs 217048454 2016-06-23 17:43
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00097
|
| -rw-r--r-- 3 H56473 hdfs 217073480 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00098
|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|
DFS Output
|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| -rw-r--r-- 3 H56473 hdfs 217008473 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00099
|
| -rw-r--r-- 3 H56473 hdfs 217055083 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00100
|
| -rw-r--r-- 3 H56473 hdfs 217170178 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00101
|
| -rw-r--r-- 3 H56473 hdfs 217231866 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00102
|
| -rw-r--r-- 3 H56473 hdfs 217065415 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00103
|
| -rw-r--r-- 3 H56473 hdfs 217082898 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00104
|
| -rw-r--r-- 3 H56473 hdfs 217091916 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00105
|
| -rw-r--r-- 3 H56473 hdfs 217104784 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00106
|
| -rw-r--r-- 3 H56473 hdfs 217230988 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00107
|
| -rw-r--r-- 3 H56473 hdfs 217165086 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00108
|
| -rw-r--r-- 3 H56473 hdfs 217182369 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00109
|
| -rw-r--r-- 3 H56473 hdfs 217118862 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00110
|
| -rw-r--r-- 3 H56473 hdfs 217180020 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00111
|
| -rw-r--r-- 3 H56473 hdfs 217208232 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00112
|
| -rw-r--r-- 3 H56473 hdfs 217199476 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00113
|
| -rw-r--r-- 3 H56473 hdfs 217059597 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00114
|
| -rw-r--r-- 3 H56473 hdfs 217198679 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00115
|
| -rw-r--r-- 3 H56473 hdfs 217091012 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00116
|
| -rw-r--r-- 3 H56473 hdfs 217190091 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00117
|
| -rw-r--r-- 3 H56473 hdfs 217212151 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00118
|
| -rw-r--r-- 3 H56473 hdfs 217177634 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00119
|
| -rw-r--r-- 3 H56473 hdfs 217173936 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00120
|
| -rw-r--r-- 3 H56473 hdfs 217091743 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00121
|
| -rw-r--r-- 3 H56473 hdfs 217141298 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00122
|
| -rw-r--r-- 3 H56473 hdfs 217132204 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00123
|
| -rw-r--r-- 3 H56473 hdfs 217246598 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00124
|
| -rw-r--r-- 3 H56473 hdfs 217151074 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00125
|
| -rw-r--r-- 3 H56473 hdfs 217159300 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00126
|
| -rw-r--r-- 3 H56473 hdfs 217065983 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00127
|
| -rw-r--r-- 3 H56473 hdfs 217193747 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00128
|
| -rw-r--r-- 3 H56473 hdfs 217228013 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00129
|
| -rw-r--r-- 3 H56473 hdfs 217375225 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00130
|
| -rw-r--r-- 3 H56473 hdfs 217335197 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00131
|
| -rw-r--r-- 3 H56473 hdfs 217182537 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00132
|
| -rw-r--r-- 3 H56473 hdfs 217229480 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00133
|
| -rw-r--r-- 3 H56473 hdfs 217283215 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00134
|
| -rw-r--r-- 3 H56473 hdfs 217372431 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00135
|
| -rw-r--r-- 3 H56473 hdfs 217234042 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00136
|
| -rw-r--r-- 3 H56473 hdfs 217176898 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00137
|
| -rw-r--r-- 3 H56473 hdfs 217301958 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00138
|
| -rw-r--r-- 3 H56473 hdfs 217451069 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00139
|
| -rw-r--r-- 3 H56473 hdfs 217314824 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00140
|
| -rw-r--r-- 3 H56473 hdfs 217236714 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00141
|
| -rw-r--r-- 3 H56473 hdfs 217198689 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00142
|
| -rw-r--r-- 3 H56473 hdfs 217300105 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00143
|
| -rw-r--r-- 3 H56473 hdfs 217322560 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00144
|
| -rw-r--r-- 3 H56473 hdfs 217251308 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00145
|
| -rw-r--r-- 3 H56473 hdfs 217268420 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00146
|
| -rw-r--r-- 3 H56473 hdfs 217250943 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00147
|
| -rw-r--r-- 3 H56473 hdfs 217317036 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00148
|
| -rw-r--r-- 3 H56473 hdfs 217249339 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00149
|
| -rw-r--r-- 3 H56473 hdfs 217199279 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00150
|
| -rw-r--r-- 3 H56473 hdfs 217201217 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00151
|
| -rw-r--r-- 3 H56473 hdfs 217254531 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00152
|
| -rw-r--r-- 3 H56473 hdfs 217325584 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00153
|
| -rw-r--r-- 3 H56473 hdfs 217361150 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00154
|
| -rw-r--r-- 3 H56473 hdfs 217108797 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00155
|
| -rw-r--r-- 3 H56473 hdfs 217250257 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00156
|
| -rw-r--r-- 3 H56473 hdfs 217207462 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00157
|
| -rw-r--r-- 3 H56473 hdfs 217283114 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00158
|
| -rw-r--r-- 3 H56473 hdfs 217336632 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00159
|
| -rw-r--r-- 3 H56473 hdfs 217240497 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00160
|
| -rw-r--r-- 3 H56473 hdfs 217253770 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00161
|
| -rw-r--r-- 3 H56473 hdfs 217304981 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00162
|
| -rw-r--r-- 3 H56473 hdfs 217337919 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00163
|
| -rw-r--r-- 3 H56473 hdfs 217246615 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00164
|
| -rw-r--r-- 3 H56473 hdfs 217175520 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00165
|
| -rw-r--r-- 3 H56473 hdfs 217196985 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00166
|
| -rw-r--r-- 3 H56473 hdfs 217251912 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00167
|
| -rw-r--r-- 3 H56473 hdfs 217220392 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00168
|
| -rw-r--r-- 3 H56473 hdfs 217197378 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00169
|
| -rw-r--r-- 3 H56473 hdfs 217348762 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00170
|
| -rw-r--r-- 3 H56473 hdfs 217328016 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00171
|
| -rw-r--r-- 3 H56473 hdfs 217142176 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00172
|
| -rw-r--r-- 3 H56473 hdfs 217184677 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00173
|
| -rw-r--r-- 3 H56473 hdfs 217137840 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00174
|
| -rw-r--r-- 3 H56473 hdfs 217256754 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00175
|
| -rw-r--r-- 3 H56473 hdfs 217220424 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00176
|
| -rw-r--r-- 3 H56473 hdfs 217185562 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00177
|
| -rw-r--r-- 3 H56473 hdfs 217194575 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00178
|
| -rw-r--r-- 3 H56473 hdfs 217188097 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00179
|
| -rw-r--r-- 3 H56473 hdfs 217158674 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00180
|
| -rw-r--r-- 3 H56473 hdfs 217222821 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00181
|
| -rw-r--r-- 3 H56473 hdfs 217170798 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00182
|
| -rw-r--r-- 3 H56473 hdfs 217112615 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00183
|
| -rw-r--r-- 3 H56473 hdfs 217247444 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00184
|
| -rw-r--r-- 3 H56473 hdfs 217136592 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00185
|
| -rw-r--r-- 3 H56473 hdfs 217142003 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00186
|
| -rw-r--r-- 3 H56473 hdfs 217125721 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00187
|
| -rw-r--r-- 3 H56473 hdfs 217223243 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00188
|
| -rw-r--r-- 3 H56473 hdfs 217130702 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00189
|
| -rw-r--r-- 3 H56473 hdfs 217183947 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00190
|
| -rw-r--r-- 3 H56473 hdfs 216992034 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00191
|
| -rw-r--r-- 3 H56473 hdfs 217092469 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00192
|
| -rw-r--r-- 3 H56473 hdfs 217105133 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00193
|
| -rw-r--r-- 3 H56473 hdfs 217004917 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00194
|
| -rw-r--r-- 3 H56473 hdfs 217078557 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00195
|
| -rw-r--r-- 3 H56473 hdfs 217142035 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00196
|
| -rw-r--r-- 3 H56473 hdfs 216993711 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00197
|
| -rw-r--r-- 3 H56473 hdfs 217113515 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00198
|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|
DFS Output
|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| -rw-r--r-- 3 H56473 hdfs 216982052 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00199
|
| -rw-r--r-- 3 H56473 hdfs 217086795 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00200
|
| -rw-r--r-- 3 H56473 hdfs 217093281 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00201
|
| -rw-r--r-- 3 H56473 hdfs 217019781 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00202
|
| -rw-r--r-- 3 H56473 hdfs 217042372 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00203
|
| -rw-r--r-- 3 H56473 hdfs 217057324 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00204
|
| -rw-r--r-- 3 H56473 hdfs 216952020 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00205
|
| -rw-r--r-- 3 H56473 hdfs 217032930 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00206
|
| -rw-r--r-- 3 H56473 hdfs 217092936 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00207
|
| -rw-r--r-- 3 H56473 hdfs 216983923 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00208
|
| -rw-r--r-- 3 H56473 hdfs 216968290 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00209
|
| -rw-r--r-- 3 H56473 hdfs 217005680 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00210
|
| -rw-r--r-- 3 H56473 hdfs 216967777 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00211
|
| -rw-r--r-- 3 H56473 hdfs 217012052 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00212
|
| -rw-r--r-- 3 H56473 hdfs 217030822 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00213
|
| -rw-r--r-- 3 H56473 hdfs 216879098 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00214
|
| -rw-r--r-- 3 H56473 hdfs 216938981 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00215
|
| -rw-r--r-- 3 H56473 hdfs 216888064 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00216
|
| -rw-r--r-- 3 H56473 hdfs 217006890 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00217
|
| -rw-r--r-- 3 H56473 hdfs 216882204 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00218
|
| -rw-r--r-- 3 H56473 hdfs 216878124 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00219
|
| -rw-r--r-- 3 H56473 hdfs 216924030 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00220
|
| -rw-r--r-- 3 H56473 hdfs 216870359 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00221
|
| -rw-r--r-- 3 H56473 hdfs 216965696 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00222
|
| -rw-r--r-- 3 H56473 hdfs 216835873 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00223
|
| -rw-r--r-- 3 H56473 hdfs 216901023 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00224
|
| -rw-r--r-- 3 H56473 hdfs 216800511 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00225
|
| -rw-r--r-- 3 H56473 hdfs 216831982 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00226
|
| -rw-r--r-- 3 H56473 hdfs 216800692 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00227
|
| -rw-r--r-- 3 H56473 hdfs 216857795 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00228
|
| -rw-r--r-- 3 H56473 hdfs 216788102 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00229
|
| -rw-r--r-- 3 H56473 hdfs 216853182 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00230
|
| -rw-r--r-- 3 H56473 hdfs 216762991 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00231
|
| -rw-r--r-- 3 H56473 hdfs 216757552 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00232
|
| -rw-r--r-- 3 H56473 hdfs 216654495 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00233
|
| -rw-r--r-- 3 H56473 hdfs 216588262 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00234
|
| -rw-r--r-- 3 H56473 hdfs 216648141 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00235
|
| -rw-r--r-- 3 H56473 hdfs 216687726 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00236
|
| -rw-r--r-- 3 H56473 hdfs 216613018 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00237
|
| -rw-r--r-- 3 H56473 hdfs 216561040 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00238
|
| -rw-r--r-- 3 H56473 hdfs 216561957 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00239
|
| -rw-r--r-- 3 H56473 hdfs 216508756 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00240
|
| -rw-r--r-- 3 H56473 hdfs 216416846 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00241
|
| -rw-r--r-- 3 H56473 hdfs 216404453 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00242
|
| -rw-r--r-- 3 H56473 hdfs 216439975 2016-06-23 17:40
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00243
|
| -rw-r--r-- 3 H56473 hdfs 216347834 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00244
|
| -rw-r--r-- 3 H56473 hdfs 216338215 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00245
|
| -rw-r--r-- 3 H56473 hdfs 216234103 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00246
|
| -rw-r--r-- 3 H56473 hdfs 216350621 2016-06-23 17:39
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00247
|
| -rw-r--r-- 3 H56473 hdfs 216290393 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00248
|
| -rw-r--r-- 3 H56473 hdfs 216131030 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00249
|
| -rw-r--r-- 3 H56473 hdfs 216065350 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00250
|
| -rw-r--r-- 3 H56473 hdfs 215942146 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00251
|
| -rw-r--r-- 3 H56473 hdfs 215876602 2016-06-23 17:41
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00252
|
| -rw-r--r-- 3 H56473 hdfs 215906085 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00253
|
| -rw-r--r-- 3 H56473 hdfs 215650113 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00254
|
| -rw-r--r-- 3 H56473 hdfs 215941082 2016-06-23 17:38
/apps/hive/warehouse/PRDDB.db/tuning_dd_key/delta_0001570_0001570/bucket_00255
|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
257 rows selected (0.065 seconds)
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Fri, Jun 24, 2016 at 6:16 PM, Gopal Vijayaraghavan <go...@apache.org>
wrote:
>
>
> > Please help me on this....let me know you need other info.
>
> Are the ORC tables fully compacted? Looks like you're running a version of
> Hive-ACID, which does not perform well without compacting delta files.
>
> dfs -ls <path>;
>
> should tell you whether there are any delta_* files in the list.
>
> > | ACID table:true
> > |
> > | alias:tuning_dd_key
> > |
>
> Other than that, it does look like you have only 1 shuffle and if you did
> run this using Tez, I recommend using
>
> <https://github.com/apache/tez/tree/master/tez-tools/swimlanes>
>
> to find out the slowest task & find more information about it. You will
> get a diagram which looks like this
>
> <http://www.slideshare.net/t3rmin4t0r/tez8-ui-walkthrough/20>
>
>
> The longest bar of that is the slowest task. I have another version of it,
> which is unreleased yet (is a bit hard to explain) giving an image which
> looks like
>
> <
> http://people.apache.org/~gopalv/q21_suppliers_who_kept_orders_waiting.svg
> >
>
>
> <https://github.com/t3rmin4t0r/tez-swimlanes/blob/master/vertex.py>
>
> which is better at finding the 1-2 reducers skewing in a large DAG like
> q21 [1].
>
>
> Cheers,
> Gopal
> [1] -
> <
> http://people.apache.org/~gopalv/tpch-plans/q21_suppliers_who_kept_orders_
> waiting.svg>
>
>
>
Re: Optimize Hive Query
Posted by Gopal Vijayaraghavan <go...@apache.org>.
> Please help me on this....let me know you need other info.
Are the ORC tables fully compacted? Looks like you're running a version of
Hive-ACID, which does not perform well without compacting delta files.
dfs -ls <path>;
should tell you whether there are any delta_* files in the list.
> | ACID table:true
> |
> | alias:tuning_dd_key
> |
Other than that, it does look like you have only 1 shuffle and if you did
run this using Tez, I recommend using
<https://github.com/apache/tez/tree/master/tez-tools/swimlanes>
to find out the slowest task & find more information about it. You will
get a diagram which looks like this
<http://www.slideshare.net/t3rmin4t0r/tez8-ui-walkthrough/20>
The longest bar of that is the slowest task. I have another version of it,
which is unreleased yet (is a bit hard to explain) giving an image which
looks like
<http://people.apache.org/~gopalv/q21_suppliers_who_kept_orders_waiting.svg
>
<https://github.com/t3rmin4t0r/tez-swimlanes/blob/master/vertex.py>
which is better at finding the 1-2 reducers skewing in a large DAG like
q21 [1].
Cheers,
Gopal
[1] -
<http://people.apache.org/~gopalv/tpch-plans/q21_suppliers_who_kept_orders_
waiting.svg>
Re: Optimize Hive Query
Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Vijay,
Please help me on this....let me know you need other info.
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Thu, Jun 23, 2016 at 12:41 PM, @Sanjiv Singh <sa...@gmail.com>
wrote:
> Hi Gopal,
>
> I am using Tez as execution engine.
>
> DAG :
>
> +--------------------------------------------------------+--+
> |
> Explain
> |
> +---------------------------------------------------------+--+
> | Plan not optimized by CBO.
> |
> |
> |
> | Vertex dependency in root stage
> |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
> |
> |
> |
> | Stage-0
> |
> | Fetch Operator
> |
> | limit:-1
> |
> | Stage-1
> |
> | Reducer 2
> |
> | File Output Operator [FS_55596]
> |
> | compressed:false
> |
> | Statistics:Num rows: 6357592675 Data size: 54076899328 Basic
> stats: COMPLETE Column stats: NONE |
> |
> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
> format:":"org.apache.hadoop.mapred.TextInputFormat","output
> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} |
> | Select Operator [SEL_55594]
> |
> |
> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
> |
> | Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE |
> | PTF Operator [PTF_55593]
> |
> | Function definitions:[{"Input
> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
> _col1","name:":"windowingtablefunction","order by:":"_col2"}] |
> | Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE |
> | Select Operator [SEL_55592]
> |
> | |
> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
> |
> | | Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE |
> | |<-Map 1 [SIMPLE_EDGE] vectorized
> |
> | Reduce Output Operator [RS_55597]
> |
> | key expressions:m_d_key (type: smallint),
> sb_gu_key (type: bigint), t_ev_st_dt (type: date) |
> | Map-reduce partition columns:m_d_key (type:
> smallint), sb_gu_key (type: bigint) |
> | sort order:+++
> |
> | Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE
> |
> | value expressions:ad_zn_key (type: int), c_dt
> (type: date), e_p_dt (type: date), sq_nbr (type: int) |
> | TableScan [TS_55590]
> |
> | ACID table:true
> |
> | alias:tuning_dd_key
> |
> | Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE |
> |
>
> |
>
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>
> Regards
> Sanjiv Singh
> Mob : +091 9990-447-339
>
> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
> wrote:
>
>>
>> > Long running query :
>>
>> Are you running this on MapReduce or Tez?
>>
>> Please post the output of explain - if you are seeing > 1 shuffle edge in
>> your query while having only one window for OVER(), that might be the
>> reason.
>>
>> OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
>>
>>
>> The multiple PTF operators should have been collapsed by the reduce
>> sink-deduplication.
>>
>> Cheers,
>> Gopal
>>
>>
>>
>
Re: Optimize Hive Query
Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Mich,
I tried the same without any luck. I don't see any improvement.
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Thu, Jun 23, 2016 at 5:38 PM, @Sanjiv Singh <sa...@gmail.com>
wrote:
> Thanks Mich. for your inputs.
>
> Let me try that as well. Will post response.
>
>
>
Re: Optimize Hive Query
Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Thanks Mich. for your inputs.
Let me try that as well. Will post response.
Re: Optimize Hive Query
Posted by Mich Talebzadeh <mi...@gmail.com>.
Funny enough it is pretty close to similar ORC transactional tables I have.
Standard with 256 buckets with two columns as below
number of distinct value in column m_d_key : 29
> number of distinct value in column sb_gu_key : 15434343
You have also vectorised data taking 1024 rows at once.
Still the optimizer does not tell me much. Also I don't use TEZ. I use
Spark as the execution engine. From my experience (and I am sure there
will be plenty who will disagree with me :)), the optimiser does not make
much difference, it is the execution engine than delivers the performance.
The other alternative is that when you populate the table insert the data
sorted by m_d_key, sb_gu_key, t_ev_st_dt to ensure that that the optimizer
will be better off.
Also may help if you add t_ev_st_dt as the third column of the bucket as
the LAG() function is using it in ORDER BY
CLUSTERED BY (m_d_key, sb_gu_key, t_ev_st_dt) INTO 256 BUCKETS
STORED AS ORC
TBLPROPERTIES (
"transactional"="true",
"orc.create.index"="true",
"orc.bloom.filter.columns"="m_d_key, sb_gu_key, t_ev_st_dt",
"orc.bloom.filter.fpp"="0.05",
"orc.stripe.size"="16777216",
"orc.row.index.stride"="10000"
)
Others may have better ideas.
HTH
Dr Mich Talebzadeh
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
On 23 June 2016 at 20:30, @Sanjiv Singh <sa...@gmail.com> wrote:
> Hi Mich ,
>
> Please find below output of command.
>
> desc formatted tuning_dd_key ;
>
>
> +-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
> | col_name |
> data_type | comment
> |
>
> +-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
> | # col_name | data_type
> | comment
> |
> | | NULL
> | NULL
> |
> | m_d_key | smallint
> |
> |
> | sb_gu_key | bigint
> |
> |
> | t_ev_st_dt | date
> |
> |
> | ad_zn_key | int
> |
> |
> | c_dt | date
> |
> |
> | e_p_dt | date
> |
> |
> | sq_nbr | int
> |
> |
> | | NULL
> | NULL
> |
> | # Detailed Table Information | NULL
> | NULL
> |
> | Database: | PRDDB
> | NULL
> |
> | CreateTime: | Thu Jun 23 11:03:53 EDT 2016
> | NULL
> |
> | LastAccessTime: | UNKNOWN
> | NULL
> |
> | Protect Mode: | None
> | NULL
> |
> | Retention: | 0
> | NULL
> |
> | Table Type: | MANAGED_TABLE
> | NULL
> |
> | Table Parameters: | NULL
> | NULL
> |
> | | COLUMN_STATS_ACCURATE
> | true
> |
> | | numFiles
> | 256
> |
> | | numRows
> | 6357592675
> |
> | | rawDataSize
> | 0
> |
> | | totalSize
> | 54076898961
> |
> | | transactional
> | true
> |
> | | transient_lastDdlTime
> | 1466694970
> |
> | | NULL
> | NULL
> |
> | # Storage Information | NULL
> | NULL
> |
> | SerDe Library: |
> org.apache.hadoop.hive.ql.io.orc.OrcSerde
> | NULL |
> | InputFormat: |
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> | NULL |
> | OutputFormat: |
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> | NULL |
> | Compressed: | No
> | NULL
> |
> | Num Buckets: | 256
> | NULL
> |
> | Bucket Columns: | [sbsc_guid_key, mas_div_key]
> | NULL
> |
> | Sort Columns: | []
> | NULL
> |
> | Storage Desc Params: | NULL
> | NULL
> |
> | | serialization.format
> | 1
> |
>
> +-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
>
>
> Regards
> Sanjiv Singh
> Mob : +091 9990-447-339
>
> On Thu, Jun 23, 2016 at 12:47 PM, Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
>> Do you also have the output from
>>
>> desc formatted tuning_dd_key
>>
>> and send the output please?
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 23 June 2016 at 17:41, @Sanjiv Singh <sa...@gmail.com> wrote:
>>
>>> Hi Gopal,
>>>
>>> I am using Tez as execution engine.
>>>
>>> DAG :
>>>
>>> +--------------------------------------------------------+--+
>>> |
>>> Explain
>>> |
>>> +---------------------------------------------------------+--+
>>> | Plan not optimized by CBO.
>>> |
>>> |
>>> |
>>> | Vertex dependency in root stage
>>> |
>>> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
>>> |
>>> |
>>> |
>>> | Stage-0
>>> |
>>> | Fetch Operator
>>> |
>>> | limit:-1
>>> |
>>> | Stage-1
>>> |
>>> | Reducer 2
>>> |
>>> | File Output Operator [FS_55596]
>>> |
>>> | compressed:false
>>> |
>>> | Statistics:Num rows: 6357592675 Data size: 54076899328
>>> Basic stats: COMPLETE Column stats: NONE |
>>> |
>>> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
>>> format:":"org.apache.hadoop.mapred.TextInputFormat","output
>>> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} |
>>> | Select Operator [SEL_55594]
>>> |
>>> |
>>> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>>> |
>>> | Statistics:Num rows: 6357592675 Data size: 54076899328
>>> Basic stats: COMPLETE Column stats: NONE |
>>> | PTF Operator [PTF_55593]
>>> |
>>> | Function definitions:[{"Input
>>> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
>>> _col1","name:":"windowingtablefunction","order by:":"_col2"}] |
>>> | Statistics:Num rows: 6357592675 Data size:
>>> 54076899328 Basic stats: COMPLETE Column stats: NONE |
>>> | Select Operator [SEL_55592]
>>> |
>>> | |
>>> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
>>> |
>>> | | Statistics:Num rows: 6357592675 Data size:
>>> 54076899328 Basic stats: COMPLETE Column stats: NONE |
>>> | |<-Map 1 [SIMPLE_EDGE] vectorized
>>> |
>>> | Reduce Output Operator [RS_55597]
>>> |
>>> | key expressions:m_d_key (type: smallint),
>>> sb_gu_key (type: bigint), t_ev_st_dt (type: date) |
>>> | Map-reduce partition columns:m_d_key (type:
>>> smallint), sb_gu_key (type: bigint) |
>>> | sort order:+++
>>> |
>>> | Statistics:Num rows: 6357592675 Data size:
>>> 54076899328 Basic stats: COMPLETE Column stats: NONE
>>> |
>>> | value expressions:ad_zn_key (type: int), c_dt
>>> (type: date), e_p_dt (type: date), sq_nbr (type: int) |
>>> | TableScan [TS_55590]
>>> |
>>> | ACID table:true
>>> |
>>> | alias:tuning_dd_key
>>> |
>>> | Statistics:Num rows: 6357592675 Data size:
>>> 54076899328 Basic stats: COMPLETE Column stats: NONE |
>>> |
>>>
>>> |
>>>
>>> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>>>
>>> Regards
>>> Sanjiv Singh
>>> Mob : +091 9990-447-339
>>>
>>> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <gopalv@apache.org
>>> > wrote:
>>>
>>>>
>>>> > Long running query :
>>>>
>>>> Are you running this on MapReduce or Tez?
>>>>
>>>> Please post the output of explain - if you are seeing > 1 shuffle edge
>>>> in
>>>> your query while having only one window for OVER(), that might be the
>>>> reason.
>>>>
>>>> OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
>>>>
>>>>
>>>> The multiple PTF operators should have been collapsed by the reduce
>>>> sink-deduplication.
>>>>
>>>> Cheers,
>>>> Gopal
>>>>
>>>>
>>>>
>>>
>>
>
Re: Optimize Hive Query
Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Mich ,
Please find below output of command.
desc formatted tuning_dd_key ;
+-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
| col_name |
data_type | comment
|
+-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
| # col_name | data_type
| comment
|
| | NULL
| NULL
|
| m_d_key | smallint
|
|
| sb_gu_key | bigint
|
|
| t_ev_st_dt | date
|
|
| ad_zn_key | int
|
|
| c_dt | date
|
|
| e_p_dt | date
|
|
| sq_nbr | int
|
|
| | NULL
| NULL
|
| # Detailed Table Information | NULL
| NULL
|
| Database: | PRDDB
| NULL
|
| CreateTime: | Thu Jun 23 11:03:53 EDT 2016
| NULL
|
| LastAccessTime: | UNKNOWN
| NULL
|
| Protect Mode: | None
| NULL
|
| Retention: | 0
| NULL
|
| Table Type: | MANAGED_TABLE
| NULL
|
| Table Parameters: | NULL
| NULL
|
| | COLUMN_STATS_ACCURATE
| true
|
| | numFiles
| 256
|
| | numRows
| 6357592675
|
| | rawDataSize
| 0
|
| | totalSize
| 54076898961
|
| | transactional
| true
|
| | transient_lastDdlTime
| 1466694970
|
| | NULL
| NULL
|
| # Storage Information | NULL
| NULL
|
| SerDe Library: | org.apache.hadoop.hive.ql.io.orc.OrcSerde
| NULL
|
| InputFormat: |
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
| NULL |
| OutputFormat: |
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
| NULL |
| Compressed: | No
| NULL
|
| Num Buckets: | 256
| NULL
|
| Bucket Columns: | [sbsc_guid_key, mas_div_key]
| NULL
|
| Sort Columns: | []
| NULL
|
| Storage Desc Params: | NULL
| NULL
|
| | serialization.format
| 1
|
+-------------------------------+-----------------------------------------------------------------------------------------------+-----------------------+--+
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Thu, Jun 23, 2016 at 12:47 PM, Mich Talebzadeh <mich.talebzadeh@gmail.com
> wrote:
> Do you also have the output from
>
> desc formatted tuning_dd_key
>
> and send the output please?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 23 June 2016 at 17:41, @Sanjiv Singh <sa...@gmail.com> wrote:
>
>> Hi Gopal,
>>
>> I am using Tez as execution engine.
>>
>> DAG :
>>
>> +--------------------------------------------------------+--+
>> |
>> Explain
>> |
>> +---------------------------------------------------------+--+
>> | Plan not optimized by CBO.
>> |
>> |
>> |
>> | Vertex dependency in root stage
>> |
>> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
>> |
>> |
>> |
>> | Stage-0
>> |
>> | Fetch Operator
>> |
>> | limit:-1
>> |
>> | Stage-1
>> |
>> | Reducer 2
>> |
>> | File Output Operator [FS_55596]
>> |
>> | compressed:false
>> |
>> | Statistics:Num rows: 6357592675 Data size: 54076899328
>> Basic stats: COMPLETE Column stats: NONE |
>> |
>> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
>> format:":"org.apache.hadoop.mapred.TextInputFormat","output
>> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} |
>> | Select Operator [SEL_55594]
>> |
>> |
>> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>> |
>> | Statistics:Num rows: 6357592675 Data size: 54076899328
>> Basic stats: COMPLETE Column stats: NONE |
>> | PTF Operator [PTF_55593]
>> |
>> | Function definitions:[{"Input
>> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
>> _col1","name:":"windowingtablefunction","order by:":"_col2"}] |
>> | Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE |
>> | Select Operator [SEL_55592]
>> |
>> | |
>> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
>> |
>> | | Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE |
>> | |<-Map 1 [SIMPLE_EDGE] vectorized
>> |
>> | Reduce Output Operator [RS_55597]
>> |
>> | key expressions:m_d_key (type: smallint),
>> sb_gu_key (type: bigint), t_ev_st_dt (type: date) |
>> | Map-reduce partition columns:m_d_key (type:
>> smallint), sb_gu_key (type: bigint) |
>> | sort order:+++
>> |
>> | Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE
>> |
>> | value expressions:ad_zn_key (type: int), c_dt
>> (type: date), e_p_dt (type: date), sq_nbr (type: int) |
>> | TableScan [TS_55590]
>> |
>> | ACID table:true
>> |
>> | alias:tuning_dd_key
>> |
>> | Statistics:Num rows: 6357592675 Data size:
>> 54076899328 Basic stats: COMPLETE Column stats: NONE |
>> |
>>
>> |
>>
>> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>>
>> Regards
>> Sanjiv Singh
>> Mob : +091 9990-447-339
>>
>> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
>> wrote:
>>
>>>
>>> > Long running query :
>>>
>>> Are you running this on MapReduce or Tez?
>>>
>>> Please post the output of explain - if you are seeing > 1 shuffle edge in
>>> your query while having only one window for OVER(), that might be the
>>> reason.
>>>
>>> OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
>>>
>>>
>>> The multiple PTF operators should have been collapsed by the reduce
>>> sink-deduplication.
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>
>
Re: Optimize Hive Query
Posted by Mich Talebzadeh <mi...@gmail.com>.
Do you also have the output from
desc formatted tuning_dd_key
and send the output please?
Dr Mich Talebzadeh
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
On 23 June 2016 at 17:41, @Sanjiv Singh <sa...@gmail.com> wrote:
> Hi Gopal,
>
> I am using Tez as execution engine.
>
> DAG :
>
> +--------------------------------------------------------+--+
> |
> Explain
> |
> +---------------------------------------------------------+--+
> | Plan not optimized by CBO.
> |
> |
> |
> | Vertex dependency in root stage
> |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
> |
> |
> |
> | Stage-0
> |
> | Fetch Operator
> |
> | limit:-1
> |
> | Stage-1
> |
> | Reducer 2
> |
> | File Output Operator [FS_55596]
> |
> | compressed:false
> |
> | Statistics:Num rows: 6357592675 Data size: 54076899328 Basic
> stats: COMPLETE Column stats: NONE |
> |
> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
> format:":"org.apache.hadoop.mapred.TextInputFormat","output
> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} |
> | Select Operator [SEL_55594]
> |
> |
> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
> |
> | Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE |
> | PTF Operator [PTF_55593]
> |
> | Function definitions:[{"Input
> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
> _col1","name:":"windowingtablefunction","order by:":"_col2"}] |
> | Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE |
> | Select Operator [SEL_55592]
> |
> | |
> outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
> |
> | | Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE |
> | |<-Map 1 [SIMPLE_EDGE] vectorized
> |
> | Reduce Output Operator [RS_55597]
> |
> | key expressions:m_d_key (type: smallint),
> sb_gu_key (type: bigint), t_ev_st_dt (type: date) |
> | Map-reduce partition columns:m_d_key (type:
> smallint), sb_gu_key (type: bigint) |
> | sort order:+++
> |
> | Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE
> |
> | value expressions:ad_zn_key (type: int), c_dt
> (type: date), e_p_dt (type: date), sq_nbr (type: int) |
> | TableScan [TS_55590]
> |
> | ACID table:true
> |
> | alias:tuning_dd_key
> |
> | Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE |
> |
>
> |
>
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>
> Regards
> Sanjiv Singh
> Mob : +091 9990-447-339
>
> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
> wrote:
>
>>
>> > Long running query :
>>
>> Are you running this on MapReduce or Tez?
>>
>> Please post the output of explain - if you are seeing > 1 shuffle edge in
>> your query while having only one window for OVER(), that might be the
>> reason.
>>
>> OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
>>
>>
>> The multiple PTF operators should have been collapsed by the reduce
>> sink-deduplication.
>>
>> Cheers,
>> Gopal
>>
>>
>>
>
Re: Optimize Hive Query
Posted by "@Sanjiv Singh" <sa...@gmail.com>.
Hi Gopal,
I am using Tez as execution engine.
DAG :
+--------------------------------------------------------+--+
|
Explain
|
+---------------------------------------------------------+--+
| Plan not optimized by CBO.
|
|
|
| Vertex dependency in root stage
|
| Reducer 2 <- Map 1 (SIMPLE_EDGE)
|
|
|
| Stage-0
|
| Fetch Operator
|
| limit:-1
|
| Stage-1
|
| Reducer 2
|
| File Output Operator [FS_55596]
|
| compressed:false
|
| Statistics:Num rows: 6357592675 Data size: 54076899328 Basic
stats: COMPLETE Column stats: NONE |
|
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
format:":"org.apache.hadoop.mapred.TextInputFormat","output
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} |
| Select Operator [SEL_55594]
|
|
outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
|
| Statistics:Num rows: 6357592675 Data size: 54076899328
Basic stats: COMPLETE Column stats: NONE |
| PTF Operator [PTF_55593]
|
| Function definitions:[{"Input
definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
_col1","name:":"windowingtablefunction","order by:":"_col2"}] |
| Statistics:Num rows: 6357592675 Data size: 54076899328
Basic stats: COMPLETE Column stats: NONE |
| Select Operator [SEL_55592]
|
| |
outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
|
| | Statistics:Num rows: 6357592675 Data size:
54076899328 Basic stats: COMPLETE Column stats: NONE |
| |<-Map 1 [SIMPLE_EDGE] vectorized
|
| Reduce Output Operator [RS_55597]
|
| key expressions:m_d_key (type: smallint),
sb_gu_key (type: bigint), t_ev_st_dt (type: date) |
| Map-reduce partition columns:m_d_key (type:
smallint), sb_gu_key (type: bigint) |
| sort order:+++
|
| Statistics:Num rows: 6357592675 Data size:
54076899328 Basic stats: COMPLETE Column stats: NONE
|
| value expressions:ad_zn_key (type: int), c_dt
(type: date), e_p_dt (type: date), sq_nbr (type: int) |
| TableScan [TS_55590]
|
| ACID table:true
|
| alias:tuning_dd_key
|
| Statistics:Num rows: 6357592675 Data size:
54076899328 Basic stats: COMPLETE Column stats: NONE |
|
|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <go...@apache.org>
wrote:
>
> > Long running query :
>
> Are you running this on MapReduce or Tez?
>
> Please post the output of explain - if you are seeing > 1 shuffle edge in
> your query while having only one window for OVER(), that might be the
> reason.
>
> OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
>
>
> The multiple PTF operators should have been collapsed by the reduce
> sink-deduplication.
>
> Cheers,
> Gopal
>
>
>
Re: Optimize Hive Query
Posted by Gopal Vijayaraghavan <go...@apache.org>.
> Long running query :
Are you running this on MapReduce or Tez?
Please post the output of explain - if you are seeing > 1 shuffle edge in
your query while having only one window for OVER(), that might be the
reason.
OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt)
The multiple PTF operators should have been collapsed by the reduce
sink-deduplication.
Cheers,
Gopal