You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by "r7raul1984@163.com" <r7...@163.com> on 2015/05/05 09:58:04 UTC
hive sql on tez run forever
I change the sql where condition to (where t.update_time >= '2015-05-04') , the sql can return result for a while. Because t.update_time >= '2015-05-04' can filter many row when table scan. But why change where condition to (where t.update_time >= '2015-05-04' or length(t8.end_user_id)>0) ,the sql run forever as follows:
Status: Running (Executing on YARN cluster with App id application_1419300485749_1419769)
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 .......... SUCCEEDED 1 1 0 0 0 0
Map 10 ......... SUCCEEDED 3 3 0 0 0 0
Map 11 ......... SUCCEEDED 151 151 0 0 0 0
Map 12 ......... SUCCEEDED 1 1 0 0 0 0
Map 13 ......... SUCCEEDED 76 76 0 0 0 0
Map 5 .......... SUCCEEDED 11 11 0 0 0 0
Map 7 .......... SUCCEEDED 156 156 0 0 0 0
Map 9 .......... SUCCEEDED 10 10 0 0 0 0
Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0
Reducer 3 ..... RUNNING 642 641 1 0 0 0
Reducer 4 RUNNING 1009 0 89 920 0 0
Reducer 6 ...... SUCCEEDED 3 3 0 0 0 0
Reducer 8 ...... SUCCEEDED 203 203 0 0 0 0
--------------------------------------------------------------------------------
VERTICES: 11/13 [==============>>------------] 55% ELAPSED TIME: 307.54 s
What is the root cause ?
r7raul1984@163.com
Re: hive sql on tez run forever
Posted by Hitesh Shah <hi...@apache.org>.
@ xqflying,
There were a few shuffle issues fixed post 0.5.3 which you might be hitting.
TEZ-2214. FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging
TEZ-1923. FetcherOrderedGrouped gets into infinite loop due to memory pressure
You can probably try the 0.5.4 release ( once it comes out within the next week or so ) or try applying the patches from the jiras above.
thanks
— Hitesh
On May 10, 2015, at 1:12 AM, xqflying@163.com wrote:
> i have encountered similar problem before,for work around, i used mr instead. i used tez 0.53at that time. and at that time shuffle keep running for ever. which version of tez r u using?
>
>
> On 2015-05-06 06:02 , Hitesh Shah Wrote:
>
> This might be a mail that is better suited for the user@hive mailing list to start with.
>
> thanks
> — Hitesh
>
> On May 5, 2015, at 12:58 AM, r7raul1984@163.com wrote:
>
> > I change the sql where condition to (where t.update_time >= '2015-05-04') , the sql can return result for a while. Because t.update_time >= '2015-05-04' can filter many row when table scan. But why change where condition to (where t.update_time >= '2015-05-04' or length(t8.end_user_id)>0) ,the sql run forever as follows:
> > Status: Running (Executing on YARN cluster with App id application_1419300485749_1419769)
> >
> > --------------------------------------------------------------------------------
> > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> > --------------------------------------------------------------------------------
> > Map 1 .......... SUCCEEDED 1 1 0 0 0 0
> > Map 10 ......... SUCCEEDED 3 3 0 0 0 0
> > Map 11 ......... SUCCEEDED 151 151 0 0 0 0
> > Map 12 ......... SUCCEEDED 1 1 0 0 0 0
> > Map 13 ......... SUCCEEDED 76 76 0 0 0 0
> > Map 5 .......... SUCCEEDED 11 11 0 0 0 0
> > Map 7 .......... SUCCEEDED 156 156 0 0 0 0
> > Map 9 .......... SUCCEEDED 10 10 0 0 0 0
> > Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0
> > Reducer 3 ..... RUNNING 642 641 1 0 0 0
> > Reducer 4 RUNNING 1009 0 89 920 0 0
> > Reducer 6 ...... SUCCEEDED 3 3 0 0 0 0
> > Reducer 8 ...... SUCCEEDED 203 203 0 0 0 0
> > --------------------------------------------------------------------------------
> > VERTICES: 11/13 [==============>>------------] 55% ELAPSED TIME: 307.54 s
> >
> > What is the root cause ?
> >
> > r7raul1984@163.com
> > <sql.txt><queryplan.TXT>
>
>
>
Re: Re: hive sql on tez run forever
Posted by xq...@163.com.
i have encountered similar problem before,for work around, i used mr instead. i used tez 0.53at that time. and at that time shuffle keep running for ever. which version of tez r u using?
On 2015-05-06 06:02 , Hitesh Shah Wrote:
This might be a mail that is better suited for the user@hive mailing list to start with.
thanks
— Hitesh
On May 5, 2015, at 12:58 AM, r7raul1984@163.com wrote:
> I change the sql where condition to (where t.update_time >= '2015-05-04') , the sql can return result for a while. Because t.update_time >= '2015-05-04' can filter many row when table scan. But why change where condition to (where t.update_time >= '2015-05-04' or length(t8.end_user_id)>0) ,the sql run forever as follows:
> Status: Running (Executing on YARN cluster with App id application_1419300485749_1419769)
>
> --------------------------------------------------------------------------------
> VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> --------------------------------------------------------------------------------
> Map 1 .......... SUCCEEDED 1 1 0 0 0 0
> Map 10 ......... SUCCEEDED 3 3 0 0 0 0
> Map 11 ......... SUCCEEDED 151 151 0 0 0 0
> Map 12 ......... SUCCEEDED 1 1 0 0 0 0
> Map 13 ......... SUCCEEDED 76 76 0 0 0 0
> Map 5 .......... SUCCEEDED 11 11 0 0 0 0
> Map 7 .......... SUCCEEDED 156 156 0 0 0 0
> Map 9 .......... SUCCEEDED 10 10 0 0 0 0
> Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0
> Reducer 3 ..... RUNNING 642 641 1 0 0 0
> Reducer 4 RUNNING 1009 0 89 920 0 0
> Reducer 6 ...... SUCCEEDED 3 3 0 0 0 0
> Reducer 8 ...... SUCCEEDED 203 203 0 0 0 0
> --------------------------------------------------------------------------------
> VERTICES: 11/13 [==============>>------------] 55% ELAPSED TIME: 307.54 s
>
> What is the root cause ?
>
> r7raul1984@163.com
> <sql.txt><queryplan.TXT>
Re: hive sql on tez run forever
Posted by Hitesh Shah <hi...@apache.org>.
This might be a mail that is better suited for the user@hive mailing list to start with.
thanks
— Hitesh
On May 5, 2015, at 12:58 AM, r7raul1984@163.com wrote:
> I change the sql where condition to (where t.update_time >= '2015-05-04') , the sql can return result for a while. Because t.update_time >= '2015-05-04' can filter many row when table scan. But why change where condition to (where t.update_time >= '2015-05-04' or length(t8.end_user_id)>0) ,the sql run forever as follows:
> Status: Running (Executing on YARN cluster with App id application_1419300485749_1419769)
>
> --------------------------------------------------------------------------------
> VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> --------------------------------------------------------------------------------
> Map 1 .......... SUCCEEDED 1 1 0 0 0 0
> Map 10 ......... SUCCEEDED 3 3 0 0 0 0
> Map 11 ......... SUCCEEDED 151 151 0 0 0 0
> Map 12 ......... SUCCEEDED 1 1 0 0 0 0
> Map 13 ......... SUCCEEDED 76 76 0 0 0 0
> Map 5 .......... SUCCEEDED 11 11 0 0 0 0
> Map 7 .......... SUCCEEDED 156 156 0 0 0 0
> Map 9 .......... SUCCEEDED 10 10 0 0 0 0
> Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0
> Reducer 3 ..... RUNNING 642 641 1 0 0 0
> Reducer 4 RUNNING 1009 0 89 920 0 0
> Reducer 6 ...... SUCCEEDED 3 3 0 0 0 0
> Reducer 8 ...... SUCCEEDED 203 203 0 0 0 0
> --------------------------------------------------------------------------------
> VERTICES: 11/13 [==============>>------------] 55% ELAPSED TIME: 307.54 s
>
> What is the root cause ?
>
> r7raul1984@163.com
> <sql.txt><queryplan.TXT>