You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Stephen Sprague <sp...@gmail.com> on 2016/02/18 06:42:17 UTC

Tez issues with beeline via HS2

Hi guys,
it was suggested i post to the user@hive group rather than the user@tez
group for this one.  Here's my issue. My query hangs when using beeline via
HS2 (but works with the local beeline client).  I'd like to overcome that.

This is my query:

   beeline -u 'jdbc:hive2://
dwrdevnn1.sv2.truila.com:10001/default;auth=noSasl sprague nopwd
org.apache.hive.jdbc.HiveDriver' <<SQL
   set hive.execution.engine=tez;
   select count(*) from omniture.hit_data where date_key=20160210;
SQL

 I'll show the log snippet from HS2 below but first these known facts.


   * beeline -u jdbc:hive2:// works for engine=tez (local client)

   * beeline -u jdbc:hive2://dwrdevnn1.sv2.trulia.com:10001/default does
not work. (HS2 client)

   * both work for engine=mr so it does appear tez related.

   * nothing gets submitted to yarn.

   * i used Gopal's fragment for tez-site.xml (
https://github.com/t3rmin4t0r/tez-autobuild/blob/llap/tez-site.xml.frag)

   * i've restarted the hiveserve2 process with VERBOSE logging

   * i've bounced the yarn RM.


and here's the tail end of hive.log (from the HS2 process)

2016-02-17 21:19:25,839 INFO  [HiveServer2-Handler-Pool: Thread-38]:
ppd.OpProcFactory (OpProcFactory.java:logExpr(707)) - Pushdown Predicates
of FIL For Alias : hit_data
2016-02-17 21:19:25,839 INFO  [HiveServer2-Handler-Pool: Thread-38]:
ppd.OpProcFactory (OpProcFactory.java:logExpr(710)) -      (date_key =
20160210)
2016-02-17 21:19:25,840 INFO  [HiveServer2-Handler-Pool: Thread-38]:
ppd.OpProcFactory (OpProcFactory.java:process(382)) - Processing for TS(18)
2016-02-17 21:19:25,840 INFO  [HiveServer2-Handler-Pool: Thread-38]:
ppd.OpProcFactory (OpProcFactory.java:logExpr(707)) - Pushdown Predicates
of TS For Alias : hit_data
2016-02-17 21:19:25,841 INFO  [HiveServer2-Handler-Pool: Thread-38]:
ppd.OpProcFactory (OpProcFactory.java:logExpr(710)) -      (date_key =
20160210)
2016-02-17 21:19:25,851 INFO  [HiveServer2-Handler-Pool: Thread-38]:
log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG
method=partition-retrieving
from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner>
2016-02-17 21:19:25,856 INFO  [HiveServer2-Handler-Pool: Thread-38]:
metastore.HiveMetaStore (HiveMetaStore.java:logInfo(746)) - 3:
get_partitions_by_expr : db=omniture tbl=hit_data
2016-02-17 21:19:25,857 INFO  [HiveServer2-Handler-Pool: Thread-38]:
HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(371)) - ugi=dwr
      ip=unknown-ip-addr      cmd=get_partitions_by_expr : db=omniture
tbl=hit_data
2016-02-17 21:19:26,305 INFO  [HiveServer2-Handler-Pool: Thread-38]:
log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG
method=partition-retrieving start=1455772765851 end=1455772766305
duration=454 from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner>
2016-02-17 21:19:26,319 INFO  [HiveServer2-Handler-Pool: Thread-38]:
optimizer.ColumnPrunerProcFactory
(ColumnPrunerProcFactory.java:pruneReduceSinkOperator(817)) - RS 22
oldColExprMap: {VALUE._col0=Column[_col0]}
2016-02-17 21:19:26,320 INFO  [HiveServer2-Handler-Pool: Thread-38]:
optimizer.ColumnPrunerProcFactory
(ColumnPrunerProcFactory.java:pruneReduceSinkOperator(866)) - RS 22
newColExprMap: {VALUE._col0=Column[_col0]}
2016-02-17 21:19:26,340 INFO  [HiveServer2-Handler-Pool: Thread-38]:
annotation.StatsRulesProcFactory
(StatsRulesProcFactory.java:updateStats(1824)) - STATS-GBY[23]: Equals 0 in
number of rows.0 rows will be set to 1
2016-02-17 21:19:26,343 INFO  [HiveServer2-Handler-Pool: Thread-38]:
log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=compile
start=1455772765310 end=1455772766343 duration=1033
from=org.apache.hadoop.hive.ql.Driver>

and that's it. this is where it stalls out.  nothing more in hive.log;
nothing more on the client.  so its kinda strange.


wondering what further steps i can take to trace down the problem.  Any
ideas?

Cheers,
Stephen
PS Here's the client session i'm looking at:
issuing: !connect jdbc:hive2://
dwrdevnn1.sv2.trulia.com:10001/default;auth=noSasl sprague nopwd
org.apache.hive.jdbc.HiveDriver '' ''
Connecting to jdbc:hive2://
dwrdevnn1.sv2.trulia.com:10001/default;auth=noSasl
Connected to: Apache Hive (version 1.2.1)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Executing command:
         set hive.execution.engine=tez;
         select count(*) from omniture.hit_data where date_key=20160210;

Getting log thread is interrupted, since query is done!
No rows affected (0.112 seconds)
<hangs here forever>

Re: Tez issues with beeline via HS2

Posted by Gopal Vijayaraghavan <go...@apache.org>.


> * i used Gopal's fragment for tez-site.xml
>(https://github.com/t3rmin4t0r/tez-autobuild/blob/llap/tez-site.xml.frag)


Please check that the tez.lib.uris is filled out properly.


I suspect it's all already setup since the CLI mode works anyway, but
cross-check that the HS2 classpath does have a tez-site.xml in it.

>   * nothing gets submitted to yarn.

I have a dumb question, which I didn't think about when you asked this in
the Tez list - Is date_key a partition column?

You can try turning off optimizations from my list, which can cause some
dumb stuff (particularly with S3/Azure).

set hive.optimize.null.scan=false;
set hive.optimize.metadataonly=false;
set hive.fetch.task.conversion=none;

And assuming your goal is to run this as fast as possible (reads the
count/min/max fast per partition).

set hive.compute.query.using.stats=true;


> and that's it. this is where it stalls out.  nothing more in hive.log;
>nothing more on the client.  so its kinda strange.
...
> wondering what further steps i can take to trace down the problem.  Any
>ideas?

Log lines with TezSession* would the ones to look into, even at INFO log
levels.

The log level doesn't seem to have lowered FWIW, so a jstack definitely
helps.

I have noticed issues with some YARN setups (Pivotal installs for
instance), which are configured to do strict queues (and occasionally to
reject such logins).

At least for capacity scheduler, I know the config looks like

yarn.scheduler.capacity.root.default.state=STOPPED


to prevent anyone from starting any jobs in "default" queue.

Cheers,
Gopal