You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Carl Steinbach (JIRA)" <ji...@apache.org> on 2013/12/06 04:59:38 UTC

[jira] [Commented] (HIVE-5972) Hiveserver2 is much slower than hiveserver1

    [ https://issues.apache.org/jira/browse/HIVE-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840939#comment-13840939 ] 

Carl Steinbach commented on HIVE-5972:
--------------------------------------

{quote}
1. hiveserver1 running on 2CPUs, 8G mem, took about 8 hours
2. hiveserver2 running on 4CPUs, 16 mem, took about 13 hours and 27min (never successful on machine with 2CPUs, 8G mem)
{quote}

How large was the resultset that your query generated? How much time did hs1/hs2 spend executing the query on the cluster as opposed to fetching the resultset through hs1/hs2?

bq. But I cannot understand why hiveserver2 is much slower than hiveserver1, because from doc, hs2 support concurrency, it should be faster than hs1, isn't it?

In this context "concurrency" means that HiveServer2 is able to handle multiple concurrent client connections, something that HS1 can't do. Also, regardless of whether you're using HS1 or HS2, the actual work of executing the query is delegated to the underlying MR cluster. The HS1/HS2 server coordinates query execution on the cluster and then acts as a relay through which the client fetches the results of the query. It's this last step (the result set fetch) which is known to be slower on HS2, and we're tracking the task of fixing this performance regression in HIVE-3746.

So far the information you have provided makes me think that this is a duplicate of HIVE-3746. Please let us know if you think some other issue is causing the performance regression.

> Hiveserver2 is much slower than hiveserver1
> -------------------------------------------
>
>                 Key: HIVE-5972
>                 URL: https://issues.apache.org/jira/browse/HIVE-5972
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 0.10.0
>            Reporter: a bc
>            Priority: Critical
>
> we are building ms sql cube by linkedserver connectiong hiveserver with Cloudera's ODBC driver.
> There are two test results:
> 1. hiveserver1 running on 2CPUs, 8G mem, took about 8 hours
> 2. hiveserver2 running on 4CPUs, 16 mem, took about 13 hours and 27min (never successful on machine with 2CPUs, 8G mem)
>  Although on both cases, almost all CPUs are busy when building cube.
> But I cannot understand why hiveserver2 is much slower than hiveserver1, because from doc, hs2 support concurrency, it should be faster than hs1, isn't it?
>  
> Thanks.
>  
> CDH4.3 on CentOS6.



--
This message was sent by Atlassian JIRA
(v6.1#6144)