You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Abhishek Gupta <ab...@gmail.com> on 2019/01/17 07:35:34 UTC

Hive2 Interactive LLAP fails to start

I have been trying to start Hive2 Interactive on HDP 2.6.5 that ships with
Hive 2.1.0. I have followed the guidelines mentioned in the following
articles about how to size and tune LLAP with Yarn
https://community.hortonworks.com/articles/149486/llap-sizing-and-setup.html

But LLAP daemon fails to start and Hive interactive gives up after
configured retries.
Also followed the following steps to troubleshoot and the error doesn't
seem to be sizing related.
https://community.hortonworks.com/articles/149899/investigating-when-llap-doesnt-start.html

Following is my cluster setup in brief:

hive.llap.daemon.yarn.container.mb= 10 GB
llap_heap_size= 7 GB
In-Memory Cache per Daemon= 2 GB
Head room= 1 GB
(hive.llap.daemon.yarn.container.mb should be = llap_heap_size + In-Memory
Cache per Daemon + Headroom)
Yarn container max= 14.6 GB, min 1 GB
Number of executors per LLAP Daemon= 1
Yarn pre-emption enabled
Retries for checking LLAP status= 20

Attached the startup script logs and Yarn container relevant logs.
https://pastebin.com/raw/4ywJHp4M
https://pastebin.com/raw/AiJkfwUR

Following is a snapshot of Slide AM page
               Application: llap0
Status all containers allocated
Total number of containers 5
Create time: 17 Jan 2019 05:40:15 GMT
Running since: 17 Jan 2019 05:40:15 GMT
Time last flexed: N/A
Application storage path: hdfs://hsft/user/hive/.slider/cluster/llap0/database

Application configuration path:
hdfs://hsft/user/hive/.slider/cluster/llap0/snapshot

Component Instances
*Component* *Desired* *Actual* *Outstanding Requests* *Failed* *Failed to
start* *Placement*
LLAP 4 4 0 0 0
slider-appmaster 1 1 0 0 0



Application Container Diagnostics
*Container ID* *Component* *State* *Exit Code* *Logs* *Diagnostics*
container_e27_1547703090681_0001_01_000002 LLAP 3 -1000 Logs
container_e27_1547703090681_0001_01_000003 LLAP 3 -1000 Logs
container_e27_1547703090681_0001_01_000004 LLAP 3 -1000 Logs
container_e27_1547703090681_0001_01_000005 LLAP 3 -1000 Logs

Re: Hive2 Interactive LLAP fails to start

Posted by Abhishek Gupta <ab...@gmail.com>.
Gentle reminder.

On Thu, Jan 17, 2019 at 3:45 PM Abhishek Gupta <ab...@gmail.com> wrote:

> One error that is striking is the error by NetUtil.py failing to connect
> with Slide agent service as the OS Python version 2.7.6  where module ssl
> doesn’t have the attribute `_create_unverified_context`, this is resulting
> in the following error when running LLAP for Hive2,
>
> NFO 2019-01-17 09:04:59,889 NetUtil.py:66 - Failed to connect to
> https://shdp-ycn04:38097/ws/v1/slider/agents/ due to 'module' object has
> no attribute '_create_unverified_context'
> INFO 2019-01-17 09:04:59,889 NetUtil.py:85 - Server at
> https://shdp-ycn04:38097/ws/v1/slider/agents/ is not reachable, sleeping
> for 10 seconds...
>
> On Thu, Jan 17, 2019 at 1:05 PM Abhishek Gupta <ab...@gmail.com>
> wrote:
>
>> I have been trying to start Hive2 Interactive on HDP 2.6.5 that ships
>> with Hive 2.1.0. I have followed the guidelines mentioned in the following
>> articles about how to size and tune LLAP with Yarn
>>
>> https://community.hortonworks.com/articles/149486/llap-sizing-and-setup.html
>>
>> But LLAP daemon fails to start and Hive interactive gives up after
>> configured retries.
>> Also followed the following steps to troubleshoot and the error doesn't
>> seem to be sizing related.
>>
>> https://community.hortonworks.com/articles/149899/investigating-when-llap-doesnt-start.html
>>
>> Following is my cluster setup in brief:
>>
>> hive.llap.daemon.yarn.container.mb= 10 GB
>> llap_heap_size= 7 GB
>> In-Memory Cache per Daemon= 2 GB
>> Head room= 1 GB
>> (hive.llap.daemon.yarn.container.mb should be = llap_heap_size +
>> In-Memory Cache per Daemon + Headroom)
>> Yarn container max= 14.6 GB, min 1 GB
>> Number of executors per LLAP Daemon= 1
>> Yarn pre-emption enabled
>> Retries for checking LLAP status= 20
>>
>> Attached the startup script logs and Yarn container relevant logs.
>> https://pastebin.com/raw/4ywJHp4M
>> https://pastebin.com/raw/AiJkfwUR
>>
>> Following is a snapshot of Slide AM page
>>                Application: llap0
>> Status all containers allocated
>> Total number of containers 5
>> Create time: 17 Jan 2019 05:40:15 GMT
>> Running since: 17 Jan 2019 05:40:15 GMT
>> Time last flexed: N/A
>> Application storage path: hdfs://hsft/user/hive/.slider/cluster/llap0/database
>>
>> Application configuration path: hdfs://hsft/user/hive/.slider/cluster/llap0/snapshot
>>
>> Component Instances
>> *Component* *Desired* *Actual* *Outstanding Requests* *Failed* *Failed
>> to start* *Placement*
>> LLAP 4 4 0 0 0
>> slider-appmaster 1 1 0 0 0
>>
>>
>>
>> Application Container Diagnostics
>> *Container ID* *Component* *State* *Exit Code* *Logs* *Diagnostics*
>> container_e27_1547703090681_0001_01_000002 LLAP 3 -1000 Logs
>> container_e27_1547703090681_0001_01_000003 LLAP 3 -1000 Logs
>> container_e27_1547703090681_0001_01_000004 LLAP 3 -1000 Logs
>> container_e27_1547703090681_0001_01_000005 LLAP 3 -1000 Logs
>>
>>
>>
>>
>>

Re: Hive2 Interactive LLAP fails to start

Posted by Abhishek Gupta <ab...@gmail.com>.
Gentle reminder.

On Thu, Jan 17, 2019 at 3:45 PM Abhishek Gupta <ab...@gmail.com> wrote:

> One error that is striking is the error by NetUtil.py failing to connect
> with Slide agent service as the OS Python version 2.7.6  where module ssl
> doesn’t have the attribute `_create_unverified_context`, this is resulting
> in the following error when running LLAP for Hive2,
>
> NFO 2019-01-17 09:04:59,889 NetUtil.py:66 - Failed to connect to
> https://shdp-ycn04:38097/ws/v1/slider/agents/ due to 'module' object has
> no attribute '_create_unverified_context'
> INFO 2019-01-17 09:04:59,889 NetUtil.py:85 - Server at
> https://shdp-ycn04:38097/ws/v1/slider/agents/ is not reachable, sleeping
> for 10 seconds...
>
> On Thu, Jan 17, 2019 at 1:05 PM Abhishek Gupta <ab...@gmail.com>
> wrote:
>
>> I have been trying to start Hive2 Interactive on HDP 2.6.5 that ships
>> with Hive 2.1.0. I have followed the guidelines mentioned in the following
>> articles about how to size and tune LLAP with Yarn
>>
>> https://community.hortonworks.com/articles/149486/llap-sizing-and-setup.html
>>
>> But LLAP daemon fails to start and Hive interactive gives up after
>> configured retries.
>> Also followed the following steps to troubleshoot and the error doesn't
>> seem to be sizing related.
>>
>> https://community.hortonworks.com/articles/149899/investigating-when-llap-doesnt-start.html
>>
>> Following is my cluster setup in brief:
>>
>> hive.llap.daemon.yarn.container.mb= 10 GB
>> llap_heap_size= 7 GB
>> In-Memory Cache per Daemon= 2 GB
>> Head room= 1 GB
>> (hive.llap.daemon.yarn.container.mb should be = llap_heap_size +
>> In-Memory Cache per Daemon + Headroom)
>> Yarn container max= 14.6 GB, min 1 GB
>> Number of executors per LLAP Daemon= 1
>> Yarn pre-emption enabled
>> Retries for checking LLAP status= 20
>>
>> Attached the startup script logs and Yarn container relevant logs.
>> https://pastebin.com/raw/4ywJHp4M
>> https://pastebin.com/raw/AiJkfwUR
>>
>> Following is a snapshot of Slide AM page
>>                Application: llap0
>> Status all containers allocated
>> Total number of containers 5
>> Create time: 17 Jan 2019 05:40:15 GMT
>> Running since: 17 Jan 2019 05:40:15 GMT
>> Time last flexed: N/A
>> Application storage path: hdfs://hsft/user/hive/.slider/cluster/llap0/database
>>
>> Application configuration path: hdfs://hsft/user/hive/.slider/cluster/llap0/snapshot
>>
>> Component Instances
>> *Component* *Desired* *Actual* *Outstanding Requests* *Failed* *Failed
>> to start* *Placement*
>> LLAP 4 4 0 0 0
>> slider-appmaster 1 1 0 0 0
>>
>>
>>
>> Application Container Diagnostics
>> *Container ID* *Component* *State* *Exit Code* *Logs* *Diagnostics*
>> container_e27_1547703090681_0001_01_000002 LLAP 3 -1000 Logs
>> container_e27_1547703090681_0001_01_000003 LLAP 3 -1000 Logs
>> container_e27_1547703090681_0001_01_000004 LLAP 3 -1000 Logs
>> container_e27_1547703090681_0001_01_000005 LLAP 3 -1000 Logs
>>
>>
>>
>>
>>

Re: Hive2 Interactive LLAP fails to start

Posted by Abhishek Gupta <ab...@gmail.com>.
One error that is striking is the error by NetUtil.py failing to connect
with Slide agent service as the OS Python version 2.7.6  where module ssl
doesn’t have the attribute `_create_unverified_context`, this is resulting
in the following error when running LLAP for Hive2,

NFO 2019-01-17 09:04:59,889 NetUtil.py:66 - Failed to connect to
https://shdp-ycn04:38097/ws/v1/slider/agents/ due to 'module' object has no
attribute '_create_unverified_context'
INFO 2019-01-17 09:04:59,889 NetUtil.py:85 - Server at
https://shdp-ycn04:38097/ws/v1/slider/agents/ is not reachable, sleeping
for 10 seconds...

On Thu, Jan 17, 2019 at 1:05 PM Abhishek Gupta <ab...@gmail.com> wrote:

> I have been trying to start Hive2 Interactive on HDP 2.6.5 that ships with
> Hive 2.1.0. I have followed the guidelines mentioned in the following
> articles about how to size and tune LLAP with Yarn
>
> https://community.hortonworks.com/articles/149486/llap-sizing-and-setup.html
>
> But LLAP daemon fails to start and Hive interactive gives up after
> configured retries.
> Also followed the following steps to troubleshoot and the error doesn't
> seem to be sizing related.
>
> https://community.hortonworks.com/articles/149899/investigating-when-llap-doesnt-start.html
>
> Following is my cluster setup in brief:
>
> hive.llap.daemon.yarn.container.mb= 10 GB
> llap_heap_size= 7 GB
> In-Memory Cache per Daemon= 2 GB
> Head room= 1 GB
> (hive.llap.daemon.yarn.container.mb should be = llap_heap_size + In-Memory
> Cache per Daemon + Headroom)
> Yarn container max= 14.6 GB, min 1 GB
> Number of executors per LLAP Daemon= 1
> Yarn pre-emption enabled
> Retries for checking LLAP status= 20
>
> Attached the startup script logs and Yarn container relevant logs.
> https://pastebin.com/raw/4ywJHp4M
> https://pastebin.com/raw/AiJkfwUR
>
> Following is a snapshot of Slide AM page
>                Application: llap0
> Status all containers allocated
> Total number of containers 5
> Create time: 17 Jan 2019 05:40:15 GMT
> Running since: 17 Jan 2019 05:40:15 GMT
> Time last flexed: N/A
> Application storage path: hdfs://hsft/user/hive/.slider/cluster/llap0/database
>
> Application configuration path: hdfs://hsft/user/hive/.slider/cluster/llap0/snapshot
>
> Component Instances
> *Component* *Desired* *Actual* *Outstanding Requests* *Failed* *Failed to
> start* *Placement*
> LLAP 4 4 0 0 0
> slider-appmaster 1 1 0 0 0
>
>
>
> Application Container Diagnostics
> *Container ID* *Component* *State* *Exit Code* *Logs* *Diagnostics*
> container_e27_1547703090681_0001_01_000002 LLAP 3 -1000 Logs
> container_e27_1547703090681_0001_01_000003 LLAP 3 -1000 Logs
> container_e27_1547703090681_0001_01_000004 LLAP 3 -1000 Logs
> container_e27_1547703090681_0001_01_000005 LLAP 3 -1000 Logs
>
>
>
>
>

Re: Hive2 Interactive LLAP fails to start

Posted by Abhishek Gupta <ab...@gmail.com>.
One error that is striking is the error by NetUtil.py failing to connect
with Slide agent service as the OS Python version 2.7.6  where module ssl
doesn’t have the attribute `_create_unverified_context`, this is resulting
in the following error when running LLAP for Hive2,

NFO 2019-01-17 09:04:59,889 NetUtil.py:66 - Failed to connect to
https://shdp-ycn04:38097/ws/v1/slider/agents/ due to 'module' object has no
attribute '_create_unverified_context'
INFO 2019-01-17 09:04:59,889 NetUtil.py:85 - Server at
https://shdp-ycn04:38097/ws/v1/slider/agents/ is not reachable, sleeping
for 10 seconds...

On Thu, Jan 17, 2019 at 1:05 PM Abhishek Gupta <ab...@gmail.com> wrote:

> I have been trying to start Hive2 Interactive on HDP 2.6.5 that ships with
> Hive 2.1.0. I have followed the guidelines mentioned in the following
> articles about how to size and tune LLAP with Yarn
>
> https://community.hortonworks.com/articles/149486/llap-sizing-and-setup.html
>
> But LLAP daemon fails to start and Hive interactive gives up after
> configured retries.
> Also followed the following steps to troubleshoot and the error doesn't
> seem to be sizing related.
>
> https://community.hortonworks.com/articles/149899/investigating-when-llap-doesnt-start.html
>
> Following is my cluster setup in brief:
>
> hive.llap.daemon.yarn.container.mb= 10 GB
> llap_heap_size= 7 GB
> In-Memory Cache per Daemon= 2 GB
> Head room= 1 GB
> (hive.llap.daemon.yarn.container.mb should be = llap_heap_size + In-Memory
> Cache per Daemon + Headroom)
> Yarn container max= 14.6 GB, min 1 GB
> Number of executors per LLAP Daemon= 1
> Yarn pre-emption enabled
> Retries for checking LLAP status= 20
>
> Attached the startup script logs and Yarn container relevant logs.
> https://pastebin.com/raw/4ywJHp4M
> https://pastebin.com/raw/AiJkfwUR
>
> Following is a snapshot of Slide AM page
>                Application: llap0
> Status all containers allocated
> Total number of containers 5
> Create time: 17 Jan 2019 05:40:15 GMT
> Running since: 17 Jan 2019 05:40:15 GMT
> Time last flexed: N/A
> Application storage path: hdfs://hsft/user/hive/.slider/cluster/llap0/database
>
> Application configuration path: hdfs://hsft/user/hive/.slider/cluster/llap0/snapshot
>
> Component Instances
> *Component* *Desired* *Actual* *Outstanding Requests* *Failed* *Failed to
> start* *Placement*
> LLAP 4 4 0 0 0
> slider-appmaster 1 1 0 0 0
>
>
>
> Application Container Diagnostics
> *Container ID* *Component* *State* *Exit Code* *Logs* *Diagnostics*
> container_e27_1547703090681_0001_01_000002 LLAP 3 -1000 Logs
> container_e27_1547703090681_0001_01_000003 LLAP 3 -1000 Logs
> container_e27_1547703090681_0001_01_000004 LLAP 3 -1000 Logs
> container_e27_1547703090681_0001_01_000005 LLAP 3 -1000 Logs
>
>
>
>
>