You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hari Sekhon (JIRA)" <ji...@apache.org> on 2018/10/03 16:24:00 UTC

[jira] [Updated] (HIVE-20666) HiveServer2 Interactive LLAP (re)connect to already running Yarn llap0 app

     [ https://issues.apache.org/jira/browse/HIVE-20666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hari Sekhon updated HIVE-20666:
-------------------------------
    Summary: HiveServer2 Interactive LLAP (re)connect to already running Yarn llap0 app  (was: HiveServer2 Interactive LLAP reconnect to already running Yarn app)

> HiveServer2 Interactive LLAP (re)connect to already running Yarn llap0 app
> --------------------------------------------------------------------------
>
>                 Key: HIVE-20666
>                 URL: https://issues.apache.org/jira/browse/HIVE-20666
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2, llap
>    Affects Versions: 1.2.1
>            Reporter: Hari Sekhon
>            Priority: Major
>
> Improve HiveServer2 Interactive LLAP to (re)connect to already running hive llap yarn app.
> Currently HiveServer2 Interactive startup may fail with the following error if it cannot get enough containers on the queue:
> {code:java}
> WARN cli.LlapStatusServiceDriver: Watch timeout 200s exhausted before desired state RUNNING is attained.
> 2018-10-01 16:26:55,624 - LLAP app 'llap0' in 'RUNNING_PARTIAL' state. Live Instances : '3'. Desired Instances : '4' after 212.498996019 secs.
> 2018-10-01 16:26:55,624 - App state is RUNNING_PARTIAL. Live Instances : '3', Desired Instance : '4'
> 2018-10-01 16:26:55,624 - LLAP app 'llap0' deployment unsuccessful.
> 2018-10-01 16:26:55,625 - Stopping LLAP
> 2018-10-01 16:26:55,625 - call[['slider', 'stop', u'llap0']] {'logoutput': True, 'user': 'hive', 'stderr': -1}{code}
> I could meanwhile see 5 containers for a previous hive llap invocation in the yarn scheduler page and this is the only HiveServer2 Interactive instance, so it appears it wasn't (re)connecting and making use of the running llap app. It's also possible that the containers were simply slow to allocate as the cluster was operating at 100% capacity and therefore weren't fully initialized when the app failed, but the error feedback doesn't give enough details about the state of the llap0 app.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)