You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2018/03/15 18:02:00 UTC

[jira] [Updated] (HIVE-18966) LLAP should not shut down when some random thread goes down

     [ https://issues.apache.org/jira/browse/HIVE-18966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HIVE-18966:
------------------------------------
    Description: 
LlapDaemonUncaughtExceptionHandler shuts down LLAP on any thread failure.
Not sure it's the best idea... 3rd party libraries like permitted UDFs or Storage Handlers (e.g. Druid recently) can have errors that should not bring the entire daemon down.
Perhaps we can go by thread name pattern?
Overall logging the error, unless it's an OOM or other Error, might be better.

We can also add error handling to important threads like schedulers, if it's missing, that will convert an exception into some critical one that will tell the handler to shut everything down.

  was:
LlapDaemonUncaughtExceptionHandler shuts down LLAP on any thread failure.
Not sure it's the best idea... 3rd party libraries like permitted UDFs or Storage Handlers (e.g. Druid recently) can have errors that should not bring the entire daemon down.
Perhaps we can go by thread name pattern?
Overall logging the error, unless it's an OOM or other Error, might be better.

We can also add error handling to important threads like schedulers, if it's missing.


> LLAP should not shut down when some random thread goes down
> -----------------------------------------------------------
>
>                 Key: HIVE-18966
>                 URL: https://issues.apache.org/jira/browse/HIVE-18966
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Priority: Major
>
> LlapDaemonUncaughtExceptionHandler shuts down LLAP on any thread failure.
> Not sure it's the best idea... 3rd party libraries like permitted UDFs or Storage Handlers (e.g. Druid recently) can have errors that should not bring the entire daemon down.
> Perhaps we can go by thread name pattern?
> Overall logging the error, unless it's an OOM or other Error, might be better.
> We can also add error handling to important threads like schedulers, if it's missing, that will convert an exception into some critical one that will tell the handler to shut everything down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)