You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2017/10/17 19:54:06 UTC

[jira] [Comment Edited] (HIVE-17431) change configuration handling in TezSessionState

    [ https://issues.apache.org/jira/browse/HIVE-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208230#comment-16208230 ] 

Sergey Shelukhin edited comment on HIVE-17431 at 10/17/17 7:53 PM:
-------------------------------------------------------------------

W.r.t. the logical problems (the other stuff I will address in due course) - looking at code in branch-2, before all the fancy changes, I see many paths where the user config would be applied; however, those are all for a new, or failed and reopened, session.
The main pool use path goes like this:
TezTask calls getSession with its unique config. That calls canWork..., that only checks doAs and queue name; then calls the private getSession/2. That again checks doAs and queue, also validates the config; however, if doAs and queue check out, it returns the session from the pool without ever using conf again.
Then, the only way conf interacts with the session is if it's not open (shouldn't happen with pool sessions on main path), or to add resources. Adding resources passes the conf from TezTask around, and doesn't interact with session's own config. Only resources are changed based on the external (to the session) conf object.
So, on this path the user config is never applied (beyond what's sent with the DAG, if anything).

So it seems like it's inconsistent between special cases (custom queue, doAs, reopen, pool session not open - the new session will be opened with the new conf from TezTask) and the normal case where the session will not get the new config, it will just come from the pool with whatever config it had. When the files are missing it looks like they are added without reopen (first in in TezTask::updateSession, and then eventually via calling addAppMasterLocalFiles in submit).
Given that the normal case appears to work, I think the whole new config propagation is unnecessary (beyond doAs, queue, and file list verification). Does this make sense?

cc [~hagleitn] [~vikram.dixit] [~sseth] I'm not sure who's more familiar with this logic :)





was (Author: sershe):
W.r.t. the logical problems (the other stuff I will address in due course) - looking at code in branch-2, before all the fancy changes, I see many paths where the user config would be applied; however, those are all for a new, or failed and reopened, session.
The main pool use path goes like this:
TezTask calls getSession with its unique config. That calls canWork..., that only checks doAs and queue name; then calls the private getSession/2. That again checks doAs and queue, also validates the config; however, if doAs and queue check out, it returns the session from the pool without ever using conf again.
Then, the only way conf interacts with the session is if it's not open (shouldn't happen with pool sessions on main path), or to add resources. Adding resources passes the conf that's passed in from TezTask around, and doesn't interact with session config it seems. Only resources are changed.
So, on this path the user config is never applied (beyond what's sent with the DAG, if anything).

So it seems like it's inconsistent between special cases (custom queue, doAs, reopen, pool session not open - the new session will be opened with the new conf) and the normal case where the session will not get the new conf. When the files are missing it looks like they are added without reopen (first in in TezTask::updateSession, and then eventually via calling addAppMasterLocalFiles in submit).
Given that the normal case appears to work, I think the whole new config propagation is unnecessary (beyond doAs, queue, and file list verification). Does this make sense?

cc [~hagleitn] [~vikram.dixit] [~sseth] I'm not sure who's more familiar with this logic :)




> change configuration handling in TezSessionState
> ------------------------------------------------
>
>                 Key: HIVE-17431
>                 URL: https://issues.apache.org/jira/browse/HIVE-17431
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-17431.patch
>
>
> The configuration is only set when opening the session; that seems unnecessary - it could be set in the ctor and made final. E.g. when updating the session and localizing new resources we may theoretically open the session with a new config, but we don't update the config and only update the files if the session is already open, which seems to imply that it's ok to not update the config. 
> In most cases, the session is opened only once or reopened without intending to change the config (e.g. if it times out).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)