You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "BELUGA BEHR (JIRA)" <ji...@apache.org> on 2018/10/01 16:26:00 UTC
[jira] [Assigned] (HIVE-20665) Hive Parallel Tasks - Hive
Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
BELUGA BEHR reassigned HIVE-20665:
----------------------------------
Assignee: BELUGA BEHR
> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> ------------------------------------------------------------------------
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 2.3.2, 3.1.0, 4.0.0
> Reporter: BELUGA BEHR
> Assignee: BELUGA BEHR
> Priority: Major
>
> When parallel tasks are enabled in Hive, all of the resulting queries share the same Hive configuration. This is problematic as each query will modify the same {{HiveConf}} object with things like query ID and query text. This will overwrite each other and cause {{ConcurrentModificationException}} issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
> Hive.set(parentHive, false);
> // TODO: can this result in cross-thread reuse of session state?
> SessionState.setCurrentSessionState(parentSessionState);
> PerfLogger.setPerfLogger(SessionState.getPerfLogger());
> LogUtils.registerLoggingContext(queryState.getConf());
> try {
> if (asyncPrepare) {
> prepare(queryState);
> }
> runQuery();
> } catch (HiveSQLException e) {
> // ...
> {code}
> [Code Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code we can see that for every thread launched, they are all calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
> /**
> * Sets the given session state in the thread local var for sessions.
> */
> public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
> }
> // SessionState is not available in runtime and Hive.get().getConf() is not safe to call
> private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
> this.state = state;
> attach(state.getConf());
> }
> private void attach(HiveConf conf) {
> // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
> this.conf = conf;
> ClassLoader classLoader = conf.getClassLoader();
> if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
> }
> }
> }
> {code}
> [Code Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use and modify.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)