You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by kw...@apache.org on 2019/02/26 05:29:17 UTC

[impala] 03/04: IMPALA-8243: Fix racy access to nonPartFieldSchemas_

This is an automated email from the ASF dual-hosted git repository.

kwho pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit fced1cc1bbe62bb94b82962aea0560df8ed00d2d
Author: Bharath Vissapragada <bh...@cloudera.com>
AuthorDate: Sat Feb 23 14:25:39 2019 -0800

    IMPALA-8243: Fix racy access to nonPartFieldSchemas_
    
    ** Please refer to the jira for the full stacktrace.
    
    When constructing the HMS partition state for a given partition,
    we are leaking a reference to HdfsTable#nonPartFieldSchemas_ list.
    The construction happens under a table lock. But once the thread
    exits the lock scope, the source list could be racily modified by
    another operation (say refresh) and that interferes with the original
    thread if it tries to access the list.`
    
    The fix is to make a shallow copy of the source list so that any
    changes to the list do not affect the original caller.
    
    This was found by a stress test under heavy concurrency of refresh
    operations + GetPartialCatalogObject() calls.
    
    Testing:
    ---------
    - I tried a bunch of combinations of operations in the unit-test
    framework but I couldn't reproduce the stack trace, probably because
    the operations are very short-lived
    
    - However, after deploying this patched jar on the stress test cluster,
    this exception never happened again.
    
    Change-Id: I7d68b54af2ba954cf0ffa7b2533cde7be835be77
    Reviewed-on: http://gerrit.cloudera.org:8080/12572
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
    Reviewed-by: Paul Rogers <pr...@cloudera.com>
---
 fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java b/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
index b9e5c55..56a2340 100644
--- a/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
+++ b/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
@@ -859,7 +859,10 @@ public class HdfsPartition implements FeFsPartition, PrunablePartition {
     // Update the serde library class based on the currently used file format.
     org.apache.hadoop.hive.metastore.api.StorageDescriptor storageDescriptor =
         new org.apache.hadoop.hive.metastore.api.StorageDescriptor(
-            table_.getNonPartitionFieldSchemas(),
+          // Make a shallow copy of the field schemas instead of passing a reference to
+          // the source list since it could potentially be modified once the current
+          // thread is out of table lock scope.
+            new ArrayList<>(table_.getNonPartitionFieldSchemas()),
             getLocation(),
             cachedMsPartitionDescriptor_.sdInputFormat,
             cachedMsPartitionDescriptor_.sdOutputFormat,