You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Gabor Kaszab (JIRA)" <ji...@apache.org> on 2019/03/14 14:00:00 UTC
[jira] [Commented] (IMPALA-8243) ConcurrentModificationException in
Catalog stress tests
[ https://issues.apache.org/jira/browse/IMPALA-8243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792672#comment-16792672 ]
Gabor Kaszab commented on IMPALA-8243:
--------------------------------------
Hey [~bharathv],
I see this was submitted. Can this be resolved and the fix version set to 3.2?
fced1cc IMPALA-8243: Fix racy access to nonPartFieldSchemas_
> ConcurrentModificationException in Catalog stress tests
> -------------------------------------------------------
>
> Key: IMPALA-8243
> URL: https://issues.apache.org/jira/browse/IMPALA-8243
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 3.1.0
> Reporter: bharath v
> Assignee: bharath v
> Priority: Blocker
>
> Following is the full stack from the Catalog server logs.
> {noformat}
> 14:09:29.474424 14829 jni-util.cc:256] java.util.ConcurrentModificationException
> java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
> java.util.ArrayList$Itr.next(ArrayList.java:851)
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1449)
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1278)
> org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1144)
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1062)
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:919)
> org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:815)
> org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:862)
> org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:759)
> org.apache.impala.thrift.TPartialPartitionInfo.write(TPartialPartitionInfo.java:665)
> org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:731)
> org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:624)
> org.apache.impala.thrift.TPartialTableInfo.write(TPartialTableInfo.java:543)
> org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:977)
> org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:857)
> org.apache.impala.thrift.TGetPartialCatalogObjectResponse.write(TGetPartialCatalogObjectResponse.java:739)
> org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
> org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:233)
> {noformat}
> It looks like the bug is in the following piece of code.
> {noformat}
> /**
> * Returns a Hive-compatible partition object that may be used in calls to the
> * metastore.
> */
> public org.apache.hadoop.hive.metastore.api.Partition toHmsPartition() {
> if (cachedMsPartitionDescriptor_ == null) return null;
> Preconditions.checkNotNull(table_.getNonPartitionFieldSchemas());
> // Update the serde library class based on the currently used file format.
> org.apache.hadoop.hive.metastore.api.StorageDescriptor storageDescriptor =
> new org.apache.hadoop.hive.metastore.api.StorageDescriptor(
> table_.getNonPartitionFieldSchemas(), <===== Reference to the actual field schema list.
> getLocation(),
> cachedMsPartitionDescriptor_.sdInputFormat,
> cachedMsPartitionDescriptor_.sdOutputFormat,
> cachedMsPartitionDescriptor_.sdCompressed,
> {noformat}
> It appears we are leaking a reference to {{nonPartFieldSchemas_}} in to the thrift object and once the thread leaves the lock scope, some other thread (load() for ex: ) can potentially change the source list and the serialization code could throw {{ConcurrentModificationException}}
> While the stack above is Catalog-v2 only, it is possible that some other threads can race in a similar fashion.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org