You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2022/11/11 10:01:00 UTC

[jira] [Commented] (IMPALA-11720) FileMetadataLoaderTest is flaky due to IOException: Filesystem closed

    [ https://issues.apache.org/jira/browse/IMPALA-11720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632224#comment-17632224 ] 

Quanlong Huang commented on IMPALA-11720:
-----------------------------------------

This is a subtle bug that the FileSystem object is first created in Path#getFileSystem(), and then unintentionally closed by InitLibhdfs() in the c++ code. So the following usages on the FileSystem object will raise this IOException of "Filesystem closed".

It all happens in FileMetadataLoader#load():
{code:java}
public void load() throws CatalogException, IOException {
  Preconditions.checkState(loadStats_ == null, "already loaded");
  loadStats_ = new LoadStats(partDir_);
  FileSystem fs = partDir_.getFileSystem(CONF);  // Create FileSystem object here since it's the first get
  ...
  boolean listWithLocations = FileSystemUtil.supportsStorageIds(fs) &&
      (oldFdsByRelPath_.isEmpty() || forceRefreshLocations);  // Trigger static code of FileSystemUtil here{code}
In FE tests that not extend FrontendTestBase, BackendConfig.INSTANCE is not initialized. So the static code of FileSystemUtil will init it:
[https://github.com/apache/impala/blob/645df57af8db40bff19407b7006b469bc3298737/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L869-L873]

This finally inits FeSupport here:
[https://github.com/apache/impala/blob/b6d6e86f1ec3fc45dccdbdc0933a235668dae2c4/fe/src/main/java/org/apache/impala/service/FeSupport.java#L494]
The stacktrace:
{noformat}
22/11/11 16:28:36 INFO service.FeSupport: Loading libfesupport.so
java.lang.Throwable
        at org.apache.impala.service.FeSupport.loadLibrary(FeSupport.java:490)
        at org.apache.impala.service.FeSupport.LookupSymbol(FeSupport.java:213)
        at org.apache.impala.service.FeSupport.LookupSymbol(FeSupport.java:222)
        at org.apache.impala.catalog.Function.lookupSymbol(Function.java:475)
        at org.apache.impala.catalog.ScalarFunction.createBuiltin(ScalarFunction.java:185)
        at org.apache.impala.catalog.ScalarFunction.createBuiltinOperator(ScalarFunction.java:173)
        at org.apache.impala.catalog.ScalarFunction.createBuiltinOperator(ScalarFunction.java:168)
        at org.apache.impala.analysis.ArithmeticExpr.initBuiltins(ArithmeticExpr.java:105)
        at org.apache.impala.catalog.BuiltinsDb.initBuiltins(BuiltinsDb.java:105)
        at org.apache.impala.catalog.BuiltinsDb.<init>(BuiltinsDb.java:94)
        at org.apache.impala.catalog.BuiltinsDb.getInstance(BuiltinsDb.java:79)
        at org.apache.impala.analysis.SqlScanner.init(SqlScanner.java:752)
        at org.apache.impala.analysis.SqlScanner.<clinit>(SqlScanner.java:768)
        at org.apache.impala.service.BackendConfig.create(BackendConfig.java:60)
        at org.apache.impala.service.BackendConfig.create(BackendConfig.java:53)
        at org.apache.impala.common.FileSystemUtil.<clinit>(FileSystemUtil.java:870)
        at org.apache.impala.catalog.FileMetadataLoader.load(FileMetadataLoader.java:176)
        at org.apache.impala.catalog.FileMetadataLoaderTest.testLoadMissingDirectory(FileMetadataLoaderTest.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:236)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
        at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:386)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:323)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:143){noformat}
It then goes into JniUtil::InitLibhdfs():
{noformat}
Java_org_apache_impala_service_FeSupport_NativeFeInit()
-> InitCommonRuntime()
   -> JniUtil::InitLibhdfs(){noformat}
It closes the FileSystem object:
{code:cpp}
void JniUtil::InitLibhdfs() {
  // make random libhdfs calls to make sure that the context class loader isn't
  // null; see xxx for an explanation
  hdfsFS fs = hdfsConnect("default", 0);
  hdfsDisconnect(fs);
}
{code}
Note that hdfsDisconnect() is the cpp equivalent for FileSystem#close():
{code:c}
int hdfsDisconnect(hdfsFS fs)
{
    // JAVA EQUIVALENT:
    //  fs.close()
{code}
[https://github.com/apache/hadoop/blob/5bb11cecea136acccac2563b37021b554e517012/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c#L869-L872]

> FileMetadataLoaderTest is flaky due to IOException: Filesystem closed
> ---------------------------------------------------------------------
>
>                 Key: IMPALA-11720
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11720
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>
> IMPALA-11699 recently reverts the changes in FileMetadataLoaderTest that extends it to  FrontendTestBase. This causes tests in it become flaky if run individually. E.g. the following commands all fail:
> {code:java}
> (pushd fe && mvn test -Dtest=FileMetadataLoaderTest#testRecursiveLoading)
> (pushd fe && mvn test -Dtest=FileMetadataLoaderTest#testHudiParquetLoading)
> (pushd fe && mvn test -Dtest=FileMetadataLoaderTest#testIcebergLoading)
> (pushd fe && mvn test -Dtest=FileMetadataLoaderTest#testAcidMinorCompactionLoading)
> (pushd fe && mvn test -Dtest=FileMetadataLoaderTest#testLoadMissingDirectory)
> (pushd fe && mvn test -Dtest=FileMetadataLoaderTest#testSkipHiddenDirectories){code}
> and the failures are the same, i.e. due to java.io.IOException: Filesystem closed
> {noformat}
> $ (pushd fe && mvn test -Dtest=FileMetadataLoaderTest#testLoadMissingDirectory)
> ...
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.601 s <<< FAILURE! - in org.apache.impala.catalog.FileMetadataLoaderTest
> [ERROR] testLoadMissingDirectory(org.apache.impala.catalog.FileMetadataLoaderTest)  Time elapsed: 1.565 s  <<< ERROR!
> java.io.IOException: Filesystem closed
> 	at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:483)
> 	at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1662)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1266)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1249)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208)
> 	at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2145)
> 	at org.apache.impala.common.FileSystemUtil.listLocatedStatusIterator(FileSystemUtil.java:804)
> 	at org.apache.impala.common.FileSystemUtil$RecursingIterator.<init>(FileSystemUtil.java:975)
> 	at org.apache.impala.common.FileSystemUtil$RecursingIterator.<init>(FileSystemUtil.java:958)
> 	at org.apache.impala.common.FileSystemUtil.listFiles(FileSystemUtil.java:768)
> 	at org.apache.impala.catalog.FileMetadataLoader.load(FileMetadataLoader.java:187)
> 	at org.apache.impala.catalog.FileMetadataLoaderTest.testLoadMissingDirectory(FileMetadataLoaderTest.java:192)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:236)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:386)
> 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:323)
> 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:143)
> {noformat}
> However, any test runs after the failure will succeed. If we run the test twice (by adding -Dsurefire.rerunFailingTestsCount=2), the second run also succeed:
> {noformat}
> $ (pushd fe && mvn test -Dtest=FileMetadataLoaderTest#testLoadMissingDirectory -Dsurefire.rerunFailingTestsCount=2)
> ...
> [WARNING] Flakes: 
> [WARNING] org.apache.impala.catalog.FileMetadataLoaderTest.testLoadMissingDirectory(org.apache.impala.catalog.FileMetadataLoaderTest)
> [ERROR]   Run 1: FileMetadataLoaderTest.testLoadMissingDirectory:192 » IO Filesystem closed
> [INFO]   Run 2: PASS
> [INFO] 
> [INFO] 
> [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Flakes: 1
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org