You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Mark Jens <ma...@gmail.com> on 2021/11/30 09:04:35 UTC

Consistent IT tests failures on Linux ARM64

Hello Accumulo community,

At my job we consider using Linux ARM64 servers and I've been tasked to
test Accumulo.

I face some timeout related issues with several IT tests:


[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
 Time elapsed: 420.122 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420
seconds
at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
at java.base@11.0.11
/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.11
/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
at java.base@11.0.11
/java.util.concurrent.FutureTask.get(FutureTask.java:190)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)

[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
 Time elapsed: 420.122 s  <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:44251)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method)
at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)

[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
 Time elapsed: 420.011 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420
seconds
at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
at
app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)

[INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Running org.apache.accumulo.test.functional.BinaryIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
65.034 s - in org.apache.accumulo.test.functional.BinaryIT
[INFO] Running org.apache.accumulo.test.functional.PermissionsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
[INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Running org.apache.accumulo.test.functional.RestartStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
[INFO] Running org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Running org.apache.accumulo.test.functional.BulkNewIT
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
[INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Running org.apache.accumulo.test.functional.BulkIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
122.959 s - in org.apache.accumulo.test.functional.BulkIT
[INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Running
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
219.253 s - in
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Running org.apache.accumulo.test.functional.VisibilityIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
[INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Running org.apache.accumulo.test.functional.SummaryIT
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
111.552 s - in org.apache.accumulo.test.functional.SummaryIT
[INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Running org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
71.934 s - in org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
307.904 s <<< FAILURE! - in
org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
 Time elapsed: 240.011 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 240
seconds
at java.base@11.0.11/java.lang.Object.wait(Native Method)
at java.base@11.0.11/java.lang.Object.wait(Object.java:328)
at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)

[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
 Time elapsed: 240.012 s  <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:39285)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method)
at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)

[INFO] Running org.apache.accumulo.test.functional.MetadataIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
97.987 s - in org.apache.accumulo.test.functional.MetadataIT
[INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Running org.apache.accumulo.test.AuditMessageIT
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
165.169 s - in org.apache.accumulo.test.AuditMessageIT
[INFO] Running
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
0.039 s - in
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR]
org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
[ERROR]   Run 1:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
TestTimedOut
[ERROR]   Run 2:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »  Appears
to ...
[INFO]
[ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
TestTimedOut test t...
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
[ERROR]   Run 1:
ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 » TestTimedOut
tes...
[ERROR]   Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
 Appears to be stuck...
[INFO]
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
[ERROR]   Run 1:
HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
» TestTimedOut
[ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be stuck in
thread Time-limited te...
[INFO]
[ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
[ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
TestTimedOut test timed ...
[ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
Time-limited test-SendThread(...

These tests fail consistently at every build attempt!

The tests fail even when executed separately, e.g.:
mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test


I am using the current 'main' branch of Accumulo.
JDK 11.0.11
Maven: 3.8.2
OS: Ubuntu 20.04.3 ARM64

Is there anything that could be done to fix these problems ?
For example some config settings ?!

P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
ARM64 is a supported platform since the JVM supports it.

Thanks!

Mark

Re: Consistent IT tests failures on Linux ARM64

Posted by Mark Jens <ma...@gmail.com>.
On Fri, 3 Dec 2021 at 10:46, Mark Jens <ma...@gmail.com> wrote:

> I've just make few more tests and:
>
> 1) with the improvement
>
> 1.1) INFO] Running
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 132.823 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>
> 1.2) Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 114.933 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>
> 2) without
>
> [INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [ERROR] Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed:
> 577.537 s <<< FAILURE! - in
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [ERROR]
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>  Time elapsed: 420.095 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 420
> seconds
> at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
> at java.base@11.0.11
> /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> at
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>
> I am going to investigate ExternalCompaction_2_IT and
> ExternalCompaction_3_IT too. These are the other very slow tests
>

I don't have much luck with those so far

mvn clean verify
-Dit.test=ExternalCompaction_2_IT#testSplitCancelsExternalCompaction
-Dtimeout.factor=3 -o -Pfast-build -N

[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
1,800.775 s <<< FAILURE! - in
org.apache.accumulo.test.compaction.ExternalCompaction_2_IT
[ERROR]
org.apache.accumulo.test.compaction.ExternalCompaction_2_IT.testSplitCancelsExternalCompaction
 Time elapsed: 1,800.024 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 1800
seconds
at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
at
app//org.apache.accumulo.fate.util.UtilWaitThread.sleep(UtilWaitThread.java:33)
at
app//org.apache.accumulo.test.compaction.ExternalCompactionTestUtils.confirmCompactionCompleted(ExternalCompactionTestUtils.java:329)
at
app//org.apache.accumulo.test.compaction.ExternalCompaction_2_IT.testSplitCancelsExternalCompaction(ExternalCompaction_2_IT.java:118)

For some reason this test always times out, no matter how much time I give
it.
Thread dumps on all processes do not show anything interesting. Or at least
I don't find anything suspicious.
I've uploaded them at
https://gist.github.com/markjens/326b681c3f8a9c1b6400f55847fe7716

-Pfast-build is
<profile>
      <id>fast-build</id>
      <properties>
        <checkstyle.skip>true</checkstyle.skip>
        <spotbugs.skip>true</spotbugs.skip>
        <maven.gitcommitid.skip>true</maven.gitcommitid.skip>
        <jacoco.skip>true></jacoco.skip>
        <enforcer.skip>true></enforcer.skip>
        <maven.javadoc.skip>true</maven.javadoc.skip>
        <spotbugs.skip>true</spotbugs.skip>
        <gpg.skip>true</gpg.skip>
        <license.skip>true</license.skip>
      </properties>
    </profile>
I have it in my ~/.m2/settings.xml to speed up the builds.


>
>
> On Thu, 2 Dec 2021 at 17:40, Christopher <ct...@apache.org> wrote:
>
>> I don't see any reason it would break anything else and not opposed to
>> making a change there to avoid repeated calls to the security provider
>> to create the credentials, but I'm strongly suspicious that this would
>> fix the performance problem with that IT. I've seen that test pass
>> very quickly before, without your change. I think it might be a
>> coincidence. I think if you were to capture a thread dump at other
>> times, you wouldn't always see it in that code, but you'd find it busy
>> doing other work instead. If it does fix it permanently, though, I'd
>> be pleasantly surprised. Regardless, I think we can move forward with
>> your PR, either way, because it does avoid unnecessary recomputation
>> of immutable credentials in ServerInfo.
>>
>> On Thu, Dec 2, 2021 at 7:23 AM Mark Jens <ma...@gmail.com> wrote:
>> >
>> > Please review https://github.com/apache/accumulo/pull/2374
>> > By caching the ServerInfo's Credentials ConcurrentDeleteTableIT passes
>> > almost 6 times faster now!
>> > I am running the whole test suite now to see whether it doesn't break
>> > something else.
>> >
>> > On Thu, 2 Dec 2021 at 13:49, Mark Jens <ma...@gmail.com> wrote:
>> >
>> > > Reducing the log output did not reduce the test run time:
>> > >
>> > > diff --git test/src/main/resources/log4j2-test.properties
>> > > test/src/main/resources/log4j2-test.properties
>> > > index 9124914f7a..810c7bf06f 100644
>> > > --- test/src/main/resources/log4j2-test.properties
>> > > +++ test/src/main/resources/log4j2-test.properties
>> > > @@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
>> > >  appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
>> > >
>> > >  logger.01.name = org.apache.accumulo.core
>> > > -logger.01.level = debug
>> > > +logger.01.level = info
>> > >
>> > >  logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
>> > >  logger.02.level = info
>> > > @@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
>> > >  logger.25.level = info
>> > >
>> > >  logger.26.name = org.apache.hadoop.minikdc
>> > > -logger.26.level = debug
>> > > +logger.26.level = info
>> > >
>> > >
>> > > @@ -169,6 +169,6 @@ logger.metrics.level = info
>> > >  logger.metrics.additivity = false
>> > >  logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput
>> > >
>> > > -rootLogger.level = debug
>> > > +rootLogger.level = info
>> > >  rootLogger.appenderRef.console.ref = STDOUT
>> > >
>> > > INFO] Running
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 785.503 s - in
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>> > >
>> > >
>> > > On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:
>> > >
>> > >> Hi again,
>> > >>
>> > >> Here are the thread dumps as promised:
>> > >>
>> > >> 1) Both TabletServers are very busy at compressing at close time. The
>> > >> following stacks are dumped in ~5 secs interval:
>> > >>
>> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
>> cpu=68425.44ms
>> > >> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
>> > >>  [0x0000fffe8f3fd000]
>> > >>    java.lang.Thread.State: RUNNABLE
>> > >>         at
>> sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
>> > >> /SHA5.java:232)
>> > >>         at sun.security.provider.SHA5.implCompress(java.base@11.0.11
>> > >> /SHA5.java:221)
>> > >>         at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:124)
>> > >>         at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >>         at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >>         at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
>> > >>         at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >>         at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >>         at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> > >>         at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> > >>         at
>> > >>
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
>> > >>         at
>> > >>
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
>> > >>         at
>> > >>
>> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
>> > >>         at
>> > >>
>> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
>> > >>         at
>> > >>
>> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
>> > >>         at
>> > >>
>> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
>> > >>         - locked <0x00000000f1585830> (a
>> > >> org.apache.accumulo.tserver.tablet.Tablet)
>> > >>         at
>> > >> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
>> > >>         at
>> > >>
>> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
>> > >>         at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >>         at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
>> > >> Source)
>> > >>         at
>> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
>> > >> /ThreadPoolExecutor.java:1128)
>> > >>         at
>> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
>> > >> /ThreadPoolExecutor.java:628)
>> > >>         at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >>         at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
>> > >> Source)
>> > >>         at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
>> cpu=72485.20ms
>> > >> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
>> > >>  [0x0000fffe8f3fd000]
>> > >>    java.lang.Thread.State: RUNNABLE
>> > >>         at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> > >> /DigestBase.java:149)
>> > >>         at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> > >> /DigestBase.java:144)
>> > >>         at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:131)
>> > >>         at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >>         at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >>         at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>> > >>         at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >>         at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >>         ...
>> > >>
>> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
>> cpu=81174.59ms
>> > >> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
>> > >>  [0x0000fffe8f3fd000]
>> > >>    java.lang.Thread.State: RUNNABLE
>> > >>         at
>> sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
>> > >> /ByteArrayAccess.java:449)
>> > >>         at sun.security.provider.SHA5.implDigest(java.base@11.0.11
>> > >> /SHA5.java:131)
>> > >>         at
>> sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
>> > >> /DigestBase.java:210)
>> > >>         at
>> sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
>> > >> /DigestBase.java:189)
>> > >>         at
>> > >> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
>> > >> /MessageDigest.java:639)
>> > >>         at java.security.MessageDigest.digest(java.base@11.0.11
>> > >> /MessageDigest.java:385)
>> > >>         at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
>> > >>         at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >>         at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >>         ...
>> > >>
>> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
>> cpu=86499.01ms
>> > >> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
>> > >>  [0x0000fffe8f3fd000]
>> > >>    java.lang.Thread.State: RUNNABLE
>> > >>         at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> > >> /DigestBase.java:149)
>> > >>         at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> > >> /DigestBase.java:144)
>> > >>         at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:131)
>> > >>         at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >>         at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >>         at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>> > >>         at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >>         at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >>         at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >>         ...
>> > >>
>> > >> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0
>> cpu=109551.37ms
>> > >> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
>> > >>  [0x0000fffe7bffd000]
>> > >> 14012    java.lang.Thread.State: RUNNABLE
>> > >> 14013   at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> > >> /DigestBase.java:149)
>> > >> 14014   at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> > >> /DigestBase.java:144)
>> > >> 14015   at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:131)
>> > >> 14016   at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >> 14017   at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >> 14018   at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
>> > >> 14019   at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> 14020   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> 14021   at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> 14022   at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> 14023   at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> 14024   at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> 14025   at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> 14026   at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
>> > >> 14027   at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
>> > >> 14028   at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> > >>
>> > >> Notice that ClientContext.getProperties(ClientContext.java:236) most
>> of
>> > >> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in
>> the
>> > >> last one it calls
>> ServerInfo.getAuthenticationToken(ServerInfo.java:153).
>> > >> And both lead to (a lot of ?!) compressing..
>> > >>
>> > >> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
>> > >> level should not be DEBUG ?!
>> > >>
>> > >> Most of its threads either wait for notifications from Zookeeper:
>> > >>
>> > >> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
>> > >> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
>> > >> Object.wait()  [0x0000fffebb7fc000]
>> > >>  878648    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>> > >>  878649   at java.lang.Object.wait(java.base@11.0.11/Native Method)
>> > >>  878650   - waiting on <no object reference available>
>> > >>  878651   at
>> > >>
>> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
>> > >>  878652   - waiting to re-lock in wait() <0x00000000f1427458> (a
>> > >> org.apache.accumulo.fate.ZooStore)
>> > >>  878653   at
>> > >>
>> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
>> > >>  878654   at
>> > >>
>> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
>> > >>  878655   at
>> > >> org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
>> > >>  878656   at
>> > >>
>> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
>> > >>  878657   at
>> > >>
>> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
>> > >> ...
>> > >>
>> > >> or wait for data:
>> > >> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0
>> cpu=7440.91ms
>> > >> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
>> > >>  [0x0000fffebadfd000]
>> > >>  878782    java.lang.Thread.State: WAITING (on object monitor)
>> > >>  878783   at java.lang.Object.wait(java.base@11.0.11/Native Method)
>> > >>  878784   - waiting on <no object reference available>
>> > >>  878785   at java.lang.Object.wait(java.base@11.0.11
>> /Object.java:328)
>> > >>  878786   at
>> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>> > >>  878787   - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
>> > >> org.apache.zookeeper.ClientCnxn$Packet)
>> > >>  878788   at
>> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>> > >>  878789   at
>> > >> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
>> > >>  878790   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
>> > >>  878791   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
>> > >> Source)
>> > >>  878792   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
>> > >> Source)
>> > >>  878793   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>> > >>  878794   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>> > >>  878795   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>> > >>  878796   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
>> > >>  878797   at
>> org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
>> > >>  878798   at
>> > >> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
>> > >>  878799   at
>> > >>
>> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
>> > >>  878800   at
>> > >> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
>> > >>  878801   at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >>  878802   at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> > >> Source)
>> > >>  878803   at
>> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
>> > >> /ThreadPoolExecutor.java:1128)
>> > >>  878804   at
>> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
>> > >> /ThreadPoolExecutor.java:628)
>> > >>  878805   at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >>  878806   at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> > >> Source)
>> > >>  878807   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
>> > >> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
>> > >>  [0x0000ffff20f50000]
>> > >>  908221    java.lang.Thread.State: WAITING (on object monitor)
>> > >>  908222   at java.lang.Object.wait(java.base@11.0.11/Native Method)
>> > >>  908223   - waiting on <no object reference available>
>> > >>  908224   at java.lang.Object.wait(java.base@11.0.11
>> /Object.java:328)
>> > >>  908225   at
>> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>> > >>  908226   - waiting to re-lock in wait() <0x00000000fa781138> (a
>> > >> org.apache.zookeeper.ClientCnxn$Packet)
>> > >>  908227   at
>> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>> > >>  908228   at
>> org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
>> > >>  908229   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
>> > >>  908230   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
>> > >> Source)
>> > >>  908231   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
>> > >> Source)
>> > >>  908232   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>> > >>  908233   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>> > >>  908234   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>> > >>  908235   at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
>> > >>  908236   at
>> > >>
>> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
>> > >>  908237   at
>> > >>
>> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
>> > >>  908238   at
>> > >>
>> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
>> > >>  908239   at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >>  908240   at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> > >> Source)
>> > >>  908241   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >> 3) SimpleGarbageCollector is also busy in getting credentials
>> > >>
>> > >>  "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
>> > >> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
>> > >> 2503    java.lang.Thread.State: RUNNABLE
>> > >> 2504   at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> > >> /DigestBase.java:149)
>> > >> 2505   at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> > >> /DigestBase.java:144)
>> > >> 2506   at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:131)
>> > >> 2507   at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >> 2508   at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >> 2509   at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
>> > >> 2510   at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> 2511   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> 2512   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> 2513   at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> 2514   at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> 2515   at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> 2516   at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> 2517   at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >> 2518   at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >> 2519   at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> > >> 2520   at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> > >> 2521   at
>> > >>
>> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
>> > >> 2522   at
>> > >>
>> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
>> > >> 2523   at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
>> > >> 2524   at
>> > >>
>> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
>> > >> 2525   at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >> 2526   at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
>> > >> Source)
>> > >> 2527   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >>
>> > >> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
>> > >> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
>> > >> 3152    java.lang.Thread.State: RUNNABLE
>> > >> 3153   at java.util.Arrays.hashCode(java.base@11.0.11
>> /Arrays.java:4685)
>> > >> 3154   at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
>> > >> 3155   at
>> java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
>> > >> /Provider.java:1107)
>> > >> 3156   at
>> java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
>> > >> /ConcurrentHashMap.java:936)
>> > >> 3157   at java.security.Provider.getService(java.base@11.0.11
>> > >> /Provider.java:1282)
>> > >> 3158   at sun.security.jca.ProviderList.getService(java.base@11.0.11
>> > >> /ProviderList.java:380)
>> > >> 3159   at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
>> > >> /GetInstance.java:157)
>> > >> 3160   at java.security.Security.getImpl(java.base@11.0.11
>> > >> /Security.java:700)
>> > >> 3161   at java.security.MessageDigest.getInstance(java.base@11.0.11
>> > >> /MessageDigest.java:178)
>> > >> 3162   at
>> > >>
>> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
>> > >> 3163   at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
>> > >> 3164   at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> 3165   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> 3166   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> 3167   at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> 3168   at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> 3169   at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> 3170   at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> 3171   at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
>> > >> 3172   at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
>> > >> 3173   at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> > >> 3174   at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> > >> 3175   at
>> > >>
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
>> > >> 3176   at
>> > >>
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
>> > >> 3177   at
>> > >>
>> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
>> > >> 3178   at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
>> > >> 3179   at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
>> > >> 3180   at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
>> > >> 3181   at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
>> > >> 3182   at
>> > >>
>> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
>> > >> 3183   at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >> 3184   at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
>> > >> Source)
>> > >> 3185   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >>
>> > >> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
>> > >> processes
>> > >>
>> > >>
>> > >> I'm not saying that the above are problematic. You know how Accumulo
>> > >> works. It is up to you to decide whether something should be
>> improved.
>> > >>
>> > >> Regards,
>> > >> Mark
>> > >>
>> > >>
>> > >> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com>
>> wrote:
>> > >>
>> > >>>
>> > >>>
>> > >>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org>
>> wrote:
>> > >>>
>> > >>>> It looks like the tests are timing out. This happens frequently
>> when
>> > >>>> running on resource-constrained systems. You can give the test more
>> > >>>> time by increasing the timeout factor: `mvn clean verify
>> > >>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
>> > >>>> -Dtimeout.factor=3`
>> > >>>>
>> > >>>> There's nothing we know of that would change the way our tests work
>> > >>>> due to ARM64, but you may have issues because of limited RAM, slow
>> CPU
>> > >>>> speeds, slow disk I/O, busy background processes, or other
>> > >>>> resource-related issues. I don't think most of the currently active
>> > >>>> developers use ARM64, or have access to a test machine to
>> reproduce or
>> > >>>>
>> > >>>
>> > >>> In case anyone wants to test on Linux ARM64 you could easily use
>> Oracle
>> > >>> Cloud for free.
>> > >>>
>> > >>>
>> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
>> > >>> explains how to create a VM and how to use this VM as a Github
>> Actions
>> > >>> runner.
>> > >>>
>> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
>> > >>> mentions this article.
>> > >>>
>> > >>>
>> > >>>> experiment with Accumulo there, so you may have to do some of your
>> own
>> > >>>> troubleshooting. If you can rule out resource-constraint issues,
>> and
>> > >>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is
>> known
>> > >>>> flaky and sometimes times out on x86_64 as well), you could create
>> a
>> > >>>> bug ticket with more details at
>> > >>>> https://github.com/apache/accumulo/issues ; there is an issue
>> template
>> > >>>> specifically for broken and/or flaky tests that you can select when
>> > >>>> creating a new ticket.
>> > >>>>
>> > >>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com>
>> > >>>> wrote:
>> > >>>> >
>> > >>>> > Hi dev1,
>> > >>>> >
>> > >>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>> > >>>> >
>> > >>>> > > Some of those tests are trying to stress conditions that
>> require a
>> > >>>> lot of
>> > >>>> > > resources to replicate specific conditions. Have you tried to
>> run
>> > >>>> those
>> > >>>> > > individual tests in isolation so that you are not competing for
>> > >>>> resources?
>> > >>>> > > Do they always fail, or are the failures transient?
>> > >>>> > >
>> > >>>> >
>> > >>>> > Q: Have you tried to run those individual tests in isolation so
>> that
>> > >>>> you
>> > >>>> > are not competing for resources?
>> > >>>> > A: This is what I mean with the following:
>> > >>>> > ---------------------
>> > >>>> > The tests fail even when executed separately, e.g.:
>> > >>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf
>> :accumulo-test
>> > >>>> > ---------------------
>> > >>>> >
>> > >>>> > Q: Do they always fail, or are the failures transient?
>> > >>>> > A: I also tried to explain that with "These tests fail
>> consistently at
>> > >>>> > every build attempt!"
>> > >>>> >
>> > >>>> > Mark
>> > >>>> >
>> > >>>> > >
>> > >>>> > > -----Original Message-----
>> > >>>> > > From: Mark Jens <ma...@gmail.com>
>> > >>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
>> > >>>> > > To: dev@accumulo.apache.org
>> > >>>> > > Subject: Consistent IT tests failures on Linux ARM64
>> > >>>> > >
>> > >>>> > > Hello Accumulo community,
>> > >>>> > >
>> > >>>> > > At my job we consider using Linux ARM64 servers and I've been
>> > >>>> tasked to
>> > >>>> > > test Accumulo.
>> > >>>> > >
>> > >>>> > > I face some timeout related issues with several IT tests:
>> > >>>> > >
>> > >>>> > >
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > >>>> > >  Time elapsed: 420.122 s  <<< ERROR!
>> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
>> after
>> > >>>> 420
>> > >>>> > > seconds at java.base@11.0.11
>> /jdk.internal.misc.Unsafe.park(Native
>> > >>>> Method)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > >>>> > > Method)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > >>>> > > at java.base@11.0.11
>> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >>>> > >
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > >>>> > >  Time elapsed: 420.122 s  <<< ERROR!
>> > >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>> > >>>> > > test-SendThread(localhost:44251)
>> > >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>> > >>>> > > java.base@11.0.11
>> > >>>> > >
>> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>> > >>>> > > at java.base@11.0.11/sun.nio.ch
>> > >>>> .SelectorImpl.select(SelectorImpl.java:136)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>> > >>>> > > at
>> > >>>> > >
>> > >>>>
>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >>>> > >
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
>> > >>>> > >  Time elapsed: 420.011 s  <<< ERROR!
>> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
>> after
>> > >>>> 420
>> > >>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native
>> Method)
>> > >>>> at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
>> > >>>> > > at
>> > >>>>
>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
>> > >>>> > > at
>> > >>>>
>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > >>>> > > Method)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > >>>> > > at java.base@11.0.11
>> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >>>> > >
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.ScannerContextIT
>> > >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 102.909 s - in
>> org.apache.accumulo.test.functional.ScannerContextIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.KerberosRenewalIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 504.472 s - in
>> org.apache.accumulo.test.functional.KerberosRenewalIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.BatchWriterFlushIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 62.132 s - in
>> org.apache.accumulo.test.functional.BatchWriterFlushIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.PermissionsIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.ZookeeperRestartIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 37.37 s - in
>> org.apache.accumulo.test.functional.ZookeeperRestartIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 23.046 s - in
>> > >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>> > >>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 255.108 s - in
>> > >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.RestartStressIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 78.359 s - in
>> org.apache.accumulo.test.functional.RestartStressIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 59.289 s - in
>> > >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
>> > >>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BloomFilterIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 135.298 s - in
>> org.apache.accumulo.test.functional.BloomFilterIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BinaryStressIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 38.626 s - in
>> org.apache.accumulo.test.functional.BinaryStressIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.ClassLoaderIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.LogicalTimeIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 116.819 s - in
>> org.apache.accumulo.test.functional.LogicalTimeIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.SplitRecoveryIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 25.421 s - in
>> org.apache.accumulo.test.functional.SplitRecoveryIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BigRootTabletIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 96.86 s - in
>> org.apache.accumulo.test.functional.BigRootTabletIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
>> > >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 238.409 s - in
>> > >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
>> > >>>> > > [INFO] Running
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 219.253 s - in
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>> > >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 489.863 s - in
>> > >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
>> > >>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.ManagerFailoverIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 47.089 s - in
>> org.apache.accumulo.test.functional.ManagerFailoverIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BackupManagerIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 22.943 s - in
>> org.apache.accumulo.test.functional.BackupManagerIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.TabletMetadataIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 46.728 s - in
>> org.apache.accumulo.test.functional.TabletMetadataIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.LateLastContactIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 46.648 s - in
>> org.apache.accumulo.test.functional.LateLastContactIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 71.934 s - in
>> > >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.HalfDeadTServerIT
>> > >>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 307.904 s <<< FAILURE! - in
>> > >>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
>> > >>>> > > [ERROR]
>> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > >>>> > >  Time elapsed: 240.011 s  <<< ERROR!
>> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
>> after
>> > >>>> 240
>> > >>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native
>> Method)
>> > >>>> at
>> > >>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
>> > >>>> > > at java.base@11.0.11
>> > >>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > >>>> > > Method)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > >>>> > > at java.base@11.0.11
>> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >>>> > >
>> > >>>> > > [ERROR]
>> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > >>>> > >  Time elapsed: 240.012 s  <<< ERROR!
>> > >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>> > >>>> > > test-SendThread(localhost:39285)
>> > >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>> > >>>> > > java.base@11.0.11
>> > >>>> > >
>> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>> > >>>> > > at java.base@11.0.11/sun.nio.ch
>> > >>>> .SelectorImpl.select(SelectorImpl.java:136)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>> > >>>> > > at
>> > >>>> > >
>> > >>>>
>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >>>> > >
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
>> > >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 43.91 s - in
>> > >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.DeleteRowsSplitIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 113.928 s - in
>> org.apache.accumulo.test.functional.DeleteRowsSplitIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
>> > >>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
>> > >>>> > > [INFO] Running
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>> > >>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1,
>> Time
>> > >>>> elapsed:
>> > >>>> > > 0.039 s - in
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>> > >>>> > > [INFO]
>> > >>>> > > [INFO] Results:
>> > >>>> > > [INFO]
>> > >>>> > > [ERROR] Errors:
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
>> > >>>> > > [ERROR]   Run 1:
>> > >>>> > >
>> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178
>> > >>>> »
>> > >>>> > > TestTimedOut
>> > >>>> > > [ERROR]   Run 2:
>> > >>>> > >
>> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
>> > >>>> Appears
>> > >>>> > > to ...
>> > >>>> > > [INFO]
>> > >>>> > > [ERROR]
>>  ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
>> > >>>> > > TestTimedOut test t...
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > >>>> > > [ERROR]   Run 1:
>> > >>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
>> > >>>> TestTimedOut
>> > >>>> > > tes...
>> > >>>> > > [ERROR]   Run 2:
>> > >>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
>> > >>>> > >  Appears to be stuck...
>> > >>>> > > [INFO]
>> > >>>> > > [ERROR]
>> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > >>>> > > [ERROR]   Run 1:
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
>> > >>>> > > » TestTimedOut
>> > >>>> > > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be
>> > >>>> stuck in
>> > >>>> > > thread Time-limited te...
>> > >>>> > > [INFO]
>> > >>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
>> > >>>> > > [ERROR]   Run 1:
>> > >>>> SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
>> > >>>> > > TestTimedOut test timed ...
>> > >>>> > > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in
>> thread
>> > >>>> > > Time-limited test-SendThread(...
>> > >>>> > >
>> > >>>> > > These tests fail consistently at every build attempt!
>> > >>>> > >
>> > >>>> > > The tests fail even when executed separately, e.g.:
>> > >>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf
>> :accumulo-test
>> > >>>> > >
>> > >>>> > >
>> > >>>> > > I am using the current 'main' branch of Accumulo.
>> > >>>> > > JDK 11.0.11
>> > >>>> > > Maven: 3.8.2
>> > >>>> > > OS: Ubuntu 20.04.3 ARM64
>> > >>>> > >
>> > >>>> > > Is there anything that could be done to fix these problems ?
>> > >>>> > > For example some config settings ?!
>> > >>>> > >
>> > >>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read
>> that
>> > >>>> Linux
>> > >>>> > > ARM64 is a supported platform since the JVM supports it.
>> > >>>> > >
>> > >>>> > > Thanks!
>> > >>>> > >
>> > >>>> > > Mark
>> > >>>> > >
>> > >>>>
>> > >>>
>>
>

Re: Consistent IT tests failures on Linux ARM64

Posted by Mark Jens <ma...@gmail.com>.
I've just make few more tests and:

1) with the improvement

1.1) INFO] Running
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
132.823 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT

1.2) Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
114.933 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT

2) without

[INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[ERROR] Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed:
577.537 s <<< FAILURE! - in
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
 Time elapsed: 420.095 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420
seconds
at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
at java.base@11.0.11
/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.11
/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
at java.base@11.0.11
/java.util.concurrent.FutureTask.get(FutureTask.java:190)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)

I am going to investigate ExternalCompaction_2_IT and
ExternalCompaction_3_IT too. These are the other very slow tests


On Thu, 2 Dec 2021 at 17:40, Christopher <ct...@apache.org> wrote:

> I don't see any reason it would break anything else and not opposed to
> making a change there to avoid repeated calls to the security provider
> to create the credentials, but I'm strongly suspicious that this would
> fix the performance problem with that IT. I've seen that test pass
> very quickly before, without your change. I think it might be a
> coincidence. I think if you were to capture a thread dump at other
> times, you wouldn't always see it in that code, but you'd find it busy
> doing other work instead. If it does fix it permanently, though, I'd
> be pleasantly surprised. Regardless, I think we can move forward with
> your PR, either way, because it does avoid unnecessary recomputation
> of immutable credentials in ServerInfo.
>
> On Thu, Dec 2, 2021 at 7:23 AM Mark Jens <ma...@gmail.com> wrote:
> >
> > Please review https://github.com/apache/accumulo/pull/2374
> > By caching the ServerInfo's Credentials ConcurrentDeleteTableIT passes
> > almost 6 times faster now!
> > I am running the whole test suite now to see whether it doesn't break
> > something else.
> >
> > On Thu, 2 Dec 2021 at 13:49, Mark Jens <ma...@gmail.com> wrote:
> >
> > > Reducing the log output did not reduce the test run time:
> > >
> > > diff --git test/src/main/resources/log4j2-test.properties
> > > test/src/main/resources/log4j2-test.properties
> > > index 9124914f7a..810c7bf06f 100644
> > > --- test/src/main/resources/log4j2-test.properties
> > > +++ test/src/main/resources/log4j2-test.properties
> > > @@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
> > >  appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
> > >
> > >  logger.01.name = org.apache.accumulo.core
> > > -logger.01.level = debug
> > > +logger.01.level = info
> > >
> > >  logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
> > >  logger.02.level = info
> > > @@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
> > >  logger.25.level = info
> > >
> > >  logger.26.name = org.apache.hadoop.minikdc
> > > -logger.26.level = debug
> > > +logger.26.level = info
> > >
> > >
> > > @@ -169,6 +169,6 @@ logger.metrics.level = info
> > >  logger.metrics.additivity = false
> > >  logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput
> > >
> > > -rootLogger.level = debug
> > > +rootLogger.level = info
> > >  rootLogger.appenderRef.console.ref = STDOUT
> > >
> > > INFO] Running
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 785.503 s - in
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> > >
> > >
> > > On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:
> > >
> > >> Hi again,
> > >>
> > >> Here are the thread dumps as promised:
> > >>
> > >> 1) Both TabletServers are very busy at compressing at close time. The
> > >> following stacks are dumped in ~5 secs interval:
> > >>
> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
> cpu=68425.44ms
> > >> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
> > >>  [0x0000fffe8f3fd000]
> > >>    java.lang.Thread.State: RUNNABLE
> > >>         at
> sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
> > >> /SHA5.java:232)
> > >>         at sun.security.provider.SHA5.implCompress(java.base@11.0.11
> > >> /SHA5.java:221)
> > >>         at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:124)
> > >>         at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >>         at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >>         at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> > >>         at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >>         at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> > >>         at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> > >>         at
> > >>
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> > >>         at
> > >>
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> > >>         at
> > >>
> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
> > >>         at
> > >>
> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
> > >>         at
> > >>
> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
> > >>         at
> > >>
> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
> > >>         - locked <0x00000000f1585830> (a
> > >> org.apache.accumulo.tserver.tablet.Tablet)
> > >>         at
> > >> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
> > >>         at
> > >>
> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
> > >>         at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >>         at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> > >> Source)
> > >>         at
> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> > >> /ThreadPoolExecutor.java:1128)
> > >>         at
> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> > >> /ThreadPoolExecutor.java:628)
> > >>         at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >>         at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> > >> Source)
> > >>         at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
> cpu=72485.20ms
> > >> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
> > >>  [0x0000fffe8f3fd000]
> > >>    java.lang.Thread.State: RUNNABLE
> > >>         at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> > >> /DigestBase.java:149)
> > >>         at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> > >> /DigestBase.java:144)
> > >>         at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:131)
> > >>         at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >>         at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >>         at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> > >>         at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >>         ...
> > >>
> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
> cpu=81174.59ms
> > >> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
> > >>  [0x0000fffe8f3fd000]
> > >>    java.lang.Thread.State: RUNNABLE
> > >>         at
> sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
> > >> /ByteArrayAccess.java:449)
> > >>         at sun.security.provider.SHA5.implDigest(java.base@11.0.11
> > >> /SHA5.java:131)
> > >>         at
> sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> > >> /DigestBase.java:210)
> > >>         at
> sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> > >> /DigestBase.java:189)
> > >>         at
> > >> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
> > >> /MessageDigest.java:639)
> > >>         at java.security.MessageDigest.digest(java.base@11.0.11
> > >> /MessageDigest.java:385)
> > >>         at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
> > >>         at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >>         ...
> > >>
> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
> cpu=86499.01ms
> > >> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
> > >>  [0x0000fffe8f3fd000]
> > >>    java.lang.Thread.State: RUNNABLE
> > >>         at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> > >> /DigestBase.java:149)
> > >>         at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> > >> /DigestBase.java:144)
> > >>         at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:131)
> > >>         at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >>         at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >>         at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> > >>         at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >>         at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >>         at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >>         ...
> > >>
> > >> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0
> cpu=109551.37ms
> > >> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
> > >>  [0x0000fffe7bffd000]
> > >> 14012    java.lang.Thread.State: RUNNABLE
> > >> 14013   at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> > >> /DigestBase.java:149)
> > >> 14014   at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> > >> /DigestBase.java:144)
> > >> 14015   at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:131)
> > >> 14016   at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >> 14017   at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >> 14018   at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
> > >> 14019   at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> 14020   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> 14021   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> 14022   at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> 14023   at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> 14024   at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> 14025   at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> 14026   at
> > >>
> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> > >> 14027   at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> > >> 14028   at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> > >>
> > >> Notice that ClientContext.getProperties(ClientContext.java:236) most
> of
> > >> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in
> the
> > >> last one it calls
> ServerInfo.getAuthenticationToken(ServerInfo.java:153).
> > >> And both lead to (a lot of ?!) compressing..
> > >>
> > >> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
> > >> level should not be DEBUG ?!
> > >>
> > >> Most of its threads either wait for notifications from Zookeeper:
> > >>
> > >> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
> > >> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
> > >> Object.wait()  [0x0000fffebb7fc000]
> > >>  878648    java.lang.Thread.State: TIMED_WAITING (on object monitor)
> > >>  878649   at java.lang.Object.wait(java.base@11.0.11/Native Method)
> > >>  878650   - waiting on <no object reference available>
> > >>  878651   at
> > >>
> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
> > >>  878652   - waiting to re-lock in wait() <0x00000000f1427458> (a
> > >> org.apache.accumulo.fate.ZooStore)
> > >>  878653   at
> > >>
> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
> > >>  878654   at
> > >>
> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
> > >>  878655   at
> > >> org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
> > >>  878656   at
> > >>
> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
> > >>  878657   at
> > >>
> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
> > >> ...
> > >>
> > >> or wait for data:
> > >> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0
> cpu=7440.91ms
> > >> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
> > >>  [0x0000fffebadfd000]
> > >>  878782    java.lang.Thread.State: WAITING (on object monitor)
> > >>  878783   at java.lang.Object.wait(java.base@11.0.11/Native Method)
> > >>  878784   - waiting on <no object reference available>
> > >>  878785   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> > >>  878786   at
> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> > >>  878787   - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
> > >> org.apache.zookeeper.ClientCnxn$Packet)
> > >>  878788   at
> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> > >>  878789   at
> > >> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
> > >>  878790   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
> > >>  878791   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
> > >> Source)
> > >>  878792   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> > >> Source)
> > >>  878793   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> > >>  878794   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> > >>  878795   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> > >>  878796   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
> > >>  878797   at
> org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
> > >>  878798   at
> > >> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
> > >>  878799   at
> > >>
> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
> > >>  878800   at
> > >> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
> > >>  878801   at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >>  878802   at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> > >> Source)
> > >>  878803   at
> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> > >> /ThreadPoolExecutor.java:1128)
> > >>  878804   at
> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> > >> /ThreadPoolExecutor.java:628)
> > >>  878805   at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >>  878806   at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> > >> Source)
> > >>  878807   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
> > >> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
> > >>  [0x0000ffff20f50000]
> > >>  908221    java.lang.Thread.State: WAITING (on object monitor)
> > >>  908222   at java.lang.Object.wait(java.base@11.0.11/Native Method)
> > >>  908223   - waiting on <no object reference available>
> > >>  908224   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> > >>  908225   at
> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> > >>  908226   - waiting to re-lock in wait() <0x00000000fa781138> (a
> > >> org.apache.zookeeper.ClientCnxn$Packet)
> > >>  908227   at
> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> > >>  908228   at
> org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
> > >>  908229   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
> > >>  908230   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
> > >> Source)
> > >>  908231   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> > >> Source)
> > >>  908232   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> > >>  908233   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> > >>  908234   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> > >>  908235   at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
> > >>  908236   at
> > >>
> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
> > >>  908237   at
> > >>
> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
> > >>  908238   at
> > >> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
> > >>  908239   at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >>  908240   at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> > >> Source)
> > >>  908241   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >> 3) SimpleGarbageCollector is also busy in getting credentials
> > >>
> > >>  "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
> > >> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
> > >> 2503    java.lang.Thread.State: RUNNABLE
> > >> 2504   at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> > >> /DigestBase.java:149)
> > >> 2505   at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> > >> /DigestBase.java:144)
> > >> 2506   at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:131)
> > >> 2507   at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >> 2508   at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >> 2509   at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> > >> 2510   at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> 2511   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> 2512   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> 2513   at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> 2514   at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> 2515   at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> 2516   at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> 2517   at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >> 2518   at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >> 2519   at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> > >> 2520   at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> > >> 2521   at
> > >>
> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
> > >> 2522   at
> > >>
> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
> > >> 2523   at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
> > >> 2524   at
> > >>
> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
> > >> 2525   at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >> 2526   at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> > >> Source)
> > >> 2527   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >>
> > >> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
> > >> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
> > >> 3152    java.lang.Thread.State: RUNNABLE
> > >> 3153   at java.util.Arrays.hashCode(java.base@11.0.11
> /Arrays.java:4685)
> > >> 3154   at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
> > >> 3155   at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
> > >> /Provider.java:1107)
> > >> 3156   at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
> > >> /ConcurrentHashMap.java:936)
> > >> 3157   at java.security.Provider.getService(java.base@11.0.11
> > >> /Provider.java:1282)
> > >> 3158   at sun.security.jca.ProviderList.getService(java.base@11.0.11
> > >> /ProviderList.java:380)
> > >> 3159   at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
> > >> /GetInstance.java:157)
> > >> 3160   at java.security.Security.getImpl(java.base@11.0.11
> > >> /Security.java:700)
> > >> 3161   at java.security.MessageDigest.getInstance(java.base@11.0.11
> > >> /MessageDigest.java:178)
> > >> 3162   at
> > >>
> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
> > >> 3163   at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
> > >> 3164   at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> 3165   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> 3166   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> 3167   at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> 3168   at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> 3169   at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> 3170   at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> 3171   at
> > >>
> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> > >> 3172   at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> > >> 3173   at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> > >> 3174   at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> > >> 3175   at
> > >>
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> > >> 3176   at
> > >>
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> > >> 3177   at
> > >>
> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
> > >> 3178   at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
> > >> 3179   at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
> > >> 3180   at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
> > >> 3181   at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
> > >> 3182   at
> > >>
> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
> > >> 3183   at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >> 3184   at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> > >> Source)
> > >> 3185   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >>
> > >> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
> > >> processes
> > >>
> > >>
> > >> I'm not saying that the above are problematic. You know how Accumulo
> > >> works. It is up to you to decide whether something should be improved.
> > >>
> > >> Regards,
> > >> Mark
> > >>
> > >>
> > >> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:
> > >>
> > >>>
> > >>>
> > >>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org>
> wrote:
> > >>>
> > >>>> It looks like the tests are timing out. This happens frequently when
> > >>>> running on resource-constrained systems. You can give the test more
> > >>>> time by increasing the timeout factor: `mvn clean verify
> > >>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
> > >>>> -Dtimeout.factor=3`
> > >>>>
> > >>>> There's nothing we know of that would change the way our tests work
> > >>>> due to ARM64, but you may have issues because of limited RAM, slow
> CPU
> > >>>> speeds, slow disk I/O, busy background processes, or other
> > >>>> resource-related issues. I don't think most of the currently active
> > >>>> developers use ARM64, or have access to a test machine to reproduce
> or
> > >>>>
> > >>>
> > >>> In case anyone wants to test on Linux ARM64 you could easily use
> Oracle
> > >>> Cloud for free.
> > >>>
> > >>>
> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
> > >>> explains how to create a VM and how to use this VM as a Github
> Actions
> > >>> runner.
> > >>>
> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
> > >>> mentions this article.
> > >>>
> > >>>
> > >>>> experiment with Accumulo there, so you may have to do some of your
> own
> > >>>> troubleshooting. If you can rule out resource-constraint issues, and
> > >>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is
> known
> > >>>> flaky and sometimes times out on x86_64 as well), you could create a
> > >>>> bug ticket with more details at
> > >>>> https://github.com/apache/accumulo/issues ; there is an issue
> template
> > >>>> specifically for broken and/or flaky tests that you can select when
> > >>>> creating a new ticket.
> > >>>>
> > >>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com>
> > >>>> wrote:
> > >>>> >
> > >>>> > Hi dev1,
> > >>>> >
> > >>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
> > >>>> >
> > >>>> > > Some of those tests are trying to stress conditions that
> require a
> > >>>> lot of
> > >>>> > > resources to replicate specific conditions. Have you tried to
> run
> > >>>> those
> > >>>> > > individual tests in isolation so that you are not competing for
> > >>>> resources?
> > >>>> > > Do they always fail, or are the failures transient?
> > >>>> > >
> > >>>> >
> > >>>> > Q: Have you tried to run those individual tests in isolation so
> that
> > >>>> you
> > >>>> > are not competing for resources?
> > >>>> > A: This is what I mean with the following:
> > >>>> > ---------------------
> > >>>> > The tests fail even when executed separately, e.g.:
> > >>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > >>>> > ---------------------
> > >>>> >
> > >>>> > Q: Do they always fail, or are the failures transient?
> > >>>> > A: I also tried to explain that with "These tests fail
> consistently at
> > >>>> > every build attempt!"
> > >>>> >
> > >>>> > Mark
> > >>>> >
> > >>>> > >
> > >>>> > > -----Original Message-----
> > >>>> > > From: Mark Jens <ma...@gmail.com>
> > >>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
> > >>>> > > To: dev@accumulo.apache.org
> > >>>> > > Subject: Consistent IT tests failures on Linux ARM64
> > >>>> > >
> > >>>> > > Hello Accumulo community,
> > >>>> > >
> > >>>> > > At my job we consider using Linux ARM64 servers and I've been
> > >>>> tasked to
> > >>>> > > test Accumulo.
> > >>>> > >
> > >>>> > > I face some timeout related issues with several IT tests:
> > >>>> > >
> > >>>> > >
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >>>> > >  Time elapsed: 420.122 s  <<< ERROR!
> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
> after
> > >>>> 420
> > >>>> > > seconds at java.base@11.0.11
> /jdk.internal.misc.Unsafe.park(Native
> > >>>> Method)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > >>>> > > Method)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>> > > at java.base@11.0.11
> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >>>> > >
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >>>> > >  Time elapsed: 420.122 s  <<< ERROR!
> > >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > >>>> > > test-SendThread(localhost:44251)
> > >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > >>>> > > java.base@11.0.11
> > >>>> > >
> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > >>>> > > at java.base@11.0.11/sun.nio.ch
> > >>>> .SelectorImpl.select(SelectorImpl.java:136)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > >>>> > > at
> > >>>> > >
> > >>>>
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >>>> > >
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> > >>>> > >  Time elapsed: 420.011 s  <<< ERROR!
> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
> after
> > >>>> 420
> > >>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native
> Method)
> > >>>> at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > >>>> > > at
> > >>>>
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > >>>> > > at
> > >>>>
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > >>>> > > Method)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>> > > at java.base@11.0.11
> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >>>> > >
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.ScannerContextIT
> > >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 102.909 s - in
> org.apache.accumulo.test.functional.ScannerContextIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.KerberosRenewalIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 504.472 s - in
> org.apache.accumulo.test.functional.KerberosRenewalIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.BatchWriterFlushIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 62.132 s - in
> org.apache.accumulo.test.functional.BatchWriterFlushIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.ZookeeperRestartIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 37.37 s - in
> org.apache.accumulo.test.functional.ZookeeperRestartIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 23.046 s - in
> > >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > >>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 255.108 s - in
> > >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.RestartStressIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 78.359 s - in
> org.apache.accumulo.test.functional.RestartStressIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 59.289 s - in
> > >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > >>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.BinaryStressIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.SplitRecoveryIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 25.421 s - in
> org.apache.accumulo.test.functional.SplitRecoveryIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.BigRootTabletIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
> > >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 238.409 s - in
> > >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
> > >>>> > > [INFO] Running
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 219.253 s - in
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
> > >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 489.863 s - in
> > >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > >>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.ManagerFailoverIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 47.089 s - in
> org.apache.accumulo.test.functional.ManagerFailoverIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.BackupManagerIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 22.943 s - in
> org.apache.accumulo.test.functional.BackupManagerIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.TabletMetadataIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 46.728 s - in
> org.apache.accumulo.test.functional.TabletMetadataIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.LateLastContactIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 46.648 s - in
> org.apache.accumulo.test.functional.LateLastContactIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 71.934 s - in
> > >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.HalfDeadTServerIT
> > >>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 307.904 s <<< FAILURE! - in
> > >>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > >>>> > > [ERROR]
> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >>>> > >  Time elapsed: 240.011 s  <<< ERROR!
> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
> after
> > >>>> 240
> > >>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native
> Method)
> > >>>> at
> > >>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > >>>> > > at java.base@11.0.11
> > >>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > >>>> > > Method)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>> > > at java.base@11.0.11
> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >>>> > >
> > >>>> > > [ERROR]
> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >>>> > >  Time elapsed: 240.012 s  <<< ERROR!
> > >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > >>>> > > test-SendThread(localhost:39285)
> > >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > >>>> > > java.base@11.0.11
> > >>>> > >
> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > >>>> > > at java.base@11.0.11/sun.nio.ch
> > >>>> .SelectorImpl.select(SelectorImpl.java:136)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > >>>> > > at
> > >>>> > >
> > >>>>
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >>>> > >
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 43.91 s - in
> > >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 113.928 s - in
> org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > >>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > >>>> > > [INFO] Running
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > >>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
> > >>>> elapsed:
> > >>>> > > 0.039 s - in
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > >>>> > > [INFO]
> > >>>> > > [INFO] Results:
> > >>>> > > [INFO]
> > >>>> > > [ERROR] Errors:
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > >>>> > > [ERROR]   Run 1:
> > >>>> > >
> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178
> > >>>> »
> > >>>> > > TestTimedOut
> > >>>> > > [ERROR]   Run 2:
> > >>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> »
> > >>>> Appears
> > >>>> > > to ...
> > >>>> > > [INFO]
> > >>>> > > [ERROR]
>  ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > >>>> > > TestTimedOut test t...
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >>>> > > [ERROR]   Run 1:
> > >>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> > >>>> TestTimedOut
> > >>>> > > tes...
> > >>>> > > [ERROR]   Run 2:
> > >>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
> > >>>> > >  Appears to be stuck...
> > >>>> > > [INFO]
> > >>>> > > [ERROR]
> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >>>> > > [ERROR]   Run 1:
> > >>>> > >
> > >>>> > >
> > >>>>
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > >>>> > > » TestTimedOut
> > >>>> > > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be
> > >>>> stuck in
> > >>>> > > thread Time-limited te...
> > >>>> > > [INFO]
> > >>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > >>>> > > [ERROR]   Run 1:
> > >>>> SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > >>>> > > TestTimedOut test timed ...
> > >>>> > > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in
> thread
> > >>>> > > Time-limited test-SendThread(...
> > >>>> > >
> > >>>> > > These tests fail consistently at every build attempt!
> > >>>> > >
> > >>>> > > The tests fail even when executed separately, e.g.:
> > >>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf
> :accumulo-test
> > >>>> > >
> > >>>> > >
> > >>>> > > I am using the current 'main' branch of Accumulo.
> > >>>> > > JDK 11.0.11
> > >>>> > > Maven: 3.8.2
> > >>>> > > OS: Ubuntu 20.04.3 ARM64
> > >>>> > >
> > >>>> > > Is there anything that could be done to fix these problems ?
> > >>>> > > For example some config settings ?!
> > >>>> > >
> > >>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read
> that
> > >>>> Linux
> > >>>> > > ARM64 is a supported platform since the JVM supports it.
> > >>>> > >
> > >>>> > > Thanks!
> > >>>> > >
> > >>>> > > Mark
> > >>>> > >
> > >>>>
> > >>>
>

Re: Consistent IT tests failures on Linux ARM64

Posted by Christopher <ct...@apache.org>.
I don't see any reason it would break anything else and not opposed to
making a change there to avoid repeated calls to the security provider
to create the credentials, but I'm strongly suspicious that this would
fix the performance problem with that IT. I've seen that test pass
very quickly before, without your change. I think it might be a
coincidence. I think if you were to capture a thread dump at other
times, you wouldn't always see it in that code, but you'd find it busy
doing other work instead. If it does fix it permanently, though, I'd
be pleasantly surprised. Regardless, I think we can move forward with
your PR, either way, because it does avoid unnecessary recomputation
of immutable credentials in ServerInfo.

On Thu, Dec 2, 2021 at 7:23 AM Mark Jens <ma...@gmail.com> wrote:
>
> Please review https://github.com/apache/accumulo/pull/2374
> By caching the ServerInfo's Credentials ConcurrentDeleteTableIT passes
> almost 6 times faster now!
> I am running the whole test suite now to see whether it doesn't break
> something else.
>
> On Thu, 2 Dec 2021 at 13:49, Mark Jens <ma...@gmail.com> wrote:
>
> > Reducing the log output did not reduce the test run time:
> >
> > diff --git test/src/main/resources/log4j2-test.properties
> > test/src/main/resources/log4j2-test.properties
> > index 9124914f7a..810c7bf06f 100644
> > --- test/src/main/resources/log4j2-test.properties
> > +++ test/src/main/resources/log4j2-test.properties
> > @@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
> >  appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
> >
> >  logger.01.name = org.apache.accumulo.core
> > -logger.01.level = debug
> > +logger.01.level = info
> >
> >  logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
> >  logger.02.level = info
> > @@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
> >  logger.25.level = info
> >
> >  logger.26.name = org.apache.hadoop.minikdc
> > -logger.26.level = debug
> > +logger.26.level = info
> >
> >
> > @@ -169,6 +169,6 @@ logger.metrics.level = info
> >  logger.metrics.additivity = false
> >  logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput
> >
> > -rootLogger.level = debug
> > +rootLogger.level = info
> >  rootLogger.appenderRef.console.ref = STDOUT
> >
> > INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 785.503 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> >
> >
> > On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:
> >
> >> Hi again,
> >>
> >> Here are the thread dumps as promised:
> >>
> >> 1) Both TabletServers are very busy at compressing at close time. The
> >> following stacks are dumped in ~5 secs interval:
> >>
> >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=68425.44ms
> >> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
> >>  [0x0000fffe8f3fd000]
> >>    java.lang.Thread.State: RUNNABLE
> >>         at sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
> >> /SHA5.java:232)
> >>         at sun.security.provider.SHA5.implCompress(java.base@11.0.11
> >> /SHA5.java:221)
> >>         at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:124)
> >>         at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >>         at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >>         at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> >>         at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >>         at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >>         at
> >> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> >>         at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> >>         at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> >>         at
> >> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
> >>         at
> >> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
> >>         at
> >> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
> >>         at
> >> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
> >>         - locked <0x00000000f1585830> (a
> >> org.apache.accumulo.tserver.tablet.Tablet)
> >>         at
> >> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
> >>         at
> >> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
> >>         at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >>         at
> >> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> >> Source)
> >>         at
> >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> >> /ThreadPoolExecutor.java:1128)
> >>         at
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> >> /ThreadPoolExecutor.java:628)
> >>         at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >>         at
> >> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> >> Source)
> >>         at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=72485.20ms
> >> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
> >>  [0x0000fffe8f3fd000]
> >>    java.lang.Thread.State: RUNNABLE
> >>         at
> >> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> >> /DigestBase.java:149)
> >>         at
> >> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> >> /DigestBase.java:144)
> >>         at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:131)
> >>         at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >>         at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >>         at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> >>         at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >>         ...
> >>
> >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=81174.59ms
> >> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
> >>  [0x0000fffe8f3fd000]
> >>    java.lang.Thread.State: RUNNABLE
> >>         at sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
> >> /ByteArrayAccess.java:449)
> >>         at sun.security.provider.SHA5.implDigest(java.base@11.0.11
> >> /SHA5.java:131)
> >>         at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> >> /DigestBase.java:210)
> >>         at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> >> /DigestBase.java:189)
> >>         at
> >> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
> >> /MessageDigest.java:639)
> >>         at java.security.MessageDigest.digest(java.base@11.0.11
> >> /MessageDigest.java:385)
> >>         at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
> >>         at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >>         ...
> >>
> >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=86499.01ms
> >> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
> >>  [0x0000fffe8f3fd000]
> >>    java.lang.Thread.State: RUNNABLE
> >>         at
> >> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> >> /DigestBase.java:149)
> >>         at
> >> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> >> /DigestBase.java:144)
> >>         at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:131)
> >>         at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >>         at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >>         at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> >>         at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >>         at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >>         at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >>         ...
> >>
> >> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0 cpu=109551.37ms
> >> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
> >>  [0x0000fffe7bffd000]
> >> 14012    java.lang.Thread.State: RUNNABLE
> >> 14013   at
> >> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> >> /DigestBase.java:149)
> >> 14014   at
> >> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> >> /DigestBase.java:144)
> >> 14015   at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:131)
> >> 14016   at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >> 14017   at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >> 14018   at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
> >> 14019   at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> 14020   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> 14021   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> 14022   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> 14023   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> 14024   at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> 14025   at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> 14026   at
> >> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> >> 14027   at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> >> 14028   at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >>
> >> Notice that ClientContext.getProperties(ClientContext.java:236) most of
> >> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in the
> >> last one it calls ServerInfo.getAuthenticationToken(ServerInfo.java:153).
> >> And both lead to (a lot of ?!) compressing..
> >>
> >> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
> >> level should not be DEBUG ?!
> >>
> >> Most of its threads either wait for notifications from Zookeeper:
> >>
> >> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
> >> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
> >> Object.wait()  [0x0000fffebb7fc000]
> >>  878648    java.lang.Thread.State: TIMED_WAITING (on object monitor)
> >>  878649   at java.lang.Object.wait(java.base@11.0.11/Native Method)
> >>  878650   - waiting on <no object reference available>
> >>  878651   at
> >> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
> >>  878652   - waiting to re-lock in wait() <0x00000000f1427458> (a
> >> org.apache.accumulo.fate.ZooStore)
> >>  878653   at
> >> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
> >>  878654   at
> >> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
> >>  878655   at
> >> org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
> >>  878656   at
> >> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
> >>  878657   at
> >> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
> >> ...
> >>
> >> or wait for data:
> >> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0 cpu=7440.91ms
> >> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
> >>  [0x0000fffebadfd000]
> >>  878782    java.lang.Thread.State: WAITING (on object monitor)
> >>  878783   at java.lang.Object.wait(java.base@11.0.11/Native Method)
> >>  878784   - waiting on <no object reference available>
> >>  878785   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> >>  878786   at
> >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> >>  878787   - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
> >> org.apache.zookeeper.ClientCnxn$Packet)
> >>  878788   at
> >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> >>  878789   at
> >> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
> >>  878790   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
> >>  878791   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
> >> Source)
> >>  878792   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> >> Source)
> >>  878793   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> >>  878794   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> >>  878795   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> >>  878796   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
> >>  878797   at org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
> >>  878798   at
> >> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
> >>  878799   at
> >> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
> >>  878800   at
> >> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
> >>  878801   at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >>  878802   at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> >> Source)
> >>  878803   at
> >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> >> /ThreadPoolExecutor.java:1128)
> >>  878804   at
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> >> /ThreadPoolExecutor.java:628)
> >>  878805   at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >>  878806   at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> >> Source)
> >>  878807   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
> >> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
> >>  [0x0000ffff20f50000]
> >>  908221    java.lang.Thread.State: WAITING (on object monitor)
> >>  908222   at java.lang.Object.wait(java.base@11.0.11/Native Method)
> >>  908223   - waiting on <no object reference available>
> >>  908224   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> >>  908225   at
> >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> >>  908226   - waiting to re-lock in wait() <0x00000000fa781138> (a
> >> org.apache.zookeeper.ClientCnxn$Packet)
> >>  908227   at
> >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> >>  908228   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
> >>  908229   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
> >>  908230   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
> >> Source)
> >>  908231   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> >> Source)
> >>  908232   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> >>  908233   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> >>  908234   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> >>  908235   at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
> >>  908236   at
> >> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
> >>  908237   at
> >> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
> >>  908238   at
> >> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
> >>  908239   at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >>  908240   at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> >> Source)
> >>  908241   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >> 3) SimpleGarbageCollector is also busy in getting credentials
> >>
> >>  "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
> >> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
> >> 2503    java.lang.Thread.State: RUNNABLE
> >> 2504   at
> >> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> >> /DigestBase.java:149)
> >> 2505   at
> >> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> >> /DigestBase.java:144)
> >> 2506   at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:131)
> >> 2507   at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >> 2508   at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >> 2509   at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> >> 2510   at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> 2511   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> 2512   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> 2513   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> 2514   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> 2515   at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> 2516   at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> 2517   at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >> 2518   at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >> 2519   at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >> 2520   at
> >> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> >> 2521   at
> >> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
> >> 2522   at
> >> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
> >> 2523   at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
> >> 2524   at
> >> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
> >> 2525   at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> 2526   at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> >> Source)
> >> 2527   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >>
> >> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
> >> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
> >> 3152    java.lang.Thread.State: RUNNABLE
> >> 3153   at java.util.Arrays.hashCode(java.base@11.0.11/Arrays.java:4685)
> >> 3154   at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
> >> 3155   at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
> >> /Provider.java:1107)
> >> 3156   at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
> >> /ConcurrentHashMap.java:936)
> >> 3157   at java.security.Provider.getService(java.base@11.0.11
> >> /Provider.java:1282)
> >> 3158   at sun.security.jca.ProviderList.getService(java.base@11.0.11
> >> /ProviderList.java:380)
> >> 3159   at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
> >> /GetInstance.java:157)
> >> 3160   at java.security.Security.getImpl(java.base@11.0.11
> >> /Security.java:700)
> >> 3161   at java.security.MessageDigest.getInstance(java.base@11.0.11
> >> /MessageDigest.java:178)
> >> 3162   at
> >> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
> >> 3163   at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
> >> 3164   at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> 3165   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> 3166   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> 3167   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> 3168   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> 3169   at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> 3170   at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> 3171   at
> >> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> >> 3172   at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> >> 3173   at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >> 3174   at
> >> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> >> 3175   at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> >> 3176   at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> >> 3177   at
> >> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
> >> 3178   at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
> >> 3179   at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
> >> 3180   at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
> >> 3181   at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
> >> 3182   at
> >> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
> >> 3183   at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> 3184   at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> >> Source)
> >> 3185   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >>
> >> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
> >> processes
> >>
> >>
> >> I'm not saying that the above are problematic. You know how Accumulo
> >> works. It is up to you to decide whether something should be improved.
> >>
> >> Regards,
> >> Mark
> >>
> >>
> >> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:
> >>
> >>>
> >>>
> >>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:
> >>>
> >>>> It looks like the tests are timing out. This happens frequently when
> >>>> running on resource-constrained systems. You can give the test more
> >>>> time by increasing the timeout factor: `mvn clean verify
> >>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
> >>>> -Dtimeout.factor=3`
> >>>>
> >>>> There's nothing we know of that would change the way our tests work
> >>>> due to ARM64, but you may have issues because of limited RAM, slow CPU
> >>>> speeds, slow disk I/O, busy background processes, or other
> >>>> resource-related issues. I don't think most of the currently active
> >>>> developers use ARM64, or have access to a test machine to reproduce or
> >>>>
> >>>
> >>> In case anyone wants to test on Linux ARM64 you could easily use Oracle
> >>> Cloud for free.
> >>>
> >>> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
> >>> explains how to create a VM and how to use this VM as a Github Actions
> >>> runner.
> >>> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
> >>> mentions this article.
> >>>
> >>>
> >>>> experiment with Accumulo there, so you may have to do some of your own
> >>>> troubleshooting. If you can rule out resource-constraint issues, and
> >>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
> >>>> flaky and sometimes times out on x86_64 as well), you could create a
> >>>> bug ticket with more details at
> >>>> https://github.com/apache/accumulo/issues ; there is an issue template
> >>>> specifically for broken and/or flaky tests that you can select when
> >>>> creating a new ticket.
> >>>>
> >>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com>
> >>>> wrote:
> >>>> >
> >>>> > Hi dev1,
> >>>> >
> >>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
> >>>> >
> >>>> > > Some of those tests are trying to stress conditions that require a
> >>>> lot of
> >>>> > > resources to replicate specific conditions. Have you tried to run
> >>>> those
> >>>> > > individual tests in isolation so that you are not competing for
> >>>> resources?
> >>>> > > Do they always fail, or are the failures transient?
> >>>> > >
> >>>> >
> >>>> > Q: Have you tried to run those individual tests in isolation so that
> >>>> you
> >>>> > are not competing for resources?
> >>>> > A: This is what I mean with the following:
> >>>> > ---------------------
> >>>> > The tests fail even when executed separately, e.g.:
> >>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> >>>> > ---------------------
> >>>> >
> >>>> > Q: Do they always fail, or are the failures transient?
> >>>> > A: I also tried to explain that with "These tests fail consistently at
> >>>> > every build attempt!"
> >>>> >
> >>>> > Mark
> >>>> >
> >>>> > >
> >>>> > > -----Original Message-----
> >>>> > > From: Mark Jens <ma...@gmail.com>
> >>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
> >>>> > > To: dev@accumulo.apache.org
> >>>> > > Subject: Consistent IT tests failures on Linux ARM64
> >>>> > >
> >>>> > > Hello Accumulo community,
> >>>> > >
> >>>> > > At my job we consider using Linux ARM64 servers and I've been
> >>>> tasked to
> >>>> > > test Accumulo.
> >>>> > >
> >>>> > > I face some timeout related issues with several IT tests:
> >>>> > >
> >>>> > >
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >>>> > >  Time elapsed: 420.122 s  <<< ERROR!
> >>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
> >>>> 420
> >>>> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
> >>>> Method)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> >>>> > > at java.base@11.0.11
> >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>> > > Method)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>> > > at java.base@11.0.11
> >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >>>> > >
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >>>> > >  Time elapsed: 420.122 s  <<< ERROR!
> >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> >>>> > > test-SendThread(localhost:44251)
> >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> >>>> > > java.base@11.0.11
> >>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> >>>> > > at java.base@11.0.11
> >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> >>>> > > at java.base@11.0.11/sun.nio.ch
> >>>> .SelectorImpl.select(SelectorImpl.java:136)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> >>>> > > at
> >>>> > >
> >>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >>>> > >
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> >>>> > >  Time elapsed: 420.011 s  <<< ERROR!
> >>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
> >>>> 420
> >>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
> >>>> at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> >>>> > > at
> >>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> >>>> > > at
> >>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> >>>> > > at java.base@11.0.11
> >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>> > > Method)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>> > > at java.base@11.0.11
> >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >>>> > >
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.BatchWriterFlushIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.ZookeeperRestartIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 23.046 s - in
> >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> >>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 255.108 s - in
> >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 59.289 s - in
> >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> >>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
> >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 238.409 s - in
> >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
> >>>> > > [INFO] Running
> >>>> > >
> >>>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 219.253 s - in
> >>>> > >
> >>>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
> >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 489.863 s - in
> >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> >>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 71.934 s - in
> >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> >>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 307.904 s <<< FAILURE! - in
> >>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
> >>>> > > [ERROR]
> >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >>>> > >  Time elapsed: 240.011 s  <<< ERROR!
> >>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
> >>>> 240
> >>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method)
> >>>> at
> >>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> >>>> > > at java.base@11.0.11
> >>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> >>>> > > at java.base@11.0.11
> >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>> > > Method)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>> > > at java.base@11.0.11
> >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >>>> > >
> >>>> > > [ERROR]
> >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >>>> > >  Time elapsed: 240.012 s  <<< ERROR!
> >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> >>>> > > test-SendThread(localhost:39285)
> >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> >>>> > > java.base@11.0.11
> >>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> >>>> > > at java.base@11.0.11
> >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> >>>> > > at java.base@11.0.11/sun.nio.ch
> >>>> .SelectorImpl.select(SelectorImpl.java:136)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> >>>> > > at
> >>>> > >
> >>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >>>> > >
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 43.91 s - in
> >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> >>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> >>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> >>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> >>>> > > [INFO] Running
> >>>> > >
> >>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> >>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
> >>>> elapsed:
> >>>> > > 0.039 s - in
> >>>> > >
> >>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> >>>> > > [INFO]
> >>>> > > [INFO] Results:
> >>>> > > [INFO]
> >>>> > > [ERROR] Errors:
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> >>>> > > [ERROR]   Run 1:
> >>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178
> >>>> »
> >>>> > > TestTimedOut
> >>>> > > [ERROR]   Run 2:
> >>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
> >>>> Appears
> >>>> > > to ...
> >>>> > > [INFO]
> >>>> > > [ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> >>>> > > TestTimedOut test t...
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >>>> > > [ERROR]   Run 1:
> >>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> >>>> TestTimedOut
> >>>> > > tes...
> >>>> > > [ERROR]   Run 2:
> >>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
> >>>> > >  Appears to be stuck...
> >>>> > > [INFO]
> >>>> > > [ERROR]
> >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >>>> > > [ERROR]   Run 1:
> >>>> > >
> >>>> > >
> >>>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> >>>> > > » TestTimedOut
> >>>> > > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be
> >>>> stuck in
> >>>> > > thread Time-limited te...
> >>>> > > [INFO]
> >>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> >>>> > > [ERROR]   Run 1:
> >>>> SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> >>>> > > TestTimedOut test timed ...
> >>>> > > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
> >>>> > > Time-limited test-SendThread(...
> >>>> > >
> >>>> > > These tests fail consistently at every build attempt!
> >>>> > >
> >>>> > > The tests fail even when executed separately, e.g.:
> >>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> >>>> > >
> >>>> > >
> >>>> > > I am using the current 'main' branch of Accumulo.
> >>>> > > JDK 11.0.11
> >>>> > > Maven: 3.8.2
> >>>> > > OS: Ubuntu 20.04.3 ARM64
> >>>> > >
> >>>> > > Is there anything that could be done to fix these problems ?
> >>>> > > For example some config settings ?!
> >>>> > >
> >>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
> >>>> Linux
> >>>> > > ARM64 is a supported platform since the JVM supports it.
> >>>> > >
> >>>> > > Thanks!
> >>>> > >
> >>>> > > Mark
> >>>> > >
> >>>>
> >>>

Re: Consistent IT tests failures on Linux ARM64

Posted by Mark Jens <ma...@gmail.com>.
Please review https://github.com/apache/accumulo/pull/2374
By caching the ServerInfo's Credentials ConcurrentDeleteTableIT passes
almost 6 times faster now!
I am running the whole test suite now to see whether it doesn't break
something else.

On Thu, 2 Dec 2021 at 13:49, Mark Jens <ma...@gmail.com> wrote:

> Reducing the log output did not reduce the test run time:
>
> diff --git test/src/main/resources/log4j2-test.properties
> test/src/main/resources/log4j2-test.properties
> index 9124914f7a..810c7bf06f 100644
> --- test/src/main/resources/log4j2-test.properties
> +++ test/src/main/resources/log4j2-test.properties
> @@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
>  appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
>
>  logger.01.name = org.apache.accumulo.core
> -logger.01.level = debug
> +logger.01.level = info
>
>  logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
>  logger.02.level = info
> @@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
>  logger.25.level = info
>
>  logger.26.name = org.apache.hadoop.minikdc
> -logger.26.level = debug
> +logger.26.level = info
>
>
> @@ -169,6 +169,6 @@ logger.metrics.level = info
>  logger.metrics.additivity = false
>  logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput
>
> -rootLogger.level = debug
> +rootLogger.level = info
>  rootLogger.appenderRef.console.ref = STDOUT
>
> INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 785.503 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>
>
> On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:
>
>> Hi again,
>>
>> Here are the thread dumps as promised:
>>
>> 1) Both TabletServers are very busy at compressing at close time. The
>> following stacks are dumped in ~5 secs interval:
>>
>> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=68425.44ms
>> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
>>  [0x0000fffe8f3fd000]
>>    java.lang.Thread.State: RUNNABLE
>>         at sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
>> /SHA5.java:232)
>>         at sun.security.provider.SHA5.implCompress(java.base@11.0.11
>> /SHA5.java:221)
>>         at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:124)
>>         at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>>         at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>>         at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
>>         at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>>         at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>>         at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>>         at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>>         at
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>>         at
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>>         at
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
>>         at
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
>>         at
>> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
>>         at
>> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
>>         at
>> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
>>         at
>> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
>>         - locked <0x00000000f1585830> (a
>> org.apache.accumulo.tserver.tablet.Tablet)
>>         at
>> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
>>         at
>> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
>>         at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>>         at
>> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
>> Source)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
>> /ThreadPoolExecutor.java:1128)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
>> /ThreadPoolExecutor.java:628)
>>         at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>>         at
>> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
>> Source)
>>         at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=72485.20ms
>> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
>>  [0x0000fffe8f3fd000]
>>    java.lang.Thread.State: RUNNABLE
>>         at
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> /DigestBase.java:149)
>>         at
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> /DigestBase.java:144)
>>         at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:131)
>>         at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>>         at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>>         at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>>         at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>>         at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>>         at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>>         at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>>         ...
>>
>> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=81174.59ms
>> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
>>  [0x0000fffe8f3fd000]
>>    java.lang.Thread.State: RUNNABLE
>>         at sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
>> /ByteArrayAccess.java:449)
>>         at sun.security.provider.SHA5.implDigest(java.base@11.0.11
>> /SHA5.java:131)
>>         at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
>> /DigestBase.java:210)
>>         at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
>> /DigestBase.java:189)
>>         at
>> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
>> /MessageDigest.java:639)
>>         at java.security.MessageDigest.digest(java.base@11.0.11
>> /MessageDigest.java:385)
>>         at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
>>         at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>>         at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>>         at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>>         at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>>         ...
>>
>> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=86499.01ms
>> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
>>  [0x0000fffe8f3fd000]
>>    java.lang.Thread.State: RUNNABLE
>>         at
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> /DigestBase.java:149)
>>         at
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> /DigestBase.java:144)
>>         at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:131)
>>         at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>>         at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>>         at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>>         at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>>         at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>>         at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>>         at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>>         at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>>         ...
>>
>> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0 cpu=109551.37ms
>> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
>>  [0x0000fffe7bffd000]
>> 14012    java.lang.Thread.State: RUNNABLE
>> 14013   at
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> /DigestBase.java:149)
>> 14014   at
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> /DigestBase.java:144)
>> 14015   at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:131)
>> 14016   at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>> 14017   at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>> 14018   at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
>> 14019   at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> 14020   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> 14021   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> 14022   at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> 14023   at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> 14024   at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> 14025   at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> 14026   at
>> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
>> 14027   at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
>> 14028   at
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>>
>> Notice that ClientContext.getProperties(ClientContext.java:236) most of
>> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in the
>> last one it calls ServerInfo.getAuthenticationToken(ServerInfo.java:153).
>> And both lead to (a lot of ?!) compressing..
>>
>> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
>> level should not be DEBUG ?!
>>
>> Most of its threads either wait for notifications from Zookeeper:
>>
>> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
>> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
>> Object.wait()  [0x0000fffebb7fc000]
>>  878648    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>>  878649   at java.lang.Object.wait(java.base@11.0.11/Native Method)
>>  878650   - waiting on <no object reference available>
>>  878651   at
>> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
>>  878652   - waiting to re-lock in wait() <0x00000000f1427458> (a
>> org.apache.accumulo.fate.ZooStore)
>>  878653   at
>> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
>>  878654   at
>> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
>>  878655   at
>> org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
>>  878656   at
>> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
>>  878657   at
>> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
>> ...
>>
>> or wait for data:
>> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0 cpu=7440.91ms
>> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
>>  [0x0000fffebadfd000]
>>  878782    java.lang.Thread.State: WAITING (on object monitor)
>>  878783   at java.lang.Object.wait(java.base@11.0.11/Native Method)
>>  878784   - waiting on <no object reference available>
>>  878785   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
>>  878786   at
>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>>  878787   - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
>> org.apache.zookeeper.ClientCnxn$Packet)
>>  878788   at
>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>>  878789   at
>> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
>>  878790   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
>>  878791   at
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
>> Source)
>>  878792   at
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
>> Source)
>>  878793   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>>  878794   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>>  878795   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>>  878796   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
>>  878797   at org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
>>  878798   at
>> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
>>  878799   at
>> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
>>  878800   at
>> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
>>  878801   at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>>  878802   at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> Source)
>>  878803   at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
>> /ThreadPoolExecutor.java:1128)
>>  878804   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
>> /ThreadPoolExecutor.java:628)
>>  878805   at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>>  878806   at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> Source)
>>  878807   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
>> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
>>  [0x0000ffff20f50000]
>>  908221    java.lang.Thread.State: WAITING (on object monitor)
>>  908222   at java.lang.Object.wait(java.base@11.0.11/Native Method)
>>  908223   - waiting on <no object reference available>
>>  908224   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
>>  908225   at
>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>>  908226   - waiting to re-lock in wait() <0x00000000fa781138> (a
>> org.apache.zookeeper.ClientCnxn$Packet)
>>  908227   at
>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>>  908228   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
>>  908229   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
>>  908230   at
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
>> Source)
>>  908231   at
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
>> Source)
>>  908232   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>>  908233   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>>  908234   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>>  908235   at
>> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
>>  908236   at
>> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
>>  908237   at
>> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
>>  908238   at
>> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
>>  908239   at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>>  908240   at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> Source)
>>  908241   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>> 3) SimpleGarbageCollector is also busy in getting credentials
>>
>>  "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
>> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
>> 2503    java.lang.Thread.State: RUNNABLE
>> 2504   at
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> /DigestBase.java:149)
>> 2505   at
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> /DigestBase.java:144)
>> 2506   at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:131)
>> 2507   at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>> 2508   at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>> 2509   at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
>> 2510   at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> 2511   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> 2512   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> 2513   at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> 2514   at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> 2515   at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> 2516   at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> 2517   at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> 2518   at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> 2519   at
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> 2520   at
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> 2521   at
>> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
>> 2522   at
>> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
>> 2523   at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
>> 2524   at
>> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
>> 2525   at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> 2526   at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
>> Source)
>> 2527   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>>
>> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
>> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
>> 3152    java.lang.Thread.State: RUNNABLE
>> 3153   at java.util.Arrays.hashCode(java.base@11.0.11/Arrays.java:4685)
>> 3154   at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
>> 3155   at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
>> /Provider.java:1107)
>> 3156   at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
>> /ConcurrentHashMap.java:936)
>> 3157   at java.security.Provider.getService(java.base@11.0.11
>> /Provider.java:1282)
>> 3158   at sun.security.jca.ProviderList.getService(java.base@11.0.11
>> /ProviderList.java:380)
>> 3159   at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
>> /GetInstance.java:157)
>> 3160   at java.security.Security.getImpl(java.base@11.0.11
>> /Security.java:700)
>> 3161   at java.security.MessageDigest.getInstance(java.base@11.0.11
>> /MessageDigest.java:178)
>> 3162   at
>> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
>> 3163   at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
>> 3164   at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> 3165   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> 3166   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> 3167   at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> 3168   at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> 3169   at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> 3170   at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> 3171   at
>> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
>> 3172   at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
>> 3173   at
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> 3174   at
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> 3175   at
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
>> 3176   at
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
>> 3177   at
>> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
>> 3178   at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
>> 3179   at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
>> 3180   at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
>> 3181   at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
>> 3182   at
>> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
>> 3183   at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> 3184   at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
>> Source)
>> 3185   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>>
>> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
>> processes
>>
>>
>> I'm not saying that the above are problematic. You know how Accumulo
>> works. It is up to you to decide whether something should be improved.
>>
>> Regards,
>> Mark
>>
>>
>> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:
>>
>>>
>>>
>>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:
>>>
>>>> It looks like the tests are timing out. This happens frequently when
>>>> running on resource-constrained systems. You can give the test more
>>>> time by increasing the timeout factor: `mvn clean verify
>>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
>>>> -Dtimeout.factor=3`
>>>>
>>>> There's nothing we know of that would change the way our tests work
>>>> due to ARM64, but you may have issues because of limited RAM, slow CPU
>>>> speeds, slow disk I/O, busy background processes, or other
>>>> resource-related issues. I don't think most of the currently active
>>>> developers use ARM64, or have access to a test machine to reproduce or
>>>>
>>>
>>> In case anyone wants to test on Linux ARM64 you could easily use Oracle
>>> Cloud for free.
>>>
>>> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
>>> explains how to create a VM and how to use this VM as a Github Actions
>>> runner.
>>> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
>>> mentions this article.
>>>
>>>
>>>> experiment with Accumulo there, so you may have to do some of your own
>>>> troubleshooting. If you can rule out resource-constraint issues, and
>>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
>>>> flaky and sometimes times out on x86_64 as well), you could create a
>>>> bug ticket with more details at
>>>> https://github.com/apache/accumulo/issues ; there is an issue template
>>>> specifically for broken and/or flaky tests that you can select when
>>>> creating a new ticket.
>>>>
>>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi dev1,
>>>> >
>>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>>>> >
>>>> > > Some of those tests are trying to stress conditions that require a
>>>> lot of
>>>> > > resources to replicate specific conditions. Have you tried to run
>>>> those
>>>> > > individual tests in isolation so that you are not competing for
>>>> resources?
>>>> > > Do they always fail, or are the failures transient?
>>>> > >
>>>> >
>>>> > Q: Have you tried to run those individual tests in isolation so that
>>>> you
>>>> > are not competing for resources?
>>>> > A: This is what I mean with the following:
>>>> > ---------------------
>>>> > The tests fail even when executed separately, e.g.:
>>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>>>> > ---------------------
>>>> >
>>>> > Q: Do they always fail, or are the failures transient?
>>>> > A: I also tried to explain that with "These tests fail consistently at
>>>> > every build attempt!"
>>>> >
>>>> > Mark
>>>> >
>>>> > >
>>>> > > -----Original Message-----
>>>> > > From: Mark Jens <ma...@gmail.com>
>>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
>>>> > > To: dev@accumulo.apache.org
>>>> > > Subject: Consistent IT tests failures on Linux ARM64
>>>> > >
>>>> > > Hello Accumulo community,
>>>> > >
>>>> > > At my job we consider using Linux ARM64 servers and I've been
>>>> tasked to
>>>> > > test Accumulo.
>>>> > >
>>>> > > I face some timeout related issues with several IT tests:
>>>> > >
>>>> > >
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>>> > >  Time elapsed: 420.122 s  <<< ERROR!
>>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>>> 420
>>>> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
>>>> Method)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>>>> > > at java.base@11.0.11
>>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>> > > Method)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> > > at java.base@11.0.11
>>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>>> > >
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>>> > >  Time elapsed: 420.122 s  <<< ERROR!
>>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>>>> > > test-SendThread(localhost:44251)
>>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>>>> > > java.base@11.0.11
>>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>>>> > > at java.base@11.0.11
>>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>>>> > > at java.base@11.0.11/sun.nio.ch
>>>> .SelectorImpl.select(SelectorImpl.java:136)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>>>> > > at
>>>> > >
>>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>>>> > >
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
>>>> > >  Time elapsed: 420.011 s  <<< ERROR!
>>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>>> 420
>>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
>>>> at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
>>>> > > at
>>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
>>>> > > at
>>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
>>>> > > at java.base@11.0.11
>>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>> > > Method)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> > > at java.base@11.0.11
>>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>>> > >
>>>> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
>>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.BatchWriterFlushIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.ZookeeperRestartIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 23.046 s - in
>>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 255.108 s - in
>>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 59.289 s - in
>>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
>>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.GarbageCollectorIT
>>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 238.409 s - in
>>>> org.apache.accumulo.test.functional.GarbageCollectorIT
>>>> > > [INFO] Running
>>>> > >
>>>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 219.253 s - in
>>>> > >
>>>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 489.863 s - in
>>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
>>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 71.934 s - in
>>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
>>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
>>>> elapsed:
>>>> > > 307.904 s <<< FAILURE! - in
>>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
>>>> > > [ERROR]
>>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>>> > >  Time elapsed: 240.011 s  <<< ERROR!
>>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>>> 240
>>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method)
>>>> at
>>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
>>>> > > at java.base@11.0.11
>>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
>>>> > > at java.base@11.0.11
>>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>> > > Method)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> > > at java.base@11.0.11
>>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>>> > >
>>>> > > [ERROR]
>>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>>> > >  Time elapsed: 240.012 s  <<< ERROR!
>>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>>>> > > test-SendThread(localhost:39285)
>>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>>>> > > java.base@11.0.11
>>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>>>> > > at java.base@11.0.11
>>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>>>> > > at java.base@11.0.11/sun.nio.ch
>>>> .SelectorImpl.select(SelectorImpl.java:136)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>>>> > > at
>>>> > >
>>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>>>> > >
>>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
>>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 43.91 s - in
>>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
>>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
>>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
>>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
>>>> > > [INFO] Running
>>>> > >
>>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
>>>> elapsed:
>>>> > > 0.039 s - in
>>>> > >
>>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>>>> > > [INFO]
>>>> > > [INFO] Results:
>>>> > > [INFO]
>>>> > > [ERROR] Errors:
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
>>>> > > [ERROR]   Run 1:
>>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178
>>>> »
>>>> > > TestTimedOut
>>>> > > [ERROR]   Run 2:
>>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
>>>> Appears
>>>> > > to ...
>>>> > > [INFO]
>>>> > > [ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
>>>> > > TestTimedOut test t...
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>>> > > [ERROR]   Run 1:
>>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
>>>> TestTimedOut
>>>> > > tes...
>>>> > > [ERROR]   Run 2:
>>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
>>>> > >  Appears to be stuck...
>>>> > > [INFO]
>>>> > > [ERROR]
>>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>>> > > [ERROR]   Run 1:
>>>> > >
>>>> > >
>>>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
>>>> > > » TestTimedOut
>>>> > > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be
>>>> stuck in
>>>> > > thread Time-limited te...
>>>> > > [INFO]
>>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
>>>> > > [ERROR]   Run 1:
>>>> SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
>>>> > > TestTimedOut test timed ...
>>>> > > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
>>>> > > Time-limited test-SendThread(...
>>>> > >
>>>> > > These tests fail consistently at every build attempt!
>>>> > >
>>>> > > The tests fail even when executed separately, e.g.:
>>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>>>> > >
>>>> > >
>>>> > > I am using the current 'main' branch of Accumulo.
>>>> > > JDK 11.0.11
>>>> > > Maven: 3.8.2
>>>> > > OS: Ubuntu 20.04.3 ARM64
>>>> > >
>>>> > > Is there anything that could be done to fix these problems ?
>>>> > > For example some config settings ?!
>>>> > >
>>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
>>>> Linux
>>>> > > ARM64 is a supported platform since the JVM supports it.
>>>> > >
>>>> > > Thanks!
>>>> > >
>>>> > > Mark
>>>> > >
>>>>
>>>

Re: Consistent IT tests failures on Linux ARM64

Posted by Mark Jens <ma...@gmail.com>.
Reducing the log output did not reduce the test run time:

diff --git test/src/main/resources/log4j2-test.properties
test/src/main/resources/log4j2-test.properties
index 9124914f7a..810c7bf06f 100644
--- test/src/main/resources/log4j2-test.properties
+++ test/src/main/resources/log4j2-test.properties
@@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
 appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n

 logger.01.name = org.apache.accumulo.core
-logger.01.level = debug
+logger.01.level = info

 logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
 logger.02.level = info
@@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
 logger.25.level = info

 logger.26.name = org.apache.hadoop.minikdc
-logger.26.level = debug
+logger.26.level = info


@@ -169,6 +169,6 @@ logger.metrics.level = info
 logger.metrics.additivity = false
 logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput

-rootLogger.level = debug
+rootLogger.level = info
 rootLogger.appenderRef.console.ref = STDOUT

INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
785.503 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT


On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:

> Hi again,
>
> Here are the thread dumps as promised:
>
> 1) Both TabletServers are very busy at compressing at close time. The
> following stacks are dumped in ~5 secs interval:
>
> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=68425.44ms
> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
>  [0x0000fffe8f3fd000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
> /SHA5.java:232)
>         at sun.security.provider.SHA5.implCompress(java.base@11.0.11
> /SHA5.java:221)
>         at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:124)
>         at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
>         at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
>         at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
>         at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>         at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>         at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>         at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>         at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>         at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>         at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>         at
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>         at
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>         at
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
>         at
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
>         at
> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
>         at
> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
>         at
> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
>         at
> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
>         - locked <0x00000000f1585830> (a
> org.apache.accumulo.tserver.tablet.Tablet)
>         at
> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
>         at
> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
>         at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>         at
> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> Source)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> /ThreadPoolExecutor.java:1128)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> /ThreadPoolExecutor.java:628)
>         at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>         at
> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> Source)
>         at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=72485.20ms
> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
>  [0x0000fffe8f3fd000]
>    java.lang.Thread.State: RUNNABLE
>         at
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> /DigestBase.java:149)
>         at
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> /DigestBase.java:144)
>         at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:131)
>         at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
>         at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
>         at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>         at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>         at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>         at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>         at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>         at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>         at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>         at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>         ...
>
> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=81174.59ms
> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
>  [0x0000fffe8f3fd000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
> /ByteArrayAccess.java:449)
>         at sun.security.provider.SHA5.implDigest(java.base@11.0.11
> /SHA5.java:131)
>         at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> /DigestBase.java:210)
>         at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> /DigestBase.java:189)
>         at
> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
> /MessageDigest.java:639)
>         at java.security.MessageDigest.digest(java.base@11.0.11
> /MessageDigest.java:385)
>         at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
>         at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>         at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>         at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>         at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>         at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>         at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>         at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>         ...
>
> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=86499.01ms
> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
>  [0x0000fffe8f3fd000]
>    java.lang.Thread.State: RUNNABLE
>         at
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> /DigestBase.java:149)
>         at
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> /DigestBase.java:144)
>         at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:131)
>         at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
>         at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
>         at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>         at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>         at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>         at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>         at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>         at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>         at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>         at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>         at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>         ...
>
> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0 cpu=109551.37ms
> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
>  [0x0000fffe7bffd000]
> 14012    java.lang.Thread.State: RUNNABLE
> 14013   at
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> /DigestBase.java:149)
> 14014   at
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> /DigestBase.java:144)
> 14015   at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:131)
> 14016   at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
> 14017   at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
> 14018   at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
> 14019   at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> 14020   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> 14021   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> 14022   at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> 14023   at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> 14024   at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> 14025   at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> 14026   at
> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> 14027   at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> 14028   at
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>
> Notice that ClientContext.getProperties(ClientContext.java:236) most of
> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in the
> last one it calls ServerInfo.getAuthenticationToken(ServerInfo.java:153).
> And both lead to (a lot of ?!) compressing..
>
> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
> level should not be DEBUG ?!
>
> Most of its threads either wait for notifications from Zookeeper:
>
> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
> Object.wait()  [0x0000fffebb7fc000]
>  878648    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>  878649   at java.lang.Object.wait(java.base@11.0.11/Native Method)
>  878650   - waiting on <no object reference available>
>  878651   at
> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
>  878652   - waiting to re-lock in wait() <0x00000000f1427458> (a
> org.apache.accumulo.fate.ZooStore)
>  878653   at
> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
>  878654   at
> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
>  878655   at org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
>  878656   at
> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
>  878657   at
> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
> ...
>
> or wait for data:
> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0 cpu=7440.91ms
> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
>  [0x0000fffebadfd000]
>  878782    java.lang.Thread.State: WAITING (on object monitor)
>  878783   at java.lang.Object.wait(java.base@11.0.11/Native Method)
>  878784   - waiting on <no object reference available>
>  878785   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
>  878786   at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>  878787   - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
> org.apache.zookeeper.ClientCnxn$Packet)
>  878788   at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>  878789   at
> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
>  878790   at
> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
>  878791   at
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
> Source)
>  878792   at
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> Source)
>  878793   at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>  878794   at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>  878795   at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>  878796   at
> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
>  878797   at org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
>  878798   at
> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
>  878799   at
> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
>  878800   at
> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
>  878801   at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>  878802   at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> Source)
>  878803   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> /ThreadPoolExecutor.java:1128)
>  878804   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> /ThreadPoolExecutor.java:628)
>  878805   at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>  878806   at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> Source)
>  878807   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
>  [0x0000ffff20f50000]
>  908221    java.lang.Thread.State: WAITING (on object monitor)
>  908222   at java.lang.Object.wait(java.base@11.0.11/Native Method)
>  908223   - waiting on <no object reference available>
>  908224   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
>  908225   at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>  908226   - waiting to re-lock in wait() <0x00000000fa781138> (a
> org.apache.zookeeper.ClientCnxn$Packet)
>  908227   at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>  908228   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
>  908229   at
> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
>  908230   at
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
> Source)
>  908231   at
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> Source)
>  908232   at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>  908233   at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>  908234   at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>  908235   at
> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
>  908236   at
> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
>  908237   at
> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
>  908238   at
> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
>  908239   at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>  908240   at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> Source)
>  908241   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
> 3) SimpleGarbageCollector is also busy in getting credentials
>
>  "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
> 2503    java.lang.Thread.State: RUNNABLE
> 2504   at
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> /DigestBase.java:149)
> 2505   at
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> /DigestBase.java:144)
> 2506   at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:131)
> 2507   at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
> 2508   at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
> 2509   at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> 2510   at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> 2511   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> 2512   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> 2513   at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> 2514   at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> 2515   at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> 2516   at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> 2517   at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> 2518   at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> 2519   at
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> 2520   at
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> 2521   at
> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
> 2522   at
> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
> 2523   at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
> 2524   at
> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
> 2525   at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> 2526   at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> Source)
> 2527   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
>
> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
> tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
> 3152    java.lang.Thread.State: RUNNABLE
> 3153   at java.util.Arrays.hashCode(java.base@11.0.11/Arrays.java:4685)
> 3154   at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
> 3155   at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
> /Provider.java:1107)
> 3156   at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
> /ConcurrentHashMap.java:936)
> 3157   at java.security.Provider.getService(java.base@11.0.11
> /Provider.java:1282)
> 3158   at sun.security.jca.ProviderList.getService(java.base@11.0.11
> /ProviderList.java:380)
> 3159   at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
> /GetInstance.java:157)
> 3160   at java.security.Security.getImpl(java.base@11.0.11
> /Security.java:700)
> 3161   at java.security.MessageDigest.getInstance(java.base@11.0.11
> /MessageDigest.java:178)
> 3162   at
> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
> 3163   at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
> 3164   at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> 3165   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> 3166   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> 3167   at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> 3168   at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> 3169   at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> 3170   at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> 3171   at
> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> 3172   at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> 3173   at
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> 3174   at
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> 3175   at
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> 3176   at
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> 3177   at
> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
> 3178   at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
> 3179   at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
> 3180   at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
> 3181   at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
> 3182   at
> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
> 3183   at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> 3184   at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> Source)
> 3185   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
>
> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
> processes
>
>
> I'm not saying that the above are problematic. You know how Accumulo
> works. It is up to you to decide whether something should be improved.
>
> Regards,
> Mark
>
>
> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:
>
>>
>>
>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:
>>
>>> It looks like the tests are timing out. This happens frequently when
>>> running on resource-constrained systems. You can give the test more
>>> time by increasing the timeout factor: `mvn clean verify
>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
>>> -Dtimeout.factor=3`
>>>
>>> There's nothing we know of that would change the way our tests work
>>> due to ARM64, but you may have issues because of limited RAM, slow CPU
>>> speeds, slow disk I/O, busy background processes, or other
>>> resource-related issues. I don't think most of the currently active
>>> developers use ARM64, or have access to a test machine to reproduce or
>>>
>>
>> In case anyone wants to test on Linux ARM64 you could easily use Oracle
>> Cloud for free.
>>
>> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
>> explains how to create a VM and how to use this VM as a Github Actions
>> runner.
>> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
>> mentions this article.
>>
>>
>>> experiment with Accumulo there, so you may have to do some of your own
>>> troubleshooting. If you can rule out resource-constraint issues, and
>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
>>> flaky and sometimes times out on x86_64 as well), you could create a
>>> bug ticket with more details at
>>> https://github.com/apache/accumulo/issues ; there is an issue template
>>> specifically for broken and/or flaky tests that you can select when
>>> creating a new ticket.
>>>
>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
>>> >
>>> > Hi dev1,
>>> >
>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>>> >
>>> > > Some of those tests are trying to stress conditions that require a
>>> lot of
>>> > > resources to replicate specific conditions. Have you tried to run
>>> those
>>> > > individual tests in isolation so that you are not competing for
>>> resources?
>>> > > Do they always fail, or are the failures transient?
>>> > >
>>> >
>>> > Q: Have you tried to run those individual tests in isolation so that
>>> you
>>> > are not competing for resources?
>>> > A: This is what I mean with the following:
>>> > ---------------------
>>> > The tests fail even when executed separately, e.g.:
>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>>> > ---------------------
>>> >
>>> > Q: Do they always fail, or are the failures transient?
>>> > A: I also tried to explain that with "These tests fail consistently at
>>> > every build attempt!"
>>> >
>>> > Mark
>>> >
>>> > >
>>> > > -----Original Message-----
>>> > > From: Mark Jens <ma...@gmail.com>
>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
>>> > > To: dev@accumulo.apache.org
>>> > > Subject: Consistent IT tests failures on Linux ARM64
>>> > >
>>> > > Hello Accumulo community,
>>> > >
>>> > > At my job we consider using Linux ARM64 servers and I've been tasked
>>> to
>>> > > test Accumulo.
>>> > >
>>> > > I face some timeout related issues with several IT tests:
>>> > >
>>> > >
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>> > >  Time elapsed: 420.122 s  <<< ERROR!
>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>> 420
>>> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
>>> Method)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>>> > > at java.base@11.0.11
>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> > > Method)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > > at java.base@11.0.11
>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>> > >
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>> > >  Time elapsed: 420.122 s  <<< ERROR!
>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>>> > > test-SendThread(localhost:44251)
>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>>> > > java.base@11.0.11
>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>>> > > at java.base@11.0.11
>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>>> > > at java.base@11.0.11/sun.nio.ch
>>> .SelectorImpl.select(SelectorImpl.java:136)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>>> > > at
>>> > >
>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>>> > >
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
>>> > >  Time elapsed: 420.011 s  <<< ERROR!
>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>> 420
>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
>>> at
>>> > >
>>> > >
>>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
>>> > > at
>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
>>> > > at
>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
>>> > > at java.base@11.0.11
>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> > > Method)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > > at java.base@11.0.11
>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>> > >
>>> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 23.046 s - in
>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 255.108 s - in
>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 59.289 s - in
>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
>>> > > [INFO] Running
>>> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 219.253 s - in
>>> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 489.863 s - in
>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 71.934 s - in
>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
>>> elapsed:
>>> > > 307.904 s <<< FAILURE! - in
>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
>>> > > [ERROR]
>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>> > >  Time elapsed: 240.011 s  <<< ERROR!
>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>> 240
>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
>>> > > at java.base@11.0.11
>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
>>> > > at java.base@11.0.11
>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> > > Method)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > > at java.base@11.0.11
>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>> > >
>>> > > [ERROR]
>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>> > >  Time elapsed: 240.012 s  <<< ERROR!
>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>>> > > test-SendThread(localhost:39285)
>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>>> > > java.base@11.0.11
>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>>> > > at java.base@11.0.11
>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>>> > > at java.base@11.0.11/sun.nio.ch
>>> .SelectorImpl.select(SelectorImpl.java:136)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>>> > > at
>>> > >
>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>>> > >
>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
>>> > > [INFO] Running
>>> > >
>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
>>> elapsed:
>>> > > 0.039 s - in
>>> > >
>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>>> > > [INFO]
>>> > > [INFO] Results:
>>> > > [INFO]
>>> > > [ERROR] Errors:
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
>>> > > [ERROR]   Run 1:
>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
>>> > > TestTimedOut
>>> > > [ERROR]   Run 2:
>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
>>> Appears
>>> > > to ...
>>> > > [INFO]
>>> > > [ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
>>> > > TestTimedOut test t...
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>> > > [ERROR]   Run 1:
>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
>>> TestTimedOut
>>> > > tes...
>>> > > [ERROR]   Run 2:
>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
>>> > >  Appears to be stuck...
>>> > > [INFO]
>>> > > [ERROR]
>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>> > > [ERROR]   Run 1:
>>> > >
>>> > >
>>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
>>> > > » TestTimedOut
>>> > > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be
>>> stuck in
>>> > > thread Time-limited te...
>>> > > [INFO]
>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
>>> > > [ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2
>>> »
>>> > > TestTimedOut test timed ...
>>> > > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
>>> > > Time-limited test-SendThread(...
>>> > >
>>> > > These tests fail consistently at every build attempt!
>>> > >
>>> > > The tests fail even when executed separately, e.g.:
>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>>> > >
>>> > >
>>> > > I am using the current 'main' branch of Accumulo.
>>> > > JDK 11.0.11
>>> > > Maven: 3.8.2
>>> > > OS: Ubuntu 20.04.3 ARM64
>>> > >
>>> > > Is there anything that could be done to fix these problems ?
>>> > > For example some config settings ?!
>>> > >
>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
>>> Linux
>>> > > ARM64 is a supported platform since the JVM supports it.
>>> > >
>>> > > Thanks!
>>> > >
>>> > > Mark
>>> > >
>>>
>>

Re: Consistent IT tests failures on Linux ARM64

Posted by Mark Jens <ma...@gmail.com>.
Hi again,

Here are the thread dumps as promised:

1) Both TabletServers are very busy at compressing at close time. The
following stacks are dumped in ~5 secs interval:

"tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=68425.44ms
elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
 [0x0000fffe8f3fd000]
   java.lang.Thread.State: RUNNABLE
        at sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
/SHA5.java:232)
        at sun.security.provider.SHA5.implCompress(java.base@11.0.11
/SHA5.java:221)
        at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:124)
        at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
        at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
        at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
        at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
        at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
        at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
        at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
        at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
        at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
        at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
        at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
        at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
        at
org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
        at
org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
        at
org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
        at
org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
        at
org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
        at
org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
        at
org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
        at
org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
        - locked <0x00000000f1585830> (a
org.apache.accumulo.tserver.tablet.Tablet)
        at org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
        at
org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
        at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
        at
io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
Source)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
/ThreadPoolExecutor.java:1128)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
/ThreadPoolExecutor.java:628)
        at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
        at
io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
Source)
        at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)

"tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=72485.20ms
elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
 [0x0000fffe8f3fd000]
   java.lang.Thread.State: RUNNABLE
        at
sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
/DigestBase.java:149)
        at
sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
/DigestBase.java:144)
        at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:131)
        at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
        at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
        at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
        at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
        at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
        at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
        at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
        at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
        at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
        at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
        at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
        at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
        ...

"tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=81174.59ms
elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
 [0x0000fffe8f3fd000]
   java.lang.Thread.State: RUNNABLE
        at sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
/ByteArrayAccess.java:449)
        at sun.security.provider.SHA5.implDigest(java.base@11.0.11
/SHA5.java:131)
        at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
/DigestBase.java:210)
        at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
/DigestBase.java:189)
        at
java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
/MessageDigest.java:639)
        at java.security.MessageDigest.digest(java.base@11.0.11
/MessageDigest.java:385)
        at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
        at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
        at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
        at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
        at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
        at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
        at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
        at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
        at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
        at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
        ...

"tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=86499.01ms
elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
 [0x0000fffe8f3fd000]
   java.lang.Thread.State: RUNNABLE
        at
sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
/DigestBase.java:149)
        at
sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
/DigestBase.java:144)
        at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:131)
        at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
        at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
        at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
        at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
        at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
        at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
        at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
        at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
        at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
        at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
        at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
        at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
        ...

"tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0 cpu=109551.37ms
elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
 [0x0000fffe7bffd000]
14012    java.lang.Thread.State: RUNNABLE
14013   at
sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
/DigestBase.java:149)
14014   at
sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
/DigestBase.java:144)
14015   at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:131)
14016   at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
14017   at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
14018   at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
14019   at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
14020   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
14021   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
14022   at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
14023   at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
14024   at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
14025   at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
14026   at
org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
14027   at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
14028   at
org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)

Notice that ClientContext.getProperties(ClientContext.java:236) most of the
times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in the last
one it calls ServerInfo.getAuthenticationToken(ServerInfo.java:153).
And both lead to (a lot of ?!) compressing..

2) The "Manager" process writes ~200Mb of logs. Maybe the default log level
should not be DEBUG ?!

Most of its threads either wait for notifications from Zookeeper:

878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
Object.wait()  [0x0000fffebb7fc000]
 878648    java.lang.Thread.State: TIMED_WAITING (on object monitor)
 878649   at java.lang.Object.wait(java.base@11.0.11/Native Method)
 878650   - waiting on <no object reference available>
 878651   at
org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
 878652   - waiting to re-lock in wait() <0x00000000f1427458> (a
org.apache.accumulo.fate.ZooStore)
 878653   at
org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
 878654   at
org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
 878655   at org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
 878656   at
org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
 878657   at
org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
...

or wait for data:
878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0 cpu=7440.91ms
elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
 [0x0000fffebadfd000]
 878782    java.lang.Thread.State: WAITING (on object monitor)
 878783   at java.lang.Object.wait(java.base@11.0.11/Native Method)
 878784   - waiting on <no object reference available>
 878785   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
 878786   at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
 878787   - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
org.apache.zookeeper.ClientCnxn$Packet)
 878788   at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
 878789   at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
 878790   at
org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
 878791   at
org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
Source)
 878792   at
org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
Source)
 878793   at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
 878794   at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
 878795   at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
 878796   at
org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
 878797   at org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
 878798   at
org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
 878799   at
org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
 878800   at
org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
 878801   at
io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
 878802   at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
Source)
 878803   at
java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
/ThreadPoolExecutor.java:1128)
 878804   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
/ThreadPoolExecutor.java:628)
 878805   at
io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
 878806   at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
Source)
 878807   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)

908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
 [0x0000ffff20f50000]
 908221    java.lang.Thread.State: WAITING (on object monitor)
 908222   at java.lang.Object.wait(java.base@11.0.11/Native Method)
 908223   - waiting on <no object reference available>
 908224   at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
 908225   at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
 908226   - waiting to re-lock in wait() <0x00000000fa781138> (a
org.apache.zookeeper.ClientCnxn$Packet)
 908227   at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
 908228   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
 908229   at
org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
 908230   at
org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
Source)
 908231   at
org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
Source)
 908232   at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
 908233   at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
 908234   at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
 908235   at
org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
 908236   at
org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
 908237   at
org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
 908238   at
org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
 908239   at
io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
 908240   at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
Source)
 908241   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)

3) SimpleGarbageCollector is also busy in getting credentials

 "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
2503    java.lang.Thread.State: RUNNABLE
2504   at
sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
/DigestBase.java:149)
2505   at
sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
/DigestBase.java:144)
2506   at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:131)
2507   at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
2508   at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
2509   at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
2510   at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
2511   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
2512   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
2513   at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
2514   at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
2515   at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
2516   at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
2517   at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
2518   at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
2519   at
org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
2520   at
org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
2521   at
org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
2522   at
org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
2523   at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
2524   at
org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
2525   at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
2526   at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
Source)
2527   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)


3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
tid=0x0000ffff28295800 nid=0x32dac5 runnable  [0x0000ffff3a5fb000]
3152    java.lang.Thread.State: RUNNABLE
3153   at java.util.Arrays.hashCode(java.base@11.0.11/Arrays.java:4685)
3154   at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
3155   at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
/Provider.java:1107)
3156   at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
/ConcurrentHashMap.java:936)
3157   at java.security.Provider.getService(java.base@11.0.11
/Provider.java:1282)
3158   at sun.security.jca.ProviderList.getService(java.base@11.0.11
/ProviderList.java:380)
3159   at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
/GetInstance.java:157)
3160   at java.security.Security.getImpl(java.base@11.0.11
/Security.java:700)
3161   at java.security.MessageDigest.getInstance(java.base@11.0.11
/MessageDigest.java:178)
3162   at
org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
3163   at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
3164   at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
3165   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
3166   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
3167   at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
3168   at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
3169   at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
3170   at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
3171   at
org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
3172   at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
3173   at
org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
3174   at
org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
3175   at
org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
3176   at
org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
3177   at
org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
3178   at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
3179   at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
3180   at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
3181   at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
3182   at
org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
3183   at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
3184   at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
Source)
3185   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)


4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
processes


I'm not saying that the above are problematic. You know how Accumulo works.
It is up to you to decide whether something should be improved.

Regards,
Mark


On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:

>
>
> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:
>
>> It looks like the tests are timing out. This happens frequently when
>> running on resource-constrained systems. You can give the test more
>> time by increasing the timeout factor: `mvn clean verify
>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
>> -Dtimeout.factor=3`
>>
>> There's nothing we know of that would change the way our tests work
>> due to ARM64, but you may have issues because of limited RAM, slow CPU
>> speeds, slow disk I/O, busy background processes, or other
>> resource-related issues. I don't think most of the currently active
>> developers use ARM64, or have access to a test machine to reproduce or
>>
>
> In case anyone wants to test on Linux ARM64 you could easily use Oracle
> Cloud for free.
>
> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
> explains how to create a VM and how to use this VM as a Github Actions
> runner.
> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
> mentions this article.
>
>
>> experiment with Accumulo there, so you may have to do some of your own
>> troubleshooting. If you can rule out resource-constraint issues, and
>> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
>> flaky and sometimes times out on x86_64 as well), you could create a
>> bug ticket with more details at
>> https://github.com/apache/accumulo/issues ; there is an issue template
>> specifically for broken and/or flaky tests that you can select when
>> creating a new ticket.
>>
>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
>> >
>> > Hi dev1,
>> >
>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>> >
>> > > Some of those tests are trying to stress conditions that require a
>> lot of
>> > > resources to replicate specific conditions. Have you tried to run
>> those
>> > > individual tests in isolation so that you are not competing for
>> resources?
>> > > Do they always fail, or are the failures transient?
>> > >
>> >
>> > Q: Have you tried to run those individual tests in isolation so that you
>> > are not competing for resources?
>> > A: This is what I mean with the following:
>> > ---------------------
>> > The tests fail even when executed separately, e.g.:
>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>> > ---------------------
>> >
>> > Q: Do they always fail, or are the failures transient?
>> > A: I also tried to explain that with "These tests fail consistently at
>> > every build attempt!"
>> >
>> > Mark
>> >
>> > >
>> > > -----Original Message-----
>> > > From: Mark Jens <ma...@gmail.com>
>> > > Sent: Tuesday, November 30, 2021 4:05 AM
>> > > To: dev@accumulo.apache.org
>> > > Subject: Consistent IT tests failures on Linux ARM64
>> > >
>> > > Hello Accumulo community,
>> > >
>> > > At my job we consider using Linux ARM64 servers and I've been tasked
>> to
>> > > test Accumulo.
>> > >
>> > > I face some timeout related issues with several IT tests:
>> > >
>> > >
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > >  Time elapsed: 420.122 s  <<< ERROR!
>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>> 420
>> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
>> Method)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>> > > at java.base@11.0.11
>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > > Method)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > >  Time elapsed: 420.122 s  <<< ERROR!
>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>> > > test-SendThread(localhost:44251)
>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>> > > java.base@11.0.11
>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>> > > at java.base@11.0.11
>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>> > > at java.base@11.0.11/sun.nio.ch
>> .SelectorImpl.select(SelectorImpl.java:136)
>> > > at
>> > >
>> > >
>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>> > > at
>> > >
>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
>> > >  Time elapsed: 420.011 s  <<< ERROR!
>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>> 420
>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
>> > >
>> > >
>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
>> > > at
>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
>> > > at
>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
>> > > at java.base@11.0.11
>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > > Method)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >
>> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
>> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
>> > > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.CreateManyScannersIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 255.108 s - in
>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
>> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 59.289 s - in
>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
>> elapsed:
>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
>> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
>> > > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
>> > > [INFO] Running
>> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 219.253 s - in
>> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
>> > > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
>> elapsed:
>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
>> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
>> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
>> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 71.934 s - in
>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
>> elapsed:
>> > > 307.904 s <<< FAILURE! - in
>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
>> > > [ERROR]
>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > >  Time elapsed: 240.011 s  <<< ERROR!
>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>> 240
>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
>> > > at java.base@11.0.11
>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
>> > > at java.base@11.0.11
>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > > Method)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >
>> > > [ERROR]
>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > >  Time elapsed: 240.012 s  <<< ERROR!
>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>> > > test-SendThread(localhost:39285)
>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>> > > java.base@11.0.11
>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>> > > at java.base@11.0.11
>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>> > > at java.base@11.0.11/sun.nio.ch
>> .SelectorImpl.select(SelectorImpl.java:136)
>> > > at
>> > >
>> > >
>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>> > > at
>> > >
>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >
>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
>> > > [INFO] Running
>> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
>> elapsed:
>> > > 0.039 s - in
>> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>> > > [INFO]
>> > > [INFO] Results:
>> > > [INFO]
>> > > [ERROR] Errors:
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
>> > > [ERROR]   Run 1:
>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
>> > > TestTimedOut
>> > > [ERROR]   Run 2:
>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
>> Appears
>> > > to ...
>> > > [INFO]
>> > > [ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
>> > > TestTimedOut test t...
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > > [ERROR]   Run 1:
>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
>> TestTimedOut
>> > > tes...
>> > > [ERROR]   Run 2:
>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
>> > >  Appears to be stuck...
>> > > [INFO]
>> > > [ERROR]
>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > > [ERROR]   Run 1:
>> > >
>> > >
>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
>> > > » TestTimedOut
>> > > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be stuck
>> in
>> > > thread Time-limited te...
>> > > [INFO]
>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
>> > > [ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
>> > > TestTimedOut test timed ...
>> > > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
>> > > Time-limited test-SendThread(...
>> > >
>> > > These tests fail consistently at every build attempt!
>> > >
>> > > The tests fail even when executed separately, e.g.:
>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>> > >
>> > >
>> > > I am using the current 'main' branch of Accumulo.
>> > > JDK 11.0.11
>> > > Maven: 3.8.2
>> > > OS: Ubuntu 20.04.3 ARM64
>> > >
>> > > Is there anything that could be done to fix these problems ?
>> > > For example some config settings ?!
>> > >
>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
>> Linux
>> > > ARM64 is a supported platform since the JVM supports it.
>> > >
>> > > Thanks!
>> > >
>> > > Mark
>> > >
>>
>

Re: Consistent IT tests failures on Linux ARM64

Posted by Mark Jens <ma...@gmail.com>.
On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:

> It looks like the tests are timing out. This happens frequently when
> running on resource-constrained systems. You can give the test more
> time by increasing the timeout factor: `mvn clean verify
> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
> -Dtimeout.factor=3`
>
> There's nothing we know of that would change the way our tests work
> due to ARM64, but you may have issues because of limited RAM, slow CPU
> speeds, slow disk I/O, busy background processes, or other
> resource-related issues. I don't think most of the currently active
> developers use ARM64, or have access to a test machine to reproduce or
>

In case anyone wants to test on Linux ARM64 you could easily use Oracle
Cloud for free.
https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
explains how to create a VM and how to use this VM as a Github Actions
runner.
https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
mentions this article.


> experiment with Accumulo there, so you may have to do some of your own
> troubleshooting. If you can rule out resource-constraint issues, and
> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
> flaky and sometimes times out on x86_64 as well), you could create a
> bug ticket with more details at
> https://github.com/apache/accumulo/issues ; there is an issue template
> specifically for broken and/or flaky tests that you can select when
> creating a new ticket.
>
> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
> >
> > Hi dev1,
> >
> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
> >
> > > Some of those tests are trying to stress conditions that require a lot
> of
> > > resources to replicate specific conditions. Have you tried to run those
> > > individual tests in isolation so that you are not competing for
> resources?
> > > Do they always fail, or are the failures transient?
> > >
> >
> > Q: Have you tried to run those individual tests in isolation so that you
> > are not competing for resources?
> > A: This is what I mean with the following:
> > ---------------------
> > The tests fail even when executed separately, e.g.:
> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > ---------------------
> >
> > Q: Do they always fail, or are the failures transient?
> > A: I also tried to explain that with "These tests fail consistently at
> > every build attempt!"
> >
> > Mark
> >
> > >
> > > -----Original Message-----
> > > From: Mark Jens <ma...@gmail.com>
> > > Sent: Tuesday, November 30, 2021 4:05 AM
> > > To: dev@accumulo.apache.org
> > > Subject: Consistent IT tests failures on Linux ARM64
> > >
> > > Hello Accumulo community,
> > >
> > > At my job we consider using Linux ARM64 servers and I've been tasked to
> > > test Accumulo.
> > >
> > > I face some timeout related issues with several IT tests:
> > >
> > >
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >  Time elapsed: 420.122 s  <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
> Method)
> > > at java.base@11.0.11
> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > > at
> > >
> > >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >  Time elapsed: 420.122 s  <<< ERROR!
> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > > test-SendThread(localhost:44251)
> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > > java.base@11.0.11
> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > > at java.base@11.0.11
> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > > at java.base@11.0.11/sun.nio.ch
> .SelectorImpl.select(SelectorImpl.java:136)
> > > at
> > >
> > >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > > at
> > >
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> > >  Time elapsed: 420.011 s  <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
> > >
> > >
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > > at
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > > at
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > > at
> > >
> > >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> > > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> > > [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> > > [INFO] Running
> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 255.108 s - in
> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> > > [INFO] Running
> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 59.289 s - in
> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> > > [INFO] Running
> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 219.253 s - in
> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> > > [INFO] Running
> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 71.934 s - in
> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> > > 307.904 s <<< FAILURE! - in
> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >  Time elapsed: 240.011 s  <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 240
> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > > at java.base@11.0.11
> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > > at
> > >
> > >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > > at
> > >
> > >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >  Time elapsed: 240.012 s  <<< ERROR!
> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > > test-SendThread(localhost:39285)
> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > > java.base@11.0.11
> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > > at java.base@11.0.11
> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > > at java.base@11.0.11/sun.nio.ch
> .SelectorImpl.select(SelectorImpl.java:136)
> > > at
> > >
> > >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > > at
> > >
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >
> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > > [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > > [INFO] Running
> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
> elapsed:
> > > 0.039 s - in
> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > > [INFO]
> > > [INFO] Results:
> > > [INFO]
> > > [ERROR] Errors:
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > > [ERROR]   Run 1:
> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> > > TestTimedOut
> > > [ERROR]   Run 2:
> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
> Appears
> > > to ...
> > > [INFO]
> > > [ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > > TestTimedOut test t...
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > > [ERROR]   Run 1:
> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> TestTimedOut
> > > tes...
> > > [ERROR]   Run 2:
> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
> > >  Appears to be stuck...
> > > [INFO]
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > > [ERROR]   Run 1:
> > >
> > >
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > > » TestTimedOut
> > > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be stuck
> in
> > > thread Time-limited te...
> > > [INFO]
> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > > [ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > > TestTimedOut test timed ...
> > > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
> > > Time-limited test-SendThread(...
> > >
> > > These tests fail consistently at every build attempt!
> > >
> > > The tests fail even when executed separately, e.g.:
> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > >
> > >
> > > I am using the current 'main' branch of Accumulo.
> > > JDK 11.0.11
> > > Maven: 3.8.2
> > > OS: Ubuntu 20.04.3 ARM64
> > >
> > > Is there anything that could be done to fix these problems ?
> > > For example some config settings ?!
> > >
> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
> Linux
> > > ARM64 is a supported platform since the JVM supports it.
> > >
> > > Thanks!
> > >
> > > Mark
> > >
>

Re: Consistent IT tests failures on Linux ARM64

Posted by Christopher <ct...@apache.org>.
It looks like the tests are timing out. This happens frequently when
running on resource-constrained systems. You can give the test more
time by increasing the timeout factor: `mvn clean verify
-Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
-Dtimeout.factor=3`

There's nothing we know of that would change the way our tests work
due to ARM64, but you may have issues because of limited RAM, slow CPU
speeds, slow disk I/O, busy background processes, or other
resource-related issues. I don't think most of the currently active
developers use ARM64, or have access to a test machine to reproduce or
experiment with Accumulo there, so you may have to do some of your own
troubleshooting. If you can rule out resource-constraint issues, and
it isn't already a known flaky test (ConcurrentDeleteTableIT is known
flaky and sometimes times out on x86_64 as well), you could create a
bug ticket with more details at
https://github.com/apache/accumulo/issues ; there is an issue template
specifically for broken and/or flaky tests that you can select when
creating a new ticket.

On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
>
> Hi dev1,
>
> On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>
> > Some of those tests are trying to stress conditions that require a lot of
> > resources to replicate specific conditions. Have you tried to run those
> > individual tests in isolation so that you are not competing for resources?
> > Do they always fail, or are the failures transient?
> >
>
> Q: Have you tried to run those individual tests in isolation so that you
> are not competing for resources?
> A: This is what I mean with the following:
> ---------------------
> The tests fail even when executed separately, e.g.:
> mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> ---------------------
>
> Q: Do they always fail, or are the failures transient?
> A: I also tried to explain that with "These tests fail consistently at
> every build attempt!"
>
> Mark
>
> >
> > -----Original Message-----
> > From: Mark Jens <ma...@gmail.com>
> > Sent: Tuesday, November 30, 2021 4:05 AM
> > To: dev@accumulo.apache.org
> > Subject: Consistent IT tests failures on Linux ARM64
> >
> > Hello Accumulo community,
> >
> > At my job we consider using Linux ARM64 servers and I've been tasked to
> > test Accumulo.
> >
> > I face some timeout related issues with several IT tests:
> >
> >
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >  Time elapsed: 420.122 s  <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
> > at java.base@11.0.11
> > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > at
> >
> > app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> > app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> > app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >  Time elapsed: 420.122 s  <<< ERROR!
> > java.lang.Exception: Appears to be stuck in thread Time-limited
> > test-SendThread(localhost:44251)
> > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > java.base@11.0.11
> > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > at java.base@11.0.11
> > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
> > at
> >
> > app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > at
> > app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> >  Time elapsed: 420.011 s  <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
> >
> > app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > at
> >
> > app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> > app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> > app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> > [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> > [INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> > [INFO] Running
> > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 219.253 s - in
> > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> > [INFO] Running org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 71.934 s - in org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> > 307.904 s <<< FAILURE! - in
> > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >  Time elapsed: 240.011 s  <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 240
> > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > at
> >
> > app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > at
> >
> > app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> > app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> > app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >  Time elapsed: 240.012 s  <<< ERROR!
> > java.lang.Exception: Appears to be stuck in thread Time-limited
> > test-SendThread(localhost:39285)
> > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > java.base@11.0.11
> > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > at java.base@11.0.11
> > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
> > at
> >
> > app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > at
> > app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >
> > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > [INFO] Running
> > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
> > 0.039 s - in
> > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > [INFO]
> > [INFO] Results:
> > [INFO]
> > [ERROR] Errors:
> > [ERROR]
> >
> > org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > [ERROR]   Run 1:
> > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> > TestTimedOut
> > [ERROR]   Run 2:
> > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »  Appears
> > to ...
> > [INFO]
> > [ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > TestTimedOut test t...
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > [ERROR]   Run 1:
> > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 » TestTimedOut
> > tes...
> > [ERROR]   Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
> >  Appears to be stuck...
> > [INFO]
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > [ERROR]   Run 1:
> >
> > HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > » TestTimedOut
> > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be stuck in
> > thread Time-limited te...
> > [INFO]
> > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > [ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > TestTimedOut test timed ...
> > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
> > Time-limited test-SendThread(...
> >
> > These tests fail consistently at every build attempt!
> >
> > The tests fail even when executed separately, e.g.:
> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> >
> >
> > I am using the current 'main' branch of Accumulo.
> > JDK 11.0.11
> > Maven: 3.8.2
> > OS: Ubuntu 20.04.3 ARM64
> >
> > Is there anything that could be done to fix these problems ?
> > For example some config settings ?!
> >
> > P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
> > ARM64 is a supported platform since the JVM supports it.
> >
> > Thanks!
> >
> > Mark
> >

Re: Consistent IT tests failures on Linux ARM64

Posted by Mark Jens <ma...@gmail.com>.
On Tue, 30 Nov 2021 at 18:34, Mike Miller <mm...@apache.org> wrote:

> There have been issues with that IT so it is possible it is unrelated to
> your architecture.
> https://github.com/apache/accumulo/pull/2304
> https://github.com/apache/accumulo/issues/1841


Issue #1841 is exactly what I experience!
My test machine has 8 CPU cores @ 2.6GHz and 16GB RAM
With -Dtimeout.factor=3 ConcurrentDeleteTableIT passes in 768.612 s (i.e.
13 mins)
Now I am running the whole IT tests suite - the CPUs are pretty idle, they
spike to up to 20%, and only 4GB RAM is being used

Once the tests finish I will re-run the ConcurrentDeleteTableIT and take
some thread dumps to see where it blocks


>
>
> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
>
> > Hi dev1,
> >
> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
> >
> > > Some of those tests are trying to stress conditions that require a lot
> of
> > > resources to replicate specific conditions. Have you tried to run those
> > > individual tests in isolation so that you are not competing for
> > resources?
> > > Do they always fail, or are the failures transient?
> > >
> >
> > Q: Have you tried to run those individual tests in isolation so that you
> > are not competing for resources?
> > A: This is what I mean with the following:
> > ---------------------
> > The tests fail even when executed separately, e.g.:
> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > ---------------------
> >
> > Q: Do they always fail, or are the failures transient?
> > A: I also tried to explain that with "These tests fail consistently at
> > every build attempt!"
> >
> > Mark
> >
> > >
> > > -----Original Message-----
> > > From: Mark Jens <ma...@gmail.com>
> > > Sent: Tuesday, November 30, 2021 4:05 AM
> > > To: dev@accumulo.apache.org
> > > Subject: Consistent IT tests failures on Linux ARM64
> > >
> > > Hello Accumulo community,
> > >
> > > At my job we consider using Linux ARM64 servers and I've been tasked to
> > > test Accumulo.
> > >
> > > I face some timeout related issues with several IT tests:
> > >
> > >
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >  Time elapsed: 420.122 s  <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
> > Method)
> > > at java.base@11.0.11
> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >  Time elapsed: 420.122 s  <<< ERROR!
> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > > test-SendThread(localhost:44251)
> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > > java.base@11.0.11
> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > > at java.base@11.0.11
> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > > at java.base@11.0.11/sun.nio.ch
> > .SelectorImpl.select(SelectorImpl.java:136)
> > > at
> > >
> > >
> >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > > at
> > >
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> > >  Time elapsed: 420.011 s  <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
> > >
> > >
> >
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > > at
> > app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > > at
> > app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> > > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> > > [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> > > [INFO] Running
> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 255.108 s - in
> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> > > [INFO] Running
> > org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 59.289 s - in
> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> > > [INFO] Running
> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 219.253 s - in
> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> > > [INFO] Running
> > org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 71.934 s - in
> > org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> > > 307.904 s <<< FAILURE! - in
> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >  Time elapsed: 240.011 s  <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 240
> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > > at java.base@11.0.11
> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >  Time elapsed: 240.012 s  <<< ERROR!
> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > > test-SendThread(localhost:39285)
> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > > java.base@11.0.11
> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > > at java.base@11.0.11
> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > > at java.base@11.0.11/sun.nio.ch
> > .SelectorImpl.select(SelectorImpl.java:136)
> > > at
> > >
> > >
> >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > > at
> > >
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >
> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > > [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > > [INFO] Running
> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
> elapsed:
> > > 0.039 s - in
> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > > [INFO]
> > > [INFO] Results:
> > > [INFO]
> > > [ERROR] Errors:
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > > [ERROR]   Run 1:
> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> > > TestTimedOut
> > > [ERROR]   Run 2:
> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
> > Appears
> > > to ...
> > > [INFO]
> > > [ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > > TestTimedOut test t...
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > > [ERROR]   Run 1:
> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> > TestTimedOut
> > > tes...
> > > [ERROR]   Run 2:
> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > »
> > >  Appears to be stuck...
> > > [INFO]
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > > [ERROR]   Run 1:
> > >
> > >
> >
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > > » TestTimedOut
> > > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be stuck
> in
> > > thread Time-limited te...
> > > [INFO]
> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > > [ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > > TestTimedOut test timed ...
> > > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
> > > Time-limited test-SendThread(...
> > >
> > > These tests fail consistently at every build attempt!
> > >
> > > The tests fail even when executed separately, e.g.:
> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > >
> > >
> > > I am using the current 'main' branch of Accumulo.
> > > JDK 11.0.11
> > > Maven: 3.8.2
> > > OS: Ubuntu 20.04.3 ARM64
> > >
> > > Is there anything that could be done to fix these problems ?
> > > For example some config settings ?!
> > >
> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
> Linux
> > > ARM64 is a supported platform since the JVM supports it.
> > >
> > > Thanks!
> > >
> > > Mark
> > >
> >
>

Re: Consistent IT tests failures on Linux ARM64

Posted by Mike Miller <mm...@apache.org>.
There have been issues with that IT so it is possible it is unrelated to
your architecture.
https://github.com/apache/accumulo/pull/2304
https://github.com/apache/accumulo/issues/1841

On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:

> Hi dev1,
>
> On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>
> > Some of those tests are trying to stress conditions that require a lot of
> > resources to replicate specific conditions. Have you tried to run those
> > individual tests in isolation so that you are not competing for
> resources?
> > Do they always fail, or are the failures transient?
> >
>
> Q: Have you tried to run those individual tests in isolation so that you
> are not competing for resources?
> A: This is what I mean with the following:
> ---------------------
> The tests fail even when executed separately, e.g.:
> mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> ---------------------
>
> Q: Do they always fail, or are the failures transient?
> A: I also tried to explain that with "These tests fail consistently at
> every build attempt!"
>
> Mark
>
> >
> > -----Original Message-----
> > From: Mark Jens <ma...@gmail.com>
> > Sent: Tuesday, November 30, 2021 4:05 AM
> > To: dev@accumulo.apache.org
> > Subject: Consistent IT tests failures on Linux ARM64
> >
> > Hello Accumulo community,
> >
> > At my job we consider using Linux ARM64 servers and I've been tasked to
> > test Accumulo.
> >
> > I face some timeout related issues with several IT tests:
> >
> >
> > [ERROR]
> >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >  Time elapsed: 420.122 s  <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
> Method)
> > at java.base@11.0.11
> > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > at
> >
> >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [ERROR]
> >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >  Time elapsed: 420.122 s  <<< ERROR!
> > java.lang.Exception: Appears to be stuck in thread Time-limited
> > test-SendThread(localhost:44251)
> > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > java.base@11.0.11
> > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > at java.base@11.0.11
> > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > at java.base@11.0.11/sun.nio.ch
> .SelectorImpl.select(SelectorImpl.java:136)
> > at
> >
> >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > at
> > app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >
> > [ERROR]
> >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> >  Time elapsed: 420.011 s  <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
> >
> >
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > at
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > at
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > at
> >
> >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> > [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> > [INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> > [INFO] Running
> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> > [INFO] Running
> > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 219.253 s - in
> > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> > [INFO] Running
> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 71.934 s - in
> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> > 307.904 s <<< FAILURE! - in
> > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >  Time elapsed: 240.011 s  <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 240
> > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > at
> >
> >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > at
> >
> >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >  Time elapsed: 240.012 s  <<< ERROR!
> > java.lang.Exception: Appears to be stuck in thread Time-limited
> > test-SendThread(localhost:39285)
> > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > java.base@11.0.11
> > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > at java.base@11.0.11
> > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > at java.base@11.0.11/sun.nio.ch
> .SelectorImpl.select(SelectorImpl.java:136)
> > at
> >
> >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > at
> > app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >
> > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > [INFO] Running
> > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
> > 0.039 s - in
> > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > [INFO]
> > [INFO] Results:
> > [INFO]
> > [ERROR] Errors:
> > [ERROR]
> >
> >
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > [ERROR]   Run 1:
> > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> > TestTimedOut
> > [ERROR]   Run 2:
> > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
> Appears
> > to ...
> > [INFO]
> > [ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > TestTimedOut test t...
> > [ERROR]
> >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > [ERROR]   Run 1:
> > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> TestTimedOut
> > tes...
> > [ERROR]   Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> »
> >  Appears to be stuck...
> > [INFO]
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > [ERROR]   Run 1:
> >
> >
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > » TestTimedOut
> > [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be stuck in
> > thread Time-limited te...
> > [INFO]
> > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > [ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > TestTimedOut test timed ...
> > [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
> > Time-limited test-SendThread(...
> >
> > These tests fail consistently at every build attempt!
> >
> > The tests fail even when executed separately, e.g.:
> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> >
> >
> > I am using the current 'main' branch of Accumulo.
> > JDK 11.0.11
> > Maven: 3.8.2
> > OS: Ubuntu 20.04.3 ARM64
> >
> > Is there anything that could be done to fix these problems ?
> > For example some config settings ?!
> >
> > P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
> > ARM64 is a supported platform since the JVM supports it.
> >
> > Thanks!
> >
> > Mark
> >
>

Re: Consistent IT tests failures on Linux ARM64

Posted by Mark Jens <ma...@gmail.com>.
Hi dev1,

On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:

> Some of those tests are trying to stress conditions that require a lot of
> resources to replicate specific conditions. Have you tried to run those
> individual tests in isolation so that you are not competing for resources?
> Do they always fail, or are the failures transient?
>

Q: Have you tried to run those individual tests in isolation so that you
are not competing for resources?
A: This is what I mean with the following:
---------------------
The tests fail even when executed separately, e.g.:
mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
---------------------

Q: Do they always fail, or are the failures transient?
A: I also tried to explain that with "These tests fail consistently at
every build attempt!"

Mark

>
> -----Original Message-----
> From: Mark Jens <ma...@gmail.com>
> Sent: Tuesday, November 30, 2021 4:05 AM
> To: dev@accumulo.apache.org
> Subject: Consistent IT tests failures on Linux ARM64
>
> Hello Accumulo community,
>
> At my job we consider using Linux ARM64 servers and I've been tasked to
> test Accumulo.
>
> I face some timeout related issues with several IT tests:
>
>
> [ERROR]
>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>  Time elapsed: 420.122 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 420
> seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
> at java.base@11.0.11
> /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> at
>
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> at java.base@11.0.11
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at java.base@11.0.11
>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at java.base@11.0.11
>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> at
>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at
>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at
>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>
> [ERROR]
>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>  Time elapsed: 420.122 s  <<< ERROR!
> java.lang.Exception: Appears to be stuck in thread Time-limited
> test-SendThread(localhost:44251)
> at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> java.base@11.0.11
> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> at java.base@11.0.11
> /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
> at
>
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> at
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>
> [ERROR]
>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
>  Time elapsed: 420.011 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 420
> seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
>
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> at
>
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> at
>
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> at
>
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> at
>
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> at
>
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> at
>
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> at java.base@11.0.11
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at java.base@11.0.11
>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at java.base@11.0.11
>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> at
>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at
>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at
>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>
> [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> [INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
> [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
> [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> [INFO] Running org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> [INFO] Running org.apache.accumulo.test.functional.BulkIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> [INFO] Running
> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 219.253 s - in
> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> [INFO] Running org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 71.934 s - in org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> 307.904 s <<< FAILURE! - in
> org.apache.accumulo.test.functional.HalfDeadTServerIT
> [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>  Time elapsed: 240.011 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 240
> seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> at
>
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> at
>
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> at java.base@11.0.11
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at java.base@11.0.11
>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at java.base@11.0.11
>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> at
>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at
>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at
>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>
> [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>  Time elapsed: 240.012 s  <<< ERROR!
> java.lang.Exception: Appears to be stuck in thread Time-limited
> test-SendThread(localhost:39285)
> at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> java.base@11.0.11
> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> at java.base@11.0.11
> /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
> at
>
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> at
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>
> [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> [INFO] Running org.apache.accumulo.test.AuditMessageIT
> [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> [INFO] Running
> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
> 0.039 s - in
> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> [INFO]
> [INFO] Results:
> [INFO]
> [ERROR] Errors:
> [ERROR]
>
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> [ERROR]   Run 1:
> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> TestTimedOut
> [ERROR]   Run 2:
> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »  Appears
> to ...
> [INFO]
> [ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> TestTimedOut test t...
> [ERROR]
>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> [ERROR]   Run 1:
> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 » TestTimedOut
> tes...
> [ERROR]   Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
>  Appears to be stuck...
> [INFO]
> [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> [ERROR]   Run 1:
>
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> » TestTimedOut
> [ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be stuck in
> thread Time-limited te...
> [INFO]
> [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> [ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> TestTimedOut test timed ...
> [ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
> Time-limited test-SendThread(...
>
> These tests fail consistently at every build attempt!
>
> The tests fail even when executed separately, e.g.:
> mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>
>
> I am using the current 'main' branch of Accumulo.
> JDK 11.0.11
> Maven: 3.8.2
> OS: Ubuntu 20.04.3 ARM64
>
> Is there anything that could be done to fix these problems ?
> For example some config settings ?!
>
> P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
> ARM64 is a supported platform since the JVM supports it.
>
> Thanks!
>
> Mark
>

RE: Consistent IT tests failures on Linux ARM64

Posted by dev1 <de...@etcoleman.com>.
Some of those tests are trying to stress conditions that require a lot of resources to replicate specific conditions. Have you tried to run those individual tests in isolation so that you are not competing for resources? Do they always fail, or are the failures transient?

-----Original Message-----
From: Mark Jens <ma...@gmail.com> 
Sent: Tuesday, November 30, 2021 4:05 AM
To: dev@accumulo.apache.org
Subject: Consistent IT tests failures on Linux ARM64

Hello Accumulo community,

At my job we consider using Linux ARM64 servers and I've been tasked to test Accumulo.

I face some timeout related issues with several IT tests:


[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
 Time elapsed: 420.122 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420 seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method) at java.base@11.0.11
/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.11
/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
at java.base@11.0.11
/java.util.concurrent.FutureTask.get(FutureTask.java:190)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)

[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
 Time elapsed: 420.122 s  <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:44251)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)

[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
 Time elapsed: 420.011 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420 seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)

[INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Running org.apache.accumulo.test.functional.BinaryIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
65.034 s - in org.apache.accumulo.test.functional.BinaryIT
[INFO] Running org.apache.accumulo.test.functional.PermissionsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
[INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Running org.apache.accumulo.test.functional.RestartStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
[INFO] Running org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Running org.apache.accumulo.test.functional.BulkNewIT
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
[INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Running org.apache.accumulo.test.functional.BulkIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
122.959 s - in org.apache.accumulo.test.functional.BulkIT
[INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Running
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
219.253 s - in
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Running org.apache.accumulo.test.functional.VisibilityIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
[INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Running org.apache.accumulo.test.functional.SummaryIT
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
111.552 s - in org.apache.accumulo.test.functional.SummaryIT
[INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Running org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
71.934 s - in org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
307.904 s <<< FAILURE! - in
org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
 Time elapsed: 240.011 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 240 seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at java.base@11.0.11/java.lang.Object.wait(Object.java:328)
at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)

[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
 Time elapsed: 240.012 s  <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:39285)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)

[INFO] Running org.apache.accumulo.test.functional.MetadataIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
97.987 s - in org.apache.accumulo.test.functional.MetadataIT
[INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Running org.apache.accumulo.test.AuditMessageIT
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
165.169 s - in org.apache.accumulo.test.AuditMessageIT
[INFO] Running
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
0.039 s - in
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR]
org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
[ERROR]   Run 1:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 » TestTimedOut
[ERROR]   Run 2:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »  Appears to ...
[INFO]
[ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
TestTimedOut test t...
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
[ERROR]   Run 1:
ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 » TestTimedOut tes...
[ERROR]   Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
 Appears to be stuck...
[INFO]
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
[ERROR]   Run 1:
HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
» TestTimedOut
[ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be stuck in
thread Time-limited te...
[INFO]
[ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
[ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
TestTimedOut test timed ...
[ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
Time-limited test-SendThread(...

These tests fail consistently at every build attempt!

The tests fail even when executed separately, e.g.:
mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test


I am using the current 'main' branch of Accumulo.
JDK 11.0.11
Maven: 3.8.2
OS: Ubuntu 20.04.3 ARM64

Is there anything that could be done to fix these problems ?
For example some config settings ?!

P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
ARM64 is a supported platform since the JVM supports it.

Thanks!

Mark