You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Mark Jens <ma...@gmail.com> on 2021/11/30 09:04:35 UTC
Consistent IT tests failures on Linux ARM64
Hello Accumulo community,
At my job we consider using Linux ARM64 servers and I've been tasked to
test Accumulo.
I face some timeout related issues with several IT tests:
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
Time elapsed: 420.122 s <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420
seconds
at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
at java.base@11.0.11
/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.11
/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
at java.base@11.0.11
/java.util.concurrent.FutureTask.get(FutureTask.java:190)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
Time elapsed: 420.122 s <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:44251)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method)
at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
Time elapsed: 420.011 s <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420
seconds
at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
at
app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
[INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Running org.apache.accumulo.test.functional.BinaryIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
65.034 s - in org.apache.accumulo.test.functional.BinaryIT
[INFO] Running org.apache.accumulo.test.functional.PermissionsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
[INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Running org.apache.accumulo.test.functional.RestartStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
[INFO] Running org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Running org.apache.accumulo.test.functional.BulkNewIT
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
[INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Running org.apache.accumulo.test.functional.BulkIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
122.959 s - in org.apache.accumulo.test.functional.BulkIT
[INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Running
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
219.253 s - in
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Running org.apache.accumulo.test.functional.VisibilityIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
[INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Running org.apache.accumulo.test.functional.SummaryIT
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
111.552 s - in org.apache.accumulo.test.functional.SummaryIT
[INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Running org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
71.934 s - in org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
307.904 s <<< FAILURE! - in
org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
Time elapsed: 240.011 s <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 240
seconds
at java.base@11.0.11/java.lang.Object.wait(Native Method)
at java.base@11.0.11/java.lang.Object.wait(Object.java:328)
at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
Time elapsed: 240.012 s <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:39285)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method)
at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
[INFO] Running org.apache.accumulo.test.functional.MetadataIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
97.987 s - in org.apache.accumulo.test.functional.MetadataIT
[INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Running org.apache.accumulo.test.AuditMessageIT
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
165.169 s - in org.apache.accumulo.test.AuditMessageIT
[INFO] Running
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
0.039 s - in
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR]
org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
[ERROR] Run 1:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
TestTimedOut
[ERROR] Run 2:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction » Appears
to ...
[INFO]
[ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
TestTimedOut test t...
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
[ERROR] Run 1:
ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 » TestTimedOut
tes...
[ERROR] Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
Appears to be stuck...
[INFO]
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
[ERROR] Run 1:
HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
» TestTimedOut
[ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be stuck in
thread Time-limited te...
[INFO]
[ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
[ERROR] Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
TestTimedOut test timed ...
[ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
Time-limited test-SendThread(...
These tests fail consistently at every build attempt!
The tests fail even when executed separately, e.g.:
mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
I am using the current 'main' branch of Accumulo.
JDK 11.0.11
Maven: 3.8.2
OS: Ubuntu 20.04.3 ARM64
Is there anything that could be done to fix these problems ?
For example some config settings ?!
P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
ARM64 is a supported platform since the JVM supports it.
Thanks!
Mark
Re: Consistent IT tests failures on Linux ARM64
Posted by Mark Jens <ma...@gmail.com>.
On Fri, 3 Dec 2021 at 10:46, Mark Jens <ma...@gmail.com> wrote:
> I've just make few more tests and:
>
> 1) with the improvement
>
> 1.1) INFO] Running
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 132.823 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>
> 1.2) Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 114.933 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>
> 2) without
>
> [INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [ERROR] Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed:
> 577.537 s <<< FAILURE! - in
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [ERROR]
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> Time elapsed: 420.095 s <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 420
> seconds
> at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
> at java.base@11.0.11
> /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> at
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>
> I am going to investigate ExternalCompaction_2_IT and
> ExternalCompaction_3_IT too. These are the other very slow tests
>
I don't have much luck with those so far
mvn clean verify
-Dit.test=ExternalCompaction_2_IT#testSplitCancelsExternalCompaction
-Dtimeout.factor=3 -o -Pfast-build -N
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
1,800.775 s <<< FAILURE! - in
org.apache.accumulo.test.compaction.ExternalCompaction_2_IT
[ERROR]
org.apache.accumulo.test.compaction.ExternalCompaction_2_IT.testSplitCancelsExternalCompaction
Time elapsed: 1,800.024 s <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 1800
seconds
at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
at
app//org.apache.accumulo.fate.util.UtilWaitThread.sleep(UtilWaitThread.java:33)
at
app//org.apache.accumulo.test.compaction.ExternalCompactionTestUtils.confirmCompactionCompleted(ExternalCompactionTestUtils.java:329)
at
app//org.apache.accumulo.test.compaction.ExternalCompaction_2_IT.testSplitCancelsExternalCompaction(ExternalCompaction_2_IT.java:118)
For some reason this test always times out, no matter how much time I give
it.
Thread dumps on all processes do not show anything interesting. Or at least
I don't find anything suspicious.
I've uploaded them at
https://gist.github.com/markjens/326b681c3f8a9c1b6400f55847fe7716
-Pfast-build is
<profile>
<id>fast-build</id>
<properties>
<checkstyle.skip>true</checkstyle.skip>
<spotbugs.skip>true</spotbugs.skip>
<maven.gitcommitid.skip>true</maven.gitcommitid.skip>
<jacoco.skip>true></jacoco.skip>
<enforcer.skip>true></enforcer.skip>
<maven.javadoc.skip>true</maven.javadoc.skip>
<spotbugs.skip>true</spotbugs.skip>
<gpg.skip>true</gpg.skip>
<license.skip>true</license.skip>
</properties>
</profile>
I have it in my ~/.m2/settings.xml to speed up the builds.
>
>
> On Thu, 2 Dec 2021 at 17:40, Christopher <ct...@apache.org> wrote:
>
>> I don't see any reason it would break anything else and not opposed to
>> making a change there to avoid repeated calls to the security provider
>> to create the credentials, but I'm strongly suspicious that this would
>> fix the performance problem with that IT. I've seen that test pass
>> very quickly before, without your change. I think it might be a
>> coincidence. I think if you were to capture a thread dump at other
>> times, you wouldn't always see it in that code, but you'd find it busy
>> doing other work instead. If it does fix it permanently, though, I'd
>> be pleasantly surprised. Regardless, I think we can move forward with
>> your PR, either way, because it does avoid unnecessary recomputation
>> of immutable credentials in ServerInfo.
>>
>> On Thu, Dec 2, 2021 at 7:23 AM Mark Jens <ma...@gmail.com> wrote:
>> >
>> > Please review https://github.com/apache/accumulo/pull/2374
>> > By caching the ServerInfo's Credentials ConcurrentDeleteTableIT passes
>> > almost 6 times faster now!
>> > I am running the whole test suite now to see whether it doesn't break
>> > something else.
>> >
>> > On Thu, 2 Dec 2021 at 13:49, Mark Jens <ma...@gmail.com> wrote:
>> >
>> > > Reducing the log output did not reduce the test run time:
>> > >
>> > > diff --git test/src/main/resources/log4j2-test.properties
>> > > test/src/main/resources/log4j2-test.properties
>> > > index 9124914f7a..810c7bf06f 100644
>> > > --- test/src/main/resources/log4j2-test.properties
>> > > +++ test/src/main/resources/log4j2-test.properties
>> > > @@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
>> > > appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
>> > >
>> > > logger.01.name = org.apache.accumulo.core
>> > > -logger.01.level = debug
>> > > +logger.01.level = info
>> > >
>> > > logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
>> > > logger.02.level = info
>> > > @@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
>> > > logger.25.level = info
>> > >
>> > > logger.26.name = org.apache.hadoop.minikdc
>> > > -logger.26.level = debug
>> > > +logger.26.level = info
>> > >
>> > >
>> > > @@ -169,6 +169,6 @@ logger.metrics.level = info
>> > > logger.metrics.additivity = false
>> > > logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput
>> > >
>> > > -rootLogger.level = debug
>> > > +rootLogger.level = info
>> > > rootLogger.appenderRef.console.ref = STDOUT
>> > >
>> > > INFO] Running
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 785.503 s - in
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>> > >
>> > >
>> > > On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:
>> > >
>> > >> Hi again,
>> > >>
>> > >> Here are the thread dumps as promised:
>> > >>
>> > >> 1) Both TabletServers are very busy at compressing at close time. The
>> > >> following stacks are dumped in ~5 secs interval:
>> > >>
>> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
>> cpu=68425.44ms
>> > >> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
>> > >> [0x0000fffe8f3fd000]
>> > >> java.lang.Thread.State: RUNNABLE
>> > >> at
>> sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
>> > >> /SHA5.java:232)
>> > >> at sun.security.provider.SHA5.implCompress(java.base@11.0.11
>> > >> /SHA5.java:221)
>> > >> at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:124)
>> > >> at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >> at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >> at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
>> > >> at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >> at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> > >> at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> > >> at
>> > >>
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
>> > >> at
>> > >>
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
>> > >> at
>> > >>
>> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
>> > >> at
>> > >>
>> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
>> > >> at
>> > >>
>> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
>> > >> at
>> > >>
>> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
>> > >> - locked <0x00000000f1585830> (a
>> > >> org.apache.accumulo.tserver.tablet.Tablet)
>> > >> at
>> > >> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
>> > >> at
>> > >>
>> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
>> > >> at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >> at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
>> > >> Source)
>> > >> at
>> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
>> > >> /ThreadPoolExecutor.java:1128)
>> > >> at
>> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
>> > >> /ThreadPoolExecutor.java:628)
>> > >> at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >> at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
>> > >> Source)
>> > >> at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
>> cpu=72485.20ms
>> > >> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
>> > >> [0x0000fffe8f3fd000]
>> > >> java.lang.Thread.State: RUNNABLE
>> > >> at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> > >> /DigestBase.java:149)
>> > >> at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> > >> /DigestBase.java:144)
>> > >> at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:131)
>> > >> at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >> at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >> at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>> > >> at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >> ...
>> > >>
>> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
>> cpu=81174.59ms
>> > >> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
>> > >> [0x0000fffe8f3fd000]
>> > >> java.lang.Thread.State: RUNNABLE
>> > >> at
>> sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
>> > >> /ByteArrayAccess.java:449)
>> > >> at sun.security.provider.SHA5.implDigest(java.base@11.0.11
>> > >> /SHA5.java:131)
>> > >> at
>> sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
>> > >> /DigestBase.java:210)
>> > >> at
>> sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
>> > >> /DigestBase.java:189)
>> > >> at
>> > >> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
>> > >> /MessageDigest.java:639)
>> > >> at java.security.MessageDigest.digest(java.base@11.0.11
>> > >> /MessageDigest.java:385)
>> > >> at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
>> > >> at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >> ...
>> > >>
>> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
>> cpu=86499.01ms
>> > >> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
>> > >> [0x0000fffe8f3fd000]
>> > >> java.lang.Thread.State: RUNNABLE
>> > >> at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> > >> /DigestBase.java:149)
>> > >> at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> > >> /DigestBase.java:144)
>> > >> at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:131)
>> > >> at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >> at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >> at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>> > >> at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >> at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >> ...
>> > >>
>> > >> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0
>> cpu=109551.37ms
>> > >> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
>> > >> [0x0000fffe7bffd000]
>> > >> 14012 java.lang.Thread.State: RUNNABLE
>> > >> 14013 at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> > >> /DigestBase.java:149)
>> > >> 14014 at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> > >> /DigestBase.java:144)
>> > >> 14015 at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:131)
>> > >> 14016 at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >> 14017 at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >> 14018 at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
>> > >> 14019 at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> 14020 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> 14021 at
>> org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> 14022 at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> 14023 at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> 14024 at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> 14025 at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> 14026 at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
>> > >> 14027 at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
>> > >> 14028 at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> > >>
>> > >> Notice that ClientContext.getProperties(ClientContext.java:236) most
>> of
>> > >> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in
>> the
>> > >> last one it calls
>> ServerInfo.getAuthenticationToken(ServerInfo.java:153).
>> > >> And both lead to (a lot of ?!) compressing..
>> > >>
>> > >> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
>> > >> level should not be DEBUG ?!
>> > >>
>> > >> Most of its threads either wait for notifications from Zookeeper:
>> > >>
>> > >> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
>> > >> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
>> > >> Object.wait() [0x0000fffebb7fc000]
>> > >> 878648 java.lang.Thread.State: TIMED_WAITING (on object monitor)
>> > >> 878649 at java.lang.Object.wait(java.base@11.0.11/Native Method)
>> > >> 878650 - waiting on <no object reference available>
>> > >> 878651 at
>> > >>
>> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
>> > >> 878652 - waiting to re-lock in wait() <0x00000000f1427458> (a
>> > >> org.apache.accumulo.fate.ZooStore)
>> > >> 878653 at
>> > >>
>> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
>> > >> 878654 at
>> > >>
>> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
>> > >> 878655 at
>> > >> org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
>> > >> 878656 at
>> > >>
>> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
>> > >> 878657 at
>> > >>
>> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
>> > >> ...
>> > >>
>> > >> or wait for data:
>> > >> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0
>> cpu=7440.91ms
>> > >> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
>> > >> [0x0000fffebadfd000]
>> > >> 878782 java.lang.Thread.State: WAITING (on object monitor)
>> > >> 878783 at java.lang.Object.wait(java.base@11.0.11/Native Method)
>> > >> 878784 - waiting on <no object reference available>
>> > >> 878785 at java.lang.Object.wait(java.base@11.0.11
>> /Object.java:328)
>> > >> 878786 at
>> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>> > >> 878787 - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
>> > >> org.apache.zookeeper.ClientCnxn$Packet)
>> > >> 878788 at
>> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>> > >> 878789 at
>> > >> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
>> > >> 878790 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
>> > >> 878791 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
>> > >> Source)
>> > >> 878792 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
>> > >> Source)
>> > >> 878793 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>> > >> 878794 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>> > >> 878795 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>> > >> 878796 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
>> > >> 878797 at
>> org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
>> > >> 878798 at
>> > >> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
>> > >> 878799 at
>> > >>
>> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
>> > >> 878800 at
>> > >> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
>> > >> 878801 at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >> 878802 at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> > >> Source)
>> > >> 878803 at
>> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
>> > >> /ThreadPoolExecutor.java:1128)
>> > >> 878804 at
>> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
>> > >> /ThreadPoolExecutor.java:628)
>> > >> 878805 at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >> 878806 at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> > >> Source)
>> > >> 878807 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
>> > >> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
>> > >> [0x0000ffff20f50000]
>> > >> 908221 java.lang.Thread.State: WAITING (on object monitor)
>> > >> 908222 at java.lang.Object.wait(java.base@11.0.11/Native Method)
>> > >> 908223 - waiting on <no object reference available>
>> > >> 908224 at java.lang.Object.wait(java.base@11.0.11
>> /Object.java:328)
>> > >> 908225 at
>> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>> > >> 908226 - waiting to re-lock in wait() <0x00000000fa781138> (a
>> > >> org.apache.zookeeper.ClientCnxn$Packet)
>> > >> 908227 at
>> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>> > >> 908228 at
>> org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
>> > >> 908229 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
>> > >> 908230 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
>> > >> Source)
>> > >> 908231 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
>> > >> Source)
>> > >> 908232 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>> > >> 908233 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>> > >> 908234 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>> > >> 908235 at
>> > >>
>> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
>> > >> 908236 at
>> > >>
>> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
>> > >> 908237 at
>> > >>
>> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
>> > >> 908238 at
>> > >>
>> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
>> > >> 908239 at
>> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >> 908240 at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> > >> Source)
>> > >> 908241 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >> 3) SimpleGarbageCollector is also busy in getting credentials
>> > >>
>> > >> "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
>> > >> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
>> > >> 2503 java.lang.Thread.State: RUNNABLE
>> > >> 2504 at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> > >> /DigestBase.java:149)
>> > >> 2505 at
>> > >>
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> > >> /DigestBase.java:144)
>> > >> 2506 at
>> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> > >> /DigestBase.java:131)
>> > >> 2507 at
>> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> > >> /MessageDigest.java:623)
>> > >> 2508 at java.security.MessageDigest.update(java.base@11.0.11
>> > >> /MessageDigest.java:345)
>> > >> 2509 at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
>> > >> 2510 at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> 2511 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> 2512 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> 2513 at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> 2514 at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> 2515 at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> 2516 at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> 2517 at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> > >> 2518 at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> > >> 2519 at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> > >> 2520 at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> > >> 2521 at
>> > >>
>> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
>> > >> 2522 at
>> > >>
>> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
>> > >> 2523 at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
>> > >> 2524 at
>> > >>
>> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
>> > >> 2525 at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >> 2526 at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
>> > >> Source)
>> > >> 2527 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >>
>> > >> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
>> > >> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
>> > >> 3152 java.lang.Thread.State: RUNNABLE
>> > >> 3153 at java.util.Arrays.hashCode(java.base@11.0.11
>> /Arrays.java:4685)
>> > >> 3154 at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
>> > >> 3155 at
>> java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
>> > >> /Provider.java:1107)
>> > >> 3156 at
>> java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
>> > >> /ConcurrentHashMap.java:936)
>> > >> 3157 at java.security.Provider.getService(java.base@11.0.11
>> > >> /Provider.java:1282)
>> > >> 3158 at sun.security.jca.ProviderList.getService(java.base@11.0.11
>> > >> /ProviderList.java:380)
>> > >> 3159 at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
>> > >> /GetInstance.java:157)
>> > >> 3160 at java.security.Security.getImpl(java.base@11.0.11
>> > >> /Security.java:700)
>> > >> 3161 at java.security.MessageDigest.getInstance(java.base@11.0.11
>> > >> /MessageDigest.java:178)
>> > >> 3162 at
>> > >>
>> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
>> > >> 3163 at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
>> > >> 3164 at
>> > >>
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> > >> 3165 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> > >> 3166 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> > >> 3167 at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> > >> 3168 at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> > >> 3169 at
>> > >>
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> > >> 3170 at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> > >> 3171 at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
>> > >> 3172 at
>> > >>
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
>> > >> 3173 at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> > >> 3174 at
>> > >>
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> > >> 3175 at
>> > >>
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
>> > >> 3176 at
>> > >>
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
>> > >> 3177 at
>> > >>
>> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
>> > >> 3178 at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
>> > >> 3179 at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
>> > >> 3180 at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
>> > >> 3181 at
>> > >>
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
>> > >> 3182 at
>> > >>
>> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
>> > >> 3183 at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> > >> 3184 at
>> > >>
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
>> > >> Source)
>> > >> 3185 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>> > >>
>> > >>
>> > >> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
>> > >> processes
>> > >>
>> > >>
>> > >> I'm not saying that the above are problematic. You know how Accumulo
>> > >> works. It is up to you to decide whether something should be
>> improved.
>> > >>
>> > >> Regards,
>> > >> Mark
>> > >>
>> > >>
>> > >> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com>
>> wrote:
>> > >>
>> > >>>
>> > >>>
>> > >>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org>
>> wrote:
>> > >>>
>> > >>>> It looks like the tests are timing out. This happens frequently
>> when
>> > >>>> running on resource-constrained systems. You can give the test more
>> > >>>> time by increasing the timeout factor: `mvn clean verify
>> > >>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
>> > >>>> -Dtimeout.factor=3`
>> > >>>>
>> > >>>> There's nothing we know of that would change the way our tests work
>> > >>>> due to ARM64, but you may have issues because of limited RAM, slow
>> CPU
>> > >>>> speeds, slow disk I/O, busy background processes, or other
>> > >>>> resource-related issues. I don't think most of the currently active
>> > >>>> developers use ARM64, or have access to a test machine to
>> reproduce or
>> > >>>>
>> > >>>
>> > >>> In case anyone wants to test on Linux ARM64 you could easily use
>> Oracle
>> > >>> Cloud for free.
>> > >>>
>> > >>>
>> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
>> > >>> explains how to create a VM and how to use this VM as a Github
>> Actions
>> > >>> runner.
>> > >>>
>> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
>> > >>> mentions this article.
>> > >>>
>> > >>>
>> > >>>> experiment with Accumulo there, so you may have to do some of your
>> own
>> > >>>> troubleshooting. If you can rule out resource-constraint issues,
>> and
>> > >>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is
>> known
>> > >>>> flaky and sometimes times out on x86_64 as well), you could create
>> a
>> > >>>> bug ticket with more details at
>> > >>>> https://github.com/apache/accumulo/issues ; there is an issue
>> template
>> > >>>> specifically for broken and/or flaky tests that you can select when
>> > >>>> creating a new ticket.
>> > >>>>
>> > >>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com>
>> > >>>> wrote:
>> > >>>> >
>> > >>>> > Hi dev1,
>> > >>>> >
>> > >>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>> > >>>> >
>> > >>>> > > Some of those tests are trying to stress conditions that
>> require a
>> > >>>> lot of
>> > >>>> > > resources to replicate specific conditions. Have you tried to
>> run
>> > >>>> those
>> > >>>> > > individual tests in isolation so that you are not competing for
>> > >>>> resources?
>> > >>>> > > Do they always fail, or are the failures transient?
>> > >>>> > >
>> > >>>> >
>> > >>>> > Q: Have you tried to run those individual tests in isolation so
>> that
>> > >>>> you
>> > >>>> > are not competing for resources?
>> > >>>> > A: This is what I mean with the following:
>> > >>>> > ---------------------
>> > >>>> > The tests fail even when executed separately, e.g.:
>> > >>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf
>> :accumulo-test
>> > >>>> > ---------------------
>> > >>>> >
>> > >>>> > Q: Do they always fail, or are the failures transient?
>> > >>>> > A: I also tried to explain that with "These tests fail
>> consistently at
>> > >>>> > every build attempt!"
>> > >>>> >
>> > >>>> > Mark
>> > >>>> >
>> > >>>> > >
>> > >>>> > > -----Original Message-----
>> > >>>> > > From: Mark Jens <ma...@gmail.com>
>> > >>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
>> > >>>> > > To: dev@accumulo.apache.org
>> > >>>> > > Subject: Consistent IT tests failures on Linux ARM64
>> > >>>> > >
>> > >>>> > > Hello Accumulo community,
>> > >>>> > >
>> > >>>> > > At my job we consider using Linux ARM64 servers and I've been
>> > >>>> tasked to
>> > >>>> > > test Accumulo.
>> > >>>> > >
>> > >>>> > > I face some timeout related issues with several IT tests:
>> > >>>> > >
>> > >>>> > >
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > >>>> > > Time elapsed: 420.122 s <<< ERROR!
>> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
>> after
>> > >>>> 420
>> > >>>> > > seconds at java.base@11.0.11
>> /jdk.internal.misc.Unsafe.park(Native
>> > >>>> Method)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > >>>> > > Method)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > >>>> > > at java.base@11.0.11
>> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >>>> > >
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > >>>> > > Time elapsed: 420.122 s <<< ERROR!
>> > >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>> > >>>> > > test-SendThread(localhost:44251)
>> > >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>> > >>>> > > java.base@11.0.11
>> > >>>> > >
>> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>> > >>>> > > at java.base@11.0.11/sun.nio.ch
>> > >>>> .SelectorImpl.select(SelectorImpl.java:136)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>> > >>>> > > at
>> > >>>> > >
>> > >>>>
>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >>>> > >
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
>> > >>>> > > Time elapsed: 420.011 s <<< ERROR!
>> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
>> after
>> > >>>> 420
>> > >>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native
>> Method)
>> > >>>> at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
>> > >>>> > > at
>> > >>>>
>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
>> > >>>> > > at
>> > >>>>
>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > >>>> > > Method)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > >>>> > > at java.base@11.0.11
>> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >>>> > >
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.ScannerContextIT
>> > >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 102.909 s - in
>> org.apache.accumulo.test.functional.ScannerContextIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.KerberosRenewalIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 504.472 s - in
>> org.apache.accumulo.test.functional.KerberosRenewalIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.BatchWriterFlushIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 62.132 s - in
>> org.apache.accumulo.test.functional.BatchWriterFlushIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.PermissionsIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.ZookeeperRestartIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 37.37 s - in
>> org.apache.accumulo.test.functional.ZookeeperRestartIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 23.046 s - in
>> > >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>> > >>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 255.108 s - in
>> > >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.RestartStressIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 78.359 s - in
>> org.apache.accumulo.test.functional.RestartStressIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 59.289 s - in
>> > >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
>> > >>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BloomFilterIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 135.298 s - in
>> org.apache.accumulo.test.functional.BloomFilterIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BinaryStressIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 38.626 s - in
>> org.apache.accumulo.test.functional.BinaryStressIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.ClassLoaderIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.LogicalTimeIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 116.819 s - in
>> org.apache.accumulo.test.functional.LogicalTimeIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.SplitRecoveryIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 25.421 s - in
>> org.apache.accumulo.test.functional.SplitRecoveryIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BigRootTabletIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 96.86 s - in
>> org.apache.accumulo.test.functional.BigRootTabletIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
>> > >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 238.409 s - in
>> > >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
>> > >>>> > > [INFO] Running
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 219.253 s - in
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>> > >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 489.863 s - in
>> > >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
>> > >>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.ManagerFailoverIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 47.089 s - in
>> org.apache.accumulo.test.functional.ManagerFailoverIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
>> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BackupManagerIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 22.943 s - in
>> org.apache.accumulo.test.functional.BackupManagerIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.TabletMetadataIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 46.728 s - in
>> org.apache.accumulo.test.functional.TabletMetadataIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.LateLastContactIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 46.648 s - in
>> org.apache.accumulo.test.functional.LateLastContactIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 71.934 s - in
>> > >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.HalfDeadTServerIT
>> > >>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 307.904 s <<< FAILURE! - in
>> > >>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
>> > >>>> > > [ERROR]
>> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > >>>> > > Time elapsed: 240.011 s <<< ERROR!
>> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
>> after
>> > >>>> 240
>> > >>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native
>> Method)
>> > >>>> at
>> > >>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
>> > >>>> > > at java.base@11.0.11
>> > >>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > >>>> > > Method)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > >>>> > > at java.base@11.0.11
>> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >>>> > >
>> > >>>> > > [ERROR]
>> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > >>>> > > Time elapsed: 240.012 s <<< ERROR!
>> > >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>> > >>>> > > test-SendThread(localhost:39285)
>> > >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>> > >>>> > > java.base@11.0.11
>> > >>>> > >
>> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>> > >>>> > > at java.base@11.0.11
>> > >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>> > >>>> > > at java.base@11.0.11/sun.nio.ch
>> > >>>> .SelectorImpl.select(SelectorImpl.java:136)
>> > >>>> > > at
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>> > >>>> > > at
>> > >>>> > >
>> > >>>>
>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >>>> > >
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
>> > >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
>> > >>>> > > [INFO] Running
>> > >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 43.91 s - in
>> > >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
>> > >>>> > > [INFO] Running
>> org.apache.accumulo.test.functional.DeleteRowsSplitIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 113.928 s - in
>> org.apache.accumulo.test.functional.DeleteRowsSplitIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
>> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
>> > >>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
>> > >>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
>> > >>>> elapsed:
>> > >>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
>> > >>>> > > [INFO] Running
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>> > >>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1,
>> Time
>> > >>>> elapsed:
>> > >>>> > > 0.039 s - in
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>> > >>>> > > [INFO]
>> > >>>> > > [INFO] Results:
>> > >>>> > > [INFO]
>> > >>>> > > [ERROR] Errors:
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
>> > >>>> > > [ERROR] Run 1:
>> > >>>> > >
>> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178
>> > >>>> »
>> > >>>> > > TestTimedOut
>> > >>>> > > [ERROR] Run 2:
>> > >>>> > >
>> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
>> > >>>> Appears
>> > >>>> > > to ...
>> > >>>> > > [INFO]
>> > >>>> > > [ERROR]
>> ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
>> > >>>> > > TestTimedOut test t...
>> > >>>> > > [ERROR]
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > >>>> > > [ERROR] Run 1:
>> > >>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
>> > >>>> TestTimedOut
>> > >>>> > > tes...
>> > >>>> > > [ERROR] Run 2:
>> > >>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
>> > >>>> > > Appears to be stuck...
>> > >>>> > > [INFO]
>> > >>>> > > [ERROR]
>> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > >>>> > > [ERROR] Run 1:
>> > >>>> > >
>> > >>>> > >
>> > >>>>
>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
>> > >>>> > > » TestTimedOut
>> > >>>> > > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be
>> > >>>> stuck in
>> > >>>> > > thread Time-limited te...
>> > >>>> > > [INFO]
>> > >>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
>> > >>>> > > [ERROR] Run 1:
>> > >>>> SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
>> > >>>> > > TestTimedOut test timed ...
>> > >>>> > > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in
>> thread
>> > >>>> > > Time-limited test-SendThread(...
>> > >>>> > >
>> > >>>> > > These tests fail consistently at every build attempt!
>> > >>>> > >
>> > >>>> > > The tests fail even when executed separately, e.g.:
>> > >>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf
>> :accumulo-test
>> > >>>> > >
>> > >>>> > >
>> > >>>> > > I am using the current 'main' branch of Accumulo.
>> > >>>> > > JDK 11.0.11
>> > >>>> > > Maven: 3.8.2
>> > >>>> > > OS: Ubuntu 20.04.3 ARM64
>> > >>>> > >
>> > >>>> > > Is there anything that could be done to fix these problems ?
>> > >>>> > > For example some config settings ?!
>> > >>>> > >
>> > >>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read
>> that
>> > >>>> Linux
>> > >>>> > > ARM64 is a supported platform since the JVM supports it.
>> > >>>> > >
>> > >>>> > > Thanks!
>> > >>>> > >
>> > >>>> > > Mark
>> > >>>> > >
>> > >>>>
>> > >>>
>>
>
Re: Consistent IT tests failures on Linux ARM64
Posted by Mark Jens <ma...@gmail.com>.
I've just make few more tests and:
1) with the improvement
1.1) INFO] Running
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
132.823 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
1.2) Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
114.933 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
2) without
[INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[ERROR] Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed:
577.537 s <<< FAILURE! - in
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
Time elapsed: 420.095 s <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420
seconds
at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
at java.base@11.0.11
/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.11
/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
at java.base@11.0.11
/java.util.concurrent.FutureTask.get(FutureTask.java:190)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
I am going to investigate ExternalCompaction_2_IT and
ExternalCompaction_3_IT too. These are the other very slow tests
On Thu, 2 Dec 2021 at 17:40, Christopher <ct...@apache.org> wrote:
> I don't see any reason it would break anything else and not opposed to
> making a change there to avoid repeated calls to the security provider
> to create the credentials, but I'm strongly suspicious that this would
> fix the performance problem with that IT. I've seen that test pass
> very quickly before, without your change. I think it might be a
> coincidence. I think if you were to capture a thread dump at other
> times, you wouldn't always see it in that code, but you'd find it busy
> doing other work instead. If it does fix it permanently, though, I'd
> be pleasantly surprised. Regardless, I think we can move forward with
> your PR, either way, because it does avoid unnecessary recomputation
> of immutable credentials in ServerInfo.
>
> On Thu, Dec 2, 2021 at 7:23 AM Mark Jens <ma...@gmail.com> wrote:
> >
> > Please review https://github.com/apache/accumulo/pull/2374
> > By caching the ServerInfo's Credentials ConcurrentDeleteTableIT passes
> > almost 6 times faster now!
> > I am running the whole test suite now to see whether it doesn't break
> > something else.
> >
> > On Thu, 2 Dec 2021 at 13:49, Mark Jens <ma...@gmail.com> wrote:
> >
> > > Reducing the log output did not reduce the test run time:
> > >
> > > diff --git test/src/main/resources/log4j2-test.properties
> > > test/src/main/resources/log4j2-test.properties
> > > index 9124914f7a..810c7bf06f 100644
> > > --- test/src/main/resources/log4j2-test.properties
> > > +++ test/src/main/resources/log4j2-test.properties
> > > @@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
> > > appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
> > >
> > > logger.01.name = org.apache.accumulo.core
> > > -logger.01.level = debug
> > > +logger.01.level = info
> > >
> > > logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
> > > logger.02.level = info
> > > @@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
> > > logger.25.level = info
> > >
> > > logger.26.name = org.apache.hadoop.minikdc
> > > -logger.26.level = debug
> > > +logger.26.level = info
> > >
> > >
> > > @@ -169,6 +169,6 @@ logger.metrics.level = info
> > > logger.metrics.additivity = false
> > > logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput
> > >
> > > -rootLogger.level = debug
> > > +rootLogger.level = info
> > > rootLogger.appenderRef.console.ref = STDOUT
> > >
> > > INFO] Running
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 785.503 s - in
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> > >
> > >
> > > On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:
> > >
> > >> Hi again,
> > >>
> > >> Here are the thread dumps as promised:
> > >>
> > >> 1) Both TabletServers are very busy at compressing at close time. The
> > >> following stacks are dumped in ~5 secs interval:
> > >>
> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
> cpu=68425.44ms
> > >> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
> > >> [0x0000fffe8f3fd000]
> > >> java.lang.Thread.State: RUNNABLE
> > >> at
> sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
> > >> /SHA5.java:232)
> > >> at sun.security.provider.SHA5.implCompress(java.base@11.0.11
> > >> /SHA5.java:221)
> > >> at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:124)
> > >> at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >> at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >> at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> > >> at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >> at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> > >> at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> > >> at
> > >>
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> > >> at
> > >>
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> > >> at
> > >>
> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
> > >> at
> > >>
> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
> > >> at
> > >>
> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
> > >> at
> > >>
> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
> > >> - locked <0x00000000f1585830> (a
> > >> org.apache.accumulo.tserver.tablet.Tablet)
> > >> at
> > >> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
> > >> at
> > >>
> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
> > >> at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >> at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> > >> Source)
> > >> at
> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> > >> /ThreadPoolExecutor.java:1128)
> > >> at
> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> > >> /ThreadPoolExecutor.java:628)
> > >> at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >> at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> > >> Source)
> > >> at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
> cpu=72485.20ms
> > >> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
> > >> [0x0000fffe8f3fd000]
> > >> java.lang.Thread.State: RUNNABLE
> > >> at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> > >> /DigestBase.java:149)
> > >> at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> > >> /DigestBase.java:144)
> > >> at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:131)
> > >> at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >> at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >> at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> > >> at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >> ...
> > >>
> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
> cpu=81174.59ms
> > >> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
> > >> [0x0000fffe8f3fd000]
> > >> java.lang.Thread.State: RUNNABLE
> > >> at
> sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
> > >> /ByteArrayAccess.java:449)
> > >> at sun.security.provider.SHA5.implDigest(java.base@11.0.11
> > >> /SHA5.java:131)
> > >> at
> sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> > >> /DigestBase.java:210)
> > >> at
> sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> > >> /DigestBase.java:189)
> > >> at
> > >> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
> > >> /MessageDigest.java:639)
> > >> at java.security.MessageDigest.digest(java.base@11.0.11
> > >> /MessageDigest.java:385)
> > >> at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
> > >> at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >> ...
> > >>
> > >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0
> cpu=86499.01ms
> > >> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
> > >> [0x0000fffe8f3fd000]
> > >> java.lang.Thread.State: RUNNABLE
> > >> at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> > >> /DigestBase.java:149)
> > >> at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> > >> /DigestBase.java:144)
> > >> at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:131)
> > >> at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >> at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >> at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> > >> at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >> at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >> ...
> > >>
> > >> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0
> cpu=109551.37ms
> > >> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
> > >> [0x0000fffe7bffd000]
> > >> 14012 java.lang.Thread.State: RUNNABLE
> > >> 14013 at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> > >> /DigestBase.java:149)
> > >> 14014 at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> > >> /DigestBase.java:144)
> > >> 14015 at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:131)
> > >> 14016 at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >> 14017 at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >> 14018 at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
> > >> 14019 at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> 14020 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> 14021 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> 14022 at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> 14023 at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> 14024 at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> 14025 at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> 14026 at
> > >>
> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> > >> 14027 at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> > >> 14028 at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> > >>
> > >> Notice that ClientContext.getProperties(ClientContext.java:236) most
> of
> > >> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in
> the
> > >> last one it calls
> ServerInfo.getAuthenticationToken(ServerInfo.java:153).
> > >> And both lead to (a lot of ?!) compressing..
> > >>
> > >> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
> > >> level should not be DEBUG ?!
> > >>
> > >> Most of its threads either wait for notifications from Zookeeper:
> > >>
> > >> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
> > >> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
> > >> Object.wait() [0x0000fffebb7fc000]
> > >> 878648 java.lang.Thread.State: TIMED_WAITING (on object monitor)
> > >> 878649 at java.lang.Object.wait(java.base@11.0.11/Native Method)
> > >> 878650 - waiting on <no object reference available>
> > >> 878651 at
> > >>
> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
> > >> 878652 - waiting to re-lock in wait() <0x00000000f1427458> (a
> > >> org.apache.accumulo.fate.ZooStore)
> > >> 878653 at
> > >>
> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
> > >> 878654 at
> > >>
> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
> > >> 878655 at
> > >> org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
> > >> 878656 at
> > >>
> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
> > >> 878657 at
> > >>
> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
> > >> ...
> > >>
> > >> or wait for data:
> > >> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0
> cpu=7440.91ms
> > >> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
> > >> [0x0000fffebadfd000]
> > >> 878782 java.lang.Thread.State: WAITING (on object monitor)
> > >> 878783 at java.lang.Object.wait(java.base@11.0.11/Native Method)
> > >> 878784 - waiting on <no object reference available>
> > >> 878785 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> > >> 878786 at
> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> > >> 878787 - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
> > >> org.apache.zookeeper.ClientCnxn$Packet)
> > >> 878788 at
> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> > >> 878789 at
> > >> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
> > >> 878790 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
> > >> 878791 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
> > >> Source)
> > >> 878792 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> > >> Source)
> > >> 878793 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> > >> 878794 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> > >> 878795 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> > >> 878796 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
> > >> 878797 at
> org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
> > >> 878798 at
> > >> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
> > >> 878799 at
> > >>
> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
> > >> 878800 at
> > >> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
> > >> 878801 at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >> 878802 at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> > >> Source)
> > >> 878803 at
> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> > >> /ThreadPoolExecutor.java:1128)
> > >> 878804 at
> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> > >> /ThreadPoolExecutor.java:628)
> > >> 878805 at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >> 878806 at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> > >> Source)
> > >> 878807 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
> > >> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
> > >> [0x0000ffff20f50000]
> > >> 908221 java.lang.Thread.State: WAITING (on object monitor)
> > >> 908222 at java.lang.Object.wait(java.base@11.0.11/Native Method)
> > >> 908223 - waiting on <no object reference available>
> > >> 908224 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> > >> 908225 at
> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> > >> 908226 - waiting to re-lock in wait() <0x00000000fa781138> (a
> > >> org.apache.zookeeper.ClientCnxn$Packet)
> > >> 908227 at
> > >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> > >> 908228 at
> org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
> > >> 908229 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
> > >> 908230 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
> > >> Source)
> > >> 908231 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> > >> Source)
> > >> 908232 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> > >> 908233 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> > >> 908234 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> > >> 908235 at
> > >>
> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
> > >> 908236 at
> > >>
> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
> > >> 908237 at
> > >>
> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
> > >> 908238 at
> > >> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
> > >> 908239 at
> > >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >> 908240 at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> > >> Source)
> > >> 908241 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >> 3) SimpleGarbageCollector is also busy in getting credentials
> > >>
> > >> "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
> > >> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
> > >> 2503 java.lang.Thread.State: RUNNABLE
> > >> 2504 at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> > >> /DigestBase.java:149)
> > >> 2505 at
> > >>
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> > >> /DigestBase.java:144)
> > >> 2506 at
> sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> > >> /DigestBase.java:131)
> > >> 2507 at
> > >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> > >> /MessageDigest.java:623)
> > >> 2508 at java.security.MessageDigest.update(java.base@11.0.11
> > >> /MessageDigest.java:345)
> > >> 2509 at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> > >> 2510 at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> 2511 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> 2512 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> 2513 at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> 2514 at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> 2515 at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> 2516 at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> 2517 at
> > >>
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> > >> 2518 at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> > >> 2519 at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> > >> 2520 at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> > >> 2521 at
> > >>
> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
> > >> 2522 at
> > >>
> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
> > >> 2523 at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
> > >> 2524 at
> > >>
> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
> > >> 2525 at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >> 2526 at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> > >> Source)
> > >> 2527 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >>
> > >> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
> > >> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
> > >> 3152 java.lang.Thread.State: RUNNABLE
> > >> 3153 at java.util.Arrays.hashCode(java.base@11.0.11
> /Arrays.java:4685)
> > >> 3154 at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
> > >> 3155 at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
> > >> /Provider.java:1107)
> > >> 3156 at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
> > >> /ConcurrentHashMap.java:936)
> > >> 3157 at java.security.Provider.getService(java.base@11.0.11
> > >> /Provider.java:1282)
> > >> 3158 at sun.security.jca.ProviderList.getService(java.base@11.0.11
> > >> /ProviderList.java:380)
> > >> 3159 at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
> > >> /GetInstance.java:157)
> > >> 3160 at java.security.Security.getImpl(java.base@11.0.11
> > >> /Security.java:700)
> > >> 3161 at java.security.MessageDigest.getInstance(java.base@11.0.11
> > >> /MessageDigest.java:178)
> > >> 3162 at
> > >>
> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
> > >> 3163 at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
> > >> 3164 at
> > >>
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> > >> 3165 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> > >> 3166 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> > >> 3167 at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> > >> 3168 at
> > >>
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> > >> 3169 at
> > >>
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> > >> 3170 at
> > >>
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> > >> 3171 at
> > >>
> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> > >> 3172 at
> > >>
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> > >> 3173 at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> > >> 3174 at
> > >>
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> > >> 3175 at
> > >>
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> > >> 3176 at
> > >>
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> > >> 3177 at
> > >>
> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
> > >> 3178 at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
> > >> 3179 at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
> > >> 3180 at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
> > >> 3181 at
> > >>
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
> > >> 3182 at
> > >>
> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
> > >> 3183 at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> > >> 3184 at
> > >>
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> > >> Source)
> > >> 3185 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> > >>
> > >>
> > >> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
> > >> processes
> > >>
> > >>
> > >> I'm not saying that the above are problematic. You know how Accumulo
> > >> works. It is up to you to decide whether something should be improved.
> > >>
> > >> Regards,
> > >> Mark
> > >>
> > >>
> > >> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:
> > >>
> > >>>
> > >>>
> > >>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org>
> wrote:
> > >>>
> > >>>> It looks like the tests are timing out. This happens frequently when
> > >>>> running on resource-constrained systems. You can give the test more
> > >>>> time by increasing the timeout factor: `mvn clean verify
> > >>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
> > >>>> -Dtimeout.factor=3`
> > >>>>
> > >>>> There's nothing we know of that would change the way our tests work
> > >>>> due to ARM64, but you may have issues because of limited RAM, slow
> CPU
> > >>>> speeds, slow disk I/O, busy background processes, or other
> > >>>> resource-related issues. I don't think most of the currently active
> > >>>> developers use ARM64, or have access to a test machine to reproduce
> or
> > >>>>
> > >>>
> > >>> In case anyone wants to test on Linux ARM64 you could easily use
> Oracle
> > >>> Cloud for free.
> > >>>
> > >>>
> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
> > >>> explains how to create a VM and how to use this VM as a Github
> Actions
> > >>> runner.
> > >>>
> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
> > >>> mentions this article.
> > >>>
> > >>>
> > >>>> experiment with Accumulo there, so you may have to do some of your
> own
> > >>>> troubleshooting. If you can rule out resource-constraint issues, and
> > >>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is
> known
> > >>>> flaky and sometimes times out on x86_64 as well), you could create a
> > >>>> bug ticket with more details at
> > >>>> https://github.com/apache/accumulo/issues ; there is an issue
> template
> > >>>> specifically for broken and/or flaky tests that you can select when
> > >>>> creating a new ticket.
> > >>>>
> > >>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com>
> > >>>> wrote:
> > >>>> >
> > >>>> > Hi dev1,
> > >>>> >
> > >>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
> > >>>> >
> > >>>> > > Some of those tests are trying to stress conditions that
> require a
> > >>>> lot of
> > >>>> > > resources to replicate specific conditions. Have you tried to
> run
> > >>>> those
> > >>>> > > individual tests in isolation so that you are not competing for
> > >>>> resources?
> > >>>> > > Do they always fail, or are the failures transient?
> > >>>> > >
> > >>>> >
> > >>>> > Q: Have you tried to run those individual tests in isolation so
> that
> > >>>> you
> > >>>> > are not competing for resources?
> > >>>> > A: This is what I mean with the following:
> > >>>> > ---------------------
> > >>>> > The tests fail even when executed separately, e.g.:
> > >>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > >>>> > ---------------------
> > >>>> >
> > >>>> > Q: Do they always fail, or are the failures transient?
> > >>>> > A: I also tried to explain that with "These tests fail
> consistently at
> > >>>> > every build attempt!"
> > >>>> >
> > >>>> > Mark
> > >>>> >
> > >>>> > >
> > >>>> > > -----Original Message-----
> > >>>> > > From: Mark Jens <ma...@gmail.com>
> > >>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
> > >>>> > > To: dev@accumulo.apache.org
> > >>>> > > Subject: Consistent IT tests failures on Linux ARM64
> > >>>> > >
> > >>>> > > Hello Accumulo community,
> > >>>> > >
> > >>>> > > At my job we consider using Linux ARM64 servers and I've been
> > >>>> tasked to
> > >>>> > > test Accumulo.
> > >>>> > >
> > >>>> > > I face some timeout related issues with several IT tests:
> > >>>> > >
> > >>>> > >
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >>>> > > Time elapsed: 420.122 s <<< ERROR!
> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
> after
> > >>>> 420
> > >>>> > > seconds at java.base@11.0.11
> /jdk.internal.misc.Unsafe.park(Native
> > >>>> Method)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > >>>> > > Method)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>> > > at java.base@11.0.11
> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >>>> > >
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >>>> > > Time elapsed: 420.122 s <<< ERROR!
> > >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > >>>> > > test-SendThread(localhost:44251)
> > >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > >>>> > > java.base@11.0.11
> > >>>> > >
> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > >>>> > > at java.base@11.0.11/sun.nio.ch
> > >>>> .SelectorImpl.select(SelectorImpl.java:136)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > >>>> > > at
> > >>>> > >
> > >>>>
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >>>> > >
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> > >>>> > > Time elapsed: 420.011 s <<< ERROR!
> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
> after
> > >>>> 420
> > >>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native
> Method)
> > >>>> at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > >>>> > > at
> > >>>>
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > >>>> > > at
> > >>>>
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > >>>> > > Method)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>> > > at java.base@11.0.11
> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >>>> > >
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.ScannerContextIT
> > >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 102.909 s - in
> org.apache.accumulo.test.functional.ScannerContextIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.KerberosRenewalIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 504.472 s - in
> org.apache.accumulo.test.functional.KerberosRenewalIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.BatchWriterFlushIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 62.132 s - in
> org.apache.accumulo.test.functional.BatchWriterFlushIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.ZookeeperRestartIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 37.37 s - in
> org.apache.accumulo.test.functional.ZookeeperRestartIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 23.046 s - in
> > >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > >>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 255.108 s - in
> > >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.RestartStressIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 78.359 s - in
> org.apache.accumulo.test.functional.RestartStressIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 59.289 s - in
> > >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > >>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.BinaryStressIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.SplitRecoveryIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 25.421 s - in
> org.apache.accumulo.test.functional.SplitRecoveryIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.BigRootTabletIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
> > >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 238.409 s - in
> > >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
> > >>>> > > [INFO] Running
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 219.253 s - in
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
> > >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 489.863 s - in
> > >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > >>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.ManagerFailoverIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 47.089 s - in
> org.apache.accumulo.test.functional.ManagerFailoverIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.BackupManagerIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 22.943 s - in
> org.apache.accumulo.test.functional.BackupManagerIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.TabletMetadataIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 46.728 s - in
> org.apache.accumulo.test.functional.TabletMetadataIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.LateLastContactIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 46.648 s - in
> org.apache.accumulo.test.functional.LateLastContactIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 71.934 s - in
> > >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.HalfDeadTServerIT
> > >>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 307.904 s <<< FAILURE! - in
> > >>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > >>>> > > [ERROR]
> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >>>> > > Time elapsed: 240.011 s <<< ERROR!
> > >>>> > > org.junit.runners.model.TestTimedOutException: test timed out
> after
> > >>>> 240
> > >>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native
> Method)
> > >>>> at
> > >>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > >>>> > > at java.base@11.0.11
> > >>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > >>>> > > Method)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > >>>> > > at java.base@11.0.11
> > >>>> > >
> > >>>> > >
> > >>>>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>> > > at java.base@11.0.11
> > >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >>>> > >
> > >>>> > > [ERROR]
> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >>>> > > Time elapsed: 240.012 s <<< ERROR!
> > >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > >>>> > > test-SendThread(localhost:39285)
> > >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > >>>> > > java.base@11.0.11
> > >>>> > >
> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > >>>> > > at java.base@11.0.11
> > >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > >>>> > > at java.base@11.0.11/sun.nio.ch
> > >>>> .SelectorImpl.select(SelectorImpl.java:136)
> > >>>> > > at
> > >>>> > >
> > >>>> > >
> > >>>>
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > >>>> > > at
> > >>>> > >
> > >>>>
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >>>> > >
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > >>>> > > [INFO] Running
> > >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 43.91 s - in
> > >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > >>>> > > [INFO] Running
> org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 113.928 s - in
> org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > >>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > >>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
> > >>>> elapsed:
> > >>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > >>>> > > [INFO] Running
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > >>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
> > >>>> elapsed:
> > >>>> > > 0.039 s - in
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > >>>> > > [INFO]
> > >>>> > > [INFO] Results:
> > >>>> > > [INFO]
> > >>>> > > [ERROR] Errors:
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > >>>> > > [ERROR] Run 1:
> > >>>> > >
> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178
> > >>>> »
> > >>>> > > TestTimedOut
> > >>>> > > [ERROR] Run 2:
> > >>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> »
> > >>>> Appears
> > >>>> > > to ...
> > >>>> > > [INFO]
> > >>>> > > [ERROR]
> ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > >>>> > > TestTimedOut test t...
> > >>>> > > [ERROR]
> > >>>> > >
> > >>>> > >
> > >>>>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > >>>> > > [ERROR] Run 1:
> > >>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> > >>>> TestTimedOut
> > >>>> > > tes...
> > >>>> > > [ERROR] Run 2:
> > >>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
> > >>>> > > Appears to be stuck...
> > >>>> > > [INFO]
> > >>>> > > [ERROR]
> > >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > >>>> > > [ERROR] Run 1:
> > >>>> > >
> > >>>> > >
> > >>>>
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > >>>> > > » TestTimedOut
> > >>>> > > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be
> > >>>> stuck in
> > >>>> > > thread Time-limited te...
> > >>>> > > [INFO]
> > >>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > >>>> > > [ERROR] Run 1:
> > >>>> SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > >>>> > > TestTimedOut test timed ...
> > >>>> > > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in
> thread
> > >>>> > > Time-limited test-SendThread(...
> > >>>> > >
> > >>>> > > These tests fail consistently at every build attempt!
> > >>>> > >
> > >>>> > > The tests fail even when executed separately, e.g.:
> > >>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf
> :accumulo-test
> > >>>> > >
> > >>>> > >
> > >>>> > > I am using the current 'main' branch of Accumulo.
> > >>>> > > JDK 11.0.11
> > >>>> > > Maven: 3.8.2
> > >>>> > > OS: Ubuntu 20.04.3 ARM64
> > >>>> > >
> > >>>> > > Is there anything that could be done to fix these problems ?
> > >>>> > > For example some config settings ?!
> > >>>> > >
> > >>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read
> that
> > >>>> Linux
> > >>>> > > ARM64 is a supported platform since the JVM supports it.
> > >>>> > >
> > >>>> > > Thanks!
> > >>>> > >
> > >>>> > > Mark
> > >>>> > >
> > >>>>
> > >>>
>
Re: Consistent IT tests failures on Linux ARM64
Posted by Christopher <ct...@apache.org>.
I don't see any reason it would break anything else and not opposed to
making a change there to avoid repeated calls to the security provider
to create the credentials, but I'm strongly suspicious that this would
fix the performance problem with that IT. I've seen that test pass
very quickly before, without your change. I think it might be a
coincidence. I think if you were to capture a thread dump at other
times, you wouldn't always see it in that code, but you'd find it busy
doing other work instead. If it does fix it permanently, though, I'd
be pleasantly surprised. Regardless, I think we can move forward with
your PR, either way, because it does avoid unnecessary recomputation
of immutable credentials in ServerInfo.
On Thu, Dec 2, 2021 at 7:23 AM Mark Jens <ma...@gmail.com> wrote:
>
> Please review https://github.com/apache/accumulo/pull/2374
> By caching the ServerInfo's Credentials ConcurrentDeleteTableIT passes
> almost 6 times faster now!
> I am running the whole test suite now to see whether it doesn't break
> something else.
>
> On Thu, 2 Dec 2021 at 13:49, Mark Jens <ma...@gmail.com> wrote:
>
> > Reducing the log output did not reduce the test run time:
> >
> > diff --git test/src/main/resources/log4j2-test.properties
> > test/src/main/resources/log4j2-test.properties
> > index 9124914f7a..810c7bf06f 100644
> > --- test/src/main/resources/log4j2-test.properties
> > +++ test/src/main/resources/log4j2-test.properties
> > @@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
> > appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
> >
> > logger.01.name = org.apache.accumulo.core
> > -logger.01.level = debug
> > +logger.01.level = info
> >
> > logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
> > logger.02.level = info
> > @@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
> > logger.25.level = info
> >
> > logger.26.name = org.apache.hadoop.minikdc
> > -logger.26.level = debug
> > +logger.26.level = info
> >
> >
> > @@ -169,6 +169,6 @@ logger.metrics.level = info
> > logger.metrics.additivity = false
> > logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput
> >
> > -rootLogger.level = debug
> > +rootLogger.level = info
> > rootLogger.appenderRef.console.ref = STDOUT
> >
> > INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 785.503 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> >
> >
> > On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:
> >
> >> Hi again,
> >>
> >> Here are the thread dumps as promised:
> >>
> >> 1) Both TabletServers are very busy at compressing at close time. The
> >> following stacks are dumped in ~5 secs interval:
> >>
> >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=68425.44ms
> >> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
> >> [0x0000fffe8f3fd000]
> >> java.lang.Thread.State: RUNNABLE
> >> at sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
> >> /SHA5.java:232)
> >> at sun.security.provider.SHA5.implCompress(java.base@11.0.11
> >> /SHA5.java:221)
> >> at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:124)
> >> at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >> at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >> at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> >> at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >> at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >> at
> >> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> >> at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> >> at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> >> at
> >> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
> >> at
> >> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
> >> at
> >> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
> >> at
> >> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
> >> - locked <0x00000000f1585830> (a
> >> org.apache.accumulo.tserver.tablet.Tablet)
> >> at
> >> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
> >> at
> >> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
> >> at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> at
> >> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> >> Source)
> >> at
> >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> >> /ThreadPoolExecutor.java:1128)
> >> at
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> >> /ThreadPoolExecutor.java:628)
> >> at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> at
> >> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> >> Source)
> >> at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=72485.20ms
> >> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
> >> [0x0000fffe8f3fd000]
> >> java.lang.Thread.State: RUNNABLE
> >> at
> >> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> >> /DigestBase.java:149)
> >> at
> >> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> >> /DigestBase.java:144)
> >> at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:131)
> >> at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >> at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >> at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> >> at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >> ...
> >>
> >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=81174.59ms
> >> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
> >> [0x0000fffe8f3fd000]
> >> java.lang.Thread.State: RUNNABLE
> >> at sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
> >> /ByteArrayAccess.java:449)
> >> at sun.security.provider.SHA5.implDigest(java.base@11.0.11
> >> /SHA5.java:131)
> >> at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> >> /DigestBase.java:210)
> >> at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> >> /DigestBase.java:189)
> >> at
> >> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
> >> /MessageDigest.java:639)
> >> at java.security.MessageDigest.digest(java.base@11.0.11
> >> /MessageDigest.java:385)
> >> at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
> >> at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >> ...
> >>
> >> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=86499.01ms
> >> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
> >> [0x0000fffe8f3fd000]
> >> java.lang.Thread.State: RUNNABLE
> >> at
> >> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> >> /DigestBase.java:149)
> >> at
> >> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> >> /DigestBase.java:144)
> >> at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:131)
> >> at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >> at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >> at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> >> at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >> at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >> ...
> >>
> >> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0 cpu=109551.37ms
> >> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
> >> [0x0000fffe7bffd000]
> >> 14012 java.lang.Thread.State: RUNNABLE
> >> 14013 at
> >> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> >> /DigestBase.java:149)
> >> 14014 at
> >> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> >> /DigestBase.java:144)
> >> 14015 at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:131)
> >> 14016 at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >> 14017 at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >> 14018 at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
> >> 14019 at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> 14020 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> 14021 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> 14022 at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> 14023 at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> 14024 at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> 14025 at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> 14026 at
> >> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> >> 14027 at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> >> 14028 at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >>
> >> Notice that ClientContext.getProperties(ClientContext.java:236) most of
> >> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in the
> >> last one it calls ServerInfo.getAuthenticationToken(ServerInfo.java:153).
> >> And both lead to (a lot of ?!) compressing..
> >>
> >> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
> >> level should not be DEBUG ?!
> >>
> >> Most of its threads either wait for notifications from Zookeeper:
> >>
> >> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
> >> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
> >> Object.wait() [0x0000fffebb7fc000]
> >> 878648 java.lang.Thread.State: TIMED_WAITING (on object monitor)
> >> 878649 at java.lang.Object.wait(java.base@11.0.11/Native Method)
> >> 878650 - waiting on <no object reference available>
> >> 878651 at
> >> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
> >> 878652 - waiting to re-lock in wait() <0x00000000f1427458> (a
> >> org.apache.accumulo.fate.ZooStore)
> >> 878653 at
> >> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
> >> 878654 at
> >> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
> >> 878655 at
> >> org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
> >> 878656 at
> >> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
> >> 878657 at
> >> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
> >> ...
> >>
> >> or wait for data:
> >> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0 cpu=7440.91ms
> >> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
> >> [0x0000fffebadfd000]
> >> 878782 java.lang.Thread.State: WAITING (on object monitor)
> >> 878783 at java.lang.Object.wait(java.base@11.0.11/Native Method)
> >> 878784 - waiting on <no object reference available>
> >> 878785 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> >> 878786 at
> >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> >> 878787 - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
> >> org.apache.zookeeper.ClientCnxn$Packet)
> >> 878788 at
> >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> >> 878789 at
> >> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
> >> 878790 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
> >> 878791 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
> >> Source)
> >> 878792 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> >> Source)
> >> 878793 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> >> 878794 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> >> 878795 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> >> 878796 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
> >> 878797 at org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
> >> 878798 at
> >> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
> >> 878799 at
> >> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
> >> 878800 at
> >> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
> >> 878801 at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> 878802 at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> >> Source)
> >> 878803 at
> >> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> >> /ThreadPoolExecutor.java:1128)
> >> 878804 at
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> >> /ThreadPoolExecutor.java:628)
> >> 878805 at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> 878806 at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> >> Source)
> >> 878807 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
> >> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
> >> [0x0000ffff20f50000]
> >> 908221 java.lang.Thread.State: WAITING (on object monitor)
> >> 908222 at java.lang.Object.wait(java.base@11.0.11/Native Method)
> >> 908223 - waiting on <no object reference available>
> >> 908224 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> >> 908225 at
> >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> >> 908226 - waiting to re-lock in wait() <0x00000000fa781138> (a
> >> org.apache.zookeeper.ClientCnxn$Packet)
> >> 908227 at
> >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> >> 908228 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
> >> 908229 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
> >> 908230 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
> >> Source)
> >> 908231 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> >> Source)
> >> 908232 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> >> 908233 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> >> 908234 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> >> 908235 at
> >> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
> >> 908236 at
> >> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
> >> 908237 at
> >> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
> >> 908238 at
> >> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
> >> 908239 at
> >> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> 908240 at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> >> Source)
> >> 908241 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >> 3) SimpleGarbageCollector is also busy in getting credentials
> >>
> >> "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
> >> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
> >> 2503 java.lang.Thread.State: RUNNABLE
> >> 2504 at
> >> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> >> /DigestBase.java:149)
> >> 2505 at
> >> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> >> /DigestBase.java:144)
> >> 2506 at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> >> /DigestBase.java:131)
> >> 2507 at
> >> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> >> /MessageDigest.java:623)
> >> 2508 at java.security.MessageDigest.update(java.base@11.0.11
> >> /MessageDigest.java:345)
> >> 2509 at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> >> 2510 at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> 2511 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> 2512 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> 2513 at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> 2514 at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> 2515 at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> 2516 at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> 2517 at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >> 2518 at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >> 2519 at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >> 2520 at
> >> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> >> 2521 at
> >> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
> >> 2522 at
> >> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
> >> 2523 at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
> >> 2524 at
> >> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
> >> 2525 at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> 2526 at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> >> Source)
> >> 2527 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >>
> >> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
> >> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
> >> 3152 java.lang.Thread.State: RUNNABLE
> >> 3153 at java.util.Arrays.hashCode(java.base@11.0.11/Arrays.java:4685)
> >> 3154 at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
> >> 3155 at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
> >> /Provider.java:1107)
> >> 3156 at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
> >> /ConcurrentHashMap.java:936)
> >> 3157 at java.security.Provider.getService(java.base@11.0.11
> >> /Provider.java:1282)
> >> 3158 at sun.security.jca.ProviderList.getService(java.base@11.0.11
> >> /ProviderList.java:380)
> >> 3159 at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
> >> /GetInstance.java:157)
> >> 3160 at java.security.Security.getImpl(java.base@11.0.11
> >> /Security.java:700)
> >> 3161 at java.security.MessageDigest.getInstance(java.base@11.0.11
> >> /MessageDigest.java:178)
> >> 3162 at
> >> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
> >> 3163 at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
> >> 3164 at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> 3165 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> 3166 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> 3167 at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> 3168 at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> 3169 at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> 3170 at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> 3171 at
> >> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> >> 3172 at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> >> 3173 at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >> 3174 at
> >> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> >> 3175 at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> >> 3176 at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> >> 3177 at
> >> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
> >> 3178 at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
> >> 3179 at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
> >> 3180 at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
> >> 3181 at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
> >> 3182 at
> >> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
> >> 3183 at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> 3184 at
> >> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> >> Source)
> >> 3185 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >>
> >> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
> >> processes
> >>
> >>
> >> I'm not saying that the above are problematic. You know how Accumulo
> >> works. It is up to you to decide whether something should be improved.
> >>
> >> Regards,
> >> Mark
> >>
> >>
> >> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:
> >>
> >>>
> >>>
> >>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:
> >>>
> >>>> It looks like the tests are timing out. This happens frequently when
> >>>> running on resource-constrained systems. You can give the test more
> >>>> time by increasing the timeout factor: `mvn clean verify
> >>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
> >>>> -Dtimeout.factor=3`
> >>>>
> >>>> There's nothing we know of that would change the way our tests work
> >>>> due to ARM64, but you may have issues because of limited RAM, slow CPU
> >>>> speeds, slow disk I/O, busy background processes, or other
> >>>> resource-related issues. I don't think most of the currently active
> >>>> developers use ARM64, or have access to a test machine to reproduce or
> >>>>
> >>>
> >>> In case anyone wants to test on Linux ARM64 you could easily use Oracle
> >>> Cloud for free.
> >>>
> >>> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
> >>> explains how to create a VM and how to use this VM as a Github Actions
> >>> runner.
> >>> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
> >>> mentions this article.
> >>>
> >>>
> >>>> experiment with Accumulo there, so you may have to do some of your own
> >>>> troubleshooting. If you can rule out resource-constraint issues, and
> >>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
> >>>> flaky and sometimes times out on x86_64 as well), you could create a
> >>>> bug ticket with more details at
> >>>> https://github.com/apache/accumulo/issues ; there is an issue template
> >>>> specifically for broken and/or flaky tests that you can select when
> >>>> creating a new ticket.
> >>>>
> >>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com>
> >>>> wrote:
> >>>> >
> >>>> > Hi dev1,
> >>>> >
> >>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
> >>>> >
> >>>> > > Some of those tests are trying to stress conditions that require a
> >>>> lot of
> >>>> > > resources to replicate specific conditions. Have you tried to run
> >>>> those
> >>>> > > individual tests in isolation so that you are not competing for
> >>>> resources?
> >>>> > > Do they always fail, or are the failures transient?
> >>>> > >
> >>>> >
> >>>> > Q: Have you tried to run those individual tests in isolation so that
> >>>> you
> >>>> > are not competing for resources?
> >>>> > A: This is what I mean with the following:
> >>>> > ---------------------
> >>>> > The tests fail even when executed separately, e.g.:
> >>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> >>>> > ---------------------
> >>>> >
> >>>> > Q: Do they always fail, or are the failures transient?
> >>>> > A: I also tried to explain that with "These tests fail consistently at
> >>>> > every build attempt!"
> >>>> >
> >>>> > Mark
> >>>> >
> >>>> > >
> >>>> > > -----Original Message-----
> >>>> > > From: Mark Jens <ma...@gmail.com>
> >>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
> >>>> > > To: dev@accumulo.apache.org
> >>>> > > Subject: Consistent IT tests failures on Linux ARM64
> >>>> > >
> >>>> > > Hello Accumulo community,
> >>>> > >
> >>>> > > At my job we consider using Linux ARM64 servers and I've been
> >>>> tasked to
> >>>> > > test Accumulo.
> >>>> > >
> >>>> > > I face some timeout related issues with several IT tests:
> >>>> > >
> >>>> > >
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >>>> > > Time elapsed: 420.122 s <<< ERROR!
> >>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
> >>>> 420
> >>>> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
> >>>> Method)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> >>>> > > at java.base@11.0.11
> >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>> > > Method)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>> > > at java.base@11.0.11
> >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >>>> > >
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >>>> > > Time elapsed: 420.122 s <<< ERROR!
> >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> >>>> > > test-SendThread(localhost:44251)
> >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> >>>> > > java.base@11.0.11
> >>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> >>>> > > at java.base@11.0.11
> >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> >>>> > > at java.base@11.0.11/sun.nio.ch
> >>>> .SelectorImpl.select(SelectorImpl.java:136)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> >>>> > > at
> >>>> > >
> >>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >>>> > >
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> >>>> > > Time elapsed: 420.011 s <<< ERROR!
> >>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
> >>>> 420
> >>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
> >>>> at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> >>>> > > at
> >>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> >>>> > > at
> >>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> >>>> > > at java.base@11.0.11
> >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>> > > Method)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>> > > at java.base@11.0.11
> >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >>>> > >
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.BatchWriterFlushIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.ZookeeperRestartIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 23.046 s - in
> >>>> org.apache.accumulo.test.functional.CreateManyScannersIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> >>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 255.108 s - in
> >>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 59.289 s - in
> >>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> >>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
> >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 238.409 s - in
> >>>> org.apache.accumulo.test.functional.GarbageCollectorIT
> >>>> > > [INFO] Running
> >>>> > >
> >>>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 219.253 s - in
> >>>> > >
> >>>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
> >>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 489.863 s - in
> >>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> >>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> >>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 71.934 s - in
> >>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> >>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 307.904 s <<< FAILURE! - in
> >>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
> >>>> > > [ERROR]
> >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >>>> > > Time elapsed: 240.011 s <<< ERROR!
> >>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
> >>>> 240
> >>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method)
> >>>> at
> >>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> >>>> > > at java.base@11.0.11
> >>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> >>>> > > at java.base@11.0.11
> >>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>> > > Method)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >>>> > > at java.base@11.0.11
> >>>> > >
> >>>> > >
> >>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>> > > at java.base@11.0.11
> >>>> /java.lang.reflect.Method.invoke(Method.java:566)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> >>>> > > at java.base@11.0.11
> >>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >>>> > >
> >>>> > > [ERROR]
> >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >>>> > > Time elapsed: 240.012 s <<< ERROR!
> >>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> >>>> > > test-SendThread(localhost:39285)
> >>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> >>>> > > java.base@11.0.11
> >>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> >>>> > > at java.base@11.0.11
> >>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> >>>> > > at java.base@11.0.11/sun.nio.ch
> >>>> .SelectorImpl.select(SelectorImpl.java:136)
> >>>> > > at
> >>>> > >
> >>>> > >
> >>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> >>>> > > at
> >>>> > >
> >>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >>>> > >
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> >>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> >>>> > > [INFO] Running
> >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 43.91 s - in
> >>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> >>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> >>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> >>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> >>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> >>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
> >>>> elapsed:
> >>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> >>>> > > [INFO] Running
> >>>> > >
> >>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> >>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
> >>>> elapsed:
> >>>> > > 0.039 s - in
> >>>> > >
> >>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> >>>> > > [INFO]
> >>>> > > [INFO] Results:
> >>>> > > [INFO]
> >>>> > > [ERROR] Errors:
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> >>>> > > [ERROR] Run 1:
> >>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178
> >>>> »
> >>>> > > TestTimedOut
> >>>> > > [ERROR] Run 2:
> >>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
> >>>> Appears
> >>>> > > to ...
> >>>> > > [INFO]
> >>>> > > [ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> >>>> > > TestTimedOut test t...
> >>>> > > [ERROR]
> >>>> > >
> >>>> > >
> >>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >>>> > > [ERROR] Run 1:
> >>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> >>>> TestTimedOut
> >>>> > > tes...
> >>>> > > [ERROR] Run 2:
> >>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
> >>>> > > Appears to be stuck...
> >>>> > > [INFO]
> >>>> > > [ERROR]
> >>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> >>>> > > [ERROR] Run 1:
> >>>> > >
> >>>> > >
> >>>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> >>>> > > » TestTimedOut
> >>>> > > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be
> >>>> stuck in
> >>>> > > thread Time-limited te...
> >>>> > > [INFO]
> >>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> >>>> > > [ERROR] Run 1:
> >>>> SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> >>>> > > TestTimedOut test timed ...
> >>>> > > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
> >>>> > > Time-limited test-SendThread(...
> >>>> > >
> >>>> > > These tests fail consistently at every build attempt!
> >>>> > >
> >>>> > > The tests fail even when executed separately, e.g.:
> >>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> >>>> > >
> >>>> > >
> >>>> > > I am using the current 'main' branch of Accumulo.
> >>>> > > JDK 11.0.11
> >>>> > > Maven: 3.8.2
> >>>> > > OS: Ubuntu 20.04.3 ARM64
> >>>> > >
> >>>> > > Is there anything that could be done to fix these problems ?
> >>>> > > For example some config settings ?!
> >>>> > >
> >>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
> >>>> Linux
> >>>> > > ARM64 is a supported platform since the JVM supports it.
> >>>> > >
> >>>> > > Thanks!
> >>>> > >
> >>>> > > Mark
> >>>> > >
> >>>>
> >>>
Re: Consistent IT tests failures on Linux ARM64
Posted by Mark Jens <ma...@gmail.com>.
Please review https://github.com/apache/accumulo/pull/2374
By caching the ServerInfo's Credentials ConcurrentDeleteTableIT passes
almost 6 times faster now!
I am running the whole test suite now to see whether it doesn't break
something else.
On Thu, 2 Dec 2021 at 13:49, Mark Jens <ma...@gmail.com> wrote:
> Reducing the log output did not reduce the test run time:
>
> diff --git test/src/main/resources/log4j2-test.properties
> test/src/main/resources/log4j2-test.properties
> index 9124914f7a..810c7bf06f 100644
> --- test/src/main/resources/log4j2-test.properties
> +++ test/src/main/resources/log4j2-test.properties
> @@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
> appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
>
> logger.01.name = org.apache.accumulo.core
> -logger.01.level = debug
> +logger.01.level = info
>
> logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
> logger.02.level = info
> @@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
> logger.25.level = info
>
> logger.26.name = org.apache.hadoop.minikdc
> -logger.26.level = debug
> +logger.26.level = info
>
>
> @@ -169,6 +169,6 @@ logger.metrics.level = info
> logger.metrics.additivity = false
> logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput
>
> -rootLogger.level = debug
> +rootLogger.level = info
> rootLogger.appenderRef.console.ref = STDOUT
>
> INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 785.503 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
>
>
> On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:
>
>> Hi again,
>>
>> Here are the thread dumps as promised:
>>
>> 1) Both TabletServers are very busy at compressing at close time. The
>> following stacks are dumped in ~5 secs interval:
>>
>> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=68425.44ms
>> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
>> [0x0000fffe8f3fd000]
>> java.lang.Thread.State: RUNNABLE
>> at sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
>> /SHA5.java:232)
>> at sun.security.provider.SHA5.implCompress(java.base@11.0.11
>> /SHA5.java:221)
>> at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:124)
>> at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>> at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>> at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
>> at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> at
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> at
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> at
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
>> at
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
>> at
>> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
>> at
>> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
>> at
>> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
>> at
>> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
>> - locked <0x00000000f1585830> (a
>> org.apache.accumulo.tserver.tablet.Tablet)
>> at
>> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
>> at
>> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
>> at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> at
>> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
>> Source)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
>> /ThreadPoolExecutor.java:1128)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
>> /ThreadPoolExecutor.java:628)
>> at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> at
>> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
>> Source)
>> at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=72485.20ms
>> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
>> [0x0000fffe8f3fd000]
>> java.lang.Thread.State: RUNNABLE
>> at
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> /DigestBase.java:149)
>> at
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> /DigestBase.java:144)
>> at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:131)
>> at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>> at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>> at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>> at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> ...
>>
>> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=81174.59ms
>> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
>> [0x0000fffe8f3fd000]
>> java.lang.Thread.State: RUNNABLE
>> at sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
>> /ByteArrayAccess.java:449)
>> at sun.security.provider.SHA5.implDigest(java.base@11.0.11
>> /SHA5.java:131)
>> at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
>> /DigestBase.java:210)
>> at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
>> /DigestBase.java:189)
>> at
>> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
>> /MessageDigest.java:639)
>> at java.security.MessageDigest.digest(java.base@11.0.11
>> /MessageDigest.java:385)
>> at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
>> at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> ...
>>
>> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=86499.01ms
>> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
>> [0x0000fffe8f3fd000]
>> java.lang.Thread.State: RUNNABLE
>> at
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> /DigestBase.java:149)
>> at
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> /DigestBase.java:144)
>> at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:131)
>> at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>> at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>> at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
>> at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> ...
>>
>> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0 cpu=109551.37ms
>> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
>> [0x0000fffe7bffd000]
>> 14012 java.lang.Thread.State: RUNNABLE
>> 14013 at
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> /DigestBase.java:149)
>> 14014 at
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> /DigestBase.java:144)
>> 14015 at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:131)
>> 14016 at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>> 14017 at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>> 14018 at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
>> 14019 at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> 14020 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> 14021 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> 14022 at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> 14023 at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> 14024 at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> 14025 at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> 14026 at
>> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
>> 14027 at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
>> 14028 at
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>>
>> Notice that ClientContext.getProperties(ClientContext.java:236) most of
>> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in the
>> last one it calls ServerInfo.getAuthenticationToken(ServerInfo.java:153).
>> And both lead to (a lot of ?!) compressing..
>>
>> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
>> level should not be DEBUG ?!
>>
>> Most of its threads either wait for notifications from Zookeeper:
>>
>> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
>> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
>> Object.wait() [0x0000fffebb7fc000]
>> 878648 java.lang.Thread.State: TIMED_WAITING (on object monitor)
>> 878649 at java.lang.Object.wait(java.base@11.0.11/Native Method)
>> 878650 - waiting on <no object reference available>
>> 878651 at
>> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
>> 878652 - waiting to re-lock in wait() <0x00000000f1427458> (a
>> org.apache.accumulo.fate.ZooStore)
>> 878653 at
>> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
>> 878654 at
>> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
>> 878655 at
>> org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
>> 878656 at
>> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
>> 878657 at
>> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
>> ...
>>
>> or wait for data:
>> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0 cpu=7440.91ms
>> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
>> [0x0000fffebadfd000]
>> 878782 java.lang.Thread.State: WAITING (on object monitor)
>> 878783 at java.lang.Object.wait(java.base@11.0.11/Native Method)
>> 878784 - waiting on <no object reference available>
>> 878785 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
>> 878786 at
>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>> 878787 - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
>> org.apache.zookeeper.ClientCnxn$Packet)
>> 878788 at
>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>> 878789 at
>> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
>> 878790 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
>> 878791 at
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
>> Source)
>> 878792 at
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
>> Source)
>> 878793 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>> 878794 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>> 878795 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>> 878796 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
>> 878797 at org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
>> 878798 at
>> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
>> 878799 at
>> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
>> 878800 at
>> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
>> 878801 at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> 878802 at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> Source)
>> 878803 at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
>> /ThreadPoolExecutor.java:1128)
>> 878804 at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
>> /ThreadPoolExecutor.java:628)
>> 878805 at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> 878806 at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> Source)
>> 878807 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
>> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
>> [0x0000ffff20f50000]
>> 908221 java.lang.Thread.State: WAITING (on object monitor)
>> 908222 at java.lang.Object.wait(java.base@11.0.11/Native Method)
>> 908223 - waiting on <no object reference available>
>> 908224 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
>> 908225 at
>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
>> 908226 - waiting to re-lock in wait() <0x00000000fa781138> (a
>> org.apache.zookeeper.ClientCnxn$Packet)
>> 908227 at
>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
>> 908228 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
>> 908229 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
>> 908230 at
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
>> Source)
>> 908231 at
>> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
>> Source)
>> 908232 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
>> 908233 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
>> 908234 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
>> 908235 at
>> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
>> 908236 at
>> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
>> 908237 at
>> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
>> 908238 at
>> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
>> 908239 at
>> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> 908240 at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
>> Source)
>> 908241 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>> 3) SimpleGarbageCollector is also busy in getting credentials
>>
>> "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
>> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
>> 2503 java.lang.Thread.State: RUNNABLE
>> 2504 at
>> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
>> /DigestBase.java:149)
>> 2505 at
>> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
>> /DigestBase.java:144)
>> 2506 at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
>> /DigestBase.java:131)
>> 2507 at
>> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
>> /MessageDigest.java:623)
>> 2508 at java.security.MessageDigest.update(java.base@11.0.11
>> /MessageDigest.java:345)
>> 2509 at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
>> 2510 at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> 2511 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> 2512 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> 2513 at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> 2514 at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> 2515 at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> 2516 at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> 2517 at
>> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
>> 2518 at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
>> 2519 at
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> 2520 at
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> 2521 at
>> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
>> 2522 at
>> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
>> 2523 at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
>> 2524 at
>> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
>> 2525 at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> 2526 at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
>> Source)
>> 2527 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>>
>> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
>> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
>> 3152 java.lang.Thread.State: RUNNABLE
>> 3153 at java.util.Arrays.hashCode(java.base@11.0.11/Arrays.java:4685)
>> 3154 at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
>> 3155 at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
>> /Provider.java:1107)
>> 3156 at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
>> /ConcurrentHashMap.java:936)
>> 3157 at java.security.Provider.getService(java.base@11.0.11
>> /Provider.java:1282)
>> 3158 at sun.security.jca.ProviderList.getService(java.base@11.0.11
>> /ProviderList.java:380)
>> 3159 at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
>> /GetInstance.java:157)
>> 3160 at java.security.Security.getImpl(java.base@11.0.11
>> /Security.java:700)
>> 3161 at java.security.MessageDigest.getInstance(java.base@11.0.11
>> /MessageDigest.java:178)
>> 3162 at
>> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
>> 3163 at
>> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
>> 3164 at
>> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
>> 3165 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
>> 3166 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
>> 3167 at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
>> 3168 at
>> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
>> 3169 at
>> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
>> 3170 at
>> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
>> 3171 at
>> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
>> 3172 at
>> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
>> 3173 at
>> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>> 3174 at
>> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
>> 3175 at
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
>> 3176 at
>> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
>> 3177 at
>> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
>> 3178 at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
>> 3179 at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
>> 3180 at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
>> 3181 at
>> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
>> 3182 at
>> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
>> 3183 at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
>> 3184 at
>> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
>> Source)
>> 3185 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>>
>>
>> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
>> processes
>>
>>
>> I'm not saying that the above are problematic. You know how Accumulo
>> works. It is up to you to decide whether something should be improved.
>>
>> Regards,
>> Mark
>>
>>
>> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:
>>
>>>
>>>
>>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:
>>>
>>>> It looks like the tests are timing out. This happens frequently when
>>>> running on resource-constrained systems. You can give the test more
>>>> time by increasing the timeout factor: `mvn clean verify
>>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
>>>> -Dtimeout.factor=3`
>>>>
>>>> There's nothing we know of that would change the way our tests work
>>>> due to ARM64, but you may have issues because of limited RAM, slow CPU
>>>> speeds, slow disk I/O, busy background processes, or other
>>>> resource-related issues. I don't think most of the currently active
>>>> developers use ARM64, or have access to a test machine to reproduce or
>>>>
>>>
>>> In case anyone wants to test on Linux ARM64 you could easily use Oracle
>>> Cloud for free.
>>>
>>> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
>>> explains how to create a VM and how to use this VM as a Github Actions
>>> runner.
>>> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
>>> mentions this article.
>>>
>>>
>>>> experiment with Accumulo there, so you may have to do some of your own
>>>> troubleshooting. If you can rule out resource-constraint issues, and
>>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
>>>> flaky and sometimes times out on x86_64 as well), you could create a
>>>> bug ticket with more details at
>>>> https://github.com/apache/accumulo/issues ; there is an issue template
>>>> specifically for broken and/or flaky tests that you can select when
>>>> creating a new ticket.
>>>>
>>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi dev1,
>>>> >
>>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>>>> >
>>>> > > Some of those tests are trying to stress conditions that require a
>>>> lot of
>>>> > > resources to replicate specific conditions. Have you tried to run
>>>> those
>>>> > > individual tests in isolation so that you are not competing for
>>>> resources?
>>>> > > Do they always fail, or are the failures transient?
>>>> > >
>>>> >
>>>> > Q: Have you tried to run those individual tests in isolation so that
>>>> you
>>>> > are not competing for resources?
>>>> > A: This is what I mean with the following:
>>>> > ---------------------
>>>> > The tests fail even when executed separately, e.g.:
>>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>>>> > ---------------------
>>>> >
>>>> > Q: Do they always fail, or are the failures transient?
>>>> > A: I also tried to explain that with "These tests fail consistently at
>>>> > every build attempt!"
>>>> >
>>>> > Mark
>>>> >
>>>> > >
>>>> > > -----Original Message-----
>>>> > > From: Mark Jens <ma...@gmail.com>
>>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
>>>> > > To: dev@accumulo.apache.org
>>>> > > Subject: Consistent IT tests failures on Linux ARM64
>>>> > >
>>>> > > Hello Accumulo community,
>>>> > >
>>>> > > At my job we consider using Linux ARM64 servers and I've been
>>>> tasked to
>>>> > > test Accumulo.
>>>> > >
>>>> > > I face some timeout related issues with several IT tests:
>>>> > >
>>>> > >
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>>> > > Time elapsed: 420.122 s <<< ERROR!
>>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>>> 420
>>>> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
>>>> Method)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>>>> > > at java.base@11.0.11
>>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>> > > Method)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> > > at java.base@11.0.11
>>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>>> > >
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>>> > > Time elapsed: 420.122 s <<< ERROR!
>>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>>>> > > test-SendThread(localhost:44251)
>>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>>>> > > java.base@11.0.11
>>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>>>> > > at java.base@11.0.11
>>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>>>> > > at java.base@11.0.11/sun.nio.ch
>>>> .SelectorImpl.select(SelectorImpl.java:136)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>>>> > > at
>>>> > >
>>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>>>> > >
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
>>>> > > Time elapsed: 420.011 s <<< ERROR!
>>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>>> 420
>>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
>>>> at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
>>>> > > at
>>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
>>>> > > at
>>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
>>>> > > at java.base@11.0.11
>>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>> > > Method)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> > > at java.base@11.0.11
>>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>>> > >
>>>> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
>>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.BatchWriterFlushIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.ZookeeperRestartIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 23.046 s - in
>>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 255.108 s - in
>>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 59.289 s - in
>>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
>>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.GarbageCollectorIT
>>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 238.409 s - in
>>>> org.apache.accumulo.test.functional.GarbageCollectorIT
>>>> > > [INFO] Running
>>>> > >
>>>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 219.253 s - in
>>>> > >
>>>> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 489.863 s - in
>>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
>>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
>>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 71.934 s - in
>>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
>>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
>>>> elapsed:
>>>> > > 307.904 s <<< FAILURE! - in
>>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
>>>> > > [ERROR]
>>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>>> > > Time elapsed: 240.011 s <<< ERROR!
>>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>>> 240
>>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method)
>>>> at
>>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
>>>> > > at java.base@11.0.11
>>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
>>>> > > at java.base@11.0.11
>>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>> > > Method)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> > > at java.base@11.0.11
>>>> > >
>>>> > >
>>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> > > at java.base@11.0.11
>>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>>> > > at java.base@11.0.11
>>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>>> > >
>>>> > > [ERROR]
>>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>>> > > Time elapsed: 240.012 s <<< ERROR!
>>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>>>> > > test-SendThread(localhost:39285)
>>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>>>> > > java.base@11.0.11
>>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>>>> > > at java.base@11.0.11
>>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>>>> > > at java.base@11.0.11/sun.nio.ch
>>>> .SelectorImpl.select(SelectorImpl.java:136)
>>>> > > at
>>>> > >
>>>> > >
>>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>>>> > > at
>>>> > >
>>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>>>> > >
>>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
>>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
>>>> > > [INFO] Running
>>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 43.91 s - in
>>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
>>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
>>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
>>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
>>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
>>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
>>>> elapsed:
>>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
>>>> > > [INFO] Running
>>>> > >
>>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
>>>> elapsed:
>>>> > > 0.039 s - in
>>>> > >
>>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>>>> > > [INFO]
>>>> > > [INFO] Results:
>>>> > > [INFO]
>>>> > > [ERROR] Errors:
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
>>>> > > [ERROR] Run 1:
>>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178
>>>> »
>>>> > > TestTimedOut
>>>> > > [ERROR] Run 2:
>>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
>>>> Appears
>>>> > > to ...
>>>> > > [INFO]
>>>> > > [ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
>>>> > > TestTimedOut test t...
>>>> > > [ERROR]
>>>> > >
>>>> > >
>>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>>> > > [ERROR] Run 1:
>>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
>>>> TestTimedOut
>>>> > > tes...
>>>> > > [ERROR] Run 2:
>>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
>>>> > > Appears to be stuck...
>>>> > > [INFO]
>>>> > > [ERROR]
>>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>>> > > [ERROR] Run 1:
>>>> > >
>>>> > >
>>>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
>>>> > > » TestTimedOut
>>>> > > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be
>>>> stuck in
>>>> > > thread Time-limited te...
>>>> > > [INFO]
>>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
>>>> > > [ERROR] Run 1:
>>>> SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
>>>> > > TestTimedOut test timed ...
>>>> > > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
>>>> > > Time-limited test-SendThread(...
>>>> > >
>>>> > > These tests fail consistently at every build attempt!
>>>> > >
>>>> > > The tests fail even when executed separately, e.g.:
>>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>>>> > >
>>>> > >
>>>> > > I am using the current 'main' branch of Accumulo.
>>>> > > JDK 11.0.11
>>>> > > Maven: 3.8.2
>>>> > > OS: Ubuntu 20.04.3 ARM64
>>>> > >
>>>> > > Is there anything that could be done to fix these problems ?
>>>> > > For example some config settings ?!
>>>> > >
>>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
>>>> Linux
>>>> > > ARM64 is a supported platform since the JVM supports it.
>>>> > >
>>>> > > Thanks!
>>>> > >
>>>> > > Mark
>>>> > >
>>>>
>>>
Re: Consistent IT tests failures on Linux ARM64
Posted by Mark Jens <ma...@gmail.com>.
Reducing the log output did not reduce the test run time:
diff --git test/src/main/resources/log4j2-test.properties
test/src/main/resources/log4j2-test.properties
index 9124914f7a..810c7bf06f 100644
--- test/src/main/resources/log4j2-test.properties
+++ test/src/main/resources/log4j2-test.properties
@@ -28,7 +28,7 @@ appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
logger.01.name = org.apache.accumulo.core
-logger.01.level = debug
+logger.01.level = info
logger.02.name = org.apache.accumulo.core.clientImpl.ManagerClient
logger.02.level = info
@@ -106,7 +106,7 @@ logger.25.name = org.apache.hadoop.security
logger.25.level = info
logger.26.name = org.apache.hadoop.minikdc
-logger.26.level = debug
+logger.26.level = info
@@ -169,6 +169,6 @@ logger.metrics.level = info
logger.metrics.additivity = false
logger.metrics.appenderRef.metrics.ref = LoggingMetricsOutput
-rootLogger.level = debug
+rootLogger.level = info
rootLogger.appenderRef.console.ref = STDOUT
INFO] Running org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
785.503 s - in org.apache.accumulo.test.functional.ConcurrentDeleteTableIT
On Thu, 2 Dec 2021 at 12:10, Mark Jens <ma...@gmail.com> wrote:
> Hi again,
>
> Here are the thread dumps as promised:
>
> 1) Both TabletServers are very busy at compressing at close time. The
> following stacks are dumped in ~5 secs interval:
>
> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=68425.44ms
> elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
> [0x0000fffe8f3fd000]
> java.lang.Thread.State: RUNNABLE
> at sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
> /SHA5.java:232)
> at sun.security.provider.SHA5.implCompress(java.base@11.0.11
> /SHA5.java:221)
> at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:124)
> at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
> at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
> at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> at
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> at
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> at
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> at
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> at
> org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
> at
> org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
> at
> org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
> at
> org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
> - locked <0x00000000f1585830> (a
> org.apache.accumulo.tserver.tablet.Tablet)
> at
> org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
> at
> org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
> at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> at
> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> Source)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> /ThreadPoolExecutor.java:1128)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> /ThreadPoolExecutor.java:628)
> at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> at
> io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
> Source)
> at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=72485.20ms
> elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
> [0x0000fffe8f3fd000]
> java.lang.Thread.State: RUNNABLE
> at
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> /DigestBase.java:149)
> at
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> /DigestBase.java:144)
> at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:131)
> at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
> at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
> at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> ...
>
> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=81174.59ms
> elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
> [0x0000fffe8f3fd000]
> java.lang.Thread.State: RUNNABLE
> at sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
> /ByteArrayAccess.java:449)
> at sun.security.provider.SHA5.implDigest(java.base@11.0.11
> /SHA5.java:131)
> at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> /DigestBase.java:210)
> at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
> /DigestBase.java:189)
> at
> java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
> /MessageDigest.java:639)
> at java.security.MessageDigest.digest(java.base@11.0.11
> /MessageDigest.java:385)
> at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
> at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> ...
>
> "tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=86499.01ms
> elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
> [0x0000fffe8f3fd000]
> java.lang.Thread.State: RUNNABLE
> at
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> /DigestBase.java:149)
> at
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> /DigestBase.java:144)
> at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:131)
> at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
> at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
> at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
> at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> ...
>
> "tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0 cpu=109551.37ms
> elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
> [0x0000fffe7bffd000]
> 14012 java.lang.Thread.State: RUNNABLE
> 14013 at
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> /DigestBase.java:149)
> 14014 at
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> /DigestBase.java:144)
> 14015 at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:131)
> 14016 at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
> 14017 at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
> 14018 at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
> 14019 at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> 14020 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> 14021 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> 14022 at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> 14023 at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> 14024 at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> 14025 at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> 14026 at
> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> 14027 at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> 14028 at
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
>
> Notice that ClientContext.getProperties(ClientContext.java:236) most of
> the times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in the
> last one it calls ServerInfo.getAuthenticationToken(ServerInfo.java:153).
> And both lead to (a lot of ?!) compressing..
>
> 2) The "Manager" process writes ~200Mb of logs. Maybe the default log
> level should not be DEBUG ?!
>
> Most of its threads either wait for notifications from Zookeeper:
>
> 878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
> cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
> Object.wait() [0x0000fffebb7fc000]
> 878648 java.lang.Thread.State: TIMED_WAITING (on object monitor)
> 878649 at java.lang.Object.wait(java.base@11.0.11/Native Method)
> 878650 - waiting on <no object reference available>
> 878651 at
> org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
> 878652 - waiting to re-lock in wait() <0x00000000f1427458> (a
> org.apache.accumulo.fate.ZooStore)
> 878653 at
> org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
> 878654 at
> org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
> 878655 at org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
> 878656 at
> org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
> 878657 at
> org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
> ...
>
> or wait for data:
> 878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0 cpu=7440.91ms
> elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
> [0x0000fffebadfd000]
> 878782 java.lang.Thread.State: WAITING (on object monitor)
> 878783 at java.lang.Object.wait(java.base@11.0.11/Native Method)
> 878784 - waiting on <no object reference available>
> 878785 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> 878786 at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> 878787 - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
> org.apache.zookeeper.ClientCnxn$Packet)
> 878788 at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> 878789 at
> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
> 878790 at
> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
> 878791 at
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
> Source)
> 878792 at
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> Source)
> 878793 at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> 878794 at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> 878795 at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> 878796 at
> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
> 878797 at org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
> 878798 at
> org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
> 878799 at
> org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
> 878800 at
> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
> 878801 at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> 878802 at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> Source)
> 878803 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
> /ThreadPoolExecutor.java:1128)
> 878804 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
> /ThreadPoolExecutor.java:628)
> 878805 at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> 878806 at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> Source)
> 878807 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
> 908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
> elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
> [0x0000ffff20f50000]
> 908221 java.lang.Thread.State: WAITING (on object monitor)
> 908222 at java.lang.Object.wait(java.base@11.0.11/Native Method)
> 908223 - waiting on <no object reference available>
> 908224 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
> 908225 at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
> 908226 - waiting to re-lock in wait() <0x00000000fa781138> (a
> org.apache.zookeeper.ClientCnxn$Packet)
> 908227 at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
> 908228 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
> 908229 at
> org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
> 908230 at
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
> Source)
> 908231 at
> org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
> Source)
> 908232 at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
> 908233 at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
> 908234 at
> org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
> 908235 at
> org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
> 908236 at
> org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
> 908237 at
> org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
> 908238 at
> org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
> 908239 at
> io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> 908240 at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
> Source)
> 908241 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
> 3) SimpleGarbageCollector is also busy in getting credentials
>
> "gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
> 2503 java.lang.Thread.State: RUNNABLE
> 2504 at
> sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
> /DigestBase.java:149)
> 2505 at
> sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
> /DigestBase.java:144)
> 2506 at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
> /DigestBase.java:131)
> 2507 at
> java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
> /MessageDigest.java:623)
> 2508 at java.security.MessageDigest.update(java.base@11.0.11
> /MessageDigest.java:345)
> 2509 at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
> 2510 at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> 2511 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> 2512 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> 2513 at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> 2514 at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> 2515 at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> 2516 at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> 2517 at
> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> 2518 at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> 2519 at
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> 2520 at
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> 2521 at
> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
> 2522 at
> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
> 2523 at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
> 2524 at
> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
> 2525 at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> 2526 at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> Source)
> 2527 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
>
> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
> tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
> 3152 java.lang.Thread.State: RUNNABLE
> 3153 at java.util.Arrays.hashCode(java.base@11.0.11/Arrays.java:4685)
> 3154 at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
> 3155 at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
> /Provider.java:1107)
> 3156 at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
> /ConcurrentHashMap.java:936)
> 3157 at java.security.Provider.getService(java.base@11.0.11
> /Provider.java:1282)
> 3158 at sun.security.jca.ProviderList.getService(java.base@11.0.11
> /ProviderList.java:380)
> 3159 at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
> /GetInstance.java:157)
> 3160 at java.security.Security.getImpl(java.base@11.0.11
> /Security.java:700)
> 3161 at java.security.MessageDigest.getInstance(java.base@11.0.11
> /MessageDigest.java:178)
> 3162 at
> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
> 3163 at
> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
> 3164 at
> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> 3165 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> 3166 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> 3167 at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> 3168 at
> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> 3169 at
> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> 3170 at
> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> 3171 at
> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> 3172 at
> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> 3173 at
> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> 3174 at
> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> 3175 at
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> 3176 at
> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> 3177 at
> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
> 3178 at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
> 3179 at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
> 3180 at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
> 3181 at
> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
> 3182 at
> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
> 3183 at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> 3184 at
> io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
> Source)
> 3185 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
>
>
> 4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
> processes
>
>
> I'm not saying that the above are problematic. You know how Accumulo
> works. It is up to you to decide whether something should be improved.
>
> Regards,
> Mark
>
>
> On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:
>
>>
>>
>> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:
>>
>>> It looks like the tests are timing out. This happens frequently when
>>> running on resource-constrained systems. You can give the test more
>>> time by increasing the timeout factor: `mvn clean verify
>>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
>>> -Dtimeout.factor=3`
>>>
>>> There's nothing we know of that would change the way our tests work
>>> due to ARM64, but you may have issues because of limited RAM, slow CPU
>>> speeds, slow disk I/O, busy background processes, or other
>>> resource-related issues. I don't think most of the currently active
>>> developers use ARM64, or have access to a test machine to reproduce or
>>>
>>
>> In case anyone wants to test on Linux ARM64 you could easily use Oracle
>> Cloud for free.
>>
>> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
>> explains how to create a VM and how to use this VM as a Github Actions
>> runner.
>> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
>> mentions this article.
>>
>>
>>> experiment with Accumulo there, so you may have to do some of your own
>>> troubleshooting. If you can rule out resource-constraint issues, and
>>> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
>>> flaky and sometimes times out on x86_64 as well), you could create a
>>> bug ticket with more details at
>>> https://github.com/apache/accumulo/issues ; there is an issue template
>>> specifically for broken and/or flaky tests that you can select when
>>> creating a new ticket.
>>>
>>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
>>> >
>>> > Hi dev1,
>>> >
>>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>>> >
>>> > > Some of those tests are trying to stress conditions that require a
>>> lot of
>>> > > resources to replicate specific conditions. Have you tried to run
>>> those
>>> > > individual tests in isolation so that you are not competing for
>>> resources?
>>> > > Do they always fail, or are the failures transient?
>>> > >
>>> >
>>> > Q: Have you tried to run those individual tests in isolation so that
>>> you
>>> > are not competing for resources?
>>> > A: This is what I mean with the following:
>>> > ---------------------
>>> > The tests fail even when executed separately, e.g.:
>>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>>> > ---------------------
>>> >
>>> > Q: Do they always fail, or are the failures transient?
>>> > A: I also tried to explain that with "These tests fail consistently at
>>> > every build attempt!"
>>> >
>>> > Mark
>>> >
>>> > >
>>> > > -----Original Message-----
>>> > > From: Mark Jens <ma...@gmail.com>
>>> > > Sent: Tuesday, November 30, 2021 4:05 AM
>>> > > To: dev@accumulo.apache.org
>>> > > Subject: Consistent IT tests failures on Linux ARM64
>>> > >
>>> > > Hello Accumulo community,
>>> > >
>>> > > At my job we consider using Linux ARM64 servers and I've been tasked
>>> to
>>> > > test Accumulo.
>>> > >
>>> > > I face some timeout related issues with several IT tests:
>>> > >
>>> > >
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>> > > Time elapsed: 420.122 s <<< ERROR!
>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>> 420
>>> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
>>> Method)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>>> > > at java.base@11.0.11
>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> > > Method)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > > at java.base@11.0.11
>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>> > >
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>> > > Time elapsed: 420.122 s <<< ERROR!
>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>>> > > test-SendThread(localhost:44251)
>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>>> > > java.base@11.0.11
>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>>> > > at java.base@11.0.11
>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>>> > > at java.base@11.0.11/sun.nio.ch
>>> .SelectorImpl.select(SelectorImpl.java:136)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>>> > > at
>>> > >
>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>>> > >
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
>>> > > Time elapsed: 420.011 s <<< ERROR!
>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>> 420
>>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
>>> at
>>> > >
>>> > >
>>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
>>> > > at
>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
>>> > > at
>>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
>>> > > at java.base@11.0.11
>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> > > Method)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > > at java.base@11.0.11
>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>> > >
>>> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 23.046 s - in
>>> org.apache.accumulo.test.functional.CreateManyScannersIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 255.108 s - in
>>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 59.289 s - in
>>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
>>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
>>> > > [INFO] Running
>>> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 219.253 s - in
>>> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 489.863 s - in
>>> org.apache.accumulo.test.functional.SslWithClientAuthIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
>>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
>>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 71.934 s - in
>>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
>>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
>>> elapsed:
>>> > > 307.904 s <<< FAILURE! - in
>>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
>>> > > [ERROR]
>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>> > > Time elapsed: 240.011 s <<< ERROR!
>>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>>> 240
>>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
>>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
>>> > > at java.base@11.0.11
>>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
>>> > > at java.base@11.0.11
>>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> > > Method)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > > at java.base@11.0.11
>>> > >
>>> > >
>>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > > at java.base@11.0.11
>>> /java.lang.reflect.Method.invoke(Method.java:566)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>>> > > at
>>> > >
>>> > >
>>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>>> > > at java.base@11.0.11
>>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>>> > >
>>> > > [ERROR]
>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>> > > Time elapsed: 240.012 s <<< ERROR!
>>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>>> > > test-SendThread(localhost:39285)
>>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>>> > > java.base@11.0.11
>>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>>> > > at java.base@11.0.11
>>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>>> > > at java.base@11.0.11/sun.nio.ch
>>> .SelectorImpl.select(SelectorImpl.java:136)
>>> > > at
>>> > >
>>> > >
>>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>>> > > at
>>> > >
>>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>>> > >
>>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
>>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
>>> > > [INFO] Running
>>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
>>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
>>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
>>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
>>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
>>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time
>>> elapsed:
>>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
>>> > > [INFO] Running
>>> > >
>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
>>> elapsed:
>>> > > 0.039 s - in
>>> > >
>>> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>>> > > [INFO]
>>> > > [INFO] Results:
>>> > > [INFO]
>>> > > [ERROR] Errors:
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
>>> > > [ERROR] Run 1:
>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
>>> > > TestTimedOut
>>> > > [ERROR] Run 2:
>>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
>>> Appears
>>> > > to ...
>>> > > [INFO]
>>> > > [ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
>>> > > TestTimedOut test t...
>>> > > [ERROR]
>>> > >
>>> > >
>>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>>> > > [ERROR] Run 1:
>>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
>>> TestTimedOut
>>> > > tes...
>>> > > [ERROR] Run 2:
>>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
>>> > > Appears to be stuck...
>>> > > [INFO]
>>> > > [ERROR]
>>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>>> > > [ERROR] Run 1:
>>> > >
>>> > >
>>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
>>> > > » TestTimedOut
>>> > > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be
>>> stuck in
>>> > > thread Time-limited te...
>>> > > [INFO]
>>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
>>> > > [ERROR] Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2
>>> »
>>> > > TestTimedOut test timed ...
>>> > > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
>>> > > Time-limited test-SendThread(...
>>> > >
>>> > > These tests fail consistently at every build attempt!
>>> > >
>>> > > The tests fail even when executed separately, e.g.:
>>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>>> > >
>>> > >
>>> > > I am using the current 'main' branch of Accumulo.
>>> > > JDK 11.0.11
>>> > > Maven: 3.8.2
>>> > > OS: Ubuntu 20.04.3 ARM64
>>> > >
>>> > > Is there anything that could be done to fix these problems ?
>>> > > For example some config settings ?!
>>> > >
>>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
>>> Linux
>>> > > ARM64 is a supported platform since the JVM supports it.
>>> > >
>>> > > Thanks!
>>> > >
>>> > > Mark
>>> > >
>>>
>>
Re: Consistent IT tests failures on Linux ARM64
Posted by Mark Jens <ma...@gmail.com>.
Hi again,
Here are the thread dumps as promised:
1) Both TabletServers are very busy at compressing at close time. The
following stacks are dumped in ~5 secs interval:
"tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=68425.44ms
elapsed=75.42s tid=0x0000fffeac074800 nid=0x33077e runnable
[0x0000fffe8f3fd000]
java.lang.Thread.State: RUNNABLE
at sun.security.provider.SHA5.implCompressCheck(java.base@11.0.11
/SHA5.java:232)
at sun.security.provider.SHA5.implCompress(java.base@11.0.11
/SHA5.java:221)
at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:124)
at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
at
org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
at
org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
at
org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
at
org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
at
org.apache.accumulo.core.metadata.schema.AmpleImpl.readTablet(AmpleImpl.java:46)
at
org.apache.accumulo.core.metadata.schema.Ample.readTablet(Ample.java:141)
at
org.apache.accumulo.tserver.tablet.Tablet.closeConsistencyCheck(Tablet.java:1379)
at
org.apache.accumulo.tserver.tablet.Tablet.completeClose(Tablet.java:1331)
- locked <0x00000000f1585830> (a
org.apache.accumulo.tserver.tablet.Tablet)
at org.apache.accumulo.tserver.tablet.Tablet.close(Tablet.java:1221)
at
org.apache.accumulo.tserver.UnloadTabletHandler.run(UnloadTabletHandler.java:92)
at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
at
io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
Source)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
/ThreadPoolExecutor.java:1128)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
/ThreadPoolExecutor.java:628)
at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
at
io.opentelemetry.context.Context$$Lambda$209/0x000000010035c840.run(Unknown
Source)
at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
"tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=72485.20ms
elapsed=79.71s tid=0x0000fffeac074800 nid=0x33077e runnable
[0x0000fffe8f3fd000]
java.lang.Thread.State: RUNNABLE
at
sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
/DigestBase.java:149)
at
sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
/DigestBase.java:144)
at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:131)
at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
...
"tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=81174.59ms
elapsed=89.01s tid=0x0000fffeac074800 nid=0x33077e runnable
[0x0000fffe8f3fd000]
java.lang.Thread.State: RUNNABLE
at sun.security.provider.ByteArrayAccess.l2bBig(java.base@11.0.11
/ByteArrayAccess.java:449)
at sun.security.provider.SHA5.implDigest(java.base@11.0.11
/SHA5.java:131)
at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
/DigestBase.java:210)
at sun.security.provider.DigestBase.engineDigest(java.base@11.0.11
/DigestBase.java:189)
at
java.security.MessageDigest$Delegate.engineDigest(java.base@11.0.11
/MessageDigest.java:639)
at java.security.MessageDigest.digest(java.base@11.0.11
/MessageDigest.java:385)
at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:439)
at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
...
"tablet migration-Worker-1" #4380 daemon prio=5 os_prio=0 cpu=86499.01ms
elapsed=94.68s tid=0x0000fffeac074800 nid=0x33077e runnable
[0x0000fffe8f3fd000]
java.lang.Thread.State: RUNNABLE
at
sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
/DigestBase.java:149)
at
sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
/DigestBase.java:144)
at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:131)
at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:403)
at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
...
"tablet migration-Worker-1" #6107 daemon prio=5 os_prio=0 cpu=109551.37ms
elapsed=117.48s tid=0x0000fffeac01b000 nid=0x33174d runnable
[0x0000fffe7bffd000]
14012 java.lang.Thread.State: RUNNABLE
14013 at
sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
/DigestBase.java:149)
14014 at
sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
/DigestBase.java:144)
14015 at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:131)
14016 at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
14017 at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
14018 at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:432)
14019 at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
14020 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
14021 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
14022 at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
14023 at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
14024 at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
14025 at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
14026 at
org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
14027 at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
14028 at
org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
Notice that ClientContext.getProperties(ClientContext.java:236) most of the
times calls ServerInfo.getPrincipal(ServerInfo.java:148) but in the last
one it calls ServerInfo.getAuthenticationToken(ServerInfo.java:153).
And both lead to (a lot of ?!) compressing..
2) The "Manager" process writes ~200Mb of logs. Maybe the default log level
should not be DEBUG ?!
Most of its threads either wait for notifications from Zookeeper:
878647 "Manager-ClientPool-Worker-3" #61 daemon prio=5 os_prio=0
cpu=375.95ms elapsed=182.38s tid=0x0000fffee0007800 nid=0x32d943 in
Object.wait() [0x0000fffebb7fc000]
878648 java.lang.Thread.State: TIMED_WAITING (on object monitor)
878649 at java.lang.Object.wait(java.base@11.0.11/Native Method)
878650 - waiting on <no object reference available>
878651 at
org.apache.accumulo.fate.ZooStore.waitForStatusChange(ZooStore.java:386)
878652 - waiting to re-lock in wait() <0x00000000f1427458> (a
org.apache.accumulo.fate.ZooStore)
878653 at
org.apache.accumulo.fate.AgeOffStore.waitForStatusChange(AgeOffStore.java:209)
878654 at
org.apache.accumulo.core.logging.FateLogger$1.waitForStatusChange(FateLogger.java:75)
878655 at org.apache.accumulo.fate.Fate.waitForCompletion(Fate.java:297)
878656 at
org.apache.accumulo.manager.FateServiceHandler.waitForFateOperation(FateServiceHandler.java:659)
878657 at
org.apache.accumulo.manager.ManagerClientServiceHandler.waitForFateOperation(ManagerClientServiceHandler.java:100)
...
or wait for data:
878781 "Repo Runner-Worker-1" #90 daemon prio=5 os_prio=0 cpu=7440.91ms
elapsed=179.99s tid=0x0000fffeb0002000 nid=0x32d99a in Object.wait()
[0x0000fffebadfd000]
878782 java.lang.Thread.State: WAITING (on object monitor)
878783 at java.lang.Object.wait(java.base@11.0.11/Native Method)
878784 - waiting on <no object reference available>
878785 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
878786 at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
878787 - waiting to re-lock in wait() <0x00000000f9bf42d8> (a
org.apache.zookeeper.ClientCnxn$Packet)
878788 at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
878789 at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2587)
878790 at
org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getChildren$5(ZooReader.java:87)
878791 at
org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$182/0x0000000100324040.apply(Unknown
Source)
878792 at
org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
Source)
878793 at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
878794 at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
878795 at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
878796 at
org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:87)
878797 at org.apache.accumulo.fate.ZooStore.reserve(ZooStore.java:141)
878798 at
org.apache.accumulo.fate.AgeOffStore.reserve(AgeOffStore.java:155)
878799 at
org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:50)
878800 at
org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
878801 at
io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
878802 at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
Source)
878803 at
java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11
/ThreadPoolExecutor.java:1128)
878804 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11
/ThreadPoolExecutor.java:628)
878805 at
io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
878806 at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
Source)
878807 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
908220 "Status Thread" #41 daemon prio=5 os_prio=0 cpu=1700.28ms
elapsed=187.25s tid=0x0000fffee41f9800 nid=0x32d920 in Object.wait()
[0x0000ffff20f50000]
908221 java.lang.Thread.State: WAITING (on object monitor)
908222 at java.lang.Object.wait(java.base@11.0.11/Native Method)
908223 - waiting on <no object reference available>
908224 at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
908225 at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1529)
908226 - waiting to re-lock in wait() <0x00000000fa781138> (a
org.apache.zookeeper.ClientCnxn$Packet)
908227 at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1512)
908228 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2129)
908229 at
org.apache.accumulo.fate.zookeeper.ZooReader.lambda$getData$0(ZooReader.java:65)
908230 at
org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$220/0x0000000100351440.apply(Unknown
Source)
908231 at
org.apache.accumulo.fate.zookeeper.ZooReader$$Lambda$184/0x0000000100323c40.apply(Unknown
Source)
908232 at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoopMutator(ZooReader.java:165)
908233 at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:144)
908234 at
org.apache.accumulo.fate.zookeeper.ZooReader.retryLoop(ZooReader.java:131)
908235 at
org.apache.accumulo.fate.zookeeper.ZooReader.getData(ZooReader.java:65)
908236 at
org.apache.accumulo.manager.Manager.getManagerGoalState(Manager.java:496)
908237 at
org.apache.accumulo.manager.Manager$StatusThread.updateStatus(Manager.java:822)
908238 at
org.apache.accumulo.manager.Manager$StatusThread.run(Manager.java:797)
908239 at
io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
908240 at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100353840.run(Unknown
Source)
908241 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
3) SimpleGarbageCollector is also busy in getting credentials
"gc" #31 prio=5 os_prio=0 cpu=15495.47ms elapsed=209.43s
tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
2503 java.lang.Thread.State: RUNNABLE
2504 at
sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@11.0.11
/DigestBase.java:149)
2505 at
sun.security.provider.DigestBase.implCompressMultiBlock(java.base@11.0.11
/DigestBase.java:144)
2506 at sun.security.provider.DigestBase.engineUpdate(java.base@11.0.11
/DigestBase.java:131)
2507 at
java.security.MessageDigest$Delegate.engineUpdate(java.base@11.0.11
/MessageDigest.java:623)
2508 at java.security.MessageDigest.update(java.base@11.0.11
/MessageDigest.java:345)
2509 at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:421)
2510 at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
2511 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
2512 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
2513 at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
2514 at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
2515 at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
2516 at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
2517 at
org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
2518 at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
2519 at
org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
2520 at
org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
2521 at
org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
2522 at
org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
2523 at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
2524 at
org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
2525 at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
2526 at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
Source)
2527 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
tid=0x0000ffff28295800 nid=0x32dac5 runnable [0x0000ffff3a5fb000]
3152 java.lang.Thread.State: RUNNABLE
3153 at java.util.Arrays.hashCode(java.base@11.0.11/Arrays.java:4685)
3154 at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
3155 at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
/Provider.java:1107)
3156 at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
/ConcurrentHashMap.java:936)
3157 at java.security.Provider.getService(java.base@11.0.11
/Provider.java:1282)
3158 at sun.security.jca.ProviderList.getService(java.base@11.0.11
/ProviderList.java:380)
3159 at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
/GetInstance.java:157)
3160 at java.security.Security.getImpl(java.base@11.0.11
/Security.java:700)
3161 at java.security.MessageDigest.getInstance(java.base@11.0.11
/MessageDigest.java:178)
3162 at
org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
3163 at
org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
3164 at
org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
3165 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
3166 at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
3167 at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
3168 at
org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
3169 at
org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
3170 at
org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
3171 at
org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
3172 at
org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
3173 at
org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
3174 at
org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
3175 at
org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
3176 at
org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
3177 at
org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getReferences(SimpleGarbageCollector.java:249)
3178 at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletes(GarbageCollectionAlgorithm.java:169)
3179 at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.confirmDeletesTrace(GarbageCollectionAlgorithm.java:276)
3180 at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.deleteBatch(GarbageCollectionAlgorithm.java:330)
3181 at
org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:315)
3182 at
org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:501)
3183 at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
3184 at
io.opentelemetry.context.Context$$Lambda$209/0x0000000100357840.run(Unknown
Source)
3185 at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
4) Nothing interesting for Initialize, Main and ZooKeeperServerMain
processes
I'm not saying that the above are problematic. You know how Accumulo works.
It is up to you to decide whether something should be improved.
Regards,
Mark
On Wed, 1 Dec 2021 at 16:35, Mark Jens <ma...@gmail.com> wrote:
>
>
> On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:
>
>> It looks like the tests are timing out. This happens frequently when
>> running on resource-constrained systems. You can give the test more
>> time by increasing the timeout factor: `mvn clean verify
>> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
>> -Dtimeout.factor=3`
>>
>> There's nothing we know of that would change the way our tests work
>> due to ARM64, but you may have issues because of limited RAM, slow CPU
>> speeds, slow disk I/O, busy background processes, or other
>> resource-related issues. I don't think most of the currently active
>> developers use ARM64, or have access to a test machine to reproduce or
>>
>
> In case anyone wants to test on Linux ARM64 you could easily use Oracle
> Cloud for free.
>
> https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
> explains how to create a VM and how to use this VM as a Github Actions
> runner.
> https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
> mentions this article.
>
>
>> experiment with Accumulo there, so you may have to do some of your own
>> troubleshooting. If you can rule out resource-constraint issues, and
>> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
>> flaky and sometimes times out on x86_64 as well), you could create a
>> bug ticket with more details at
>> https://github.com/apache/accumulo/issues ; there is an issue template
>> specifically for broken and/or flaky tests that you can select when
>> creating a new ticket.
>>
>> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
>> >
>> > Hi dev1,
>> >
>> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>> >
>> > > Some of those tests are trying to stress conditions that require a
>> lot of
>> > > resources to replicate specific conditions. Have you tried to run
>> those
>> > > individual tests in isolation so that you are not competing for
>> resources?
>> > > Do they always fail, or are the failures transient?
>> > >
>> >
>> > Q: Have you tried to run those individual tests in isolation so that you
>> > are not competing for resources?
>> > A: This is what I mean with the following:
>> > ---------------------
>> > The tests fail even when executed separately, e.g.:
>> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>> > ---------------------
>> >
>> > Q: Do they always fail, or are the failures transient?
>> > A: I also tried to explain that with "These tests fail consistently at
>> > every build attempt!"
>> >
>> > Mark
>> >
>> > >
>> > > -----Original Message-----
>> > > From: Mark Jens <ma...@gmail.com>
>> > > Sent: Tuesday, November 30, 2021 4:05 AM
>> > > To: dev@accumulo.apache.org
>> > > Subject: Consistent IT tests failures on Linux ARM64
>> > >
>> > > Hello Accumulo community,
>> > >
>> > > At my job we consider using Linux ARM64 servers and I've been tasked
>> to
>> > > test Accumulo.
>> > >
>> > > I face some timeout related issues with several IT tests:
>> > >
>> > >
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > > Time elapsed: 420.122 s <<< ERROR!
>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>> 420
>> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
>> Method)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
>> > > at java.base@11.0.11
>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > > Method)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > > Time elapsed: 420.122 s <<< ERROR!
>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>> > > test-SendThread(localhost:44251)
>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>> > > java.base@11.0.11
>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>> > > at java.base@11.0.11
>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>> > > at java.base@11.0.11/sun.nio.ch
>> .SelectorImpl.select(SelectorImpl.java:136)
>> > > at
>> > >
>> > >
>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>> > > at
>> > >
>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
>> > > Time elapsed: 420.011 s <<< ERROR!
>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>> 420
>> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
>> > >
>> > >
>> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
>> > > at
>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
>> > > at
>> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
>> > > at java.base@11.0.11
>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > > Method)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >
>> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
>> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
>> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
>> > > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.CreateManyScannersIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 255.108 s - in
>> org.apache.accumulo.test.functional.CreateInitialSplitsIT
>> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
>> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 59.289 s - in
>> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
>> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time
>> elapsed:
>> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
>> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
>> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
>> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
>> > > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
>> > > [INFO] Running
>> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 219.253 s - in
>> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
>> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
>> > > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
>> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
>> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
>> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time
>> elapsed:
>> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
>> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
>> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
>> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
>> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
>> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
>> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 71.934 s - in
>> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
>> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
>> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time
>> elapsed:
>> > > 307.904 s <<< FAILURE! - in
>> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
>> > > [ERROR]
>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > > Time elapsed: 240.011 s <<< ERROR!
>> > > org.junit.runners.model.TestTimedOutException: test timed out after
>> 240
>> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
>> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
>> > > at java.base@11.0.11
>> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
>> > > at
>> > >
>> > >
>> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
>> > > at java.base@11.0.11
>> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > > Method)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > > at java.base@11.0.11
>> > >
>> > >
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> > > at
>> > >
>> > >
>> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>> > > at
>> > >
>> > >
>> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>> > > at java.base@11.0.11
>> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>> > >
>> > > [ERROR]
>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > > Time elapsed: 240.012 s <<< ERROR!
>> > > java.lang.Exception: Appears to be stuck in thread Time-limited
>> > > test-SendThread(localhost:39285)
>> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
>> > > java.base@11.0.11
>> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
>> > > at java.base@11.0.11
>> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
>> > > at java.base@11.0.11/sun.nio.ch
>> .SelectorImpl.select(SelectorImpl.java:136)
>> > > at
>> > >
>> > >
>> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
>> > > at
>> > >
>> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >
>> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
>> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
>> > > [INFO] Running
>> org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
>> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
>> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
>> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
>> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
>> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
>> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
>> > > [INFO] Running
>> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
>> elapsed:
>> > > 0.039 s - in
>> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
>> > > [INFO]
>> > > [INFO] Results:
>> > > [INFO]
>> > > [ERROR] Errors:
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
>> > > [ERROR] Run 1:
>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
>> > > TestTimedOut
>> > > [ERROR] Run 2:
>> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
>> Appears
>> > > to ...
>> > > [INFO]
>> > > [ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
>> > > TestTimedOut test t...
>> > > [ERROR]
>> > >
>> > >
>> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
>> > > [ERROR] Run 1:
>> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
>> TestTimedOut
>> > > tes...
>> > > [ERROR] Run 2:
>> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
>> > > Appears to be stuck...
>> > > [INFO]
>> > > [ERROR]
>> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
>> > > [ERROR] Run 1:
>> > >
>> > >
>> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
>> > > » TestTimedOut
>> > > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be stuck
>> in
>> > > thread Time-limited te...
>> > > [INFO]
>> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
>> > > [ERROR] Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
>> > > TestTimedOut test timed ...
>> > > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
>> > > Time-limited test-SendThread(...
>> > >
>> > > These tests fail consistently at every build attempt!
>> > >
>> > > The tests fail even when executed separately, e.g.:
>> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>> > >
>> > >
>> > > I am using the current 'main' branch of Accumulo.
>> > > JDK 11.0.11
>> > > Maven: 3.8.2
>> > > OS: Ubuntu 20.04.3 ARM64
>> > >
>> > > Is there anything that could be done to fix these problems ?
>> > > For example some config settings ?!
>> > >
>> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
>> Linux
>> > > ARM64 is a supported platform since the JVM supports it.
>> > >
>> > > Thanks!
>> > >
>> > > Mark
>> > >
>>
>
Re: Consistent IT tests failures on Linux ARM64
Posted by Mark Jens <ma...@gmail.com>.
On Tue, 30 Nov 2021 at 18:32, Christopher <ct...@apache.org> wrote:
> It looks like the tests are timing out. This happens frequently when
> running on resource-constrained systems. You can give the test more
> time by increasing the timeout factor: `mvn clean verify
> -Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
> -Dtimeout.factor=3`
>
> There's nothing we know of that would change the way our tests work
> due to ARM64, but you may have issues because of limited RAM, slow CPU
> speeds, slow disk I/O, busy background processes, or other
> resource-related issues. I don't think most of the currently active
> developers use ARM64, or have access to a test machine to reproduce or
>
In case anyone wants to test on Linux ARM64 you could easily use Oracle
Cloud for free.
https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a
explains how to create a VM and how to use this VM as a Github Actions
runner.
https://github.com/apache/accumulo/issues/1884#issuecomment-970267282
mentions this article.
> experiment with Accumulo there, so you may have to do some of your own
> troubleshooting. If you can rule out resource-constraint issues, and
> it isn't already a known flaky test (ConcurrentDeleteTableIT is known
> flaky and sometimes times out on x86_64 as well), you could create a
> bug ticket with more details at
> https://github.com/apache/accumulo/issues ; there is an issue template
> specifically for broken and/or flaky tests that you can select when
> creating a new ticket.
>
> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
> >
> > Hi dev1,
> >
> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
> >
> > > Some of those tests are trying to stress conditions that require a lot
> of
> > > resources to replicate specific conditions. Have you tried to run those
> > > individual tests in isolation so that you are not competing for
> resources?
> > > Do they always fail, or are the failures transient?
> > >
> >
> > Q: Have you tried to run those individual tests in isolation so that you
> > are not competing for resources?
> > A: This is what I mean with the following:
> > ---------------------
> > The tests fail even when executed separately, e.g.:
> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > ---------------------
> >
> > Q: Do they always fail, or are the failures transient?
> > A: I also tried to explain that with "These tests fail consistently at
> > every build attempt!"
> >
> > Mark
> >
> > >
> > > -----Original Message-----
> > > From: Mark Jens <ma...@gmail.com>
> > > Sent: Tuesday, November 30, 2021 4:05 AM
> > > To: dev@accumulo.apache.org
> > > Subject: Consistent IT tests failures on Linux ARM64
> > >
> > > Hello Accumulo community,
> > >
> > > At my job we consider using Linux ARM64 servers and I've been tasked to
> > > test Accumulo.
> > >
> > > I face some timeout related issues with several IT tests:
> > >
> > >
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > > Time elapsed: 420.122 s <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
> Method)
> > > at java.base@11.0.11
> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > > at
> > >
> > >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > > Time elapsed: 420.122 s <<< ERROR!
> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > > test-SendThread(localhost:44251)
> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > > java.base@11.0.11
> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > > at java.base@11.0.11
> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > > at java.base@11.0.11/sun.nio.ch
> .SelectorImpl.select(SelectorImpl.java:136)
> > > at
> > >
> > >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > > at
> > >
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> > > Time elapsed: 420.011 s <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
> > >
> > >
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > > at
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > > at
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > > at
> > >
> > >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > > at
> > >
> > >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> > > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> > > [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> > > [INFO] Running
> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 255.108 s - in
> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> > > [INFO] Running
> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 59.289 s - in
> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> > > [INFO] Running
> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 219.253 s - in
> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> > > [INFO] Running
> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 71.934 s - in
> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> > > 307.904 s <<< FAILURE! - in
> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > > Time elapsed: 240.011 s <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 240
> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > > at java.base@11.0.11
> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > > at
> > >
> > >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > > at
> > >
> > >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > > Time elapsed: 240.012 s <<< ERROR!
> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > > test-SendThread(localhost:39285)
> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > > java.base@11.0.11
> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > > at java.base@11.0.11
> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > > at java.base@11.0.11/sun.nio.ch
> .SelectorImpl.select(SelectorImpl.java:136)
> > > at
> > >
> > >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > > at
> > >
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >
> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > > [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > > [INFO] Running
> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
> elapsed:
> > > 0.039 s - in
> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > > [INFO]
> > > [INFO] Results:
> > > [INFO]
> > > [ERROR] Errors:
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > > [ERROR] Run 1:
> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> > > TestTimedOut
> > > [ERROR] Run 2:
> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
> Appears
> > > to ...
> > > [INFO]
> > > [ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > > TestTimedOut test t...
> > > [ERROR]
> > >
> > >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > > [ERROR] Run 1:
> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> TestTimedOut
> > > tes...
> > > [ERROR] Run 2:
> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
> > > Appears to be stuck...
> > > [INFO]
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > > [ERROR] Run 1:
> > >
> > >
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > > » TestTimedOut
> > > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be stuck
> in
> > > thread Time-limited te...
> > > [INFO]
> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > > [ERROR] Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > > TestTimedOut test timed ...
> > > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
> > > Time-limited test-SendThread(...
> > >
> > > These tests fail consistently at every build attempt!
> > >
> > > The tests fail even when executed separately, e.g.:
> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > >
> > >
> > > I am using the current 'main' branch of Accumulo.
> > > JDK 11.0.11
> > > Maven: 3.8.2
> > > OS: Ubuntu 20.04.3 ARM64
> > >
> > > Is there anything that could be done to fix these problems ?
> > > For example some config settings ?!
> > >
> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
> Linux
> > > ARM64 is a supported platform since the JVM supports it.
> > >
> > > Thanks!
> > >
> > > Mark
> > >
>
Re: Consistent IT tests failures on Linux ARM64
Posted by Christopher <ct...@apache.org>.
It looks like the tests are timing out. This happens frequently when
running on resource-constrained systems. You can give the test more
time by increasing the timeout factor: `mvn clean verify
-Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
-Dtimeout.factor=3`
There's nothing we know of that would change the way our tests work
due to ARM64, but you may have issues because of limited RAM, slow CPU
speeds, slow disk I/O, busy background processes, or other
resource-related issues. I don't think most of the currently active
developers use ARM64, or have access to a test machine to reproduce or
experiment with Accumulo there, so you may have to do some of your own
troubleshooting. If you can rule out resource-constraint issues, and
it isn't already a known flaky test (ConcurrentDeleteTableIT is known
flaky and sometimes times out on x86_64 as well), you could create a
bug ticket with more details at
https://github.com/apache/accumulo/issues ; there is an issue template
specifically for broken and/or flaky tests that you can select when
creating a new ticket.
On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
>
> Hi dev1,
>
> On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>
> > Some of those tests are trying to stress conditions that require a lot of
> > resources to replicate specific conditions. Have you tried to run those
> > individual tests in isolation so that you are not competing for resources?
> > Do they always fail, or are the failures transient?
> >
>
> Q: Have you tried to run those individual tests in isolation so that you
> are not competing for resources?
> A: This is what I mean with the following:
> ---------------------
> The tests fail even when executed separately, e.g.:
> mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> ---------------------
>
> Q: Do they always fail, or are the failures transient?
> A: I also tried to explain that with "These tests fail consistently at
> every build attempt!"
>
> Mark
>
> >
> > -----Original Message-----
> > From: Mark Jens <ma...@gmail.com>
> > Sent: Tuesday, November 30, 2021 4:05 AM
> > To: dev@accumulo.apache.org
> > Subject: Consistent IT tests failures on Linux ARM64
> >
> > Hello Accumulo community,
> >
> > At my job we consider using Linux ARM64 servers and I've been tasked to
> > test Accumulo.
> >
> > I face some timeout related issues with several IT tests:
> >
> >
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > Time elapsed: 420.122 s <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
> > at java.base@11.0.11
> > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > at
> >
> > app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> > app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> > app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > Time elapsed: 420.122 s <<< ERROR!
> > java.lang.Exception: Appears to be stuck in thread Time-limited
> > test-SendThread(localhost:44251)
> > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > java.base@11.0.11
> > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > at java.base@11.0.11
> > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
> > at
> >
> > app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > at
> > app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> > Time elapsed: 420.011 s <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
> >
> > app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > at
> >
> > app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > at
> >
> > app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> > app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> > app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> > [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> > [INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> > [INFO] Running
> > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 219.253 s - in
> > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> > [INFO] Running org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 71.934 s - in org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> > 307.904 s <<< FAILURE! - in
> > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > Time elapsed: 240.011 s <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 240
> > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > at
> >
> > app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > at
> >
> > app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> > app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> > app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > Time elapsed: 240.012 s <<< ERROR!
> > java.lang.Exception: Appears to be stuck in thread Time-limited
> > test-SendThread(localhost:39285)
> > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > java.base@11.0.11
> > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > at java.base@11.0.11
> > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
> > at
> >
> > app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > at
> > app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >
> > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > [INFO] Running
> > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
> > 0.039 s - in
> > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > [INFO]
> > [INFO] Results:
> > [INFO]
> > [ERROR] Errors:
> > [ERROR]
> >
> > org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > [ERROR] Run 1:
> > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> > TestTimedOut
> > [ERROR] Run 2:
> > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction » Appears
> > to ...
> > [INFO]
> > [ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > TestTimedOut test t...
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > [ERROR] Run 1:
> > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 » TestTimedOut
> > tes...
> > [ERROR] Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
> > Appears to be stuck...
> > [INFO]
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > [ERROR] Run 1:
> >
> > HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > » TestTimedOut
> > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be stuck in
> > thread Time-limited te...
> > [INFO]
> > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > [ERROR] Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > TestTimedOut test timed ...
> > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
> > Time-limited test-SendThread(...
> >
> > These tests fail consistently at every build attempt!
> >
> > The tests fail even when executed separately, e.g.:
> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> >
> >
> > I am using the current 'main' branch of Accumulo.
> > JDK 11.0.11
> > Maven: 3.8.2
> > OS: Ubuntu 20.04.3 ARM64
> >
> > Is there anything that could be done to fix these problems ?
> > For example some config settings ?!
> >
> > P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
> > ARM64 is a supported platform since the JVM supports it.
> >
> > Thanks!
> >
> > Mark
> >
Re: Consistent IT tests failures on Linux ARM64
Posted by Mark Jens <ma...@gmail.com>.
On Tue, 30 Nov 2021 at 18:34, Mike Miller <mm...@apache.org> wrote:
> There have been issues with that IT so it is possible it is unrelated to
> your architecture.
> https://github.com/apache/accumulo/pull/2304
> https://github.com/apache/accumulo/issues/1841
Issue #1841 is exactly what I experience!
My test machine has 8 CPU cores @ 2.6GHz and 16GB RAM
With -Dtimeout.factor=3 ConcurrentDeleteTableIT passes in 768.612 s (i.e.
13 mins)
Now I am running the whole IT tests suite - the CPUs are pretty idle, they
spike to up to 20%, and only 4GB RAM is being used
Once the tests finish I will re-run the ConcurrentDeleteTableIT and take
some thread dumps to see where it blocks
>
>
> On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
>
> > Hi dev1,
> >
> > On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
> >
> > > Some of those tests are trying to stress conditions that require a lot
> of
> > > resources to replicate specific conditions. Have you tried to run those
> > > individual tests in isolation so that you are not competing for
> > resources?
> > > Do they always fail, or are the failures transient?
> > >
> >
> > Q: Have you tried to run those individual tests in isolation so that you
> > are not competing for resources?
> > A: This is what I mean with the following:
> > ---------------------
> > The tests fail even when executed separately, e.g.:
> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > ---------------------
> >
> > Q: Do they always fail, or are the failures transient?
> > A: I also tried to explain that with "These tests fail consistently at
> > every build attempt!"
> >
> > Mark
> >
> > >
> > > -----Original Message-----
> > > From: Mark Jens <ma...@gmail.com>
> > > Sent: Tuesday, November 30, 2021 4:05 AM
> > > To: dev@accumulo.apache.org
> > > Subject: Consistent IT tests failures on Linux ARM64
> > >
> > > Hello Accumulo community,
> > >
> > > At my job we consider using Linux ARM64 servers and I've been tasked to
> > > test Accumulo.
> > >
> > > I face some timeout related issues with several IT tests:
> > >
> > >
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > > Time elapsed: 420.122 s <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
> > Method)
> > > at java.base@11.0.11
> > > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > > Time elapsed: 420.122 s <<< ERROR!
> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > > test-SendThread(localhost:44251)
> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > > java.base@11.0.11
> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > > at java.base@11.0.11
> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > > at java.base@11.0.11/sun.nio.ch
> > .SelectorImpl.select(SelectorImpl.java:136)
> > > at
> > >
> > >
> >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > > at
> > >
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> > > Time elapsed: 420.011 s <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
> > >
> > >
> >
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > > at
> > app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > > at
> > app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> > > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> > > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> > > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> > > [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> > > [INFO] Running
> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 255.108 s - in
> org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> > > [INFO] Running
> > org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 59.289 s - in
> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> > > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> > > [INFO] Running
> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 219.253 s - in
> > > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> > > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> > > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> > > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> > > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> > > [INFO] Running
> > org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 71.934 s - in
> > org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> > > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> > > 307.904 s <<< FAILURE! - in
> > > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > > Time elapsed: 240.011 s <<< ERROR!
> > > org.junit.runners.model.TestTimedOutException: test timed out after 240
> > > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> > > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > > at java.base@11.0.11
> /java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > > at
> > >
> > >
> >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > > at java.base@11.0.11
> > > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > at java.base@11.0.11
> > >
> > >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > > at
> > >
> > >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > > at
> > >
> > >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > > at java.base@11.0.11
> > > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> > >
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > > Time elapsed: 240.012 s <<< ERROR!
> > > java.lang.Exception: Appears to be stuck in thread Time-limited
> > > test-SendThread(localhost:39285)
> > > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > > java.base@11.0.11
> > > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > > at java.base@11.0.11
> > > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > > at java.base@11.0.11/sun.nio.ch
> > .SelectorImpl.select(SelectorImpl.java:136)
> > > at
> > >
> > >
> >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > > at
> > >
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> > >
> > > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > > [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > > [INFO] Running
> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time
> elapsed:
> > > 0.039 s - in
> > > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > > [INFO]
> > > [INFO] Results:
> > > [INFO]
> > > [ERROR] Errors:
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > > [ERROR] Run 1:
> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> > > TestTimedOut
> > > [ERROR] Run 2:
> > > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
> > Appears
> > > to ...
> > > [INFO]
> > > [ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > > TestTimedOut test t...
> > > [ERROR]
> > >
> > >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > > [ERROR] Run 1:
> > > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> > TestTimedOut
> > > tes...
> > > [ERROR] Run 2:
> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > »
> > > Appears to be stuck...
> > > [INFO]
> > > [ERROR]
> org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > > [ERROR] Run 1:
> > >
> > >
> >
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > > » TestTimedOut
> > > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be stuck
> in
> > > thread Time-limited te...
> > > [INFO]
> > > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > > [ERROR] Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > > TestTimedOut test timed ...
> > > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
> > > Time-limited test-SendThread(...
> > >
> > > These tests fail consistently at every build attempt!
> > >
> > > The tests fail even when executed separately, e.g.:
> > > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> > >
> > >
> > > I am using the current 'main' branch of Accumulo.
> > > JDK 11.0.11
> > > Maven: 3.8.2
> > > OS: Ubuntu 20.04.3 ARM64
> > >
> > > Is there anything that could be done to fix these problems ?
> > > For example some config settings ?!
> > >
> > > P.S. At https://github.com/apache/accumulo/issues/1884 I read that
> Linux
> > > ARM64 is a supported platform since the JVM supports it.
> > >
> > > Thanks!
> > >
> > > Mark
> > >
> >
>
Re: Consistent IT tests failures on Linux ARM64
Posted by Mike Miller <mm...@apache.org>.
There have been issues with that IT so it is possible it is unrelated to
your architecture.
https://github.com/apache/accumulo/pull/2304
https://github.com/apache/accumulo/issues/1841
On Tue, Nov 30, 2021 at 9:34 AM Mark Jens <ma...@gmail.com> wrote:
> Hi dev1,
>
> On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
>
> > Some of those tests are trying to stress conditions that require a lot of
> > resources to replicate specific conditions. Have you tried to run those
> > individual tests in isolation so that you are not competing for
> resources?
> > Do they always fail, or are the failures transient?
> >
>
> Q: Have you tried to run those individual tests in isolation so that you
> are not competing for resources?
> A: This is what I mean with the following:
> ---------------------
> The tests fail even when executed separately, e.g.:
> mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> ---------------------
>
> Q: Do they always fail, or are the failures transient?
> A: I also tried to explain that with "These tests fail consistently at
> every build attempt!"
>
> Mark
>
> >
> > -----Original Message-----
> > From: Mark Jens <ma...@gmail.com>
> > Sent: Tuesday, November 30, 2021 4:05 AM
> > To: dev@accumulo.apache.org
> > Subject: Consistent IT tests failures on Linux ARM64
> >
> > Hello Accumulo community,
> >
> > At my job we consider using Linux ARM64 servers and I've been tasked to
> > test Accumulo.
> >
> > I face some timeout related issues with several IT tests:
> >
> >
> > [ERROR]
> >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > Time elapsed: 420.122 s <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native
> Method)
> > at java.base@11.0.11
> > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > at
> >
> >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [ERROR]
> >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > Time elapsed: 420.122 s <<< ERROR!
> > java.lang.Exception: Appears to be stuck in thread Time-limited
> > test-SendThread(localhost:44251)
> > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > java.base@11.0.11
> > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > at java.base@11.0.11
> > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > at java.base@11.0.11/sun.nio.ch
> .SelectorImpl.select(SelectorImpl.java:136)
> > at
> >
> >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > at
> > app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >
> > [ERROR]
> >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> > Time elapsed: 420.011 s <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
> >
> >
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> > at
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> > at
> app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> > at
> >
> >
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> > at
> >
> >
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> > [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> > [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> > [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> > [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> > [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> > [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> > [INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
> > [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> > [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> > [INFO] Running
> org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> > [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> > [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> > [INFO] Running org.apache.accumulo.test.functional.BulkIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> > [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> > [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> > [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> > [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> > [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> > [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> > [INFO] Running
> > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 219.253 s - in
> > org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> > [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> > [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> > [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> > [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> > [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> > [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> > [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> > [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> > [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> > [INFO] Running
> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 71.934 s - in
> org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> > [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> > [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> > 307.904 s <<< FAILURE! - in
> > org.apache.accumulo.test.functional.HalfDeadTServerIT
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > Time elapsed: 240.011 s <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 240
> > seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> > java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> > at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> > at
> >
> >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> > at
> >
> >
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> >
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> >
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> >
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> >
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> >
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> >
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > Time elapsed: 240.012 s <<< ERROR!
> > java.lang.Exception: Appears to be stuck in thread Time-limited
> > test-SendThread(localhost:39285)
> > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > java.base@11.0.11
> > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > at java.base@11.0.11
> > /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> > at java.base@11.0.11/sun.nio.ch
> .SelectorImpl.select(SelectorImpl.java:136)
> > at
> >
> >
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> > at
> > app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
> >
> > [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> > [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> > [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> > [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> > [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> > [INFO] Running org.apache.accumulo.test.AuditMessageIT
> > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> > [INFO] Running
> > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
> > 0.039 s - in
> > org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> > [INFO]
> > [INFO] Results:
> > [INFO]
> > [ERROR] Errors:
> > [ERROR]
> >
> >
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> > [ERROR] Run 1:
> > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> > TestTimedOut
> > [ERROR] Run 2:
> > ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »
> Appears
> > to ...
> > [INFO]
> > [ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> > TestTimedOut test t...
> > [ERROR]
> >
> >
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> > [ERROR] Run 1:
> > ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 »
> TestTimedOut
> > tes...
> > [ERROR] Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> »
> > Appears to be stuck...
> > [INFO]
> > [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> > [ERROR] Run 1:
> >
> >
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> > » TestTimedOut
> > [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be stuck in
> > thread Time-limited te...
> > [INFO]
> > [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> > [ERROR] Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> > TestTimedOut test timed ...
> > [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
> > Time-limited test-SendThread(...
> >
> > These tests fail consistently at every build attempt!
> >
> > The tests fail even when executed separately, e.g.:
> > mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> >
> >
> > I am using the current 'main' branch of Accumulo.
> > JDK 11.0.11
> > Maven: 3.8.2
> > OS: Ubuntu 20.04.3 ARM64
> >
> > Is there anything that could be done to fix these problems ?
> > For example some config settings ?!
> >
> > P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
> > ARM64 is a supported platform since the JVM supports it.
> >
> > Thanks!
> >
> > Mark
> >
>
Re: Consistent IT tests failures on Linux ARM64
Posted by Mark Jens <ma...@gmail.com>.
Hi dev1,
On Tue, 30 Nov 2021 at 16:21, dev1 <de...@etcoleman.com> wrote:
> Some of those tests are trying to stress conditions that require a lot of
> resources to replicate specific conditions. Have you tried to run those
> individual tests in isolation so that you are not competing for resources?
> Do they always fail, or are the failures transient?
>
Q: Have you tried to run those individual tests in isolation so that you
are not competing for resources?
A: This is what I mean with the following:
---------------------
The tests fail even when executed separately, e.g.:
mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
---------------------
Q: Do they always fail, or are the failures transient?
A: I also tried to explain that with "These tests fail consistently at
every build attempt!"
Mark
>
> -----Original Message-----
> From: Mark Jens <ma...@gmail.com>
> Sent: Tuesday, November 30, 2021 4:05 AM
> To: dev@accumulo.apache.org
> Subject: Consistent IT tests failures on Linux ARM64
>
> Hello Accumulo community,
>
> At my job we consider using Linux ARM64 servers and I've been tasked to
> test Accumulo.
>
> I face some timeout related issues with several IT tests:
>
>
> [ERROR]
>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> Time elapsed: 420.122 s <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 420
> seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
> at java.base@11.0.11
> /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> at
>
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> at java.base@11.0.11
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at java.base@11.0.11
>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at java.base@11.0.11
>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> at
>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at
>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at
>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>
> [ERROR]
>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> Time elapsed: 420.122 s <<< ERROR!
> java.lang.Exception: Appears to be stuck in thread Time-limited
> test-SendThread(localhost:44251)
> at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> java.base@11.0.11
> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> at java.base@11.0.11
> /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
> at
>
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> at
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>
> [ERROR]
>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
> Time elapsed: 420.011 s <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 420
> seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
>
> app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
> at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
> at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
> at
>
> app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
> at
>
> app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
> at
>
> app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
> at
>
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
> at
>
> app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
> at
>
> app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
> at
>
> app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
> at java.base@11.0.11
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at java.base@11.0.11
>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at java.base@11.0.11
>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> at
>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at
>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at
>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>
> [INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
> [INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
> [INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
> [INFO] Running org.apache.accumulo.test.functional.BinaryIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 65.034 s - in org.apache.accumulo.test.functional.BinaryIT
> [INFO] Running org.apache.accumulo.test.functional.PermissionsIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
> [INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
> [INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
> [INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
> [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
> [INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
> [INFO] Running org.apache.accumulo.test.functional.RestartStressIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
> [INFO] Running org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
> [INFO] Running org.apache.accumulo.test.functional.BulkNewIT
> [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
> [INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
> [INFO] Running org.apache.accumulo.test.functional.BulkIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 122.959 s - in org.apache.accumulo.test.functional.BulkIT
> [INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
> [INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
> [INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
> [INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
> [INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
> [INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
> [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
> [INFO] Running
> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 219.253 s - in
> org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
> [INFO] Running org.apache.accumulo.test.functional.VisibilityIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
> [INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
> [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
> [INFO] Running org.apache.accumulo.test.functional.SummaryIT
> [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 111.552 s - in org.apache.accumulo.test.functional.SummaryIT
> [INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
> [INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
> [INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
> [INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
> [INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
> [INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
> [INFO] Running org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 71.934 s - in org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
> [INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
> [ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> 307.904 s <<< FAILURE! - in
> org.apache.accumulo.test.functional.HalfDeadTServerIT
> [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> Time elapsed: 240.011 s <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 240
> seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at
> java.base@11.0.11/java.lang.Object.wait(Object.java:328)
> at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
> at
>
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
> at
>
> app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
> at java.base@11.0.11
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at java.base@11.0.11
>
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at java.base@11.0.11
>
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> at
>
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at
>
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
>
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at
>
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
>
> app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
>
> app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> at
>
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> at java.base@11.0.11
> /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
>
> [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> Time elapsed: 240.012 s <<< ERROR!
> java.lang.Exception: Appears to be stuck in thread Time-limited
> test-SendThread(localhost:39285)
> at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> java.base@11.0.11
> /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> at java.base@11.0.11
> /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
> at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
> at
>
> app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
> at
> app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>
> [INFO] Running org.apache.accumulo.test.functional.MetadataIT
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 97.987 s - in org.apache.accumulo.test.functional.MetadataIT
> [INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
> [INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
> [INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
> [INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
> [INFO] Running org.apache.accumulo.test.AuditMessageIT
> [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 165.169 s - in org.apache.accumulo.test.AuditMessageIT
> [INFO] Running
> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
> 0.039 s - in
> org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
> [INFO]
> [INFO] Results:
> [INFO]
> [ERROR] Errors:
> [ERROR]
>
> org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
> [ERROR] Run 1:
> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 »
> TestTimedOut
> [ERROR] Run 2:
> ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction » Appears
> to ...
> [INFO]
> [ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
> TestTimedOut test t...
> [ERROR]
>
> org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> [ERROR] Run 1:
> ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 » TestTimedOut
> tes...
> [ERROR] Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
> Appears to be stuck...
> [INFO]
> [ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
> [ERROR] Run 1:
>
> HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
> » TestTimedOut
> [ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be stuck in
> thread Time-limited te...
> [INFO]
> [ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
> [ERROR] Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
> TestTimedOut test timed ...
> [ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
> Time-limited test-SendThread(...
>
> These tests fail consistently at every build attempt!
>
> The tests fail even when executed separately, e.g.:
> mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
>
>
> I am using the current 'main' branch of Accumulo.
> JDK 11.0.11
> Maven: 3.8.2
> OS: Ubuntu 20.04.3 ARM64
>
> Is there anything that could be done to fix these problems ?
> For example some config settings ?!
>
> P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
> ARM64 is a supported platform since the JVM supports it.
>
> Thanks!
>
> Mark
>
RE: Consistent IT tests failures on Linux ARM64
Posted by dev1 <de...@etcoleman.com>.
Some of those tests are trying to stress conditions that require a lot of resources to replicate specific conditions. Have you tried to run those individual tests in isolation so that you are not competing for resources? Do they always fail, or are the failures transient?
-----Original Message-----
From: Mark Jens <ma...@gmail.com>
Sent: Tuesday, November 30, 2021 4:05 AM
To: dev@accumulo.apache.org
Subject: Consistent IT tests failures on Linux ARM64
Hello Accumulo community,
At my job we consider using Linux ARM64 servers and I've been tasked to test Accumulo.
I face some timeout related issues with several IT tests:
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
Time elapsed: 420.122 s <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420 seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method) at java.base@11.0.11
/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.11
/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
at java.base@11.0.11
/java.util.concurrent.FutureTask.get(FutureTask.java:190)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
Time elapsed: 420.122 s <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:44251)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
Time elapsed: 420.011 s <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420 seconds at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
[INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Running org.apache.accumulo.test.functional.BinaryIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
65.034 s - in org.apache.accumulo.test.functional.BinaryIT
[INFO] Running org.apache.accumulo.test.functional.PermissionsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
[INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Running org.apache.accumulo.test.functional.RestartStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
[INFO] Running org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Running org.apache.accumulo.test.functional.BulkNewIT
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
[INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Running org.apache.accumulo.test.functional.BulkIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
122.959 s - in org.apache.accumulo.test.functional.BulkIT
[INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Running
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
219.253 s - in
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Running org.apache.accumulo.test.functional.VisibilityIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
[INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Running org.apache.accumulo.test.functional.SummaryIT
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
111.552 s - in org.apache.accumulo.test.functional.SummaryIT
[INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Running org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
71.934 s - in org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
307.904 s <<< FAILURE! - in
org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
Time elapsed: 240.011 s <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 240 seconds at java.base@11.0.11/java.lang.Object.wait(Native Method) at java.base@11.0.11/java.lang.Object.wait(Object.java:328)
at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
at java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
Time elapsed: 240.012 s <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:39285)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
[INFO] Running org.apache.accumulo.test.functional.MetadataIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
97.987 s - in org.apache.accumulo.test.functional.MetadataIT
[INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Running org.apache.accumulo.test.AuditMessageIT
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
165.169 s - in org.apache.accumulo.test.AuditMessageIT
[INFO] Running
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
0.039 s - in
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR]
org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
[ERROR] Run 1:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 » TestTimedOut
[ERROR] Run 2:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction » Appears to ...
[INFO]
[ERROR] ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
TestTimedOut test t...
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
[ERROR] Run 1:
ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 » TestTimedOut tes...
[ERROR] Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
Appears to be stuck...
[INFO]
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
[ERROR] Run 1:
HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
» TestTimedOut
[ERROR] Run 2: HalfDeadTServerIT.testRecover » Appears to be stuck in
thread Time-limited te...
[INFO]
[ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
[ERROR] Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
TestTimedOut test timed ...
[ERROR] Run 2: SslIT.adminStop » Appears to be stuck in thread
Time-limited test-SendThread(...
These tests fail consistently at every build attempt!
The tests fail even when executed separately, e.g.:
mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
I am using the current 'main' branch of Accumulo.
JDK 11.0.11
Maven: 3.8.2
OS: Ubuntu 20.04.3 ARM64
Is there anything that could be done to fix these problems ?
For example some config settings ?!
P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
ARM64 is a supported platform since the JVM supports it.
Thanks!
Mark