You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Mehakmeet Singh (Jira)" <ji...@apache.org> on 2020/07/30 13:55:00 UTC

[jira] [Comment Edited] (HADOOP-17158) Intermittent test timeout for ITestAbfsInputStreamStatistics#testReadAheadCounters

    [ https://issues.apache.org/jira/browse/HADOOP-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167940#comment-17167940 ] 

Mehakmeet Singh edited comment on HADOOP-17158 at 7/30/20, 1:54 PM:
--------------------------------------------------------------------

[~bilahari.th], Can you tell which bucket were you using for testing, and was it only parallel runs or all runs that were timing out?

Also, can you remove the timeout limit and the thread sleep() condition from the test and run it a few times and post the trace here, so I could know what are the counter values that are coming in your setup?


was (Author: mehakmeetsingh):
[~bilahari.th], Can you tell with which bucket were you using for testing, and was it only parallel runs or all runs that were timing out?

Also, can you remove the timeout limit and the thread sleep() condition from the test and run it a few times and post the trace here, so I could know what are the counter values that are coming in your setup?

> Intermittent test timeout for ITestAbfsInputStreamStatistics#testReadAheadCounters
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-17158
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17158
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 3.3.0
>            Reporter: Mehakmeet Singh
>            Assignee: Mehakmeet Singh
>            Priority: Major
>
> Intermittent test timeout for ITestAbfsInputStreamStatistics#testReadAheadCounters happening due to race conditions in readAhead threads.
> Test error:
> {code:java}
> [ERROR] testReadAheadCounters(org.apache.hadoop.fs.azurebfs.ITestAbfsInputStreamStatistics)  Time elapsed: 30.723 s  <<< ERROR!org.junit.runners.model.TestTimedOutException: test timed out after 30000 milliseconds        at java.lang.Thread.sleep(Native Method)        at org.apache.hadoop.fs.azurebfs.ITestAbfsInputStreamStatistics.testReadAheadCounters(ITestAbfsInputStreamStatistics.java:346)        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)        at java.lang.reflect.Method.invoke(Method.java:498)        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)        at java.util.concurrent.FutureTask.run(FutureTask.java:266)        at java.lang.Thread.run(Thread.java:748) {code}
> Possible Reasoning:
> - ReadAhead queue doesn't get completed and hence the counter values are not satisfied in 30 seconds time for some systems.
> - The condition that readAheadBytesRead and remoteBytesRead counter values need to be greater than or equal to 4KB and 32KB respectively doesn't occur in some machines due to the fact that sometimes instead of reading for readAhead Buffer, remote reads are performed due to Threads still being in the readAhead queue to fill that buffer. Thus resulting in either of the 2 counter values to be not satisfying the condition and getting in an infinite loop and hence timing out the test eventually.
> Possible Fixes:
> - Write better test(That would pass under all conditions).
> - Maybe UT instead of IT?
> Possible fix to better the test would be preferable and UT as the last resort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org