You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Anoop Sam John (Jira)" <ji...@apache.org> on 2020/04/20 17:31:00 UTC

[jira] [Comment Edited] (HADOOP-16998) WASB : NativeAzureFsOutputStream#close() throwing java.lang.IllegalArgumentException instead of IOE which causes HBase RS to get aborted

    [ https://issues.apache.org/jira/browse/HADOOP-16998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087871#comment-17087871 ] 

Anoop Sam John edited comment on HADOOP-16998 at 4/20/20, 5:30 PM:
-------------------------------------------------------------------

Thanks Steve.
The version on which this was observed was 2.7.3.. But I believe this should be there in all versions and even in master.
HADOOP-16785 handles cases where writes are called after close().  Here it is different.  When close() is been called there is still data pending for flush.  That write fails with IOE from Azure Storage SDK. And then in finally block of the close() it try to close the Azure Storage SDK level OS which throws back same IOE.  This is the stack trace of the Exception what we see at HBase level.
{code}
Caused by: java.lang.IllegalArgumentException: ...
                  at java.lang.Throwable.addSuppressed(Throwable.java:1072)
                  at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
                  at org.apache.hadoop.fs.azure.NativeAzureFileSystem$NativeAzureFsOutputStream.close(NativeAzureFileSystem.java:1055)
                  at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
                  at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
                  at org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.finishClose(AbstractHFileWriter.java:248)
                  at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.finishClose(HFileWriterV3.java:133)
                  at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:368)
                  at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:1080)
                  at org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:67)
                  at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:80)
                  at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:960)
                  at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2411)
                  at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2511)
                  at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2256)
                  at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2218)
                  at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2110)
                  at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2036)
                  at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:501)
                  at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
                  at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
                  at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
                  at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: ...
                  at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:778)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal.writeBlock(BlobOutputStreamInternal.java:462)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal.access$000(BlobOutputStreamInternal.java:47)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal$1.call(BlobOutputStreamInternal.java:406)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal$1.call(BlobOutputStreamInternal.java:403)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                  at java.lang.Thread.run(Thread.java:748)
Caused by: com.microsoft.azure.storage.StorageException: ..
                  at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87)
                  at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:315)
                  at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:185)
                  at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1097)
                  at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:1069)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal.writeBlock(BlobOutputStreamInternal.java:456)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal.access$000(BlobOutputStreamInternal.java:47)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal$1.call(BlobOutputStreamInternal.java:406)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal$1.call(BlobOutputStreamInternal.java:403)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                  at java.lang.Thread.run(Thread.java:748)
{code}
If the flush gets IOE, we will have retry at our end. But here as it throws IllegalArgumentException we end up aborting RS



 




was (Author: anoop.hbase):
Thanks Steve.
The version on which this was observed was 2.7.3.. But I believe this should be there in all versions and even in master.
HADOOP-16785 having handles cases where writes are called after close().  Here it is different.  When close() is been called there is still data pending for flush.  That write fails with IOE from Azure Storage SDK. And then in finally block of the close() it try to close the Azure Storage SDK level OS which throws back same IOE.  This is the stack trace of the Exception what we see at HBase level.
{code}
Caused by: java.lang.IllegalArgumentException: ...
                  at java.lang.Throwable.addSuppressed(Throwable.java:1072)
                  at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
                  at org.apache.hadoop.fs.azure.NativeAzureFileSystem$NativeAzureFsOutputStream.close(NativeAzureFileSystem.java:1055)
                  at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
                  at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
                  at org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.finishClose(AbstractHFileWriter.java:248)
                  at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.finishClose(HFileWriterV3.java:133)
                  at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:368)
                  at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:1080)
                  at org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:67)
                  at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:80)
                  at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:960)
                  at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2411)
                  at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2511)
                  at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2256)
                  at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2218)
                  at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2110)
                  at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2036)
                  at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:501)
                  at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
                  at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
                  at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
                  at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: ...
                  at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:778)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal.writeBlock(BlobOutputStreamInternal.java:462)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal.access$000(BlobOutputStreamInternal.java:47)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal$1.call(BlobOutputStreamInternal.java:406)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal$1.call(BlobOutputStreamInternal.java:403)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                  at java.lang.Thread.run(Thread.java:748)
Caused by: com.microsoft.azure.storage.StorageException: ..
                  at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87)
                  at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:315)
                  at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:185)
                  at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1097)
                  at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:1069)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal.writeBlock(BlobOutputStreamInternal.java:456)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal.access$000(BlobOutputStreamInternal.java:47)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal$1.call(BlobOutputStreamInternal.java:406)
                  at com.microsoft.azure.storage.blob.BlobOutputStreamInternal$1.call(BlobOutputStreamInternal.java:403)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                  at java.lang.Thread.run(Thread.java:748)
{code}
If the flush gets IOE, we will have retry at our end. But here as it throws IllegalArgumentException we end up aborting RS



 



> WASB : NativeAzureFsOutputStream#close() throwing java.lang.IllegalArgumentException instead of IOE which causes HBase RS to get aborted
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16998
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16998
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Major
>         Attachments: HADOOP-16998.patch
>
>
> During HFile create, at the end when called close() on the OutputStream, there is some pending data to get flushed. When this flush happens, an Exception is thrown back from Storage. The Azure-storage SDK layer will throw back IOE. (Even if it is a StorageException thrown from the Storage, the SDK converts it to IOE.) But at HBase, we end up getting IllegalArgumentException which causes the RS to get aborted. If we get back IOE, the flush will get retried instead of aborting RS.
> The reason is this
> NativeAzureFsOutputStream uses Azure-storage SDK's BlobOutputStreamInternal. But the BlobOutputStreamInternal is wrapped within a SyncableDataOutputStream which is a FilterOutputStream. During the close op, NativeAzureFsOutputStream calls close on SyncableDataOutputStream and it uses below method from FilterOutputStream
> {code}
> public void close() throws IOException {
>   try (OutputStream ostream = out) {
>               flush();
>   }
> }
> {code}
> Here the flush call caused an IOE to be thrown to here. The finally will issue close call on ostream (Which is an instance of BlobOutputStreamInternal)
> When BlobOutputStreamInternal#close() is been called, if there was any exception already occured on that Stream, it will throw back the same Exception
> {code}
> public synchronized void close() throws IOException {
>   try {
>               // if the user has already closed the stream, this will throw a STREAM_CLOSED exception
>               // if an exception was thrown by any thread in the threadExecutor, realize it now
>               this.checkStreamState();
>               ...
> }
> private void checkStreamState() throws IOException {
>   if (this.lastError != null) {
>               throw this.lastError;
>   }
> }
> {code}
> So here both try and finally block getting Exceptions and Java uses Throwable#addSuppressed() 
> Within this method if both Exceptions are same objects, it throws back IllegalArgumentException
> {code}
> public final synchronized void addSuppressed(Throwable exception) {
>               if (exception == this)
>                              throw new IllegalArgumentException(SELF_SUPPRESSION_MESSAGE, exception);
>               ....
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org