You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/08 23:06:48 UTC

[GitHub] [iceberg] lykwan opened a new issue #2796: Sporadic `Resetting to invalid mark` Error

lykwan opened a new issue #2796:
URL: https://github.com/apache/iceberg/issues/2796


   We're trying to use S3FileIO and we're running into the following exception on uploads:
   
   ```
   java.io.UncheckedIOException: java.io.IOException: Resetting to invalid mark
   	at software.amazon.awssdk.utils.FunctionalUtils.asRuntimeException(FunctionalUtils.java:180)
   	at software.amazon.awssdk.utils.FunctionalUtils.lambda$safeRunnable$5(FunctionalUtils.java:126)
   	at software.amazon.awssdk.utils.FunctionalUtils.invokeSafely(FunctionalUtils.java:140)
   	at software.amazon.awssdk.core.sync.RequestBody.lambda$fromInputStream$1(RequestBody.java:122)
   	at software.amazon.awssdk.services.s3.internal.handlers.SyncChecksumValidationInterceptor$ChecksumCalculatingStreamProvider.lambda$newStream$0(SyncChecksumValidationInterceptor.java:112)
   	at software.amazon.awssdk.utils.FunctionalUtils.lambda$safeSupplier$4(FunctionalUtils.java:108)
   	at software.amazon.awssdk.utils.FunctionalUtils.invokeSafely(FunctionalUtils.java:136)
   	at software.amazon.awssdk.services.s3.internal.handlers.SyncChecksumValidationInterceptor$ChecksumCalculatingStreamProvider.newStream(SyncChecksumValidationInterceptor.java:112)
   	at software.amazon.awssdk.core.internal.http.StreamManagingStage$ClosingStreamProvider.newStream(StreamManagingStage.java:78)
   	at java.util.Optional.map(Optional.java:215)
   	at software.amazon.awssdk.http.apache.internal.RepeatableInputStreamRequestEntity.getContent(RepeatableInputStreamRequestEntity.java:114)
   	at software.amazon.awssdk.http.apache.internal.RepeatableInputStreamRequestEntity.<init>(RepeatableInputStreamRequestEntity.java:89)
   	at software.amazon.awssdk.http.apache.internal.impl.ApacheHttpRequestFactory.wrapEntity(ApacheHttpRequestFactory.java:153)
   	at software.amazon.awssdk.http.apache.internal.impl.ApacheHttpRequestFactory.createApacheRequest(ApacheHttpRequestFactory.java:133)
   	at software.amazon.awssdk.http.apache.internal.impl.ApacheHttpRequestFactory.create(ApacheHttpRequestFactory.java:53)
   	at software.amazon.awssdk.http.apache.ApacheHttpClient.toApacheRequest(ApacheHttpClient.java:258)
   	at software.amazon.awssdk.http.apache.ApacheHttpClient.prepareRequest(ApacheHttpClient.java:228)
   	at software.amazon.awssdk.core.client.builder.SdkDefaultClientBuilder$NonManagedSdkHttpClient.prepareRequest(SdkDefaultClientBuilder.java:432)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.executeHttpRequest(MakeHttpRequestStage.java:67)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:55)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:39)
   	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
   	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
   	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
   	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:73)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:77)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:39)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:50)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:36)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:64)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:34)
   	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
   	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56)
   	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:48)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:31)
   	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
   	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
   	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
   	at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:193)
   	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:133)
   	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:159)
   	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:112)
   	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:167)
   	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:94)
   	at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
   	at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:55)
   	at software.amazon.awssdk.services.s3.DefaultS3Client.putObject(DefaultS3Client.java:8645)
   	at org.apache.iceberg.aws.s3.S3OutputStream.completeUploads(S3OutputStream.java:316)
   	at org.apache.iceberg.aws.s3.S3OutputStream.close(S3OutputStream.java:197)
   ```
   
   This occurs sporadically and can succeed on retries, but it happens consistently.
   
   I was wondering if anyone else has seen this? I was wondering if the RequestBody.fromInputStream could be setting a mark at 128kB, then start reading past it, hit some sort of exception, and try to reset back to the beginning.
   
   https://github.com/aws/aws-sdk-java-v2/blob/master/core/sdk-core/src/main/java/software/amazon/awssdk/core/sync/RequestBody.java#L131
   
   This would cause the BufferedInputStream to fail created here because it has read past the readlimit of the mark.
   
   https://github.com/apache/iceberg/blob/master/aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputStream.java#L304


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] otayel commented on issue #2796: Sporadic `Resetting to invalid mark` Error

Posted by GitBox <gi...@apache.org>.
otayel commented on issue #2796:
URL: https://github.com/apache/iceberg/issues/2796#issuecomment-1044365273


   I am also seeing this problem on our side. Any insights how we can solve it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] igorcalabria commented on issue #2796: Sporadic `Resetting to invalid mark` Error

Posted by GitBox <gi...@apache.org>.
igorcalabria commented on issue #2796:
URL: https://github.com/apache/iceberg/issues/2796#issuecomment-1020471059


   I can confirm that this happens pretty consistently when running `rewrite_manifests` action on a streaming table. Subsequent retries always seems to work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on issue #2796: Sporadic `Resetting to invalid mark` Error

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on issue #2796:
URL: https://github.com/apache/iceberg/issues/2796#issuecomment-878553759


   Let me try to reproduce, and will reply back when I get some results. Feel free to add more details if you have some steps that can consistently reproduce this, thank you! @lykwan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] johnclara commented on issue #2796: Sporadic `Resetting to invalid mark` Error

Posted by GitBox <gi...@apache.org>.
johnclara commented on issue #2796:
URL: https://github.com/apache/iceberg/issues/2796#issuecomment-878481134


   @jackye1995 any ideas?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] igorcalabria commented on issue #2796: Sporadic `Resetting to invalid mark` Error

Posted by GitBox <gi...@apache.org>.
igorcalabria commented on issue #2796:
URL: https://github.com/apache/iceberg/issues/2796#issuecomment-1020471059


   I can confirm that this happens pretty consistently when running `rewrite_manifests` action on a streaming table. Subsequent retries always seems to work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] otayel commented on issue #2796: Sporadic `Resetting to invalid mark` Error

Posted by GitBox <gi...@apache.org>.
otayel commented on issue #2796:
URL: https://github.com/apache/iceberg/issues/2796#issuecomment-1046814641


   Adding some update from my side. So this was happening to me consistently when I am trying to run a sql query from spark to delete some entries.  reading more online, usually this exception is tied when s3 fails to upload a file due to network issue and retries again using the data from buffer but if the data was too bigger than buffer size (default 128kb) then it throws the above exception. Looking the EMR cluster logs, I noticed exceptions happening before this one related to s3 throttling `S3Exception: Please reduce your request rate.` which have might resulted in retry and thus the  `Resetting to invalid mark` exception. 
   
   I have addressed this issue for now by doing two things so far:
   - Increasing the buffer size to 512MB + 1 byte by setting the system property `com.amazonaws.sdk.s3.defaultStreamBufferSize`. The 512MB is coming from that we compact the data early on by `rewriteDataFiles` which has default max file size of 512MB.
   - Decreasing the EMR cluster size on my side to try and avoid the s3 throttling exception.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org