You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "suddendust (via GitHub)" <gi...@apache.org> on 2023/02/13 10:38:27 UTC

[GitHub] [pinot] suddendust opened a new issue, #10273: AWS SDK java.nio.file.FileAlreadyExistsException During State Transition from OFFLINE -> ONLINE

suddendust opened a new issue, #10273:
URL: https://github.com/apache/pinot/issues/10273

   Certain segments in our servers can't come online due to this exception:
   
   ```
   Caused by: java.io.IOException: Failed to read response into file: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   Caused by: java.nio.file.FileAlreadyExistsException: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   2023/02/13 09:35:49.630 INFO [S3PinotFS] [HelixTaskExecutor-message_handle_thread_27] Copy s3://my-s3-bucket-pinot/controller/data/service_call_view/service_call_view__43__350__20230211T0514Z to local /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   2023/02/13 09:35:50.317 WARN [PinotFSSegmentFetcher] [HelixTaskExecutor-message_handle_thread_27] Caught exception while fetching segment from: s3://my-s3-bucket-pinot/controller/data/service_call_view/service_call_view__43__350__20230211T0514Z to: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   software.amazon.awssdk.core.exception.SdkClientException: Unable to unmarshall response (Failed to read response into file: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz). Response Code: 200, Response Text: OK
   Caused by: software.amazon.awssdk.core.exception.NonRetryableException: Failed to read response into file: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   Caused by: java.io.IOException: Failed to read response into file: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   Caused by: java.nio.file.FileAlreadyExistsException: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   2023/02/13 09:35:51.300 INFO [S3PinotFS] [HelixTaskExecutor-message_handle_thread_27] Copy s3://my-s3-bucket-pinot/controller/data/service_call_view/service_call_view__43__350__20230211T0514Z to local /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   2023/02/13 09:35:52.058 WARN [PinotFSSegmentFetcher] [HelixTaskExecutor-message_handle_thread_27] Caught exception while fetching segment from: s3://my-s3-bucket-pinot/controller/data/service_call_view/service_call_view__43__350__20230211T0514Z to: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   software.amazon.awssdk.core.exception.SdkClientException: Unable to unmarshall response (Failed to read response into file: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz). Response Code: 200, Response Text: OK
   Caused by: software.amazon.awssdk.core.exception.NonRetryableException: Failed to read response into file: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   Caused by: java.io.IOException: Failed to read response into file: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   Caused by: java.nio.file.FileAlreadyExistsException: /var/pinot/server/data/index/service_call_view_REALTIME/service_call_view__43__350__20230211T0514Z.tar.gz
   2023/02/13 09:35:52.059 WARN [service_call_view_REALTIME-RealtimeTableDataManager] [HelixTaskExecutor-message_handle_thread_27] Failed to download segment service_call_view__43__350__20230211T0514Z from deep store:
   2023/02/13 09:35:52.060 WARN [service_call_view_REALTIME-RealtimeTableDataManager] [HelixTaskExecutor-message_handle_thread_27] Download segment service_call_view__43__350__20230211T0514Z from deepstore uri s3://my-s3-bucket-pinot/controller/data/service_call_view/service_call_view__43__350__20230211T0514Z failed.
   2023/02/13 09:35:52.060 ERROR [SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel] [HelixTaskExecutor-message_handle_thread_27] Caught exception in state transition from OFFLINE -> ONLINE for resource: service_call_view_REALTIME, partition: service_call_view__43__350__20230211T0514Z
   2023/02/13 09:35:52.061 ERROR [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread_27] Exception while executing a state transition task service_call_view__43__350__20230211T0514Z
   2023/02/13 09:35:56.848 INFO [HelixTask] [HelixTaskExecutor-message_handle_thread_27] Message: 18143b00-938e-472e-b81e-1469b00a5d72 (parent: null) handling task for service_call_view_REALTIME:service_call_view__43__350__20230211T0514Z completed at: 1676280956848, results: false. FrameworkTime: 15 ms; HandlerTime: 8675 ms.
   ```
   
   Due to this, the segment moves to `ERROR` state and rebalancing keeps failing. We can enable overwrites in this case?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10273: AWS SDK java.nio.file.FileAlreadyExistsException During State Transition from OFFLINE -> ONLINE

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10273:
URL: https://github.com/apache/pinot/issues/10273#issuecomment-1428859367

   cc @snleee 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org