You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/08/13 23:40:14 UTC

[GitHub] [incubator-druid] clintropolis commented on issue #6573: Druid tasks fail occasionally on Azure Storage

clintropolis commented on issue #6573: Druid tasks fail occasionally on Azure Storage
URL: https://github.com/apache/incubator-druid/issues/6573#issuecomment-521048760
 
 
   Hi, @fdoumet 
   
   First off, apologies for letting your issue get stale, we get quite a lot and sometimes we miss them, so thanks for chasing away the stale bot 😅. On the bright side, it made me notice this issue, and while I don't have access to an Azure environment to actually test this, I was able to do a bit of research and have an idea about how maybe we can improve the situation.
   
   Looking into the error message the issue report, specifically the 
   
   ```The specified block list is invalid.```
   
   i ran into this [stackoverflow discussion](https://stackoverflow.com/questions/12917857/the-specified-block-list-is-invalid-while-uploading-blobs-in-parallel), and these 2 issue reports for a .net library https://github.com/Azure/azure-storage-net/issues/456 and https://github.com/Azure/azure-storage-net/issues/780 seem to indicate that this error message can potentially happen when multiple writers try to write to the same blob, and a potential solution might be to perform a retry.
   
   The good news is that the Azure storage pusher does have a retry mechanism, controlled by `druid.azure.maxTries` and it defaults to `3`, but in this case it looks like an `IOException` which was not internally considered as retry-able. However, the `StorageException` which is the cause of that `IOException` is considered eligible for retry,  so I've opened a PR, #8296, that adjusts the retry logic that the push implementation uses to now retry in the case of the error messages like you have reported.
   
   I don't know if this will actually fix this issue and unfortunately am unable to test, but with any luck maybe it will do the trick 🤞.
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org