You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/23 19:14:10 UTC

[GitHub] [spark] steveloughran edited a comment on pull request #35569: [SPARK-38250][CORE] Check existence before deleting stagingDir in HadoopMapReduceCommitProtocol

steveloughran edited a comment on pull request #35569:
URL: https://github.com/apache/spark/pull/35569#issuecomment-1049106381


   * alluxio shouldn't be complaining that the file isn't there. delete(path) must return true as the requirement "path is not present when we return" is met.
   * removing checks before delete() saves one round trip when working with object stores.
   * valid point about namenode lock overheads, but not something i personally worry too much about. a lock of some form may be needed for the exists probe too, and you've now got two RPCs. if the situation was that most times you did the call the path you wanted to delete wasn't there then maybe it could be justified, otherwise it adds 1 call per operation.
   
   overall then, -1 to this, though it does sound like alluxio is overreacting.
   
   >  why doesn't fs.delete check the existence on the server side, I think the following ideas might be related.
   
   it doesn't consider deleting a nonexistent path to be an error. after you finish the call, the path you passed in isn't there, which is the outcome it is trying to offer.  
   
   > Similarly, for users of FileSystem, maybe some FileSystems do check the existence before deleting like Alluxio, but, IMHO, we can't ask all the FileSystem to do the same, it's better to do the check on users' side.
   
   why?
    
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org