You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/04/24 14:57:19 UTC

[GitHub] [spark] squito commented on issue #24447: [SPARK-26089] Checking the shuffle transmitted data to handle large corrupt shuffle blocks

squito commented on issue #24447: [SPARK-26089] Checking the shuffle transmitted data to handle large corrupt shuffle blocks
URL: https://github.com/apache/spark/pull/24447#issuecomment-486277895
 
 
   HI @turboFei thanks for posting this.  Have you looked at [SPARK-26089](https://issues.apache.org/jira/browse/SPARK-26089) yet?  That actually addresses two out of your three concerns: 
   
   * only check the first maxBytesInFlight/3 bytes.
   * need additional memory.
   
   I'm not saying there is not value here -- that change does not handle the 3rd one "only detect the compressed or wrapped data."  I also have always worried about whether it really makes sense to rely on the the codec to detect corruption, so using a digest could also make sense.  But, this is a large change, so the case should be made clearly.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org