You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/03 05:30:34 UTC

[GitHub] [spark] HeartSaVioR commented on pull request #29630: [SPARK-32097] Enable Spark History Server to read from multiple directories

HeartSaVioR commented on pull request #29630:
URL: https://github.com/apache/spark/pull/29630#issuecomment-686262943


   I'm not sure your PR really deals with reading from multiple directories. The change is listing -> glob with *. Could you please elaborate what is the difference? The change also doesn't have any new unit tests verifying the changes.
   
   In general comment with the idea, having multiple root directories are still possible, but probably better to be just a static list (IMHO) instead of regex, as listing with glob pattern is known to be very slow.
   
   One thing I'm afraid of having multiple root directories is, SHS is already very complicated in point of thread-safety view even we only allow single root directory, and it may make things more complicated. I'm on the fence on doing this, until we are clear that this won't make SHS more complicated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org