You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/03 07:44:37 UTC

[GitHub] [spark] Gaurangi94 commented on pull request #29630: [SPARK-32097] Enable Spark History Server to read from multiple directories

Gaurangi94 commented on pull request #29630:
URL: https://github.com/apache/spark/pull/29630#issuecomment-686316853


   > I'm not sure your PR really deals with reading from multiple directories. The change is listing -> glob with *. Could you please elaborate what is the difference? The change also doesn't have any new unit tests verifying the changes.
   > 
   > In general comment with the idea, having multiple root directories are still possible, but probably better to be just a static list (IMHO) instead of regex, as listing with glob pattern is known to be very slow.
   > 
   > One thing I'm afraid of having multiple root directories is, SHS is already very complicated in point of thread-safety view even we only allow single root directory, and it may make things more complicated. I'm on the fence on doing this, until we are clear that this won't make SHS more complicated.
   
   Thanks for your response! By multiple directories I meant that a regex could potentially match more than one directory. In case of external file system, glob pattern might be better considering we will have to make just one over the network call. Also, it will be easier for the user to specify just one setting, instead of multiple values. What do you think?
   
   I will add the unit tests. Thanks for pointing out.
   
   MHS will function only as a read only server. Can thread-safety be an issue in that case?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org