You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/02/12 13:24:56 UTC

[GitHub] gaborgsomogyi opened a new pull request #23764: [SPARK-26825][SS] Fix temp checkpoint creation in cluster mode when default filesystem is not local.

gaborgsomogyi opened a new pull request #23764: [SPARK-26825][SS] Fix temp checkpoint creation in cluster mode when default filesystem is not local.
URL: https://github.com/apache/spark/pull/23764
 
 
   ## What changes were proposed in this pull request?
   
   There are situations where temporary checkpoint directory created by Spark. One example when one uses console sink. Such cases in the actual implementation `StreamingQueryManager` creates directory with `Utils.createTempDir` which will be passed to the appropriate `StreamExecution`. `StreamExecution` then does the following:
   * Creates the directory again
   * Resolves the provided directory
   
   The problem comes when resolving happens. The `StreamingQueryManager` provided path doesn't contain `file://` scheme and because of this from local filesystem it can switch to HDFS for example (such case HDFS is the default filesystem).
   
   In this PR I've added the following changes:
   * Creating the directory only in `StreamExecution`
   * `file://` scheme added to the directory
   * As it was not clear that the checkpoint directory was not created because of permission issues I've added an exception when checkpoint directory doesn't exist and creation is not successful
   
   ## How was this patch tested?
   
   Existing unit tests + started a query in client/cluster mode.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org