You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by GitBox <gi...@apache.org> on 2022/04/04 00:20:51 UTC

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #35979: [SPARK-38664][CORE] Support compact EventLog when there are illegal characters in the path

HyukjinKwon commented on code in PR #35979:
URL: https://github.com/apache/spark/pull/35979#discussion_r841300801


##########
core/src/main/scala/org/apache/spark/deploy/history/EventLogFileCompactor.scala:
##########
@@ -221,5 +221,5 @@ private class CompactedEventLogFileWriter(
     hadoopConf: Configuration)
   extends SingleEventLogFileWriter(appId, appAttemptId, logBaseDir, sparkConf, hadoopConf) {
 
-  override val logPath: String = originalFilePath.toUri.toString + EventLogFileWriter.COMPACTED

Review Comment:
   Is it because the string presentation of the path is able to omit the scheme of the URI, or it does not encode the special characters to the encoded chars? Actually, we would have to always use Hadoop's `Path` to work on Hadoop's `Path` so `fs.default.name` is respected.
   
   The only exception would be when this path is stored somewhere outside. In this case, the URI has to be stored as the fully-qualified one so different `fs.default.name` does not affect the original path.
   
   This path URI stuff has many weird holes so we're just delegating to Hadoop's `Path` implementation, and leveraging their bug fixes.
   
   I am fine if this change does not handle all of the cases but wanted to point out this direction is better than manually handling it in Spark.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org