You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/10 12:20:55 UTC

[GitHub] [hudi] xushiyan commented on a diff in pull request #5282: [HUDI-3845] Fix delete mor table's partition with urlencode's error

xushiyan commented on code in PR #5282:
URL: https://github.com/apache/hudi/pull/5282#discussion_r846771115


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java:
##########
@@ -471,7 +470,8 @@ public void remove() {
   private static FSDataInputStream getFSDataInputStream(FileSystem fs,
                                                         HoodieLogFile logFile,
                                                         int bufferSize) throws IOException {
-    FSDataInputStream fsDataInputStream = fs.open(logFile.getPath(), bufferSize);
+    String escapePathName = PartitionPathEncodeUtils.unescapePathName(logFile.getPath().toString());
+    FSDataInputStream fsDataInputStream = fs.open(new Path(escapePathName), bufferSize);

Review Comment:
   this has performance impact; the unescape is looking at every char, also calling `logFile.getPath()` construct a Path object and then L474 constructs another Path, which is costly. There should be a deeper fix to this: we shouldn't have to unescape every `logFile.getPath()` to read.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org