You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2020/12/28 05:08:41 UTC
[spark] branch master updated: [SPARK-33532][SQL] Add comments to a
unreachable branch in SpecificParquetRecordReaderBase.initialize method
This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new e6f0198 [SPARK-33532][SQL] Add comments to a unreachable branch in SpecificParquetRecordReaderBase.initialize method
e6f0198 is described below
commit e6f019836c099398542b443f7700f79de81da0d5
Author: yangjie01 <ya...@baidu.com>
AuthorDate: Mon Dec 28 14:07:50 2020 +0900
[SPARK-33532][SQL] Add comments to a unreachable branch in SpecificParquetRecordReaderBase.initialize method
### What changes were proposed in this pull request?
This pr mainly adds a comment for the 'rowgroupoffsets! = null' branch in `SpecificParquetRecordReaderBase.init(InputSplit, TaskAttemptContext)` to indicate that spark read parquet process will not enter this branch after SPARK-13883 and SPARK-13989. It is not deleted because PARQUET-131 wants to move `SpecificParquetRecordReaderBase` into the parquet-mr project.
### Why are the changes needed?
Add a useful comment.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Pass the Jenkins or GitHub Action
Closes #30484 from LuciferYang/SPARK-33532.
Authored-by: yangjie01 <ya...@baidu.com>
Signed-off-by: HyukjinKwon <gu...@apache.org>
---
.../datasources/parquet/SpecificParquetRecordReaderBase.java | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java b/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java
index c975e52..be68880 100644
--- a/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java
+++ b/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java
@@ -107,6 +107,13 @@ public abstract class SpecificParquetRecordReaderBase<T> extends RecordReader<Vo
FilterCompat.Filter filter = getFilter(configuration);
blocks = filterRowGroups(filter, footer.getBlocks(), fileSchema);
} else {
+ // SPARK-33532: After SPARK-13883 and SPARK-13989, the parquet read process will
+ // no longer enter this branch because `ParquetInputSplit` only be constructed in
+ // `ParquetFileFormat.buildReaderWithPartitionValues` and
+ // `ParquetPartitionReaderFactory.buildReaderBase` method,
+ // and the `rowGroupOffsets` in `ParquetInputSplit` set to null explicitly.
+ // We didn't delete this branch because PARQUET-131 wanted to move this to the
+ // parquet-mr project.
// otherwise we find the row groups that were selected on the client
footer = readFooter(configuration, file, NO_FILTER);
Set<Long> offsets = new HashSet<>();
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org