You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/12/02 00:41:00 UTC

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6345: Spark 3.3: Choose readers based on task types

aokolnychyi commented on code in PR #6345:
URL: https://github.com/apache/iceberg/pull/6345#discussion_r1037687769


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/BatchDataReader.java:
##########
@@ -28,21 +28,48 @@
 import org.apache.iceberg.io.CloseableIterator;
 import org.apache.iceberg.io.InputFile;
 import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
+import org.apache.iceberg.spark.source.metrics.TaskNumDeletes;
+import org.apache.iceberg.spark.source.metrics.TaskNumSplits;
 import org.apache.spark.rdd.InputFileBlockHolder;
+import org.apache.spark.sql.connector.metric.CustomTaskMetric;
+import org.apache.spark.sql.connector.read.PartitionReader;
 import org.apache.spark.sql.vectorized.ColumnarBatch;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-class BatchDataReader extends BaseBatchReader<FileScanTask> {
+class BatchDataReader extends BaseBatchReader<FileScanTask>

Review Comment:
   This class was only used as `PartitionReader` in `SparkScan`, where we extended it, implemented `PartitionReader` and called the implementation as `BatchReader`.  After adding a common reader factory, we may have multiple batch readers now. That's why `BatchDataReader` seemed like a more accurate name than `BatchReader`. As there were no other places that used this class, I decided to implement `PartitionReader` directly here.
   
   Any feedback is welcome. See `SparkScan` below for old usage.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org