You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yang Jie (Jira)" <ji...@apache.org> on 2019/10/14 03:20:00 UTC

[jira] [Created] (SPARK-29454) Reduce one unsafeProjection call when read parquet file

Yang Jie created SPARK-29454:
--------------------------------

             Summary: Reduce one unsafeProjection call when read parquet file
                 Key: SPARK-29454
                 URL: https://issues.apache.org/jira/browse/SPARK-29454
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.4.4, 2.3.4, 2.2.3
            Reporter: Yang Jie


ParquetGroupConverter call unsafeProjection function to covert SpecificInternalRow to UnsafeRow every times when read Parquet data file use ParquetRecordReader, then ParquetFileFormat will call unsafeProjection function to covert this UnsafeRow to another UnsafeRow again when partitionSchema is not empty , and on the other hand we PartitionReaderWithPartitionValues  always do this convert process when use DataSourceV2.

I think the first time convert in ParquetGroupConverter is redundant and ParquetRecordReader return a SpecificInternalRow is enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org