You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean R. Owen (Jira)" <ji...@apache.org> on 2022/04/16 20:26:00 UTC

[jira] [Commented] (SPARK-38536) Spark 3 can not read mixed format partitions

    [ https://issues.apache.org/jira/browse/SPARK-38536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523208#comment-17523208 ] 

Sean R. Owen commented on SPARK-38536:
--------------------------------------

3.0.x is EOL; 3.1.x is not, but I'm not clear whether it's worth backporting given the nature of it (this seems pretty uncommon) and that 3.1.x is almost EOL, so people should be using 3.2.x or shortly 3.3.x

> Spark 3 can not read mixed format partitions
> --------------------------------------------
>
>                 Key: SPARK-38536
>                 URL: https://issues.apache.org/jira/browse/SPARK-38536
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0, 3.2.1
>            Reporter: Huicheng Song
>            Priority: Major
>
> Spark 3.x reads partitions with table's input format, which fails when the partition has a different input format than the table.
> This is a regression introduced by SPARK-26630. Before that fix, Spark will use Partition InputFormat when creating HadoopRDD. With that fix, Spark uses only Table InputFormat when creating HadoopRDD, causing failures
> Reading mixed format partitions is an import scenario, especially for format migration. It is also well supported in query engines like Hive and Presto.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org