You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:22:05 UTC
[jira] [Updated] (SPARK-17550) DataFrameWriter.partitionBy() should
throw exception if column is not present in Dataframe
[ https://issues.apache.org/jira/browse/SPARK-17550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-17550:
---------------------------------
Labels: bulk-closed (was: )
> DataFrameWriter.partitionBy() should throw exception if column is not present in Dataframe
> ------------------------------------------------------------------------------------------
>
> Key: SPARK-17550
> URL: https://issues.apache.org/jira/browse/SPARK-17550
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Reporter: Aniket Kulkarni
> Priority: Minor
> Labels: bulk-closed
>
> I have a spark job which performs certain computations on event data and eventually persists it to hive.
> I was trying to write to hive using the code snippet shown below :
> dataframe.write.format("orc").partitionBy(col1,col2).options(options).mode(SaveMode.Append).saveAsTable(hiveTable)
> The write to hive was not working as col2 in the above example was not present in the dataframe. It was a little tedious to debug this as no exception or message showed up in the logs. I was constantly seeing executor lost failures in the logs and nothing more.
> I think there should be an exception thrown when one tries to write to hive on a partitioning column that does not exist.
> If this is indeed something that needs to be fixed, I would like to volunteer to fix this in the spark-core code base.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org