You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2023/08/21 11:05:00 UTC

[jira] [Resolved] (SPARK-44883) Spark insertInto with location GCS bucket root causes NPE

     [ https://issues.apache.org/jira/browse/SPARK-44883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran resolved SPARK-44883.
------------------------------------
    Resolution: Duplicate

> Spark insertInto with location GCS bucket root causes NPE
> ---------------------------------------------------------
>
>                 Key: SPARK-44883
>                 URL: https://issues.apache.org/jira/browse/SPARK-44883
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.3.0
>            Reporter: Dipayan Dev
>            Priority: Minor
>
> In our Organisation, we are using GCS bucket root location to point to our Hive table. Dataproc's latest 2.1 uses *Spark* *3.3.0* and this needs to be fixed.
> Spark Scala code to reproduce this issue
> {noformat}
> val DF = Seq(("test1", 123)).toDF("name", "num")
> DF.write.option("path", "gs://test_dd123/").mode(SaveMode.Overwrite).partitionBy("num").format("orc").saveAsTable("schema_name.table_name")
> val DF1 = Seq(("test2", 125)).toDF("name", "num")
> DF.write.mode(SaveMode.Overwrite).format("orc").insertInto("schema_name.table_name")
> java.lang.NullPointerException
>   at org.apache.hadoop.fs.Path.<init>(Path.java:141)
>   at org.apache.hadoop.fs.Path.<init>(Path.java:120)
>   at org.apache.hadoop.fs.Path.suffix(Path.java:441)
>   at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.$anonfun$getCustomPartitionLocations$1(InsertIntoHadoopFsRelationCommand.scala:254) {noformat}
> Looks like the issue is coming from Hadoop Path. 
> {noformat}
> scala> import org.apache.hadoop.fs.Path
> import org.apache.hadoop.fs.Path
> scala> val path: Path = new Path("gs://test_dd123/")
> path: org.apache.hadoop.fs.Path = gs://test_dd123/
> scala> path.suffix("/num=123")
> java.lang.NullPointerException
>   at org.apache.hadoop.fs.Path.<init>(Path.java:150)
>   at org.apache.hadoop.fs.Path.<init>(Path.java:129)
>   at org.apache.hadoop.fs.Path.suffix(Path.java:450){noformat}
> Path.suffix throughs NPE when writing into GS buckets root. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org