You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:00:32 UTC

[jira] [Updated] (SPARK-19575) Reading from or writing to a hive serde table with a non pre-existing location should succeed

     [ https://issues.apache.org/jira/browse/SPARK-19575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-19575:
---------------------------------
    Labels: bulk-closed  (was: )

> Reading from or writing to a hive serde table with a non pre-existing location should succeed
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19575
>                 URL: https://issues.apache.org/jira/browse/SPARK-19575
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Song Jun
>            Priority: Major
>              Labels: bulk-closed
>
> currently when we select from a hive serde table which has a non pre-existing location will throw an exception:
> ```
> Input path does not exist: file:/tmp/spark-37caa4e6-5a6a-4361-a905-06cc56afb274
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/tmp/spark-37caa4e6-5a6a-4361-a905-06cc56afb274
>         at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285)
>         at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
>         at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
>         at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:194)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
>         at scala.Option.getOrElse(Option.scala:121)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
>         at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
>         at scala.Option.getOrElse(Option.scala:121)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
>         at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
>         at scala.Option.getOrElse(Option.scala:121)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
>         at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
>         at scala.Option.getOrElse(Option.scala:121)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
>         at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
>         at scala.Option.getOrElse(Option.scala:121)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
>         at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
>         at scala.Option.getOrElse(Option.scala:121)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
>         at org.apache.spark.SparkContext.runJob(SparkContext.scala:2080)
>         at org.apache.spark.rdd.RDD.count(RDD.scala:1157)
>         at org.apache.spark.sql.QueryTest$.checkAnswer(QueryTest.scala:258)
> ```
> this is a folllowup work from SPARK-19329 which has unify the action when we reading from or writing to a datasource table with a non pre-existing locaiton, so here we should also unify the hive serde tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org