You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2015/08/11 01:27:46 UTC
[jira] [Created] (SPARK-9804) "isSrcLocal" parameter in loadTable /
loadPartition is incorrect for HDFS source data
Marcelo Vanzin created SPARK-9804:
-------------------------------------
Summary: "isSrcLocal" parameter in loadTable / loadPartition is incorrect for HDFS source data
Key: SPARK-9804
URL: https://issues.apache.org/jira/browse/SPARK-9804
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.5.0
Reporter: Marcelo Vanzin
The shims for Hive >= 0.14 hardcode the value of the {{isSrcLocal}} parameter to true. If the source data is not actually local, you get errors like this:
{noformat}
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://vanzin-st1-1.vpc.cloudera.com:8020/user/hive/warehouse/spark_hive.db/src/.hive-staging_hive_2015-08-10_15-20-28_215_840551940044534110-1/-ext-10000/part-00000, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:648)
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:529)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:340)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1908)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1876)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1841)
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2517)
at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:2589)
at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1395)
at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1319)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.sql.hive.client.Shim_v0_14.loadPartition(HiveShim.scala:430)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadPartition$1.apply$mcV$sp(ClientWrapper.scala:473)
...
{noformat}
This can be triggered by running a query like the following:
{code}
INSERT INTO TABLE blah PARTITION(key=value) SELECT ...;
{code}
Where "key=value" is a new partition being added to the existing table.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org