You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2015/08/11 01:27:46 UTC
[jira] [Created] (SPARK-9804) "isSrcLocal" parameter in loadTable / loadPartition is incorrect for HDFS source data

Marcelo Vanzin created SPARK-9804:
-------------------------------------

             Summary: "isSrcLocal" parameter in loadTable / loadPartition is incorrect for HDFS source data
                 Key: SPARK-9804
                 URL: https://issues.apache.org/jira/browse/SPARK-9804
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.5.0
            Reporter: Marcelo Vanzin


The shims for Hive >= 0.14 hardcode the value of the {{isSrcLocal}} parameter to true. If the source data is not actually local, you get errors like this:

{noformat}
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://vanzin-st1-1.vpc.cloudera.com:8020/user/hive/warehouse/spark_hive.db/src/.hive-staging_hive_2015-08-10_15-20-28_215_840551940044534110-1/-ext-10000/part-00000, expected: file:///
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:648)
        at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:529)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:340)
        at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1908)
        at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1876)
        at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1841)
        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2517)
        at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:2589)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1395)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1319)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.sql.hive.client.Shim_v0_14.loadPartition(HiveShim.scala:430)
        at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadPartition$1.apply$mcV$sp(ClientWrapper.scala:473)
...
{noformat}

This can be triggered by running a query like the following:

{code}
INSERT INTO TABLE blah PARTITION(key=value) SELECT ...;
{code}

Where "key=value" is a new partition being added to the existing table.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org