You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Michael Armbrust (JIRA)" <ji...@apache.org> on 2015/08/12 19:39:45 UTC

[jira] [Resolved] (SPARK-9804) "isSrcLocal" parameter in loadTable / loadPartition is incorrect for HDFS source data

     [ https://issues.apache.org/jira/browse/SPARK-9804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Armbrust resolved SPARK-9804.
-------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.5.0

Issue resolved by pull request 8086
[https://github.com/apache/spark/pull/8086]

> "isSrcLocal" parameter in loadTable / loadPartition is incorrect for HDFS source data
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-9804
>                 URL: https://issues.apache.org/jira/browse/SPARK-9804
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.0
>            Reporter: Marcelo Vanzin
>             Fix For: 1.5.0
>
>
> The shims for Hive >= 0.14 hardcode the value of the {{isSrcLocal}} parameter to true. If the source data is not actually local, you get errors like this:
> {noformat}
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://vanzin-st1-1.vpc.cloudera.com:8020/user/hive/warehouse/spark_hive.db/src/.hive-staging_hive_2015-08-10_15-20-28_215_840551940044534110-1/-ext-10000/part-00000, expected: file:///
>         at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:648)
>         at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
>         at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:529)
>         at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
>         at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
>         at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:340)
>         at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1908)
>         at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1876)
>         at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1841)
>         at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2517)
>         at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:2589)
>         at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1395)
>         at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1319)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.spark.sql.hive.client.Shim_v0_14.loadPartition(HiveShim.scala:430)
>         at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadPartition$1.apply$mcV$sp(ClientWrapper.scala:473)
> ...
> {noformat}
> This can be triggered by running a query like the following:
> {code}
> INSERT INTO TABLE blah PARTITION(key=value) SELECT ...;
> {code}
> Where "key=value" is a new partition being added to the existing table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org