You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2016/12/06 23:11:58 UTC

[jira] [Created] (SPARK-18752) "isSrcLocal" parameter to Hive loadTable / loadPartition should come from user

Marcelo Vanzin created SPARK-18752:
--------------------------------------

             Summary: "isSrcLocal" parameter to Hive loadTable / loadPartition should come from user
                 Key: SPARK-18752
                 URL: https://issues.apache.org/jira/browse/SPARK-18752
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.1.0
            Reporter: Marcelo Vanzin
            Priority: Minor


We ran into an issue with the HiveShim code that calls "loadTable" and "loadPartition" while testing with some recent changes in upstream Hive.

The semantics in Hive changed slightly, and if you provide the wrong value for "isSrcLocal" you now can end up with an invalid table: the Hive code will move the temp directory to the final destination instead of moving its children.

The problem in Spark is that HiveShim.scala tries to figure out the value of "isSrcLocal" based on where the source and target directories are; that's not correct. "isSrcLocal" should be set based on the user query (e.g. "LOAD DATA LOCAL" would set it to "true"). So we need to propagate that information from the user query down to HiveShim.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org