You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hao Ren (JIRA)" <ji...@apache.org> on 2015/06/22 23:08:01 UTC

[jira] [Comment Edited] (SPARK-6675) HiveContext setConf is not stable

    [ https://issues.apache.org/jira/browse/SPARK-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596638#comment-14596638 ] 

Hao Ren edited comment on SPARK-6675 at 6/22/15 9:07 PM:
---------------------------------------------------------

Hi,

I tried branch-1.4.
It works now. Issue closed.

Thank you for your work.




was (Author: invkrh):
Hi,

I tried branch-1.4.
It works now.

Thank you for your work.



> HiveContext setConf is not stable
> ---------------------------------
>
>                 Key: SPARK-6675
>                 URL: https://issues.apache.org/jira/browse/SPARK-6675
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.0
>         Environment: AWS ec2 xlarge2 cluster launched by spark's script
>            Reporter: Hao Ren
>            Priority: Critical
>
> I find HiveContext.setConf does not work correctly. Here are some code snippets showing the problem:
> snippet 1:
> {code}
> import org.apache.spark.sql.hive.HiveContext
> import org.apache.spark.{SparkConf, SparkContext}
> object Main extends App {
>   val conf = new SparkConf()
>     .setAppName("context-test")
>     .setMaster("local[8]")
>   val sc = new SparkContext(conf)
>   val hc = new HiveContext(sc)
>   hc.setConf("spark.sql.shuffle.partitions", "10")
>   hc.setConf("hive.metastore.warehouse.dir", "/home/spark/hive/warehouse_test")
>   hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println
>   hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach println
> }
> {code}
> Results:
> (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test)
> (spark.sql.shuffle.partitions,10)
> snippet 2:
> {code}
> ...
>   hc.setConf("hive.metastore.warehouse.dir", "/home/spark/hive/warehouse_test")
>   hc.setConf("spark.sql.shuffle.partitions", "10")
>   hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println
>   hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach println
> ...
> {code}
> Results:
> (hive.metastore.warehouse.dir,/user/hive/warehouse)
> (spark.sql.shuffle.partitions,10)
> You can see that I just permuted the two setConf call, then that leads to two different Hive configuration.
> It seems that HiveContext can not set a new value on "hive.metastore.warehouse.dir" key in one or the first "setConf" call.
> You need another "setConf" call before changing "hive.metastore.warehouse.dir". For example, set "hive.metastore.warehouse.dir" twice and the snippet 1
> snippet 3:
> {code}
> ...
>   hc.setConf("hive.metastore.warehouse.dir", "/home/spark/hive/warehouse_test")
>   hc.setConf("hive.metastore.warehouse.dir", "/home/spark/hive/warehouse_test")
>   hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println
> ...
> {code}
> Results:
> (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test)
> You can reproduce this if you move to the latest branch-1.3 (1.3.1-snapshot, htag = 7d029cb1eb6f1df1bce1a3f5784fb7ce2f981a33)
> I have also tested the released 1.3.0 (htag = 4aaf48d46d13129f0f9bdafd771dd80fe568a7dc). It has the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org