You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by chengxiang li <ch...@intel.com> on 2014/11/13 14:10:58 UTC

Review Request 27987: HIVE-8833 implement remote spark client

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/
-----------------------------------------------------------

Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.


Bugs: HIVE-8833
    https://issues.apache.org/jira/browse/HIVE-8833


Repository: hive-git


Description
-------

Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.


Diffs
-----

  ql/pom.xml 06d7f27 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 

Diff: https://reviews.apache.org/r/27987/diff/


Testing
-------


Thanks,

chengxiang li


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by chengxiang li <ch...@intel.com>.

> On Nov. 14, 2014, 7:32 p.m., Marcelo Vanzin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java, line 26
> > <https://reviews.apache.org/r/27987/diff/3/?file=763278#file763278line26>
> >
> >     nit: space before {
> >     
> >     Maybe implement Closeable?

fixed.


> On Nov. 14, 2014, 7:32 p.m., Marcelo Vanzin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java, line 101
> > <https://reviews.apache.org/r/27987/diff/3/?file=763279#file763279line101>
> >
> >     Doesn't this work?
> >     
> >     for (Map.Entry<String, String> entry : hiveConf)

yeah, it works. fixed.


> On Nov. 14, 2014, 7:32 p.m., Marcelo Vanzin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java, line 107
> > <https://reviews.apache.org/r/27987/diff/3/?file=763279#file763279line107>
> >
> >     Is Hive still using commons-logging? slf4j makes this much better since it handles format strings for you...

Hive use commons-logging, keep this here.


> On Nov. 14, 2014, 7:32 p.m., Marcelo Vanzin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java, line 94
> > <https://reviews.apache.org/r/27987/diff/3/?file=763281#file763281line94>
> >
> >     Don't you get warnings here since JobHandle needs a type parameter?

not actually in my Intellij, very strange. fix it.


> On Nov. 14, 2014, 7:32 p.m., Marcelo Vanzin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java, line 51
> > <https://reviews.apache.org/r/27987/diff/3/?file=763284#file763284line51>
> >
> >     You could use:
> >     
> >       new URI(path).getScheme() != null

fixed.


> On Nov. 14, 2014, 7:32 p.m., Marcelo Vanzin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java, line 55
> > <https://reviews.apache.org/r/27987/diff/3/?file=763284#file763284line55>
> >
> >     You could use:
> >     
> >       new File(path).toURI().toURL()

fixed.


- chengxiang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/#review61479
-----------------------------------------------------------


On Nov. 14, 2014, 3:43 a.m., chengxiang li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27987/
> -----------------------------------------------------------
> 
> (Updated Nov. 14, 2014, 3:43 a.m.)
> 
> 
> Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8833
>     https://issues.apache.org/jira/browse/HIVE-8833
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.
> 
> 
> Diffs
> -----
> 
>   ql/pom.xml 06d7f27 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 
> 
> Diff: https://reviews.apache.org/r/27987/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> chengxiang li
> 
>


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by Marcelo Vanzin <va...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/#review61479
-----------------------------------------------------------

Ship it!


LGTM, just small nits.


ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java
<https://reviews.apache.org/r/27987/#comment103130>

    nit: space before {
    
    Maybe implement Closeable?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java
<https://reviews.apache.org/r/27987/#comment103134>

    Use "properties.load(Reader)" instead, so you can force UTF-8 encoding.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java
<https://reviews.apache.org/r/27987/#comment103135>

    Doesn't this work?
    
    for (Map.Entry<String, String> entry : hiveConf)



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java
<https://reviews.apache.org/r/27987/#comment103137>

    Is Hive still using commons-logging? slf4j makes this much better since it handles format strings for you...



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java
<https://reviews.apache.org/r/27987/#comment103145>

    Don't you get warnings here since JobHandle needs a type parameter?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java
<https://reviews.apache.org/r/27987/#comment103148>

    You could use:
    
      new URI(path).getScheme() != null



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java
<https://reviews.apache.org/r/27987/#comment103150>

    You could use:
    
      new File(path).toURI().toURL()


- Marcelo Vanzin


On Nov. 14, 2014, 3:43 a.m., chengxiang li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27987/
> -----------------------------------------------------------
> 
> (Updated Nov. 14, 2014, 3:43 a.m.)
> 
> 
> Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8833
>     https://issues.apache.org/jira/browse/HIVE-8833
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.
> 
> 
> Diffs
> -----
> 
>   ql/pom.xml 06d7f27 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 
> 
> Diff: https://reviews.apache.org/r/27987/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> chengxiang li
> 
>


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by chengxiang li <ch...@intel.com>.

> On Nov. 17, 2014, 10:11 p.m., Szehon Ho wrote:
> > spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java, line 193
> > <https://reviews.apache.org/r/27987/diff/4/?file=765595#file765595line193>
> >
> >     Sorry for the basic question, what is 'spark' as spark.master signify?

Spark support different kinds of resource manager, which is configured with spark.master, such as:
local, local[coreNumber], local-cluster
spark://hostname:port
mesos:://hostname:port
yarn-cluster
yarn-client

spark://hostname:port refer to a standalone spark cluster.


- chengxiang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/#review61819
-----------------------------------------------------------


On Nov. 17, 2014, 3:47 a.m., chengxiang li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27987/
> -----------------------------------------------------------
> 
> (Updated Nov. 17, 2014, 3:47 a.m.)
> 
> 
> Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8833
>     https://issues.apache.org/jira/browse/HIVE-8833
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.
> 
> 
> Diffs
> -----
> 
>   ql/pom.xml 06d7f27 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 
> 
> Diff: https://reviews.apache.org/r/27987/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> chengxiang li
> 
>


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by Szehon Ho <sz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/#review61819
-----------------------------------------------------------


Looks mostly good, just some minor nits and basic question as I'm not too familiar.


ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java
<https://reviews.apache.org/r/27987/#comment103720>

    Can we correct this typo while we are in this class?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java
<https://reviews.apache.org/r/27987/#comment103725>

    Should we put this inside the null-check, to avoid NPE?



spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java
<https://reviews.apache.org/r/27987/#comment103730>

    Sorry for the basic question, what is 'spark' as spark.master signify?


- Szehon Ho


On Nov. 17, 2014, 3:47 a.m., chengxiang li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27987/
> -----------------------------------------------------------
> 
> (Updated Nov. 17, 2014, 3:47 a.m.)
> 
> 
> Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8833
>     https://issues.apache.org/jira/browse/HIVE-8833
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.
> 
> 
> Diffs
> -----
> 
>   ql/pom.xml 06d7f27 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 
> 
> Diff: https://reviews.apache.org/r/27987/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> chengxiang li
> 
>


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by chengxiang li <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/
-----------------------------------------------------------

(Updated Nov. 18, 2014, 1:52 a.m.)


Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.


Changes
-------

fixed an import error.


Bugs: HIVE-8833
    https://issues.apache.org/jira/browse/HIVE-8833


Repository: hive-git


Description
-------

Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.


Diffs (updated)
-----

  ql/pom.xml 076f993 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java 3cca7f4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 

Diff: https://reviews.apache.org/r/27987/diff/


Testing
-------


Thanks,

chengxiang li


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by chengxiang li <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/
-----------------------------------------------------------

(Updated Nov. 18, 2014, 1:37 a.m.)


Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.


Changes
-------

fixed szehon mentioned issues.


Bugs: HIVE-8833
    https://issues.apache.org/jira/browse/HIVE-8833


Repository: hive-git


Description
-------

Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.


Diffs (updated)
-----

  ql/pom.xml 076f993 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java 3cca7f4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 

Diff: https://reviews.apache.org/r/27987/diff/


Testing
-------


Thanks,

chengxiang li


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by chengxiang li <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/
-----------------------------------------------------------

(Updated Nov. 17, 2014, 3:47 a.m.)


Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.


Bugs: HIVE-8833
    https://issues.apache.org/jira/browse/HIVE-8833


Repository: hive-git


Description
-------

Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.


Diffs (updated)
-----

  ql/pom.xml 06d7f27 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 

Diff: https://reviews.apache.org/r/27987/diff/


Testing
-------


Thanks,

chengxiang li


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by chengxiang li <ch...@intel.com>.

> On 十一月 14, 2014, 4:39 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java, line 73
> > <https://reviews.apache.org/r/27987/diff/3/?file=763281#file763281line73>
> >
> >     Public or private?

Package scope should be suitable here, as we only allow HiveSparkClientFacotry to create new RemoteHiveSparkClient instance, it's better not public to others.


> On 十一月 14, 2014, 4:39 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java, line 166
> > <https://reviews.apache.org/r/27987/diff/3/?file=763281#file763281line166>
> >
> >     I'm wondering if addFile() should accept uri instead.

I can look at it later, as it's about remote spark context API.


- chengxiang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/#review61398
-----------------------------------------------------------


On 十一月 14, 2014, 3:43 a.m., chengxiang li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27987/
> -----------------------------------------------------------
> 
> (Updated 十一月 14, 2014, 3:43 a.m.)
> 
> 
> Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8833
>     https://issues.apache.org/jira/browse/HIVE-8833
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.
> 
> 
> Diffs
> -----
> 
>   ql/pom.xml 06d7f27 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 
> 
> Diff: https://reviews.apache.org/r/27987/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> chengxiang li
> 
>


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by Xuefu Zhang <xz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/#review61398
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java
<https://reviews.apache.org/r/27987/#comment102959>

    Public or private?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java
<https://reviews.apache.org/r/27987/#comment102965>

    I'm wondering if addFile() should accept uri instead.


- Xuefu Zhang


On Nov. 14, 2014, 3:43 a.m., chengxiang li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27987/
> -----------------------------------------------------------
> 
> (Updated Nov. 14, 2014, 3:43 a.m.)
> 
> 
> Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8833
>     https://issues.apache.org/jira/browse/HIVE-8833
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.
> 
> 
> Diffs
> -----
> 
>   ql/pom.xml 06d7f27 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 
> 
> Diff: https://reviews.apache.org/r/27987/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> chengxiang li
> 
>


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by chengxiang li <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/
-----------------------------------------------------------

(Updated Nov. 14, 2014, 3:43 a.m.)


Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.


Bugs: HIVE-8833
    https://issues.apache.org/jira/browse/HIVE-8833


Repository: hive-git


Description
-------

Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.


Diffs (updated)
-----

  ql/pom.xml 06d7f27 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 

Diff: https://reviews.apache.org/r/27987/diff/


Testing
-------


Thanks,

chengxiang li


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by chengxiang li <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/
-----------------------------------------------------------

(Updated Nov. 14, 2014, 3:34 a.m.)


Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.


Changes
-------

1.enable submit spark job through programatic SparkSubmit.
2.make automatic calculating reduce number work at local spark context mode, so that we don't break qtest.
now this patch should cover all features for this JIRA.


Bugs: HIVE-8833
    https://issues.apache.org/jira/browse/HIVE-8833


Repository: hive-git


Description
-------

Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.


Diffs (updated)
-----

  ql/pom.xml 06d7f27 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5af66ee 

Diff: https://reviews.apache.org/r/27987/diff/


Testing
-------


Thanks,

chengxiang li


Re: Review Request 27987: HIVE-8833 implement remote spark client

Posted by chengxiang li <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27987/#review61250
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java
<https://reviews.apache.org/r/27987/#comment102775>

    disable it temporarily as this depends on spark context, we should enable it in an uniform way for local spark client and remote spark client.


- chengxiang li


On 十一月 13, 2014, 1:10 p.m., chengxiang li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27987/
> -----------------------------------------------------------
> 
> (Updated 十一月 13, 2014, 1:10 p.m.)
> 
> 
> Review request for hive, Rui Li, Szehon Ho, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8833
>     https://issues.apache.org/jira/browse/HIVE-8833
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hive would support submitting spark job through both local spark client and remote spark client. we should unify the spark client API, and implement remote spark client through Remote Spark Context.
> 
> 
> Diffs
> -----
> 
>   ql/pom.xml 06d7f27 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java ee16c9e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 2fea62d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e3e6d16 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 51e0510 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobRef.java bf43b6e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java d4d14a3 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 8346b28 
> 
> Diff: https://reviews.apache.org/r/27987/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> chengxiang li
> 
>