You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by vanzin <gi...@git.apache.org> on 2018/01/05 22:32:37 UTC

[GitHub] spark pull request #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' opt...

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/20169

    [SPARK-17088][hive] Fix 'sharesHadoopClasses' option when creating client.

    Because the call to the constructor of HiveClientImpl crosses class loader
    boundaries, different versions of the same class (Configuration in this
    case) were loaded, and that caused a runtime error when instantiating the
    client. By using a safer type in the signature of the constructor, it's
    possible to avoid the problem.
    
    I considered removing 'sharesHadoopClasses', but it may still be desired
    (even though there are 0 users of it since it was not working). When Spark
    starts to support Hadoop 3, it may be necessary to use that option to
    load clients for older Hive metastore versions that don't know about
    Hadoop 3.
    
    Tested with added unit test.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-17088

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20169.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20169
    
----
commit 668fcbac6b24e6c1e4b6c720459654ca8e88f03c
Author: Marcelo Vanzin <va...@...>
Date:   2018-01-05T22:28:09Z

    [SPARK-17088][hive] Fix 'sharesHadoopClasses' option when creating client.
    
    Because the call to the constructor of HiveClientImpl crosses class loader
    boundaries, different versions of the same class (Configuration) in this
    case were loaded, and that caused a runtime error when instantiating the
    client. By using a safer type in the signature of the constructor, it's
    possible to avoid the problem.
    
    I considered removing 'sharesHadoopClasses', but it may still be desired
    (even though there are 0 users of it since it was not working). When Spark
    starts to support Hadoop 3, it may be necessary to use that option to
    load clients for older Hive metastore versions that don't know about
    Hadoop 3.
    
    Tested with added unit test.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Yes I'm not following, @gatorsmile. You're saying there are no owners but wanted to revert a change because you felt it was code that should only change with your review. You're correct that there are no formal owners and that this all relies on proactive review and proactive ping. We have to balance expediency with the cost of occasional post hoc review .
    
    Above you suggested you hadn't really looked at this change, and I don't see any specific objection you have. Reverting would have been well out of line. 
    
    This seemed fine process wise until that point. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    And where exactly was a peer review missing here?
    
    I didn't check this in without review. You're basically saying that only you and Wenchen are acceptable "peers". I just disagree strongly with that statement.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' opt...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20169#discussion_r163434545
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -82,8 +83,9 @@ import org.apache.spark.util.{CircularBuffer, Utils}
      */
     private[hive] class HiveClientImpl(
         override val version: HiveVersion,
    +    warehouseDir: Option[String],
         sparkConf: SparkConf,
    -    hadoopConf: Configuration,
    +    hadoopConf: JIterable[JMap.Entry[String, String]],
    --- End diff --
    
    The test case was added in this PR!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    >  I am not familiar with the code base of SHS
    
    Please leave the strawmen out of the discussion. I *am* familiar with this part of the code base, you just believe I'm not.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/148/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    @vanzin You might not understand my point. We need a peer review, even if we are familiar with the code base. Let me give a better example for clarifying my point. 
    
    Even if I am familiar with SQL, I still ping @cloud-fan , right? Even if @cloud-fan might not have a bandwidth to review the PR, at least he is pinged. Do you think we should encourage this in our community? Also cc @srowen 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    > That is why I suggested above to ping me and @cloud-fan when the community does the change in Spark SQL.
    
    The point I'm making (and I believe Hyukjin too) is that it should be the opposite. If you want to be involved in all these reviews, it's up to you to monitor what's going on. A new contributor, for example, cannot know who to ping when he wants to make a change.
    
    And I hope you see how "x and y must be pinged in any SQL review" goes against the idea that the project is owned by all committers, not just a few of them.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    > My only concern is we should notify people who wrote the code 
    
    Like, ahem, me? (Check this code history. I've worked on quite a good chunk of it over time.)
    
    I agree with what @HyukjinKwon said here. A policy of "some people are required to be pinged on every PR in a certain module" is exactly what was voted down by the PMC. We already have a huge bandwidth problem with review, we should be encouraging committers (and other contributors) to be more active, not discouraging them that with that sort of policy.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    A late LGTM. @vanzin I'm sorry that we sometimes missed your pinging, but I think it's still good to ping more SQL people for SQL changes, for notification purpose. A post-hoc review is better than no review. Thanks for your understading!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    > Could you please ping me or @cloud-fan when changing the codes in SQL?
    
    I've had really spotty results pinging people. How about you guys make an effort to monitor changes in SQL if you really want to personally approve all of them?
    
    > Could we revert this PR since this is not well reviewed?
    
    No, it's well reviewed and tested. You not really understanding the fix is a different issue. I'd appreciate if you made a bigger effort than you made so far in understanding the issue and the fix.
    
    Thanks for taking a late look though.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Thank you for your previous contributions~ In the recent few releases, there are major refactoring in the related codes. The codes do not belong to anybody. Our goal is to build the best-quality Spark and benefit more people. 
    
    To improve the code quality, we always encourage more reviews from the committers and contributors, especially the ones who are active in the related code base. That is why I suggested above to ping me and @cloud-fan when the community does the change in Spark SQL. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' opt...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20169#discussion_r163431536
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -82,8 +83,9 @@ import org.apache.spark.util.{CircularBuffer, Utils}
      */
     private[hive] class HiveClientImpl(
         override val version: HiveVersion,
    +    warehouseDir: Option[String],
         sparkConf: SparkConf,
    -    hadoopConf: Configuration,
    +    hadoopConf: JIterable[JMap.Entry[String, String]],
    --- End diff --
    
    Any test case?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    **[Test build #86539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86539/testReport)** for PR 20169 at commit [`668fcba`](https://github.com/apache/spark/commit/668fcbac6b24e6c1e4b6c720459654ca8e88f03c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    The difference is that I won't care if you get a review from someone else and and check that code in. Even if you don't ping me. By the fact that you are a committer, I trust your judgement that you made the change carefully and got it properly reviewed. If I happen to find a problem in it later, I won't go back and blame you for not pinging me, if that were the case. I'd fix the issue in a new PR and go on.
    
    You seem to be focused on the fact that you were not consulted about a change. You should be focusing on the fact that you are one person among many peers that we've all trusted to have good judgement in moving the project forward, and that you personally do not need to be involved in everything, and that it's really a little out of line for you to imply that.
    
    I don't really have anything else to offer here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' opt...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20169#discussion_r163429758
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -82,8 +83,9 @@ import org.apache.spark.util.{CircularBuffer, Utils}
      */
     private[hive] class HiveClientImpl(
         override val version: HiveVersion,
    +    warehouseDir: Option[String],
         sparkConf: SparkConf,
    -    hadoopConf: Configuration,
    +    hadoopConf: JIterable[JMap.Entry[String, String]],
    --- End diff --
    
    Why this change is needed?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Yeah. That is a valid argument for the new contributors, but it is not true for the active contributors and committers like you and me. 
    
    Let me give an example. I am not familiar with the code base of SHS, but at least, I knew you did most of the work recently. I will ping you before merging the code changes, although I have the right to merge it without notifying you. Do you think this suggestion is good to improve the code quality of SHS? 
    
    WDYT? @srowen 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Let me explain my point here. I am not trying to offend anybody. Spark is becoming more and more complex. Peer review and test cases are the major methods we used for quality control. Spark is an infrastructure software, instead of an application software. We really need to be careful when reviewing the PRs. Nobody's codes are perfect. For example, if I made a change in SHS, I will ping @vanzin for sure, because @vanzin is the original author and the most active committer of SHS.
    
    Personally, I am doing my best to review the changes in the SQL. It would be nice if you can ping me or @cloud-fan . Since Spark 2.3 release is the recent focus, the current focus of the whole community is for testing and reviewing the changes of Spark 2.3 release. 
    
    Regarding this PR, I think we do not need to introduce new parameter `warehouseDir`, which is already contained in `hadoopConf `. Keeping the interface clean is always good, I believe. Please review my follow-up PR: https://github.com/apache/spark/pull/20377


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    @gatorsmile, I see you merge changes with no review frequently, so I'm surprised to see you say this. I trust your judgment when you do. You've made the point repeatedly that you think you should be pinged on changes to this file. OK, people will try to remember that. For your part, you can be more proactive about review, as there are evidently files you really want to keep an eye on. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    >  I think you also notice that Wenchen and I are the top contributors and reviewers in Spark SQL
    
    No disagreement there. 
    
    > If any change made in Spark SQL, I think it would be nice to ping us, right?
    
    Nice but not required. That's the whole point. If a committer is comfortable with the change he made, and the reviews he got, he does not need approval from any specific set of other committers.
    
    Again, if you feel you need to be involved in all reviews, it's up to *you* to do that.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    I think @vanzin is as qualified to modify this code as anyone. I defer to your and his judgment about who really needs to weigh in, if anyone, before merging. We don't, and can't, efficiently operate otherwise. If this had resulted in a breakage or error, I'd expect the author to learn that, actually, modifying this code is more tricky than expected and really does need a second set of eyes next time.
    
    But it didn't, so I'm not sure what problem you're solving. I do not see something that should have happened differently, even in hindsight. This is getting into committer-splaining well-understood ideas to your peers, so, would leave it here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' opt...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20169#discussion_r163434724
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -82,8 +83,9 @@ import org.apache.spark.util.{CircularBuffer, Utils}
      */
     private[hive] class HiveClientImpl(
         override val version: HiveVersion,
    +    warehouseDir: Option[String],
    --- End diff --
    
    Same as the code that existed before.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    **[Test build #86539 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86539/testReport)** for PR 20169 at commit [`668fcba`](https://github.com/apache/spark/commit/668fcbac6b24e6c1e4b6c720459654ca8e88f03c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    @vanzin Could we revert this PR since this is not well reviewed?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    **[Test build #85733 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85733/testReport)** for PR 20169 at commit [`668fcba`](https://github.com/apache/spark/commit/668fcbac6b24e6c1e4b6c720459654ca8e88f03c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    @vanzin I never said I and Wenchen are acceptable peers. I have to correct it. I think you also notice that Wenchen and I are the top contributors and reviewers in Spark SQL. If any change made in Spark SQL, I think it would be nice to ping us, right? Also cc @srowen 
    
    @srowen If you check the history of my commits, I never merge my own PRs without ping the committers who are familiar with the code base. Right? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' opt...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20169#discussion_r163429951
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -82,8 +83,9 @@ import org.apache.spark.util.{CircularBuffer, Utils}
      */
     private[hive] class HiveClientImpl(
         override val version: HiveVersion,
    +    warehouseDir: Option[String],
         sparkConf: SparkConf,
    -    hadoopConf: Configuration,
    +    hadoopConf: JIterable[JMap.Entry[String, String]],
    --- End diff --
    
    That's the bug fix... explained in the PR description.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    It might be _optionally_ good to leave cc who are related with the proposal itself for notification purpose but I don't think it's quite useful to leave cc to few specific committers as a requirement since we suffer from a reviewing bandwidth problem. I was thinking that we should try to distribute this overhead.
    
    I would do the ping only for big, important or invasive changes to let them focus on important stuff.
    
    I think it's basically committer's responsibility to check PRs, rather than asking PR authors to ping and wait for reviews to few specific committers.
    
    To be clear, this PR has been even open left more then 2 weeks after a review and approval!



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    (I take back my comment @gatorsmile. I feel sure I've seen you quickly commit a change -- which I think is entirely fine in cases where you think it's appropriate -- but couldn't find an example clicking through your last 10 PRs or so. You turn them around quickly, but that's not the same.)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    To be clear, I think I personally usually cc related guys. Yup, if I were fixing this codes, I would have cc'ed you @gatorsmile because I agree that it's good to do. Sure, more eyes are better. Sure, I think I am personally trying to support you and other committers to review and roll it better.
    
    I left a comment only because It sounded to cc few specific committers as a requirement because it's not quite what has been on my mind.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Will review this more carefully later. 
    
    @vanzin Could you please ping me or @cloud-fan when changing the codes in SQL? I believe we are more familiar about the code base in this area compared with the other active committers. Thanks! 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    To be honest, I really care the quality, especially for such an important infrastructure software. The above argument is not related to the trust at all. Sorry for any misunderstanding. 
    
    For example, if I got involved in SHS-related PRs, I will ping you for ensuring the code changes in SHS are properly reviewed. I also believe this is what we can do at best practices to improve the quality. I trust all the committers but I just want to ensure the best committers got notified and review it carefully, if possible.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85733/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    > but I don't think it's quite useful to leave cc to few specific committers as a requirement 
    
    I agree with that, we should not be bound to a few specific committers. But I think it's good and polite to **notify** people who is the major author of the file being touched, even he is not a committer. For example, the major author of the touched `HiveClientImpl.scala` is @gatorsmile .
    
    > I think it's basically committer's responsibility to check PRs, rather than asking PR authors to ping and wait for reviews to few specific committers.
    
    I also agree with that, I'm also happy to see this PR got reviewed and merged. My only concern is we should notify people who wrote the code you are changing, and we doesn't need to wait for the response.
     


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' opt...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20169


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' opt...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20169#discussion_r163431452
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -82,8 +83,9 @@ import org.apache.spark.util.{CircularBuffer, Utils}
      */
     private[hive] class HiveClientImpl(
         override val version: HiveVersion,
    +    warehouseDir: Option[String],
    --- End diff --
    
    What is the semantics when `warehouseDir ` is None?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86539/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' opt...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20169#discussion_r163429626
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -82,8 +83,9 @@ import org.apache.spark.util.{CircularBuffer, Utils}
      */
     private[hive] class HiveClientImpl(
         override val version: HiveVersion,
    +    warehouseDir: Option[String],
    --- End diff --
    
    Could you add `@param`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    **[Test build #85733 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85733/testReport)** for PR 20169 at commit [`668fcba`](https://github.com/apache/spark/commit/668fcbac6b24e6c1e4b6c720459654ca8e88f03c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20169: [SPARK-17088][hive] Fix 'sharesHadoopClasses' option whe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20169
  
    @vanzin Yes. That is nice to have, but not required in the Apache community. 
    
    I believe both of us want to build the best-quality Spark. That is why I asked whether we should encourage this in the community, especially for the active committers who are very familiar with the Spark community.
    
    For example, even if I were confident about my own change in SHS, I will still ping you at the beginning. This is just to request your review for double checking the code changes in SHS. Maybe you do not have a bandwidth now, you might still can take a look at it when you are free. I do not expect you to check the whole change history and review all the changes in SHS, right?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org