You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by deanchen <gi...@git.apache.org> on 2015/04/20 06:18:33 UTC

[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

GitHub user deanchen opened a pull request:

    https://github.com/apache/spark/pull/5586

    [SPARK-6918][YARN] Secure HBase support.

    Obtain HBase security token with Kerberos credentials locally to be sent to executors. 
    
    Similar to obtainTokenForNamenodes. Fails gracefully if HBase classes are not included in path. 
    
    Tested on eBay's secure HBase cluster.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/deanchen/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5586.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5586
    
----
commit aa7fab658464c751efa44c065961e87be724db3a
Author: Dean Chen <de...@gmail.com>
Date:   2015-04-20T02:19:53Z

    [SPARK-6918][YARN] Secure HBase support.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-94549813
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30596/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-97065628
  
    jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-96678326
  
    So can you detail how one actually uses this?  The hive stuff can be compiled into spark, but hbase cannot be.  So I assume for this to work you have to include the hbase jars.  Does just specifying driver-class-path for both yarn client and cluster modes work?  
    
    did you test this on both secure and non-secure clusters?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-94352454
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-96770354
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-94523582
  
      [Test build #30596 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30596/consoleFull) for   PR 5586 at commit [`aa7fab6`](https://github.com/apache/spark/commit/aa7fab658464c751efa44c065961e87be724db3a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/5586


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by deanchen <gi...@git.apache.org>.
Github user deanchen commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-97659943
  
    @XuTingjun This looks like a generic Spark driver error when an executor crashes. Can you please dig up the executor stack trace containing the root cause?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by deanchen <gi...@git.apache.org>.
Github user deanchen commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-96842084
  
    Yes, including the HBase jars on the driver and/or executor (eg. _/usr/lib/hbase/lib/hbase-client.jar:/usr/lib/hbase/lib/hbase-common.jar:/usr/lib/hbase/lib/hbase-hadoop2-compat.jar:/usr/lib/hbase/lib/hbase-protocol.jar:/usr/lib/hbase/lib/htrace-core-2.04.jar_) will allow the driver and executor to reference the hbase configuration and create a new connection. The assumption is that the hbase jars are also in those same dirs on the executors. Hbase-site.xml will need to be moved in to /conf or in to the Spark conf path since that is where the zk config for HBase is contained.
    
    I've tested this on yarn-client and yarn-cluster on our secure production cluster with hbase 0.98 with and without the hbase jars included. And also in HDP sandbox with hbase 0.98 with a unsecured hbase connection(all running locally). 
    
    Updated the pull request to remove _throw new RuntimeException_ on line 1117 and log as an error since users may be running a secure YARN cluster without security on HBase. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-94549797
  
      [Test build #30596 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30596/consoleFull) for   PR 5586 at commit [`aa7fab6`](https://github.com/apache/spark/commit/aa7fab658464c751efa44c065961e87be724db3a).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by XuTingjun <gi...@git.apache.org>.
Github user XuTingjun commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-97653791
  
    These days I run the select command to read data in hbase with beeline shell, it always throw the exception:
    >java.lang.IllegalStateException: unread block data
            at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2424)
            at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1383)
            at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
            at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
            at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
            at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
            at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
            at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
            at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
            at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
    My cluster information is: /opt/jdk1.8.0_40, hadoop26.0, hbase1.0.0, zookeeper 3.5.0



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-97067460
  
      [Test build #31140 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31140/consoleFull) for   PR 5586 at commit [`0c190ef`](https://github.com/apache/spark/commit/0c190efb5007110426da510136aec90bb0e94e36).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-96766899
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by XuTingjun <gi...@git.apache.org>.
Github user XuTingjun commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-94367970
  
    Yeah, LGTM, I need this function. can we put hbase's config into hbase-site.xml, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-97106637
  
      [Test build #31140 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31140/consoleFull) for   PR 5586 at commit [`0c190ef`](https://github.com/apache/spark/commit/0c190efb5007110426da510136aec90bb0e94e36).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-94522526
  
    Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by deanchen <gi...@git.apache.org>.
Github user deanchen commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-94352574
  
    @XuTingjun noticed you were also interested in this feature on https://github.com/apache/spark/pull/5031


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by deanchen <gi...@git.apache.org>.
Github user deanchen commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-94471741
  
    The HBaseConfiguration object will read from hbase-default.xml or hbase-site.xml in the classpath. Do you have hbase config in another file? The zookeeper configs are what is needed for obtaining the security token and should always be in hbase-site.xml so just copying that in to the Spark config dir should do the trick.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/5586#issuecomment-97433860
  
    I think this looks good.  It would be nice to have an example on accessing hbase for other users to reference but that is out of scope of this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org