You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by alexliu68 <gi...@git.apache.org> on 2015/02/05 23:31:08 UTC

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

GitHub user alexliu68 opened a pull request:

    https://github.com/apache/spark/pull/4406

    [SPARK-5622][SQL] add connector configuration to thrift-server

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/alexliu68/spark SPARK-SQL-5622

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4406.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4406
    
----
commit 482c08fd15bd24130bc34d33498557b86a242492
Author: Alex Liu <al...@yahoo.com>
Date:   2015-02-05T22:23:11Z

    [SPARK-5622][SQL] add connector configuration to thrift-server

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

Posted by alexliu68 <gi...@git.apache.org>.

Github user alexliu68 closed the pull request at:

    https://github.com/apache/spark/pull/4406


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/4406#issuecomment-74166182
  
    I'm confused by how this solution relates to what is requested in the JIRA.  This appears to be moving configuration from System.properties to the hiveconf.  Why not just read from the system properties directly?
    
    I'll add that the correct integration point here would be to create a [RelationProvider](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala#L39).  When a user asks for a cassandra table we pass the relation provider a SQLContext and you can get any configuration you need from that object.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

Posted by alexliu68 <gi...@git.apache.org>.

Github user alexliu68 commented on the pull request:

    https://github.com/apache/spark/pull/4406#issuecomment-74178413
  
    The following log shows how HiveServer2 starts metastore
    
    INFO  2015-02-12 15:37:56 org.apache.hive.service.AbstractService: HiveServer2: Async execution pool size 50
    INFO  2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:OperationManager is inited.
    INFO  2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service: SessionManager is inited.
    INFO  2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service: CLIService is inited.
    INFO  2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:ThriftBinaryCLIService is inited.
    INFO  2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service: HiveServer2 is inited.
    INFO  2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:OperationManager is started.
    INFO  2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:SessionManager is started.
    INFO  2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:CLIService is started.
    INFO  2015-02-12 15:37:56 org.apache.hadoop.hive.metastore.HiveMetaStore: 0: Opening raw store with implemenation class:com.datastax.bdp.hadoop.hive.metastore.CassandraHiveMetaStore
    DEBUG 2015-02-12 15:37:56 org.apache.hadoop.conf.Configuration: java.io.IOException: config(config)
    	at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:263)
    	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getConf(HiveMetaStore.java:381)
    	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:413)
    	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:402)
    	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:441)
    	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:326)
    	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:286)
    	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
    	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
    	at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4060)
    	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:121)
    	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:104)
    	at org.apache.hive.service.cli.CLIService.start(CLIService.java:82)
    	at org.apache.hive.service.CompositeService.start(CompositeService.java:70)
    	at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:73)
    	at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:69)
    	at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:606)
    	at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
    	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
    	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    
    DEBUG 2015-02-12 15:37:56 com.datastax.bdp.hadoop.hive.metastore.CassandraHiveMetaStore: Creating CassandraHiveMetaStore


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/4406#issuecomment-74176301
  
    Okay, this code is not giving you access to anything that you couldn't have just gotten from system properties.  So it seems like a no-op to me.
    
    its also not clear to me why you need be involved in metastore startup at all.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

Posted by alexliu68 <gi...@git.apache.org>.

Github user alexliu68 commented on the pull request:

    https://github.com/apache/spark/pull/4406#issuecomment-78637718
  
    Let's close it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

Posted by helena <gi...@git.apache.org>.

Github user helena commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4406#discussion_r24527030
  
    --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala ---
    @@ -35,7 +37,7 @@ import org.apache.spark.scheduler.{SparkListenerApplicationEnd, SparkListener}
      */
     object HiveThriftServer2 extends Logging {
       var LOG = LogFactory.getLog(classOf[HiveServer2])
    -
    +  var connectors = Seq("cassandra")
    --- End diff --
    
    @alexliu68 This can't be done with a -D system property we can read in from the connector? or if set in spark conf as 'spark.cassandra.....'? i.e. I don't think anything cassandra specific should be here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/4406#issuecomment-73144244
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

Posted by alexliu68 <gi...@git.apache.org>.

Github user alexliu68 commented on the pull request:

    https://github.com/apache/spark/pull/4406#issuecomment-74175322
  
    When Hive-thriftserver with DSE integration starts, it starts our custom Hive metastore which uses a Cassandra client to access Cassandra. The Cassandra client takes the username/password configuration settings to access Cassandra nodes.  Another use case is connecting to thrift server through Beeline and passing the username/password. In this use case, Cassandra client  has to use hiveconf to get the username/password per user session. That's the reason we can't simply read it through system properties.
    
    RelationProvider is not involved during Hive metastore startup, so I am afraid that it can help for our case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/4406#issuecomment-78574035
  
    Any update here?  Otherwise I suggest we close this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org