You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by alexliu68 <gi...@git.apache.org> on 2015/02/05 23:31:08 UTC
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
GitHub user alexliu68 opened a pull request:
https://github.com/apache/spark/pull/4406
[SPARK-5622][SQL] add connector configuration to thrift-server
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/alexliu68/spark SPARK-SQL-5622
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4406.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4406
----
commit 482c08fd15bd24130bc34d33498557b86a242492
Author: Alex Liu <al...@yahoo.com>
Date: 2015-02-05T22:23:11Z
[SPARK-5622][SQL] add connector configuration to thrift-server
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
Posted by alexliu68 <gi...@git.apache.org>.
Github user alexliu68 closed the pull request at:
https://github.com/apache/spark/pull/4406
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/4406#issuecomment-74166182
I'm confused by how this solution relates to what is requested in the JIRA. This appears to be moving configuration from System.properties to the hiveconf. Why not just read from the system properties directly?
I'll add that the correct integration point here would be to create a [RelationProvider](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala#L39). When a user asks for a cassandra table we pass the relation provider a SQLContext and you can get any configuration you need from that object.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
Posted by alexliu68 <gi...@git.apache.org>.
Github user alexliu68 commented on the pull request:
https://github.com/apache/spark/pull/4406#issuecomment-74178413
The following log shows how HiveServer2 starts metastore
INFO 2015-02-12 15:37:56 org.apache.hive.service.AbstractService: HiveServer2: Async execution pool size 50
INFO 2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:OperationManager is inited.
INFO 2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service: SessionManager is inited.
INFO 2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service: CLIService is inited.
INFO 2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:ThriftBinaryCLIService is inited.
INFO 2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service: HiveServer2 is inited.
INFO 2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:OperationManager is started.
INFO 2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:SessionManager is started.
INFO 2015-02-12 15:37:56 org.apache.hive.service.AbstractService: Service:CLIService is started.
INFO 2015-02-12 15:37:56 org.apache.hadoop.hive.metastore.HiveMetaStore: 0: Opening raw store with implemenation class:com.datastax.bdp.hadoop.hive.metastore.CassandraHiveMetaStore
DEBUG 2015-02-12 15:37:56 org.apache.hadoop.conf.Configuration: java.io.IOException: config(config)
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:263)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getConf(HiveMetaStore.java:381)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:413)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:402)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:441)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:326)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:286)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4060)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:121)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:104)
at org.apache.hive.service.cli.CLIService.start(CLIService.java:82)
at org.apache.hive.service.CompositeService.start(CompositeService.java:70)
at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:73)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:69)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
DEBUG 2015-02-12 15:37:56 com.datastax.bdp.hadoop.hive.metastore.CassandraHiveMetaStore: Creating CassandraHiveMetaStore
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/4406#issuecomment-74176301
Okay, this code is not giving you access to anything that you couldn't have just gotten from system properties. So it seems like a no-op to me.
its also not clear to me why you need be involved in metastore startup at all.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
Posted by alexliu68 <gi...@git.apache.org>.
Github user alexliu68 commented on the pull request:
https://github.com/apache/spark/pull/4406#issuecomment-78637718
Let's close it
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
Posted by helena <gi...@git.apache.org>.
Github user helena commented on a diff in the pull request:
https://github.com/apache/spark/pull/4406#discussion_r24527030
--- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala ---
@@ -35,7 +37,7 @@ import org.apache.spark.scheduler.{SparkListenerApplicationEnd, SparkListener}
*/
object HiveThriftServer2 extends Logging {
var LOG = LogFactory.getLog(classOf[HiveServer2])
-
+ var connectors = Seq("cassandra")
--- End diff --
@alexliu68 This can't be done with a -D system property we can read in from the connector? or if set in spark conf as 'spark.cassandra.....'? i.e. I don't think anything cassandra specific should be here.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4406#issuecomment-73144244
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
Posted by alexliu68 <gi...@git.apache.org>.
Github user alexliu68 commented on the pull request:
https://github.com/apache/spark/pull/4406#issuecomment-74175322
When Hive-thriftserver with DSE integration starts, it starts our custom Hive metastore which uses a Cassandra client to access Cassandra. The Cassandra client takes the username/password configuration settings to access Cassandra nodes. Another use case is connecting to thrift server through Beeline and passing the username/password. In this use case, Cassandra client has to use hiveconf to get the username/password per user session. That's the reason we can't simply read it through system properties.
RelationProvider is not involved during Hive metastore startup, so I am afraid that it can help for our case.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-5622][SQL] add connector configuration ...
Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/4406#issuecomment-78574035
Any update here? Otherwise I suggest we close this issue.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org