You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/10/12 09:30:14 UTC

[GitHub] [incubator-seatunnel] AlexNilone opened a new issue, #3076: [Bug] [Module Name] 不支持CDH5.13的hive jdbc方式访问吗

AlexNilone opened a new issue, #3076:
URL: https://github.com/apache/incubator-seatunnel/issues/3076

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues.
   
   
   ### What happened
   
   因为一些特殊的原因,需要以jdbc方式来访问hiveserver2服务,集群环境是CDH5.13的。因为涉及到hive版本的问题,用的Spark_home是CDH 通过parcels安装的Spark2.4.0的目录。驱动包另在plugins目录下建的子目录放的是集群的hive-jdbc包
   
   两个问题:
   1、查询不到数据
   2、若以yarn client方式提交作业,报告gss票据问题(集群添加了kerberos认证)
   
   ### SeaTunnel Version
   
   2.1.3
   
   ### SeaTunnel Config
   
   ```conf
   env {
     spark.sql.catalogImplementation = "hive"
   	spark.app.name = "SeaTunnel"
   	spark.executor.instances = 1
   	spark.executor.cores = 1
   	spark.num.executors=1
   	spark.executor.memory = "1g"
   	execution.parallelism = 1
   	spark.yarn.keytab=/hdfs.keytab
   	spark.yarn.principal="hdfs/server001@MYCDH"
   }
   
   source {
   	jdbc {
   			driver = org.apache.hive.jdbc.HiveDriver ,
   	    url = "jdbc:hive2://server001:10000/;principal=hive/server001@MYCDH",
   	    user = "hive",
   	    password = "hive",
   	    table = "test_seatunnel_source"
   	    result_table_name = "test_seatunnel_source"
   
   	}
   }
   
   transform {
   }
   
   sink{
   	Console {}
   
   }
   ```
   
   
   ### Running Command
   
   ```shell
   start-seatunnel-spark.sh --master local --deploy-mode client \
   	--config /data/apache-seatunnel-incubating-2.1.3/config/hivejdbc-console.conf
   
   
   start-seatunnel-spark.sh --master yarn --deploy-mode client \
   	--config /data/apache-seatunnel-incubating-2.1.3/config/hivejdbc-console.conf
   ```
   
   
   ### Error Exception
   
   ```log
   能打印出来表头,但是没有具体的数据内容查询到。
   
   22/10/12 17:20:33 INFO jdbc.Utils: Resolved authority: cdh129135:10000
   22/10/12 17:20:33 INFO jdbc.JDBCRDD: closed connection
   22/10/12 17:20:33 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 1069 bytes result sent to driver
   22/10/12 17:20:33 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 876 ms on localhost (executor driver) (1/1)
   22/10/12 17:20:33 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
   22/10/12 17:20:33 INFO scheduler.DAGScheduler: ResultStage 0 (show at Console.scala:38) finished in 1.295 s
   22/10/12 17:20:33 INFO scheduler.DAGScheduler: Job 0 finished: show at Console.scala:38, took 1.344624 s
   +------------------------+--------------------------+
   |test_seatunnel_source.id|test_seatunnel_source.name|
   +------------------------+--------------------------+
   +------------------------+--------------------------+
   
   22/10/12 17:20:33 INFO spark.SparkContext: Invoking stop() from shutdown hook
   22/10/12 17:20:33 INFO server.AbstractConnector: Stopped Spark@1f52eb6f{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
   22/10/12 17:20:33 INFO ui.SparkUI: Stopped Spark web UI at http://cdh129135:4040
   22/10/12 17:20:33 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
   22/10/12 17:20:33 INFO memory.MemoryStore: MemoryStore cleared
   22/10/12 17:20:33 INFO storage.BlockManager: BlockManager stopped
   22/10/12 17:20:33 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
   22/10/12 17:20:33 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
   22/10/12 17:20:33 INFO spark.SparkContext: Successfully stopped SparkContext
   22/10/12 17:20:33 INFO util.ShutdownHookManager: Shutdown hook called
   22/10/12 17:20:33 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-c931a4e1-6abc-476b-813a-718773b5e110
   22/10/12 17:20:33 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7add2b21-6a02-4943-9cf1-5349f3fc3c37
   
   
   
   ---yarn方式运行报gss报错
   Caused by: org.apache.thrift.transport.TTransportException: GSS initiate failed
           at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
           at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
           at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
           at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
           at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
           at java.security.AccessController.doPrivileged(Native Method)
           at javax.security.auth.Subject.doAs(Subject.java:422)
           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
           at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
           at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:204)
   ```
   
   
   ### Flink or Spark Version
   
   Spark2.4.0( CDH5.13官方parcels安装包)
   
   ### Java or Scala Version
   
   1.8
   
   ### Screenshots
   
   1
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] CalvinKirs closed issue #3076: [Bug] [Module Name] 不支持CDH5.13的hive jdbc方式访问吗

Posted by GitBox <gi...@apache.org>.
CalvinKirs closed issue #3076: [Bug] [Module Name] 不支持CDH5.13的hive jdbc方式访问吗
URL: https://github.com/apache/incubator-seatunnel/issues/3076


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org