You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/02/21 21:51:54 UTC

[GitHub] [hudi] rubenssoto opened a new issue #2588: [SUPPORT] Cannot create hive connection

rubenssoto opened a new issue #2588:
URL: https://github.com/apache/hudi/issues/2588


   Hello People,
   
   Emr 6.1
   Spark 3.0.0
   Hudi 0.7.0
   
   My EMR is configured to use glue catalog as a metastore, most of the time my Hudi jobs working great, but sometimes I have problems on hive connection:
   ```
   21/02/21 09:30:10 ERROR Client: Application diagnostics message: User class threw exception: java.lang.Exception: Error on Table: mobile_checkout, Error Message: org.apache.hudi.hive.HoodieHiveSyncException: Cannot create hive connection jdbc:hive2://ip-10-0-49-168.us-west-2.compute.internal:10000/
   	at jobs.TableProcessor.start(TableProcessor.scala:95)
   	at TableProcessorWrapper$.$anonfun$main$2(TableProcessorWrapper.scala:23)
   	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
   	at scala.util.Success.$anonfun$map$1(Try.scala:255)
   	at scala.util.Success.map(Try.scala:213)
   	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
   	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
   	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
   	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
   	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
   	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
   	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
   	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
   	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
   
   Exception in thread "main" org.apache.spark.SparkException: Application application_1613496813774_2019 finished with failed status
   	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1191)
   	at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1582)
   	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:936)
   	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1015)
   	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1024)
   	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   21/02/21 09:30:10 INFO ShutdownHookManager: Shutdown hook called
   ```
   
   I open a ticket in AWS, the ticket is open for more than 2 weeks and was not solved yet.
   I know that is not a Hudi problem, but do you have any idea how to solve this?
   Is there another way in Hudi to sync with Hive?
   I have been using Hive with spark for some months and I never saw an error like this.
   
   
   Thank you, so much!!!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-785933427


   @bvaradar I think it is a hive issue, I'm trying to increase hive heap size, I hope it helps.
   
   I process the tables in threads, so I have almost 20 hive connections open.
   
   Do you have any experience with Hudi and Hive? Because Hudi probably execute simple queries to verify table schema
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-818455364


   Hello Guys,
   
   I think is something related to EMR and Hive, we made a solution for our own to only enable hive sync when schema changes.
   
   It could be a good feature to Hudi, enable syncs only when schema changes or a new partition arrives.
   
   I will close the issue, because is not something related to hudi.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-810773796


   @bvaradar which hive version do you use?  How many jobs running at the same time?
   
   Probably your case, you have a big Hive, to respond a lot of requests, in my case I use hive inside EMR only to integrate with Glue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto edited a comment on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto edited a comment on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-783534393


   <img width="1309" alt="Captura de Tela 2021-02-22 às 14 19 11" src="https://user-images.githubusercontent.com/36298331/108744715-07c64a80-7519-11eb-8b02-98261e74474d.png">
   
   Sometimes take a while to show the error, these jobs normally run in 5 minutes, more than 20 minutes trying to connect to hive.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-785309606


   Hello Guys,
   
   I found some new errors:
   
   `21/02/24 18:53:18 ERROR HiveSyncTool: Got runtime exception when hive syncing
   org.apache.hudi.hive.HoodieHiveSyncException: Failed to sync partitions for table order_delivery_failure
   	at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:211)
   	at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:148)
   	at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:94)
   	at org.apache.hudi.HoodieSparkSqlWriter$.syncHive(HoodieSparkSqlWriter.scala:355)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4(HoodieSparkSqlWriter.scala:403)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4$adapted(HoodieSparkSqlWriter.scala:399)
   	at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
   	at org.apache.hudi.HoodieSparkSqlWriter$.metaSync(HoodieSparkSqlWriter.scala:399)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:460)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:218)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:134)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:124)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:123)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:963)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:132)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:132)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:248)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:131)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:963)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:399)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:288)
   	at hudiwriter.HudiWriter.merge(HudiWriter.scala:79)
   	at hudiwriter.HudiContext.writeToHudi(HudiContext.scala:34)
   	at jobs.TableProcessor.start(TableProcessor.scala:86)
   	at TableProcessorWrapper$.$anonfun$main$2(TableProcessorWrapper.scala:23)
   	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
   	at scala.util.Success.$anonfun$map$1(Try.scala:255)
   	at scala.util.Success.map(Try.scala:213)
   	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
   	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
   	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
   	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
   	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
   	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
   	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
   	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
   	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
   Caused by: java.lang.IllegalArgumentException: Partition key parts [] does not match with partition values []. Check partition strategy. 
   	at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:40)
   	at org.apache.hudi.hive.HoodieHiveClient.getPartitionClause(HoodieHiveClient.java:163)
   	at org.apache.hudi.hive.HoodieHiveClient.constructAddPartitions(HoodieHiveClient.java:147)
   	at org.apache.hudi.hive.HoodieHiveClient.addPartitionsToTable(HoodieHiveClient.java:121)
   	at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:206)
   	... 54 more`
   
   
   The last error, happens a lot of times, so after that, I had this error:
   
   `21/02/24 18:54:05 ERROR HiveConnection: Error opening session
   org.apache.thrift.transport.TTransportException
   	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
   	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
   	at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
   	at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
   	at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
   	at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:38)
   	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
   	at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
   	at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
   	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
   	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
   	at org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:168)
   	at org.apache.hive.service.rpc.thrift.TCLIService$Client.OpenSession(TCLIService.java:155)
   	at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:680)
   	at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:200)
   	at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
   	at java.sql.DriverManager.getConnection(DriverManager.java:664)
   	at java.sql.DriverManager.getConnection(DriverManager.java:247)
   	at org.apache.hudi.hive.HoodieHiveClient.createHiveConnection(HoodieHiveClient.java:436)
   	at org.apache.hudi.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:88)
   	at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:66)
   	at org.apache.hudi.HoodieSparkSqlWriter$.syncHive(HoodieSparkSqlWriter.scala:355)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4(HoodieSparkSqlWriter.scala:403)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4$adapted(HoodieSparkSqlWriter.scala:399)
   	at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
   	at org.apache.hudi.HoodieSparkSqlWriter$.metaSync(HoodieSparkSqlWriter.scala:399)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:460)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:218)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:134)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:124)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:123)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:963)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:132)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:132)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:248)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:131)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:963)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:399)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:288)
   	at hudiwriter.HudiWriter.merge(HudiWriter.scala:79)
   	at hudiwriter.HudiContext.writeToHudi(HudiContext.scala:34)
   	at jobs.TableProcessor.start(TableProcessor.scala:86)
   	at TableProcessorWrapper$.$anonfun$main$2(TableProcessorWrapper.scala:23)
   	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
   	at scala.util.Success.$anonfun$map$1(Try.scala:255)
   	at scala.util.Success.map(Try.scala:213)
   	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
   	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
   	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
   	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
   	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
   	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
   	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
   	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
   	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)`
   
   
   and the last one:
   
   `21/02/24 18:54:05 INFO TableProcessor: Searching for files for database: courier_api and table: poc_inbox_history
   Error on Table: deliveryman_route, Error Message: org.apache.hudi.hive.HoodieHiveSyncException: Cannot create hive connection jdbc:hive2://ip-10-0-57-142.us-west-2.compute.internal:10000/
   21/02/24 18:54:05 INFO TableProcessor: Searching for files for database: courier_api and table: poc_kpis
   java.lang.Exception: Error on Table: deliveryman_route, Error Message: org.apache.hudi.hive.HoodieHiveSyncException: Cannot create hive connection jdbc:hive2://ip-10-0-57-142.us-west-2.compute.internal:10000/
   	at jobs.TableProcessor.start(TableProcessor.scala:104)
   	at TableProcessorWrapper$.$anonfun$main$2(TableProcessorWrapper.scala:23)
   	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
   	at scala.util.Success.$anonfun$map$1(Try.scala:255)
   	at scala.util.Success.map(Try.scala:213)
   	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
   	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
   	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
   	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
   	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
   	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
   	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
   	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
   	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

bvaradar commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-812256475


   We use hive-2.3.4 and have more than 100 jobs running  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto removed a comment on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto removed a comment on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-785309606


   Hello Guys,
   
   I found some new errors:
   
   `21/02/24 18:53:18 ERROR HiveSyncTool: Got runtime exception when hive syncing
   org.apache.hudi.hive.HoodieHiveSyncException: Failed to sync partitions for table order_delivery_failure
   	at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:211)
   	at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:148)
   	at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:94)
   	at org.apache.hudi.HoodieSparkSqlWriter$.syncHive(HoodieSparkSqlWriter.scala:355)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4(HoodieSparkSqlWriter.scala:403)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4$adapted(HoodieSparkSqlWriter.scala:399)
   	at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
   	at org.apache.hudi.HoodieSparkSqlWriter$.metaSync(HoodieSparkSqlWriter.scala:399)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:460)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:218)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:134)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:124)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:123)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:963)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:132)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:132)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:248)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:131)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:963)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:399)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:288)
   	at hudiwriter.HudiWriter.merge(HudiWriter.scala:79)
   	at hudiwriter.HudiContext.writeToHudi(HudiContext.scala:34)
   	at jobs.TableProcessor.start(TableProcessor.scala:86)
   	at TableProcessorWrapper$.$anonfun$main$2(TableProcessorWrapper.scala:23)
   	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
   	at scala.util.Success.$anonfun$map$1(Try.scala:255)
   	at scala.util.Success.map(Try.scala:213)
   	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
   	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
   	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
   	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
   	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
   	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
   	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
   	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
   	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
   Caused by: java.lang.IllegalArgumentException: Partition key parts [] does not match with partition values []. Check partition strategy. 
   	at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:40)
   	at org.apache.hudi.hive.HoodieHiveClient.getPartitionClause(HoodieHiveClient.java:163)
   	at org.apache.hudi.hive.HoodieHiveClient.constructAddPartitions(HoodieHiveClient.java:147)
   	at org.apache.hudi.hive.HoodieHiveClient.addPartitionsToTable(HoodieHiveClient.java:121)
   	at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:206)
   	... 54 more`
   
   
   The last error, happens a lot of times, so after that, I had this error:
   
   `21/02/24 18:54:05 ERROR HiveConnection: Error opening session
   org.apache.thrift.transport.TTransportException
   	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
   	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
   	at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
   	at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
   	at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
   	at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:38)
   	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
   	at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
   	at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
   	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
   	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
   	at org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:168)
   	at org.apache.hive.service.rpc.thrift.TCLIService$Client.OpenSession(TCLIService.java:155)
   	at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:680)
   	at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:200)
   	at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
   	at java.sql.DriverManager.getConnection(DriverManager.java:664)
   	at java.sql.DriverManager.getConnection(DriverManager.java:247)
   	at org.apache.hudi.hive.HoodieHiveClient.createHiveConnection(HoodieHiveClient.java:436)
   	at org.apache.hudi.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:88)
   	at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:66)
   	at org.apache.hudi.HoodieSparkSqlWriter$.syncHive(HoodieSparkSqlWriter.scala:355)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4(HoodieSparkSqlWriter.scala:403)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4$adapted(HoodieSparkSqlWriter.scala:399)
   	at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
   	at org.apache.hudi.HoodieSparkSqlWriter$.metaSync(HoodieSparkSqlWriter.scala:399)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:460)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:218)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:134)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:124)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:123)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:963)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:132)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:132)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:248)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:131)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:963)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:399)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:288)
   	at hudiwriter.HudiWriter.merge(HudiWriter.scala:79)
   	at hudiwriter.HudiContext.writeToHudi(HudiContext.scala:34)
   	at jobs.TableProcessor.start(TableProcessor.scala:86)
   	at TableProcessorWrapper$.$anonfun$main$2(TableProcessorWrapper.scala:23)
   	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
   	at scala.util.Success.$anonfun$map$1(Try.scala:255)
   	at scala.util.Success.map(Try.scala:213)
   	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
   	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
   	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
   	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
   	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
   	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
   	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
   	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
   	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)`
   
   
   and the last one:
   
   `21/02/24 18:54:05 INFO TableProcessor: Searching for files for database: courier_api and table: poc_inbox_history
   Error on Table: deliveryman_route, Error Message: org.apache.hudi.hive.HoodieHiveSyncException: Cannot create hive connection jdbc:hive2://ip-10-0-57-142.us-west-2.compute.internal:10000/
   21/02/24 18:54:05 INFO TableProcessor: Searching for files for database: courier_api and table: poc_kpis
   java.lang.Exception: Error on Table: deliveryman_route, Error Message: org.apache.hudi.hive.HoodieHiveSyncException: Cannot create hive connection jdbc:hive2://ip-10-0-57-142.us-west-2.compute.internal:10000/
   	at jobs.TableProcessor.start(TableProcessor.scala:104)
   	at TableProcessorWrapper$.$anonfun$main$2(TableProcessorWrapper.scala:23)
   	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
   	at scala.util.Success.$anonfun$map$1(Try.scala:255)
   	at scala.util.Success.map(Try.scala:213)
   	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
   	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
   	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
   	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
   	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
   	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
   	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
   	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
   	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-783534393


   <img width="1309" alt="Captura de Tela 2021-02-22 às 14 19 11" src="https://user-images.githubusercontent.com/36298331/108744715-07c64a80-7519-11eb-8b02-98261e74474d.png">
   
   Sometimes take a while to show the error, these jobs run in 5 minutes, more than 20 minutes trying to connect to hive.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

bvaradar commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-785928870


   @rubenssoto : The stack-trace does not contain Hudi in it. So, I dont know how to help in this regard. Regarding high cpu load on hive server, Are you also running hive queries apart from HMS integration ?
   
   @n3nash @nsivabalan : Any other ideas ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar edited a comment on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

bvaradar edited a comment on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-810682199


   Hive Syncing through Hudi works fine in AWS production setup (non EMR) without any issues. In case if this helps, Here is an example config through DeltaStreamer.
   
   --hoodie-conf hoodie.datasource.hive_sync.database=<db> --hoodie-conf hoodie.datasource.hive_sync.enable=true --hoodie-conf hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://hiveserver:10000 --hoodie-conf hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.MultiPartKeysValueExtractor --hoodie-conf hoodie.datasource.hive_sync.partition_fields=<partion_fields> --hoodie-conf hoodie.datasource.hive_sync.table=<table_name>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-810416047


   Balaji/Udit: once you respond, can you please remove "awaiting-community-help" label for the issue and add "awaiting-user-response" label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto closed issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto closed issue #2588:
URL: https://github.com/apache/hudi/issues/2588


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-821760500


   Did you file a ticket w/ EMR then? if you know the actual root cause and the fix, would be good to bring it up w/ EMR folks so that EMR can patch the fix and others can benefit. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

bvaradar commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-783449274


   @umehrot2 : Can you check if there is anything that can be done in Hudi to fix this in EMR  ecosystem?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-821765451


   @nsivabalan 
   I really tried, our migration to hudi was late more than 2 months, we have a 2 months ticket with AWS and no solution was gave to us.
   
   We will create a file on table folder with all table columns, when the new dataframe have different columns comparing to that file, we will enable hudi hive sync. I think it could solve the problem for now, until aws gave to us a better solution.
   
   Another approach that we want to try, is to sync hive table through metastore, disabling jdbc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

bvaradar commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-810682199


   Hive Syncing through Hudi works fine in AWS production setup (non EMR) without any issues. Here is an example config through DeltaStreamer.
   
   --hoodie-conf hoodie.datasource.hive_sync.database=<db> --hoodie-conf hoodie.datasource.hive_sync.enable=true --hoodie-conf hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://hiveserver:10000 --hoodie-conf hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.MultiPartKeysValueExtractor --hoodie-conf hoodie.datasource.hive_sync.partition_fields=<partion_fields> --hoodie-conf hoodie.datasource.hive_sync.table=<table_name>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-784705748


   @umehrot2 @bvaradar if you have any idea, please help me, is the only problem that prevents me from turning Hudi to production.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-821761181


   and wrt your point about syncing hive only when required:
   we might have to establish hive connection to get the schema and compare w/ latest hudi schema and to find differences in partitions. so not sure how we can avoid creating hive connections if no changes are done to schema and partitions. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2588: [SUPPORT] Cannot create hive connection

Posted by GitBox <gi...@apache.org>.

rubenssoto commented on issue #2588:
URL: https://github.com/apache/hudi/issues/2588#issuecomment-784695940


   <img width="1680" alt="Captura de Tela 2021-02-23 às 23 09 52" src="https://user-images.githubusercontent.com/36298331/108935409-68927780-762c-11eb-9c3f-591f1b626557.png">
   
   I'm having the problem right now, on master side you could see a lot of cpu usage, but I dont understand why hive is using a lot of CPU


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org