You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/27 14:38:01 UTC

[GitHub] [hudi] jdattani opened a new issue, #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

jdattani opened a new issue, #5451:
URL: https://github.com/apache/hudi/issues/5451

   **Describe the problem you faced**
   
   Using DynamoDB as the lock provider for concurrent writes results in an error stating java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException
   
   **To Reproduce**
   
   Steps to reproduce the behaviour:
   
   - Build Hudi from 0.10.1 source files
   
   - Provide the following Hudi write options as part of a PySpark script: 'hoodie.write.concurrency.mode': 'optimistic_concurrency_control', 'hoodie.cleaner.policy.failed.writes': 'LAZY', 'hoodie.write.lock.provider': 'org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider', 'hoodie.write.lock.dynamodb.table': '<TABLE_NAME>', 'hoodie.write.lock.dynamodb.partition_key': '<KEY_NAME>'
   
   
   **Expected behavior**
   
   Job is able to acquire lock.
   
   **Environment Description**
   
   * Hudi version : 0.10.1 
   
   * Spark version : 3.1.2
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   
   
   **Additional context**
   
   Using on Glue 3.0. Dynamo DB table is already created manually and role assigned to the job has all the permissions to operate on the table. 
   
   ```
   'hoodie.write.concurrency.mode': 'optimistic_concurrency_control',
   'hoodie.cleaner.policy.failed.writes': 'LAZY',
   'hoodie.write.lock.dynamodb.endpoint_url': 'dynamodb.us-east-1.amazonaws.com',
   'hoodie.write.lock.provider': 'org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider',
   'hoodie.write.lock.dynamodb.table': '<TABLE_NAME>',
   'hoodie.write.lock.dynamodb.partition_key': '<KEY_NAME>',
   'hoodie.write.lock.dynamodb.region': 'us-east-1',
   
   ```
   
   Tried both with and without providing "hoodie.write.lock.dynamodb.endpoint_url"
   
   Job runs fine without concurrency mode configurations.
   
   **Stacktrace**
   
   ```
   2022-04-27 14:13:05,812 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(73)): Error from Python:Traceback (most recent call last):
     File "/tmp/glue_process_bundle.py", line 17, in <module>
       start_process(glue_ctx, config, glue_catalog_svc)
     File "/tmp/glue_process_bundle.zip/jobs/process.py", line 180, in start_signal_process
       load(final_df, config)
     File "/tmp/glue_process_bundle.zip/jobs/process.py", line 99, in load
       df.write.format("hudi").options(**hudi_options).mode("append").save(config.params.processed_bucket)
     File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 1109, in save
       self._jwrite.save(path)
     File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
       answer, self.gateway_client, self.target_id, self.name)
     File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
       return f(*a, **kw)
     File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
       format(target_id, ".", name), value)
   py4j.protocol.Py4JJavaError: An error occurred while calling o255.save.
   : java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException
   	at java.lang.Class.forName0(Native Method)
   	at java.lang.Class.forName(Class.java:264)
   	at org.apache.hudi.common.util.ReflectionUtils.getClass(ReflectionUtils.java:54)
   	at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:89)
   	at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:100)
   	at org.apache.hudi.client.transaction.lock.LockManager.getLockProvider(LockManager.java:91)
   	at org.apache.hudi.client.transaction.lock.LockManager.unlock(LockManager.java:83)
   	at org.apache.hudi.client.transaction.TransactionManager.endTransaction(TransactionManager.java:71)
   	at org.apache.hudi.client.SparkRDDWriteClient.getTableAndInitCtx(SparkRDDWriteClient.java:445)
   	at org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:157)
   	at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:217)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:277)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:185)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:223)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:220)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:181)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:238)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: java.lang.ClassNotFoundException: com.amazonaws.services.dynamodbv2.model.LockNotGrantedException
   	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
   	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
   	... 51 more
   ```
   
   Since this is NoClassDefFoundError, was wondering if there are some additional sdk jars that I need to include to user this functionality?
   
   Thanks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jdattani commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
jdattani commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1117374695

   @yihua  @umehrot2 The issue was resolved after defining dynamodb partition key name to "key". Thanks for your inputs. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1112747391

   @umehrot2 Do you know the right setup for using DynamoDB as the lock provider?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] kazdy commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
kazdy commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1111420094

   Hi, I would add this jar to your job:
   https://mvnrepository.com/artifact/org.apache.hudi/hudi-aws
   
   I think these dependencies were separated from Spark bundle in Hudi 0.10.1 release: 
   https://github.com/apache/hudi/pull/4542


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1113630739

   @umehrot2  It looks like that DynamoDB throws validation exception when it is used as the lock provider.  Do you have any idea on this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jdattani closed issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
jdattani closed issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException
URL: https://github.com/apache/hudi/issues/5451


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1113795072

   Thanks, @umehrot2 for the information.  Do you think we can improve the document around how to use DynamoDB-based lock provider?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jtmzheng commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by "jtmzheng (via GitHub)" <gi...@apache.org>.
jtmzheng commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1481615398

   
   > @kazdy @yihua I tried including hudi-aws jar to Glue Dependent JARs path. But still getting the exact same error. Is there anything else I can try?
   
   This seems currently broken wrt https://github.com/apache/hudi/issues/5451#issuecomment-1111800234 using the hudi-aws-bundle (same issue). Ran into it on EMR 6.9 (/usr/lib/hudi/hudi-aws-bundle-0.12.1-amzn-0.jar)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jdattani commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
jdattani commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1113001589

   @yihua I tried adding retries as you suggested, still got the same error msg:  Unable to acquire lock, lock object null.
   
   Additional detail on the error states : **"The provided key element does not match the schema "**
   
   ```
   'hoodie.write.concurrency.mode': 'optimistic_concurrency_control',
   'hoodie.cleaner.policy.failed.writes': 'LAZY',
   'hoodie.write.lock.dynamodb.endpoint_url': 'dynamodb.us-east-1.amazonaws.com',
   'hoodie.write.lock.provider': 'org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider',
   'hoodie.write.lock.dynamodb.table': '<TABLE_NAME>',
   'hoodie.write.lock.dynamodb.partition_key': 'tablename',
   'hoodie.write.lock.dynamodb.region': 'us-east-1',
   'hoodie.write.lock.wait_time_ms':'1500',
   'hoodie.write.lock.num_retries':'10'
   ```
   
   I see it retried the operation 3 times.
   
   ```
   
   2022-04-29 07:55:09,998 WARN [Thread-8] dynamodbv2.AmazonDynamoDBLockClient (AmazonDynamoDBLockClient.java:acquireLock(534)): Could not acquire lock because of a client side failure in talking to DDB
   com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: 6EVA0VDQKC3Q7HNI6NNCCL67MJVV4KQNSO5AEMVJF66Q9ASUAAJG; Proxy: null)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1819)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1403)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1372)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
   
   2022-04-29 07:55:11,010 WARN [Thread-8] dynamodbv2.AmazonDynamoDBLockClient (AmazonDynamoDBLockClient.java:acquireLock(534)): Could not acquire lock because of a client side failure in talking to DDB
   com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: M630K6QR1GRQ8UNJKRV1CGNQAJVV4KQNSO5AEMVJF66Q9ASUAAJG; Proxy: null)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1819)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1403)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1372)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770
   
   
   2022-04-29 07:55:12,018 WARN [Thread-8] dynamodbv2.AmazonDynamoDBLockClient (AmazonDynamoDBLockClient.java:acquireLock(534)): Could not acquire lock because of a client side failure in talking to DDB
   com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: N1U0BAITEL1G6Q9HH4UJKO2VERVV4KQNSO5AEMVJF66Q9ASUAAJG; Proxy: null)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1819)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1403)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1372)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
       at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
   
   ```
   I checked the CloudTrail logs and could see a successful "DescribeTable" call from Glue Job to the DynamoDB lock table. So that rules out and IAM related issues. 
   
   The table has only one field - partition key "tablename" as seen in screenshot below
   
   <img width="991" alt="image" src="https://user-images.githubusercontent.com/14809344/165906389-95bf5357-675f-4fea-8e27-7533acbe8558.png">
   
   I also manually inserted a dummy value in the table. 
   
   <img width="998" alt="image" src="https://user-images.githubusercontent.com/14809344/165906550-2d04497c-c8d7-4841-95da-579c4aba969a.png">
   
   Do we also need to define a sort key on the table? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] umehrot2 commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
umehrot2 commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1115302883

   @jdattani I see how it can be a bit confusing. The `partition_key` here actually means the value to use for the column, and not the column name itself. The column name is fixed to be `key`.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1559574192

   @jdattani Closing this bug as its working fine. Please reopen in case you need any more support on same. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1545975371

   @jtmzheng I was not able to reproduce the bug. I tried with both versions 0.12.2 and master code. I was able to successfully use DynamoDB without any issues with spark bundle and aws bundle. 
   
   Can you please let us know the steps to reproduce how you got that error again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1546096342

   Also I have tried with both Glue 3.0 and Glue 4.0 with default "--datalake-formats hudi" and both of them are working fine with dynamo db concurrency.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1115361949

   @jdattani Have you tried the suggestion and see if it works?  Feel free to close this issue once all good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jdattani commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
jdattani commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1111800234

   @kazdy @yihua I tried including hudi-aws jar to Glue Dependent JARs path. But still getting the exact same error. Is there anything else I can try?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1111530355

   @jdattani As @kazdy suggested, you need to add `hudi-aws` jar for classes that are specific to AWS.  Let us know if adding the jar works.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1112757926

   @jdattani Could you add the Hudi config below, bumping the retries, and try again?  It is likely due to transient error.
   ```
   hoodie.write.lock.client.num_retries=10
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jdattani commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
jdattani commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1112023970

   Update: I ended up including all three jars removed from here into Glue Dependent Jars path https://github.com/apache/hudi/pull/4542
   
   com.amazonaws:dynamodb-lock-client
   com.amazonaws:aws-java-sdk-dynamodb
   com.amazonaws:aws-java-sdk-core
   
   The error changed, now getting: An error occurred while calling o255.save. Unable to acquire lock, lock object null.
   Issue similar to the one mentioned here: https://github.com/apache/hudi/issues/4456 and using the same config as suggested in that.  The only difference against what is suggested [in the issue 4456, ](https://github.com/apache/hudi/issues/4456#issuecomment-1041228841) is that I have table already created. 
   
   ```
   2022-04-28 10:00:19,695 ERROR [spark-listener-group-eventLog] scheduler.AsyncEventQueue (Logging.scala:logError(94)): Listener EventLoggingListener threw an exception
   java.util.ConcurrentModificationException
   	at java.util.Hashtable$Enumerator.next(Hashtable.java:1408)
   	at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$6.next(Wrappers.scala:424)
   	at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$6.next(Wrappers.scala:420)
   	at scala.collection.Iterator.foreach(Iterator.scala:941)
   	at scala.collection.Iterator.foreach$(Iterator.scala:941)
   	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
   	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
   
   File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
       format(target_id, ".", name), value)
   py4j.protocol.Py4JJavaError: An error occurred while calling o255.save.
   : **org.apache.hudi.exception.HoodieLockException: Unable to acquire lock, lock object null**
   	at org.apache.hudi.client.transaction.lock.LockManager.lock(LockManager.java:76)
   	at org.apache.hudi.client.transaction.TransactionManager.beginTransaction(TransactionManager.java:51)
   	at org.apache.hudi.client.SparkRDDWriteClient.getTableAndInitCtx(SparkRDDWriteClient.java:430)
   	at org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:157)
   	at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:217)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:277)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:185)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:223)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:220)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:181)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:238)
   	at java.lang.Thread.run(Thread.java:750)
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] umehrot2 commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
umehrot2 commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1115303685

   Created a couple of Jiras to improve the naming/docs and to package the jars in a bundle so customers don't have to pass these AWS specific jars manually:
   
   https://issues.apache.org/jira/browse/HUDI-4011
   https://issues.apache.org/jira/browse/HUDI-4010


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] umehrot2 commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
umehrot2 commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1113643504

   Did you manually create the DynamoDB table ? The code needs a string attribute name `key` in that table and thats it. Not sure where `tableName` attribute came from. I am guessing you have added it manually.
   
   I would recommend deleting the table and let Hudi code setup the DynamoDB table for you. And regarding the other errors regarding class not found, yes it seems those jars have been removed from the bundle. Simply adding `hudi-aws` jar will not help, since it is not a bundle. I will look into why such a change was made, but for now you have to workaround by adding those jars yourself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jdattani commented on issue #5451: [SUPPORT] Hudi 0.10.1 raises exception java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException

Posted by GitBox <gi...@apache.org>.
jdattani commented on issue #5451:
URL: https://github.com/apache/hudi/issues/5451#issuecomment-1113914876

   @umehrot2 Thanks for your inputs. Yes, the table was manually created. 
   
   Since partition_key is an attribute in hoodie config, I assumed it could be any name and hudi would read it from config.
   https://hudi.apache.org/docs/concurrency_control/
   
   hoodie.write.lock.provider=org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider
   hoodie.write.lock.dynamodb.table
   hoodie.write.lock.dynamodb.partition_key
   hoodie.write.lock.dynamodb.region
   
   Nevertheless I will try changing the partition key field name and update if that solves the issue. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org