You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/10/11 17:36:32 UTC

[GitHub] [hudi] gtwuser opened a new issue, #6925: [SUPPORT]Dynamo DB is not getting created during concurrent writes to table

gtwuser opened a new issue, #6925:
URL: https://github.com/apache/hudi/issues/6925

   A clear and concise description of the problem.
   using below configs as mentioned in document we are writing to hudi tables multiple dataframes concurrently using the `concurrent.futures.ProcessPoolExecutor(max_workers=200) as executor`, but the Dynamo DB is not getting created. Please correct me here, trying to undetrstand why isnt the dynamo DB getting created ?
   ```bash
   'hoodie.write.lock.provider': 'org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider',
                   'hoodie.write.lock.dynamodb.table': 'hudi_db_lock',
                   'hoodie.write.lock.dynamodb.endpoint_url':'dynamodb.us-east-1.amazonaws.com',
                   'hoodie.write.lock.dynamodb.partition_key': 'hudi_db_lock',
   ```
   
   `Full config`:
   ```bash
   commonConfig = {
                   'className': 'org.apache.hudi',
                   'hoodie.datasource.hive_sync.use_jdbc': 'false',
                   'hoodie.datasource.write.precombine.field': 'payload.recordedAt',
                   'hoodie.datasource.write.recordkey.field': 'metadata.msgID,metadata.topic',
                   'hoodie.write.lock.provider': 'org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider',
                   'hoodie.write.lock.dynamodb.table': 'hudi_db_lock',
                   'hoodie.write.lock.dynamodb.endpoint_url':'dynamodb.us-east-1.amazonaws.com',
                   'hoodie.write.lock.dynamodb.partition_key': 'hudi_db_lock',
                   'hoodie.table.name': 'sse',
                   # 'hoodie.consistency.check.enabled': 'true',
                   'hoodie.datasource.hive_sync.database': args['database_name'],
                   'hoodie.datasource.write.reconcile.schema': 'true',
                   'hoodie.datasource.hive_sync.table': f'sse_{"_".join(prefix.split("/")[-7:-5])}'.lower(),
                   'hoodie.datasource.hive_sync.enable': 'true',
                   'path': 's3://' + args['curated_bucket'] + '/merged/sse-native/' + f'{prefix.split("/")[-7]}'.lower(),
                   # 1,024 * 1,024 * 128 = 134,217,728 (134 MB)
                   'hoodie.parquet.small.file.limit': '307200',
                   'hoodie.parquet.max.file.size': '128000000'
               }
   ```
   
   **Environment Description**
   
   * Hudi version : 0.11.1
   
   * Spark version : 3.1
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) :  no
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   We are running hudi apis via aws glue jobs
   
   @n3nash @nsivabalan  @alexeykudinkin please provide some pointer on this. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1275137180

   Tried following the steps mentioned in this issue https://github.com/apache/hudi/issues/2533, but for me i dont find any files containing this `20221011182252268`.  
   
   .hoodie folder contents: https://gist.github.com/gtwuser/c08aac8fbb30574e225cb0cdbdf334d9


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1275108161

   I suspect this issue is leading to other errors like below, please provide some info on this as well :
   ```bash
   2022-10-11 18:24:54,723 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(73)): Error from Python:Traceback (most recent call last):
     File "/tmp/second-delete-upsert.py", line 231, in <module>
       main()
     File "/tmp/second-delete-upsert.py", line 227, in main
       for result in executor.map(startMerging, keys):
     File "/usr/lib64/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
       yield fs.pop().result()
     File "/usr/lib64/python3.7/concurrent/futures/_base.py", line 435, in result
       return self.__get_result()
     File "/usr/lib64/python3.7/concurrent/futures/_base.py", line 384, in __get_result
       raise self._exception
     File "/usr/lib64/python3.7/concurrent/futures/thread.py", line 57, in run
       result = self.fn(*self.args, **self.kwargs)
     File "/tmp/second-delete-upsert.py", line 219, in startMerging
       .mode('append') \
     File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 1107, in save
       self._jwrite.save()
     File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
       answer, self.gateway_client, self.target_id, self.name)
     File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
       return f(*a, **kw)
     File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
       format(target_id, ".", name), value)
   py4j.protocol.Py4JJavaError: An error occurred while calling o3614.save.
   : org.apache.hudi.exception.HoodieRollbackException: Failed to rollback s3://hudi-curated-bucket-b36cf0f0/merged/sse-native/fmc commits 20221011182252268
   	at org.apache.hudi.client.BaseHoodieWriteClient.rollback(BaseHoodieWriteClient.java:789)
   	at org.apache.hudi.client.BaseHoodieWriteClient.rollbackFailedWrites(BaseHoodieWriteClient.java:1198)
   	at org.apache.hudi.client.BaseHoodieWriteClient.rollbackFailedWrites(BaseHoodieWriteClient.java:1181)
   	at org.apache.hudi.client.BaseHoodieWriteClient.rollbackFailedWrites(BaseHoodieWriteClient.java:1169)
   	at org.apache.hudi.client.BaseHoodieWriteClient.lambda$startCommitWithTime$97cdbdca$1(BaseHoodieWriteClient.java:970)
   	at org.apache.hudi.common.util.CleanerUtils.rollbackFailedWrites(CleanerUtils.java:151)
   	at org.apache.hudi.client.BaseHoodieWriteClient.startCommitWithTime(BaseHoodieWriteClient.java:969)
   	at org.apache.hudi.client.BaseHoodieWriteClient.startCommitWithTime(BaseHoodieWriteClient.java:962)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:328)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:184)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:185)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:223)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:220)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:181)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:301)
   	at sun.reflect.GeneratedMethodAccessor231.invoke(Unknown Source)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:238)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieRollbackException: Found in-flight commits after time :20221011182252268, please rollback greater commits first
   	at org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.validateRollbackCommitSequence(BaseRollbackActionExecutor.java:198)
   	at org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.doRollbackAndGetStats(BaseRollbackActionExecutor.java:220)
   	at org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.runRollback(BaseRollbackActionExecutor.java:118)
   	at org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.execute(BaseRollbackActionExecutor.java:145)
   	at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.rollback(HoodieSparkCopyOnWriteTable.java:281)
   	at org.apache.hudi.client.BaseHoodieWriteClient.rollback(BaseHoodieWriteClient.java:773)
   	... 46 more
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1284374671

   and something to keep in mind. if you are spinning up 200 concurrent workers, some are bound to fail if there are overlapping commits and due to timeout not able to acquire locks since there are 200 concurrent writers. you may have to increase some of the lock configs like retries, lock timeouts etc. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1289920224

   sure, thanks!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1303982961

   > > oh btw, one issue I see w/ you configs is, you are not setting the configs properly for multi-writer. https://hudi.apache.org/docs/concurrency_control#enabling-multi-writing
   > > ```
   > > hoodie.write.concurrency.mode=optimistic_concurrency_control
   > > hoodie.cleaner.policy.failed.writes=LAZY
   > > hoodie.write.lock.provider=<lock-provider-classname>
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > you were setting the last one, but not the first one.
   > > Also, can you first try out using just 1 thread along w/ setting above configs. we can go from there if basic set up is not working.
   > 
   > Sorry for the delay. But now after adding above changes we are getting this error now : `Caused by: java.lang.ClassNotFoundException: org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider` [link to full stack of error ](https://gist.github.com/gtwuser/cd23f4851aab035774717f08ce1fbfc2) I feel we are close to fix it, so please let me know whats that im missing now. I checked the classpath of the `DynamoDBBasedLockProvider` it seems correct.
   > 
   > PLEASE NOTE THE JARS USED ARE FROM HUDI `v0.13.0` **jars used**:
   > 
   > ```shell
   > hudi-utilities-bundle_2.12-0.13.0-SNAPSHOT.jar,
   > hudi-spark3.1-bundle_2.12-0.13.0-SNAPSHOT.jar,
   > calcite-core-1.30.0.jar
   > ```
   > 
   > **config used for locking**:
   > 
   > ```shell
   >             'className': 'org.apache.hudi',
   >             'hoodie.datasource.hive_sync.use_jdbc': 'false',
   >             'hoodie.datasource.write.precombine.field': 'payload.recordedAt',
   >             'hoodie.datasource.write.recordkey.field': 'metadata.msgID,metadata.topic',
   >             'hoodie.write.lock.provider': 'org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider',
   >             'hoodie.write.lock.dynamodb.table': 'hudi_db_lock',
   >             'hoodie.write.lock.dynamodb.partition_key':'hudi_db_lock',
   >             'hoodie.write.concurrency.mode': 'optimistic_concurrency_control',
   >             'hoodie.write.lock.dynamodb.endpoint_url':'dynamodb.us-east-1.amazonaws.com',
   >             'hoodie.cleaner.policy.failed.writes': 'LAZY',
   > ```
   
   This same issue if ClassNotFoundException is reproducible with below jars as well which is from hudi build `0.12.1`:
   ```bash
   hudi-utilities_2.12-0.12.1.jar,
   hudi-spark3.1-bundle_2.12-0.12.1.jar,
   calcite-core-1.30.0.jar
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1303973569

   Another observation with below jars in class path the error changes - 
   `Caused by: java.lang.NoClassDefFoundError: com/amazonaws/services/dynamodbv2/model/LockNotGrantedException`: 
   
   **jars used**:
   ```bash
    hudi-utilities_2.12-0.11.1.jar, 
    hudi-spark3.1-bundle_2.12-0.11.1.jar, 
    calcite-core-1.30.0.jar
   ```
   **errors**: [link to fullstack error](https://gist.github.com/gtwuser/4cb8db2c8b186eb8954bb074262aada4)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1284446526

   Sure @nsivabalan will test this as well and get back to you asap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1301606359

   @gtwuser : any updates please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1303947778

   > oh btw, one issue I see w/ you configs is, you are not setting the configs properly for multi-writer. https://hudi.apache.org/docs/concurrency_control#enabling-multi-writing
   > 
   > ```
   > hoodie.write.concurrency.mode=optimistic_concurrency_control
   > hoodie.cleaner.policy.failed.writes=LAZY
   > hoodie.write.lock.provider=<lock-provider-classname>
   > ```
   > 
   > you were setting the last one, but not the first one.
   > 
   > Also, can you first try out using just 1 thread along w/ setting above configs. we can go from there if basic set up is not working.
   
   Sorry for the delay. But now after adding above changes we are getting this error now : 
   `Caused by: java.lang.ClassNotFoundException: org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider` 
   [link to full stack of error ](https://gist.github.com/gtwuser/cd23f4851aab035774717f08ce1fbfc2)
   I feel we are close to fix it, so please let me know whats that im missing now. I checked the classpath of the `DynamoDBBasedLockProvider` it seems correct. 
   
   **config used for locking**:
   ```bash
               'className': 'org.apache.hudi',
               'hoodie.datasource.hive_sync.use_jdbc': 'false',
               'hoodie.datasource.write.precombine.field': 'payload.recordedAt',
               'hoodie.datasource.write.recordkey.field': 'metadata.msgID,metadata.topic',
               'hoodie.write.lock.provider': 'org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider',
               'hoodie.write.lock.dynamodb.table': 'hudi_db_lock',
               'hoodie.write.lock.dynamodb.partition_key':'hudi_db_lock',
               'hoodie.write.concurrency.mode': 'optimistic_concurrency_control',
               'hoodie.write.lock.dynamodb.endpoint_url':'dynamodb.us-east-1.amazonaws.com',
               'hoodie.cleaner.policy.failed.writes': 'LAZY',


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1303995265

   > Hi @gtwuser could you provide the full Spark driver logs for the job with concurrent writers? Could you also share the files under `.hoodie/` in the table base path for debugging?
   @yihua please find the content of `.hoodie` folder but i see all files are of 0 bytes. 
   [_hoodie.zip](https://github.com/apache/hudi/files/9941196/_hoodie.zip)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1276868209

   Hi @gtwuser could you provide the full Spark driver logs for the job with concurrent writers?  Could you also share the files under `.hoodie/` in the table base path for debugging? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1284372437

   oh btw, one issue I see w/ you configs is, you are not setting the configs properly for multi-writer. 
   https://hudi.apache.org/docs/concurrency_control#enabling-multi-writing
   ```
   hoodie.write.concurrency.mode=optimistic_concurrency_control
   hoodie.cleaner.policy.failed.writes=LAZY
   hoodie.write.lock.provider=<lock-provider-classname>
   ```
   
   you were setting the last one, but not the first one. 
   
   Also, can you first try out using just 1 thread along w/ setting above configs. we can go from there if basic set up is not working.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1275139488

   ^^ @nsivabalan @umehrot2 @vinothchandar @alexeykudinkin 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1303965269

   Another issue which i found happening recently along with the ClassFoundNotException:
   ```bash
   2022-11-04 17:28:19,605 INFO [Thread-19] heartbeat.HoodieHeartbeatClient (HoodieHeartbeatClient.java:stop(196)): Deleted heartbeat file for instant 20221104172809097
   2022-11-04 17:28:19,605 INFO [Thread-19] transaction.TransactionManager (TransactionManager.java:close(84)): Transaction manager closed
   2022-11-04 17:28:19,606 ERROR [Thread-19] v2.AppendDataExec (Logging.scala:logError(73)): Data source write support org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite@11fa0f7f is aborting.
   2022-11-04 17:28:19,606 ERROR [Thread-19] internal.DataSourceInternalWriterHelper (DataSourceInternalWriterHelper.java:abort(94)): Commit 20221104172809097 aborted 
   2022-11-04 17:28:19,637 ERROR [Thread-16] v2.AppendDataExec (Logging.scala:logError(73)): Data source write support org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite@61232ffe is aborting.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1284365727

   @gtwuser : gentle ping. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser closed issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser closed issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table
URL: https://github.com/apache/hudi/issues/6925


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gtwuser commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
gtwuser commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1304855171

   After scanning through the existing git issues was able to fix it. Ref: [#5451](https://github.com/apache/hudi/issues/5451)
   Thanks to all. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #6925: [SUPPORT]Table in Dynamo DB is not getting created during concurrent writes to table

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #6925:
URL: https://github.com/apache/hudi/issues/6925#issuecomment-1305139824

   👍 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org