You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "pthalasta (via GitHub)" <gi...@apache.org> on 2023/06/07 23:07:42 UTC

[GitHub] [hudi] pthalasta opened a new issue, #8901: [SUPPORT] Spark job never terminates

pthalasta opened a new issue, #8901:
URL: https://github.com/apache/hudi/issues/8901

   **Describe the problem you faced**
   Spark job never terminates when trying to execute the sample code provided in spark-guide docs. It shows a warning message of `Dynamic Attachment Failed` and nothing post that.
   
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Execute spark-guide example as pyspark code and execute it as 
   ```
   spark-submit  --deploy-mode client    spark-test.py  --packages org.apache.hudi:hudi-spark3.3-bundle_2.12:0.13.1,spark-avro_2.13-3.4.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf 'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog' --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
   ```
   
   **Environment Description**
   OS: MacOS
   
   * Hudi version : hudi bundle hudi-spark-bundle_2.12-0.13.1
   
   * Spark version : 3.4.0 (local installation of spark through brew)
   
   * Hive version : 3.1.3
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   **Stacktrace**
   
   ```
   # WARNING: Unable to get Instrumentation. Dynamic Attach failed. You may add this JAR as -javaagent manually, or supply -Djdk.attach.allowAttachSelf
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pthalasta commented on issue #8901: [SUPPORT] Spark job never terminates

Posted by "pthalasta (via GitHub)" <gi...@apache.org>.
pthalasta commented on issue #8901:
URL: https://github.com/apache/hudi/issues/8901#issuecomment-1583381245

   I was able to add some env variable as mentioned in the warning message, however, the job never terminates and these are the last few lines of the logs that i see
   
   ```
   23/06/08 13:54:10 INFO ClusteringUtils: Found 0 files in pending clustering operations
   23/06/08 13:54:10 INFO AbstractTableFileSystemView: Building file system view for partition (files)
   23/06/08 13:54:11 INFO AbstractTableFileSystemView: addFilesToView: NumFiles=1, NumFileGroups=1, FileGroupsCreationTime=0, StoreTimeTaken=0
   ```
   
   Can someone help me with this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #8901: [SUPPORT] Spark job never terminates

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #8901:
URL: https://github.com/apache/hudi/issues/8901#issuecomment-1587006058

   It should be out soon, it's been actively worked upon. Thanks. Closing out this issue. Feel free to reopen in case of any concerns.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #8901: [SUPPORT] Spark job never terminates

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #8901:
URL: https://github.com/apache/hudi/issues/8901#issuecomment-1584077632

   @pthalasta We were still working on spark 3.4 support. Hudi 0.13.1 doesn't support spark 3.4 as of moment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pthalasta commented on issue #8901: [SUPPORT] Spark job never terminates

Posted by "pthalasta (via GitHub)" <gi...@apache.org>.
pthalasta commented on issue #8901:
URL: https://github.com/apache/hudi/issues/8901#issuecomment-1585043901

   @ad1happy2go thanks for the update! I changed the spark version to 3.3 and seem to be working as expected. Is there an ETA on when 3.4 version of spark will be supported?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xushiyan closed issue #8901: [SUPPORT] Spark job never terminates

Posted by "xushiyan (via GitHub)" <gi...@apache.org>.
xushiyan closed issue #8901: [SUPPORT] Spark job never terminates
URL: https://github.com/apache/hudi/issues/8901


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org