You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/01/19 21:23:16 UTC

[GitHub] [hudi] harishraju-govindaraju commented on issue #4641: [SUPPORT] - HudiDeltaStreamer - EMR - SparkSubmit Not working

harishraju-govindaraju commented on issue #4641:
URL: https://github.com/apache/hudi/issues/4641#issuecomment-1016880242


   Folks,
   
   I had to change the jar locations to S3 path and managed to  overcome the error. However, i am facing another error. I am using DeltaStreamer for first time run. I was in an assumption that the first time we run, the deltastreamer will create the hudi table. I get an error saying the hoodie table is not found ? Does that mean that i cannot use deltastreamer for initial loads.
   
   Exception in thread "main" org.apache.hudi.exception.TableNotFoundException: Hoodie table not found in path s3://ztrusted1/default/hudi-table1/.hoodie
   
   Here is my spark-submit command. Please help .
   
   spark-submit \
   --jars "s3://zcustomjar/spark-avro_2.12-3.1.2.jar,s3://zcustomjar/hudi-spark-bundle_2.11-0.5.3-rc2.jar" \
   --deploy-mode "client" \
   --class "org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer" /usr/lib/hudi/hudi-utilities-bundle.jar \
   --table-type COPY_ON_WRITE \
   --source-ordering-field id \
   --target-base-path s3://ztrusted1/default/hudi-table1 --target-table hudi-table1 \
   --hoodie-conf hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator \
   --hoodie-conf hoodie.datasource.write.recordkey.field=id \
   --hoodie-conf hoodie.deltastreamer.source.dfs.root=s3://zlanding1 \
   --hoodie-conf hoodie.datasource.write.partitionpath.field=compcode \
   --hoodie-conf hoodie.datasource.write.operation=insert
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org