You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/11 15:19:56 UTC

[GitHub] [hudi] tooptoop4 opened a new issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

tooptoop4 opened a new issue #1948:
URL: https://github.com/apache/hudi/issues/1948


   /home/ec2-user/spark_home/bin/spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer --jars "/home/ec2-user/spark-avro_2.11-2.4.6.jar" --master spark://redact:7077 --deploy-mode client /home/ec2-user/hudi-utilities-bundle_2.11-0.5.3-1.jar --table-type COPY_ON_WRITE --source-ordering-field dms_timestamp --source-class org.apache.hudi.utilities.sources.ParquetDFSSource --target-base-path s3a://redact/my/finaltbl --target-table mytestdms --transformer-class org.apache.hudi.utilities.transform.AWSDmsTransformer --payload-class org.apache.hudi.payload.AWSDmsAvroPayload --hoodie-conf hoodie.datasource.write.recordkey.field=id --hoodie-conf hoodie.datasource.write.partitionpath.field=id --hoodie-conf hoodie.deltastreamer.source.dfs.root=s3a://redact/my/dms/test
   
   ```
   2020-08-11 15:11:43,418 [main] ERROR org.apache.hudi.common.util.DFSPropertiesConfiguration - Error reading in properies from dfs
   java.io.FileNotFoundException: File file:/home/ec2-user/src/test/resources/delta-streamer-config/dfs-source.properties does not exist
           at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:635)
           at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:861)
           at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:625)
           at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
           at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:146)
           at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:347)
           at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:787)
           at org.apache.hudi.common.util.DFSPropertiesConfiguration.visitFile(DFSPropertiesConfiguration.java:87)
           at org.apache.hudi.common.util.DFSPropertiesConfiguration.<init>(DFSPropertiesConfiguration.java:60)
           at org.apache.hudi.common.util.DFSPropertiesConfiguration.<init>(DFSPropertiesConfiguration.java:64)
           at org.apache.hudi.utilities.UtilHelpers.readConfig(UtilHelpers.java:118)
           at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.<init>(HoodieDeltaStreamer.java:451)
           at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.<init>(HoodieDeltaStreamer.java:97)
           at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.<init>(HoodieDeltaStreamer.java:91)
           at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:380)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
           at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
           at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
           at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
           at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
           at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
           at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
           at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bhasudha commented on issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

Posted by GitBox <gi...@apache.org>.
bhasudha commented on issue #1948:
URL: https://github.com/apache/hudi/issues/1948#issuecomment-673347421


   You would need to set the `--props` config for DeltaStreamer with a valid property file - https://github.com/apache/hudi/blob/379cf0786fe9fea94ec8c0da7d467ae2fb30dd0b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L217/ . Or pass in the props individually using `--hoodie-conf `. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bhasudha commented on issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

Posted by GitBox <gi...@apache.org>.
bhasudha commented on issue #1948:
URL: https://github.com/apache/hudi/issues/1948#issuecomment-673938434


   I think your configs missed the schema provider related configs. Sample property file configs here - https://github.com/apache/hudi/blob/master/hudi-utilities/src/test/resources/delta-streamer-config/dfs-source.properties   
   
   You would need to specify the source table schema and target table schema. Thats my guess. Can you try with that and see ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] tooptoop4 commented on issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

Posted by GitBox <gi...@apache.org>.
tooptoop4 commented on issue #1948:
URL: https://github.com/apache/hudi/issues/1948#issuecomment-673519517


   I use hoodie-conf as shown in the description, but is property file mandatory?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] tooptoop4 commented on issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

Posted by GitBox <gi...@apache.org>.
tooptoop4 commented on issue #1948:
URL: https://github.com/apache/hudi/issues/1948#issuecomment-673968115


   y do I need those schemas? the spark job succeeds and data is as expected in target table, just wonder why it needs to print java.io.FileNotFoundException


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #1948:
URL: https://github.com/apache/hudi/issues/1948#issuecomment-678230791


   @tooptoop4 : As a workaround, pass an empty but existing file. I have opened a jira (https://issues.apache.org/jira/browse/HUDI-1209) to allow for  props to be optional.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar closed issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

Posted by GitBox <gi...@apache.org>.
bvaradar closed issue #1948:
URL: https://github.com/apache/hudi/issues/1948


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org