You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 19:17:55 UTC

[GitHub] [beam] kennknowles opened a new issue, #18553: can't read/write hdfs in Flink CLUSTER(Standalone)

kennknowles opened a new issue, #18553:
URL: https://github.com/apache/beam/issues/18553

   i just write a simple demo like:
   
   ```
   
           Configuration conf = new Configuration();
           conf.set("fs.default.name", "hdfs://localhost:9000");
   //other
   codes
           p.apply("ReadLines", TextIO.read().from("hdfs://localhost:9000/tmp/words"))
            
         .apply(TextIO.write().to("hdfs://localhost:9000/tmp/hdfsout"));
   
   ```
   
   
   it works in flink local model with cmd:
   
   ```
   
   mvn exec:java -Dexec.mainClass=com.joe.FlinkWithHDFS     -Pflink-runner     -Dexec.args="--runner=FlinkRunner
   --filesToStage=target/flinkBeam-2.2.0-SNAPSHOT-shaded.jar"
   
   ```
   
   
   but not works in CLUSTER mode:
   
   ```
   
   mvn exec:java -Dexec.mainClass=com.joe.FlinkWithHDFS     -Pflink-runner     -Dexec.args="--runner=FlinkRunner
   --filesToStage=target/flinkBeam-2.2.0-SNAPSHOT-shaded.jar --flinkMaster=localhost:6123 "
   
   ```
   
   
   it seems the flink cluster regard the hdfs as local file system. 
   The input log from flink-jobmanger.log is:
   
   ```
   
   2017-09-27 20:17:37,962 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Successfully
   ran initialization on master in 136 ms.
   2017-09-27 20:17:37,968 INFO  org.apache.beam.sdk.io.FileBasedSource
                          - {color:red}Filepattern hdfs://localhost:9000/tmp/words2 matched 0 files with
   total size 0{color}
   2017-09-27 20:17:37,968 INFO  org.apache.beam.sdk.io.FileBasedSource           
               - Splitting filepattern hdfs://localhost:9000/tmp/words2 into bundles of size 0 took 0 ms
   and produced 0 files a
   nd 0 bundles
   
   
   ```
   
   
   The output  error message is :
   
   ```
   
   Caused by: java.lang.ClassCastException: {color:red}org.apache.beam.sdk.io.hdfs.HadoopResourceId cannot
   be cast to org.apache.beam.sdk.io.LocalResourceId{color}
           at org.apache.beam.sdk.io.LocalFileSystem.create(LocalFileSystem.java:77)
   
          at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:256)
           at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:243)
   
          at org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:922)
           at org.apache.beam.sdk.io.FileBasedSink$Writer.openUnwindowed(FileBasedSink.java:884)
   
          at org.apache.beam.sdk.io.WriteFiles.finalizeForDestinationFillEmptyShards(WriteFiles.java:909)
   
          at org.apache.beam.sdk.io.WriteFiles.access$900(WriteFiles.java:110)
           at org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:858)
   
   
   ```
   
   
   can somebody help me, i've try all the way just can't work it out [cry]
   https://issues.apache.org/jira/browse/BEAM-2457
   
   
   
   
   Imported from Jira [BEAM-2995](https://issues.apache.org/jira/browse/BEAM-2995). Original Jira may contain additional context.
   Reported by: huangjianhuang.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org