You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Juan Carlos Garcia M (JIRA)" <ji...@apache.org> on 2018/05/22 14:02:00 UTC

[jira] [Commented] (BEAM-2277) IllegalArgumentException when using Hadoop file system for WordCount example.

    [ https://issues.apache.org/jira/browse/BEAM-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483997#comment-16483997 ] 

Juan Carlos Garcia M commented on BEAM-2277:
--------------------------------------------

It looks like this problem is either not 100% fixed or there is a regression in
{noformat}
<beam.version>2.4.0</beam.version>{noformat}
Testing the *WindowedWordCount* from the word-count-beam example failed with 
{noformat}
18/05/22 15:53:49 ERROR executor.Executor: Exception in task 3.2 in stage 4.0 (TID 21)
org.apache.beam.sdk.util.UserCodeException: java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds have the same scheme, but received hdfs, example-counts-windo-14.
at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
at org.apache.beam.sdk.io.WriteFiles$FinalizeTempFileBundles$FinalizeFn$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:141)
at org.apache.beam.runners.spark.translation.DoFnRunnerWithMetrics.processElement(DoFnRunnerWithMetrics.java:64)
at org.apache.beam.runners.spark.translation.SparkProcessContext$ProcCtxtIterator.computeNext(SparkProcessContext.java:145)
at org.apache.beam.runners.spark.repackaged.com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:145)
at org.apache.beam.runners.spark.repackaged.com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:140)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:461)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:921)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:921)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2067)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2067)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds have the same scheme, but received hdfs, example-counts-windo-14.
at org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
at org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:436)
at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:281)
at org.apache.beam.sdk.io.FileBasedSink$WriteOperation.moveToOutputFiles(FileBasedSink.java:755)
at org.apache.beam.sdk.io.WriteFiles$FinalizeTempFileBundles$FinalizeFn.process(WriteFiles.java:798){noformat}
 

> IllegalArgumentException when using Hadoop file system for WordCount example.
> -----------------------------------------------------------------------------
>
>                 Key: BEAM-2277
>                 URL: https://issues.apache.org/jira/browse/BEAM-2277
>             Project: Beam
>          Issue Type: Bug
>          Components: z-do-not-use-sdk-java-extensions
>            Reporter: Aviem Zur
>            Assignee: Aviem Zur
>            Priority: Blocker
>             Fix For: 2.0.0
>
>
> IllegalArgumentException when using Hadoop file system for WordCount example.
> Occurred when running WordCount example using Spark runner on a YARN cluster.
> Command-line arguments:
> {code:none}
> --runner=SparkRunner --inputFile=hdfs:///user/myuser/kinglear.txt --output=hdfs:///user/myuser/wc/wc
> {code}
> Stack trace:
> {code:none}
> java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds have the same scheme, but received file, hdfs.
> 	at org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
> 	at org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:394)
> 	at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:236)
> 	at org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles(FileBasedSink.java:626)
> 	at org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBasedSink.java:516)
> 	at org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:592)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)