You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Amit Sela (JIRA)" <ji...@apache.org> on 2016/09/22 15:41:20 UTC

[jira] [Closed] (BEAM-645) Running Wordcount in Spark Checks Locally and Outputs in HDFS

     [ https://issues.apache.org/jira/browse/BEAM-645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amit Sela closed BEAM-645.
--------------------------
       Resolution: Duplicate
    Fix Version/s: 0.3.0-incubating

> Running Wordcount in Spark Checks Locally and Outputs in HDFS
> -------------------------------------------------------------
>
>                 Key: BEAM-645
>                 URL: https://issues.apache.org/jira/browse/BEAM-645
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>    Affects Versions: 0.3.0-incubating
>            Reporter: Jesse Anderson
>            Assignee: Amit Sela
>             Fix For: 0.3.0-incubating
>
>
> When running the Wordcount example with the Spark runner, the Spark runner uses the input file in HDFS. When the program performs its startup checks, it looks for the file in the local filesystem.
> To workaround this issue, you have to create a file in the local filesystem and put the actual file in HDFS.
> Here is the stack trace when the file doesn't exist in the local filesystem:
> {quote}Exception in thread "main" java.lang.IllegalStateException: Unable to find any files matching Macbeth.txt
> 	at org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:199)
> 	at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:279)
> 	at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:192)
> 	at org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
> 	at org.apache.beam.runners.spark.SparkRunner.apply(SparkRunner.java:128)
> 	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:400)
> 	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:323)
> 	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:58)
> 	at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:173)
> 	at org.apache.beam.examples.WordCount.main(WordCount.java:195)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> 	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
> 	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)