You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Claire Yuan <cl...@yahoo-inc.com> on 2017/06/16 18:45:08 UTC

Cannot find output with Apex runner

Hi all,   I was following the instruction here Apache Apex Runner to submit the work into the cluster. The building seems to be successful. However, the thing is that I could not find where the output is. I set my param in my maven command with: --output=/user/claire/output/and I checked with hadoop dfs -ls /home/claire/output/ but seems no such directory created.I also checked my local directory with --output=/home/claire/output/, and still no output there
Finally I set the output directory manually with:--output=hdfs:///user/claireyuan/outputit gave exception as: Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.5.0:java (default-cli) on project beam-examples-java: An exception occured while executing the Java class. null: InvocationTargetException: Unable to find registrar for hdfs -> [Help 1]
  
|  
|   |  
Apache Apex Runner
 Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data p...  |  |

  |

 I am wondering where I should check or modify my output directory to be?
Claire

Re: Cannot find output with Apex runner

Posted by Flavio Fiszman <fl...@google.com>.
Hi Claire, did you ever discover what the issue was?
If you did, would you mind sharing it so we have some knowledge of common
issues and how to fix them?
If not, here are a couple of related questions I found that may be of help:
https://stackoverflow.com/questions/17265002/hadoop-no-filesystem-for-scheme-file
https://stackoverflow.com/questions/26958865/no-filesystem-for-scheme-hdfs/28135140#28135140
Thanks!

On Fri, Jun 16, 2017 at 5:21 PM, Stephen Sisk <si...@google.com> wrote:

> That failure "Failed to construct Hadoop filesystem with configuration
> Configuration: /home/conf/hadoop/core-site.xml,
> /home/conf/hadoop/hdfs-site.xml" seems pretty interesting.
>
> I don't know much about hadoop/hdfs - is there config info you should be
> putting there? Perhaps you have the *-site.xml files in a different
> location? I believe I saw some discussion elsewhere about using a standard
> hadoop environmental variable to set the location to read from (earlier on
> user@)
>
> S
>
> On Fri, Jun 16, 2017 at 3:31 PM Claire Yuan <cl...@yahoo-inc.com>
> wrote:
>
>> Hi,
>>    Here is what I got after I explicitly include the dependency for
>>
>>        <dependency>
>>           <groupId>org.apache.beam</groupId>
>>           <artifactId>beam-sdks-java-io-hadoop-file-system</artifactId>
>>           <scope>runtime</scope>
>>         </dependency>
>> in my pom.xml:
>> Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.5.0:java
>> (default-cli) on project beam-examples-java: An exception occured while
>> executing the Java class. null: InvocationTargetException: Failed to
>> construct Hadoop filesystem with configuration Configuration:
>> /home/conf/hadoop/core-site.xml, /home/conf/hadoop/hdfs-site.xml: No
>> FileSystem for scheme: hdfs ->
>> I am also wondering if I add it correctly?
>>
>> Claire
>>
>>
>> On Friday, June 16, 2017 2:32 PM, Stephen Sisk <si...@google.com> wrote:
>>
>>
>> We've seen a couple reports involving the "Unable to find registrar for
>> hdfs"
>>
>> The other potential cause is misconfiguration of HDFS/beam can't find the
>> HDFS config.
>>
>> I filed https://issues.apache.org/jira/browse/BEAM-2457 - we don't
>> believe this is a bug in beam, but a number of users seem to be running
>> into the issues so there might be an undiagnosed issue or a common
>> misconfiguration problem.
>>
>> Claire - if you figure out the root cause, it'd be helpful if you let us
>> know what solved the issue so we can improve the error message you saw.
>> (and if you can't figure it out, hopefully folks on this list will help you
>> figure it out)
>>
>> S
>>
>> On Fri, Jun 16, 2017 at 1:58 PM Kenneth Knowles <kl...@google.com> wrote:
>>
>> Hi Claire,
>>
>> The 'hdfs' filesystem is registered when you include the artifact
>> "org.apache.beam:beam-sdks-java-io-hadoop-file-system". Do you have this
>> in your dependencies?
>>
>> Kenn
>>
>> On Fri, Jun 16, 2017 at 11:45 AM, Claire Yuan <cl...@yahoo-inc.com>
>> wrote:
>>
>> Hi all,
>>   I was following the instruction here Apache Apex Runner
>> <https://beam.apache.org/documentation/runners/apex/> to submit the work
>> into the cluster. The building seems to be successful. However, the thing
>> is that I could not find where the output is. I set my param in my maven
>> command with:
>> --output=/user/claire/output/
>> and I checked with hadoop dfs -ls /home/claire/output/ but seems no such
>> directory created.
>> I also checked my local directory with
>> --output=/home/claire/output/, and still no output there
>> Finally I set the output directory manually with:
>> --output=hdfs:///user/claireyuan/output
>> it gave exception as: Failed to execute goal
>> org.codehaus.mojo:exec-maven-plugin:1.5.0:java (default-cli) on project
>> beam-examples-java: An exception occured while executing the Java class.
>> null: InvocationTargetException: Unable to find registrar for hdfs -> [Help
>> 1]
>>
>> Apache Apex Runner
>> Apache Beam is an open source, unified model and set of language-specific
>> SDKs for defining and executing data p...
>> <https://beam.apache.org/documentation/runners/apex/>
>> I am wondering where I should check or modify my output directory to be?
>>
>> Claire
>>
>>
>>
>>
>>

Re: Cannot find output with Apex runner

Posted by Stephen Sisk <si...@google.com>.
That failure "Failed to construct Hadoop filesystem with configuration
Configuration: /home/conf/hadoop/core-site.xml,
/home/conf/hadoop/hdfs-site.xml" seems pretty interesting.

I don't know much about hadoop/hdfs - is there config info you should be
putting there? Perhaps you have the *-site.xml files in a different
location? I believe I saw some discussion elsewhere about using a standard
hadoop environmental variable to set the location to read from (earlier on
user@)

S

On Fri, Jun 16, 2017 at 3:31 PM Claire Yuan <cl...@yahoo-inc.com>
wrote:

> Hi,
>    Here is what I got after I explicitly include the dependency for
>
>        <dependency>
>           <groupId>org.apache.beam</groupId>
>           <artifactId>beam-sdks-java-io-hadoop-file-system</artifactId>
>           <scope>runtime</scope>
>         </dependency>
> in my pom.xml:
> Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.5.0:java
> (default-cli) on project beam-examples-java: An exception occured while
> executing the Java class. null: InvocationTargetException: Failed to
> construct Hadoop filesystem with configuration Configuration:
> /home/conf/hadoop/core-site.xml, /home/conf/hadoop/hdfs-site.xml: No
> FileSystem for scheme: hdfs ->
> I am also wondering if I add it correctly?
>
> Claire
>
>
> On Friday, June 16, 2017 2:32 PM, Stephen Sisk <si...@google.com> wrote:
>
>
> We've seen a couple reports involving the "Unable to find registrar for
> hdfs"
>
> The other potential cause is misconfiguration of HDFS/beam can't find the
> HDFS config.
>
> I filed https://issues.apache.org/jira/browse/BEAM-2457 - we don't
> believe this is a bug in beam, but a number of users seem to be running
> into the issues so there might be an undiagnosed issue or a common
> misconfiguration problem.
>
> Claire - if you figure out the root cause, it'd be helpful if you let us
> know what solved the issue so we can improve the error message you saw.
> (and if you can't figure it out, hopefully folks on this list will help you
> figure it out)
>
> S
>
> On Fri, Jun 16, 2017 at 1:58 PM Kenneth Knowles <kl...@google.com> wrote:
>
> Hi Claire,
>
> The 'hdfs' filesystem is registered when you include the artifact
> "org.apache.beam:beam-sdks-java-io-hadoop-file-system". Do you have this in
> your dependencies?
>
> Kenn
>
> On Fri, Jun 16, 2017 at 11:45 AM, Claire Yuan <cl...@yahoo-inc.com>
> wrote:
>
> Hi all,
>   I was following the instruction here Apache Apex Runner
> <https://beam.apache.org/documentation/runners/apex/> to submit the work
> into the cluster. The building seems to be successful. However, the thing
> is that I could not find where the output is. I set my param in my maven
> command with:
> --output=/user/claire/output/
> and I checked with hadoop dfs -ls /home/claire/output/ but seems no such
> directory created.
> I also checked my local directory with
> --output=/home/claire/output/, and still no output there
> Finally I set the output directory manually with:
> --output=hdfs:///user/claireyuan/output
> it gave exception as: Failed to execute goal
> org.codehaus.mojo:exec-maven-plugin:1.5.0:java (default-cli) on project
> beam-examples-java: An exception occured while executing the Java class.
> null: InvocationTargetException: Unable to find registrar for hdfs -> [Help
> 1]
>
> Apache Apex Runner
> Apache Beam is an open source, unified model and set of language-specific
> SDKs for defining and executing data p...
> <https://beam.apache.org/documentation/runners/apex/>
> I am wondering where I should check or modify my output directory to be?
>
> Claire
>
>
>
>
>

Re: Cannot find output with Apex runner

Posted by Claire Yuan <cl...@yahoo-inc.com>.
Hi,    Here is what I got after I explicitly include the dependency for                <dependency>          <groupId>org.apache.beam</groupId>          <artifactId>beam-sdks-java-io-hadoop-file-system</artifactId>          <scope>runtime</scope>        </dependency> in my pom.xml:Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.5.0:java (default-cli) on project beam-examples-java: An exception occured while executing the Java class. null: InvocationTargetException: Failed to construct Hadoop filesystem with configuration Configuration: /home/conf/hadoop/core-site.xml, /home/conf/hadoop/hdfs-site.xml: No FileSystem for scheme: hdfs ->I am also wondering if I add it correctly?
Claire 

    On Friday, June 16, 2017 2:32 PM, Stephen Sisk <si...@google.com> wrote:
 

 We've seen a couple reports involving the "Unable to find registrar for hdfs"
The other potential cause is misconfiguration of HDFS/beam can't find the HDFS config.

I filed https://issues.apache.org/jira/browse/BEAM-2457 - we don't believe this is a bug in beam, but a number of users seem to be running into the issues so there might be an undiagnosed issue or a common misconfiguration problem. 
Claire - if you figure out the root cause, it'd be helpful if you let us know what solved the issue so we can improve the error message you saw. (and if you can't figure it out, hopefully folks on this list will help you figure it out)
S
On Fri, Jun 16, 2017 at 1:58 PM Kenneth Knowles <kl...@google.com> wrote:

Hi Claire,
The 'hdfs' filesystem is registered when you include the artifact "org.apache.beam:beam-sdks-java-io-hadoop-file-system". Do you have this in your dependencies?
Kenn
On Fri, Jun 16, 2017 at 11:45 AM, Claire Yuan <cl...@yahoo-inc.com> wrote:

Hi all,   I was following the instruction here Apache Apex Runner to submit the work into the cluster. The building seems to be successful. However, the thing is that I could not find where the output is. I set my param in my maven command with: --output=/user/claire/output/and I checked with hadoop dfs -ls /home/claire/output/ but seems no such directory created.I also checked my local directory with --output=/home/claire/output/, and still no output there
Finally I set the output directory manually with:--output=hdfs:///user/claireyuan/outputit gave exception as: Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.5.0:java (default-cli) on project beam-examples-java: An exception occured while executing the Java class. null: InvocationTargetException: Unable to find registrar for hdfs -> [Help 1]
  
|  
|   |  
Apache Apex Runner
 Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data p...  |  |

  |

 I am wondering where I should check or modify my output directory to be?
Claire




   

Re: Cannot find output with Apex runner

Posted by Stephen Sisk <si...@google.com>.
We've seen a couple reports involving the "Unable to find registrar for
hdfs"

The other potential cause is misconfiguration of HDFS/beam can't find the
HDFS config.

I filed https://issues.apache.org/jira/browse/BEAM-2457 - we don't believe
this is a bug in beam, but a number of users seem to be running into the
issues so there might be an undiagnosed issue or a common misconfiguration
problem.

Claire - if you figure out the root cause, it'd be helpful if you let us
know what solved the issue so we can improve the error message you saw.
(and if you can't figure it out, hopefully folks on this list will help you
figure it out)

S

On Fri, Jun 16, 2017 at 1:58 PM Kenneth Knowles <kl...@google.com> wrote:

> Hi Claire,
>
> The 'hdfs' filesystem is registered when you include the artifact
> "org.apache.beam:beam-sdks-java-io-hadoop-file-system". Do you have this in
> your dependencies?
>
> Kenn
>
> On Fri, Jun 16, 2017 at 11:45 AM, Claire Yuan <cl...@yahoo-inc.com>
> wrote:
>
>> Hi all,
>>   I was following the instruction here Apache Apex Runner
>> <https://beam.apache.org/documentation/runners/apex/> to submit the work
>> into the cluster. The building seems to be successful. However, the thing
>> is that I could not find where the output is. I set my param in my maven
>> command with:
>> --output=/user/claire/output/
>> and I checked with hadoop dfs -ls /home/claire/output/ but seems no such
>> directory created.
>> I also checked my local directory with
>> --output=/home/claire/output/, and still no output there
>> Finally I set the output directory manually with:
>> --output=hdfs:///user/claireyuan/output
>> it gave exception as: Failed to execute goal
>> org.codehaus.mojo:exec-maven-plugin:1.5.0:java (default-cli) on project
>> beam-examples-java: An exception occured while executing the Java class.
>> null: InvocationTargetException: Unable to find registrar for hdfs -> [Help
>> 1]
>>
>> Apache Apex Runner
>> Apache Beam is an open source, unified model and set of language-specific
>> SDKs for defining and executing data p...
>> <https://beam.apache.org/documentation/runners/apex/>
>> I am wondering where I should check or modify my output directory to be?
>>
>> Claire
>>
>
>

Re: Cannot find output with Apex runner

Posted by Kenneth Knowles <kl...@google.com>.
Hi Claire,

The 'hdfs' filesystem is registered when you include the artifact
"org.apache.beam:beam-sdks-java-io-hadoop-file-system". Do you have this in
your dependencies?

Kenn

On Fri, Jun 16, 2017 at 11:45 AM, Claire Yuan <cl...@yahoo-inc.com>
wrote:

> Hi all,
>   I was following the instruction here Apache Apex Runner
> <https://beam.apache.org/documentation/runners/apex/> to submit the work
> into the cluster. The building seems to be successful. However, the thing
> is that I could not find where the output is. I set my param in my maven
> command with:
> --output=/user/claire/output/
> and I checked with hadoop dfs -ls /home/claire/output/ but seems no such
> directory created.
> I also checked my local directory with
> --output=/home/claire/output/, and still no output there
> Finally I set the output directory manually with:
> --output=hdfs:///user/claireyuan/output
> it gave exception as: Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.5.0:java
> (default-cli) on project beam-examples-java: An exception occured while
> executing the Java class. null: InvocationTargetException: Unable to find
> registrar for hdfs -> [Help 1]
>
> Apache Apex Runner
> Apache Beam is an open source, unified model and set of language-specific
> SDKs for defining and executing data p...
> <https://beam.apache.org/documentation/runners/apex/>
> I am wondering where I should check or modify my output directory to be?
>
> Claire
>