You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Robertson Williams <rw...@gmail.com> on 2016/05/24 10:24:57 UTC

expected a valid 'gs://' path but was given '/tmp/tmpLocation'

I try with the latest version 0.1.0-SNAPSHOT cloned from git, but when
testing with MinimalWordCount, it throws

    expected a valid 'gs://' path but was given '/tmp/tmpLocation'

Can I run MinimalWordCount example locally (by supplying tmp location at
local file system e.g. file://) or is it bound to gs only? The source[1][2]
seems to me it reads from gs only, but I maybe missing something I am not
aware.

Which part can I change so MinimalWordCount can execute without such error?

Thanks

[1]
https://github.com/apache/incubator-beam/blob/e3105c8e109535f801fd145b91b0c7aa93b86d1a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DataflowPathValidator.java

[2]
https://github.com/apache/incubator-beam/blob/96765f19b1bd8149240cd77eb7cf7fb636e477e4/sdks/java/core/src/main/java/org/apache/beam/sdk/util/gcsfs/GcsPath.java

Re: expected a valid 'gs://' path but was given '/tmp/tmpLocation'

Posted by Punit Naik <na...@gmail.com>.
Below I have attached the MinimalWordCount modified to run on
DirectPipelineRunner

You can execute it by cd'ing into beam parent folder and then executing the
following command:

mvn compile exec:java -pl examples/java
-Dexec.mainClass=org.apache.beam.examples.MinimalWordCount
-Dexec.args="/tmp/in /tmp/out"

Here /tmp/in is the input file and /tmp/out is the output file.

NOTE: For some reasons its only accepting inputs from the /tmp folder and
it only reads files not folders. Not so sure why is it this way.

Re: expected a valid 'gs://' path but was given '/tmp/tmpLocation'

Posted by Robertson Williams <rw...@gmail.com>.
Thanks for the explanation!

On Tue, May 24, 2016 at 7:52 PM, Davor Bonaci <da...@google.com> wrote:

> Yes -- MinimalWordCount example currently defaults to the
> DataflowPipelineRunner, which runs pipelines on the Google Cloud Dataflow
> service. (We'll be changing this.) In general, Cloud-based runners don't
> have access to your local machine, hence the exception you saw.
>
> DirectPipelineRunner can execute pipelines locally, mainly for testing
> purposes.
>
> On Tue, May 24, 2016 at 3:48 AM, Robertson Williams <
> rwilliams.gr@gmail.com> wrote:
>
>> Just find out what goes wrong. Changing to use
>>
>>   org.apache.beam.sdk.options.DirectPipelineOptions
>>   org.apache.beam.sdk.runners.DirectPipelineRunner
>>
>> fixing the problem.
>>
>> Thanks
>>
>>
>> On Tue, May 24, 2016 at 6:24 PM, Robertson Williams <
>> rwilliams.gr@gmail.com> wrote:
>>
>>> I try with the latest version 0.1.0-SNAPSHOT cloned from git, but when
>>> testing with MinimalWordCount, it throws
>>>
>>>     expected a valid 'gs://' path but was given '/tmp/tmpLocation'
>>>
>>> Can I run MinimalWordCount example locally (by supplying tmp location at
>>> local file system e.g. file://) or is it bound to gs only? The source[1][2]
>>> seems to me it reads from gs only, but I maybe missing something I am not
>>> aware.
>>>
>>> Which part can I change so MinimalWordCount can execute without such
>>> error?
>>>
>>> Thanks
>>>
>>> [1]
>>> https://github.com/apache/incubator-beam/blob/e3105c8e109535f801fd145b91b0c7aa93b86d1a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DataflowPathValidator.java
>>>
>>> [2]
>>> https://github.com/apache/incubator-beam/blob/96765f19b1bd8149240cd77eb7cf7fb636e477e4/sdks/java/core/src/main/java/org/apache/beam/sdk/util/gcsfs/GcsPath.java
>>>
>>
>>
>

Re: expected a valid 'gs://' path but was given '/tmp/tmpLocation'

Posted by Davor Bonaci <da...@google.com>.
Yes -- MinimalWordCount example currently defaults to the
DataflowPipelineRunner, which runs pipelines on the Google Cloud Dataflow
service. (We'll be changing this.) In general, Cloud-based runners don't
have access to your local machine, hence the exception you saw.

DirectPipelineRunner can execute pipelines locally, mainly for testing
purposes.

On Tue, May 24, 2016 at 3:48 AM, Robertson Williams <rw...@gmail.com>
wrote:

> Just find out what goes wrong. Changing to use
>
>   org.apache.beam.sdk.options.DirectPipelineOptions
>   org.apache.beam.sdk.runners.DirectPipelineRunner
>
> fixing the problem.
>
> Thanks
>
>
> On Tue, May 24, 2016 at 6:24 PM, Robertson Williams <
> rwilliams.gr@gmail.com> wrote:
>
>> I try with the latest version 0.1.0-SNAPSHOT cloned from git, but when
>> testing with MinimalWordCount, it throws
>>
>>     expected a valid 'gs://' path but was given '/tmp/tmpLocation'
>>
>> Can I run MinimalWordCount example locally (by supplying tmp location at
>> local file system e.g. file://) or is it bound to gs only? The source[1][2]
>> seems to me it reads from gs only, but I maybe missing something I am not
>> aware.
>>
>> Which part can I change so MinimalWordCount can execute without such
>> error?
>>
>> Thanks
>>
>> [1]
>> https://github.com/apache/incubator-beam/blob/e3105c8e109535f801fd145b91b0c7aa93b86d1a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DataflowPathValidator.java
>>
>> [2]
>> https://github.com/apache/incubator-beam/blob/96765f19b1bd8149240cd77eb7cf7fb636e477e4/sdks/java/core/src/main/java/org/apache/beam/sdk/util/gcsfs/GcsPath.java
>>
>
>

Re: expected a valid 'gs://' path but was given '/tmp/tmpLocation'

Posted by Robertson Williams <rw...@gmail.com>.
Just find out what goes wrong. Changing to use

  org.apache.beam.sdk.options.DirectPipelineOptions
  org.apache.beam.sdk.runners.DirectPipelineRunner

fixing the problem.

Thanks


On Tue, May 24, 2016 at 6:24 PM, Robertson Williams <rw...@gmail.com>
wrote:

> I try with the latest version 0.1.0-SNAPSHOT cloned from git, but when
> testing with MinimalWordCount, it throws
>
>     expected a valid 'gs://' path but was given '/tmp/tmpLocation'
>
> Can I run MinimalWordCount example locally (by supplying tmp location at
> local file system e.g. file://) or is it bound to gs only? The source[1][2]
> seems to me it reads from gs only, but I maybe missing something I am not
> aware.
>
> Which part can I change so MinimalWordCount can execute without such error?
>
> Thanks
>
> [1]
> https://github.com/apache/incubator-beam/blob/e3105c8e109535f801fd145b91b0c7aa93b86d1a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DataflowPathValidator.java
>
> [2]
> https://github.com/apache/incubator-beam/blob/96765f19b1bd8149240cd77eb7cf7fb636e477e4/sdks/java/core/src/main/java/org/apache/beam/sdk/util/gcsfs/GcsPath.java
>