You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Ryan Williams (JIRA)" <ji...@apache.org> on 2018/07/16 19:20:00 UTC

[jira] [Comment Edited] (BEAM-4742) Allow custom docker-image in portable wordcount example

    [ https://issues.apache.org/jira/browse/BEAM-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545622#comment-16545622 ] 

Ryan Williams edited comment on BEAM-4742 at 7/16/18 7:19 PM:
--------------------------------------------------------------

Minor follow-up here, I've seen the second issue ({{IOError}} opening a path for writing in a temp dir) many more times while testing portable wordcount; [here is a relevant stack-trace in the full job-server output|https://gist.github.com/ryan-williams/dd240df5c8901c1f0d00e38b0e436c6c#file-gistfile1-txt-L1781-L1801].

Strangely, it usually happens once when I spin up a fresh job-server, failing my first attempt to run portable wordcount, but then an immediate second run succeeds. However, I've also seen the first 2 or 3 fail, so I really don't have an explanation of that; it seems to suggest a race somewhere, which is a little unsettling, but I don't have enough info to say more.

The exception itself comes when {{iobase._WriteBundleDoFn}} calls into {{filebasedsink.open_writer}}, bypassing the {{initialize_write}} call to {{mkdirs}}.

I'm mostly thinking this means we should go through with [#5903|https://github.com/apache/beam/pull/5903] and make {{localfilesystem}} always create intermediate directories on write-paths, which is covered by [BEAM-4747|https://issues.apache.org/jira/browse/BEAM-4747], so I'll continue the discussion there.


was (Author: rdub):
Minor follow-up here, I've seen the second issue ({{IOError}} opening a path for writing in a temp dir) many more times while testing portable wordcount; [here is a relevant stack-trace in the full job-server output|https://gist.github.com/ryan-williams/dd240df5c8901c1f0d00e38b0e436c6c#file-gistfile1-txt-L1781-L1801].

Strangely, it usually happens once when I spin up a fresh job-server, failing my first attempt to run portable wordcount, but then an immediate second run succeeds. However, I've also seen the first 2 or 3 fail, so I really don't have an explanation of that; it seems to suggest a race somewhere, which is a little unsettling, but I don't have enough info to say more.

The exception itself comes when {{iobase._WriteBundleDoFn}} calls into {{filebasedsink.open_writer}}, bypassing the {{initialize_write}} call to {{mkdirs}}.

I'm mostly thinking this means we should go through with [#5903|https://github.com/apache/beam/pull/5903] and make {{localfilesystem}} always create intermediate directories on write-paths.

I'll probably file a fresh JIRA for that, as well.

> Allow custom docker-image in portable wordcount example
> -------------------------------------------------------
>
>                 Key: BEAM-4742
>                 URL: https://issues.apache.org/jira/browse/BEAM-4742
>             Project: Beam
>          Issue Type: Improvement
>          Components: examples-python
>    Affects Versions: 2.5.0
>            Reporter: Ryan Williams
>            Assignee: Ryan Williams
>            Priority: Minor
>             Fix For: 2.5.0
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * -[the default docker image is hard-coded to a bintray URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68], but I published my image to Docker Hub- I missed that [there's already a pipeline option for this|https://github.com/apache/beam/pull/5902#discussion_r201071859]! Thanks [~lcwik]
>  * the default output path is in a temporary directory that doesn't exist at the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)