You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Kamil Szewczyk (JIRA)" <ji...@apache.org> on 2017/07/11 11:57:00 UTC

[jira] [Commented] (BEAM-1286) DataflowRunner handling of missing filesToStage

    [ https://issues.apache.org/jira/browse/BEAM-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16082100#comment-16082100 ] 

Kamil Szewczyk commented on BEAM-1286:
--------------------------------------

DataflowRunner allows to filesToStage to be missing, but then it automatically detects ClassPath Resources to be staged, which are jar files. The same behaviour is in flink runner. So, part of handling is done.

I have set up dataflow-runner and by default when “--gcpTempLocation” option is selected, new staging folder is created and jar files are uploaded there. Without any “--tempLocation” “--gcpTempLocation” or “--stagingLocation” set a new dataflow staging bucket is created and files are uploaded there. 

Those files are available to workers, but not necessarily used by them. I am right?
Why we should fail when nothing to be staged is found? 

> DataflowRunner handling of missing filesToStage
> -----------------------------------------------
>
>                 Key: BEAM-1286
>                 URL: https://issues.apache.org/jira/browse/BEAM-1286
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Daniel Halperin
>              Labels: newbie, starter
>
> DataflowRunner allows filesToStage to be missing -- it logs an error and moves on. Is this the right behavior? It can complicate user experience.
> At least, I guess that if nothing to be staged is found, we should fail.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)