You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Pei He (JIRA)" <ji...@apache.org> on 2016/07/07 22:23:11 UTC

[jira] [Commented] (BEAM-430) Introducing gcpTempLocation that default to tempLocation

    [ https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366882#comment-15366882 ] 

Pei He commented on BEAM-430:
-----------------------------

Since stagingLocation will default to gcpTempLocation, and gcpTempLocation will default to tempLocation, DataflowRunner cannot use stagingLocation as the default value for tempLocation.
We will break the dependency cycle between stagingLocation and tempLocation, which is currently in DataflowRunner.

> Introducing gcpTempLocation that default to tempLocation
> --------------------------------------------------------
>
>                 Key: BEAM-430
>                 URL: https://issues.apache.org/jira/browse/BEAM-430
>             Project: Beam
>          Issue Type: Improvement
>            Reporter: Pei He
>            Assignee: Pei He
>            Priority: Minor
>
> Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. And, it requires tempLocation to be a gcs path.
> Another case is BigQueryIO uses tempLocation and also requires it to be on gcs.
> So, users cannot set tempLocation to a non-gcs path with DataflowRunner or BigQueryIO.
> However, tempLocation could be on any file system. For example, WordCount defaults to output to tempLocation.
> The proposal is to add gcpTempLocation. And, it defaults to tempLocation if tempLocation is a gcs path.
> StagingLocation and BigQueryIO will use gcpTempLocation by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)