You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Valentyn Tymofieiev (JIRA)" <ji...@apache.org> on 2018/11/15 01:16:00 UTC

[jira] [Created] (BEAM-6069) Bigquery Tornadoes example fails to run when we pass a custom temp location.

Valentyn Tymofieiev created BEAM-6069:
-----------------------------------------

             Summary: Bigquery Tornadoes example fails to run when we pass a custom temp location.
                 Key: BEAM-6069
                 URL: https://issues.apache.org/jira/browse/BEAM-6069
             Project: Beam
          Issue Type: Bug
          Components: examples-java, io-java-gcp
            Reporter: Valentyn Tymofieiev
            Assignee: Reuven Lax


Steps to reproduce:
{noformat}
PROJECT=$(gcloud config get-value project)
BUCKET=${USER}_gcs_bucket
BQ_DATASET=${USER}_bq_dataset
TABLE_NAME=out

bq mk --project=$PROJECT $BQ_DATASET
gsutil mb gs://$BUCKET


PATH_TO_REPO_CLONE=/path/to/beam

mvn archetype:generate -DarchetypeGroupId=org.apache.beam -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples  -DarchetypeVersion=2.8.0  -DgroupId=org.example  -DartifactId=word-count-beam  -Dversion="0.1" -Dpackage=org.apache.beam.examples -DinteractiveMode=false

cd word-count-beam/

mkdir src/main/java/org/apache/beam/examples/cookbook

cp $PATH_TO_REPO_CLONE/examples/java/src/main/java/org/apache/beam/examples/cookbook//BigQueryTornadoes.java ./src/main/java/org/apache/beam/examples/cookbook

mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.cookbook.BigQueryTornadoes -Dexec.args="--runner=DataflowRunner --project=$PROJECT --input=clouddataflow-readonly:samples.weather_stations --gcpTempLocation=gs://$BUCKET/tmp --output=$BQ_DATASET.$TABLE_NAME " -Pdataflow-runner

{noformat}
This fails with:
{noformat}
java.lang.IllegalArgumentException: BigQueryIO.Read needs a GCS temp location to store temp files.
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO$TypedRead.validate(BigQueryIO.java:662)
at org.apache.beam.sdk.Pipeline$ValidateVisitor.enterCompositeTransform(Pipeline.java:641)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:645)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:649)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:311)
at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:245)
at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:458)
at org.apache.beam.sdk.Pipeline.validate(Pipeline.java:577)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:312)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
at org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:166)
at org.apache.beam.examples.cookbook.BigQueryTornadoes.main(BigQueryTornadoes.java:172)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
at java.lang.Thread.run(Thread.java:748)

{noformat}
Ironically, the example works if we remove --gcpTempLocation. From logs, we can see that in that case we use a bucket that looks like: gs://dataflow-staging-us-central1-927334603519.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)