You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Fallon (JIRA)" <ji...@apache.org> on 2017/08/10 17:38:00 UTC
[jira] [Created] (BEAM-2761) Write to empty BigQuery partition
fails with "No schema specified on job or table." despite having provided
schema
Fallon created BEAM-2761:
----------------------------
Summary: Write to empty BigQuery partition fails with "No schema specified on job or table." despite having provided schema
Key: BEAM-2761
URL: https://issues.apache.org/jira/browse/BEAM-2761
Project: Beam
Issue Type: Bug
Components: runner-dataflow
Affects Versions: 2.1.0, 2.2.0
Reporter: Fallon
Assignee: Thomas Groh
Priority: Minor
In 2.1.0-SNAPSHOT and 2.2.0-SNAPSHOT, jobs writing an empty PCollection to a BigQuery partition fail with "java.lang.RuntimeException: Failed to create load job with id prefix". This is associated with a message "No schema specified on job or table" even though a schema is provided.
Command to run job:
{code}
mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.EmptyPCollection \
-Dexec.args="--runner=DataflowRunner --project=<GCP project> \
--gcpTempLocation=<tmp location>" \
-Pdataflow-runner
{code}
Code to reproduce the problem:
{code:title=EmptyPCollection.java|borderStyle=solid}
public class EmptyPCollection {
public static void main(String[] args) {
PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create();
options.setTempLocation("<your tmp location>");
Pipeline pipeline = Pipeline.create(options);
String schema = "{\"fields\": [{\"name\": \"pet\", \"type\": \"string\", \"mode\": \"required\"}]}";
String table = "mydataset.pets";
List<String> pets = Arrays.asList("Dog", "Cat", "Goldfish");
PCollection<String> inputText = pipeline.apply(Create.of(pets)).setCoder(StringUtf8Coder.of());
PCollection<TableRow> rows = inputText.apply(ParDo.of(new DoFn<String, TableRow>() {
@ProcessElement
public void processElement(ProcessContext c) {
String text = c.element();
if (text.startsWith("X")) { // change to (D)og and works fine
TableRow row = new TableRow();
row.set("pet", text);
c.output(row);
}
}
}));
rows.apply(BigQueryIO.writeTableRows().to(table).withJsonSchema(schema)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED));
pipeline.run().waitUntilFinish();
}
}
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)