You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "fdiazgon (Jira)" <ji...@apache.org> on 2019/12/05 16:15:00 UTC

[jira] [Updated] (BEAM-8896) WITH query AS + SELECT query JOIN other throws invalid type

     [ https://issues.apache.org/jira/browse/BEAM-8896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

fdiazgon updated BEAM-8896:
---------------------------
    Description: 
The first one of the two following queries fails, despite queries being equivalent:
{code:java}
Pipeline p = Pipeline.create();

Schema schemaA =
    Schema.of(
        Schema.Field.of("id", Schema.FieldType.BYTES),
        Schema.Field.of("fA1", Schema.FieldType.STRING));

Schema schemaB =
    Schema.of(
        Schema.Field.of("id", Schema.FieldType.STRING),
        Schema.Field.of("fB1", Schema.FieldType.STRING));

PCollection<Row> inputA =
    p.apply(Create.of(ImmutableList.<Row>of()).withCoder(SchemaCoder.of(schemaA)));

PCollection<Row> inputB =
    p.apply(Create.of(ImmutableList.<Row>of()).withCoder(SchemaCoder.of(schemaB)));

// Fails
String query1 =
    "WITH query AS "
        + "( "
        + " SELECT id, fA1, fA1 AS fA1_2 "
        + " FROM tblA"
        + ") "
        + "SELECT fA1, fB1, fA1_2 "
        + "FROM query "
        + "JOIN tblB ON (TO_HEX(query.id) = tblB.id)";

// Ok
String query2 =
    "WITH query AS "
        + "( "
        + " SELECT fA1, fB1, fA1 AS fA1_2 "
        + " FROM tblA "
        + " JOIN tblB "
        + " ON (TO_HEX(tblA.id) = tblB.id) "
        + ")"
        + "SELECT fA1, fB1, fA1_2 "
        + "FROM query ";

// Ok
String query3 =
    "WITH query AS "
    + "( "
    + " SELECT TO_HEX(id) AS id, fA1, fA1 AS fA1_2 "
    + " FROM tblA"
    + ") "
    + "SELECT fA1, fB1, fA1_2 "
    + "FROM query "
    + "JOIN tblB ON (query.id = tblB.id)";

Schema transform3 =
    PCollectionTuple.of("tblA", inputA)
        .and("tblB", inputB)
        .apply(SqlTransform.query(query3))
        .getSchema();
System.out.println(transform3);

Schema transform2 =
    PCollectionTuple.of("tblA", inputA)
        .and("tblB", inputB)
        .apply(SqlTransform.query(query2))
        .getSchema();
System.out.println(transform2);

Schema transform1 =
    PCollectionTuple.of("tblA", inputA)
        .and("tblB", inputB)
        .apply(SqlTransform.query(query1))
        .getSchema();
System.out.println(transform1);
{code}
 

The error is:
{noformat}
Exception in thread "main" java.lang.AssertionError: Field ordinal 2 is invalid for  type 'RecordType(VARBINARY id, VARCHAR fA1)'Exception in thread "main" java.lang.AssertionError: Field ordinal 2 is invalid for  type 'RecordType(VARBINARY id, VARCHAR fA1)' at org.apache.beam.repackaged.sql.org.apache.calcite.rex.RexBuilder.makeFieldAccess(RexBuilder.java:197){noformat}
 

If I change `schemaB.id` to `BYTES` (while also avoid using `TO_HEX`), all queries work fine. 

  was:
The first one of the two following queries fails, despite queries being equivalent:
{code:java}
Pipeline p = Pipeline.create();

Schema schemaA =
    Schema.of(
        Schema.Field.of("id", Schema.FieldType.BYTES),
        Schema.Field.of("fA1", Schema.FieldType.STRING));

Schema schemaB =
    Schema.of(
        Schema.Field.of("id", Schema.FieldType.STRING),
        Schema.Field.of("fB1", Schema.FieldType.STRING));

PCollection<Row> inputA =
    p.apply(Create.of(ImmutableList.<Row>of()).withCoder(SchemaCoder.of(schemaA)));

PCollection<Row> inputB =
    p.apply(Create.of(ImmutableList.<Row>of()).withCoder(SchemaCoder.of(schemaB)));

// Fails
String query1 =
    "WITH query AS "
        + "( "
        + " SELECT id, fA1, fA1 AS fA1_2 "
        + " FROM tblA"
        + ") "
        + "SELECT fA1, fB1, fA1_2 "
        + "FROM query "
        + "JOIN tblB ON (TO_HEX(query.id) = tblB.id)";

// Ok
String query2 =
    "WITH query AS "
        + "( "
        + " SELECT fA1, fB1, fA1 AS fA1_2 "
        + " FROM tblA "
        + " JOIN tblB "
        + " ON (TO_HEX(tblA.id) = tblB.id) "
        + ")"
        + "SELECT fA1, fB1, fA1_2 "
        + "FROM query ";

Schema transform2 =
    PCollectionTuple.of("tblA", inputA)
        .and("tblB", inputB)
        .apply(SqlTransform.query(query2))
        .getSchema();
System.out.println(transform2);

Schema transform1 =
    PCollectionTuple.of("tblA", inputA)
        .and("tblB", inputB)
        .apply(SqlTransform.query(query1))
        .getSchema();
System.out.println(transform1);
{code}
 

The error is:
{noformat}
Exception in thread "main" java.lang.AssertionError: Field ordinal 2 is invalid for  type 'RecordType(VARBINARY id, VARCHAR fA1)'Exception in thread "main" java.lang.AssertionError: Field ordinal 2 is invalid for  type 'RecordType(VARBINARY id, VARCHAR fA1)' at org.apache.beam.repackaged.sql.org.apache.calcite.rex.RexBuilder.makeFieldAccess(RexBuilder.java:197){noformat}
 

If I change `schemaB.id` to `BYTES` (while also avoid using `TO_HEX`), both queries work fine. 


> WITH query AS + SELECT query JOIN other throws invalid type
> -----------------------------------------------------------
>
>                 Key: BEAM-8896
>                 URL: https://issues.apache.org/jira/browse/BEAM-8896
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql
>    Affects Versions: 2.16.0
>            Reporter: fdiazgon
>            Assignee: Andrew Pilloud
>            Priority: Major
>
> The first one of the two following queries fails, despite queries being equivalent:
> {code:java}
> Pipeline p = Pipeline.create();
> Schema schemaA =
>     Schema.of(
>         Schema.Field.of("id", Schema.FieldType.BYTES),
>         Schema.Field.of("fA1", Schema.FieldType.STRING));
> Schema schemaB =
>     Schema.of(
>         Schema.Field.of("id", Schema.FieldType.STRING),
>         Schema.Field.of("fB1", Schema.FieldType.STRING));
> PCollection<Row> inputA =
>     p.apply(Create.of(ImmutableList.<Row>of()).withCoder(SchemaCoder.of(schemaA)));
> PCollection<Row> inputB =
>     p.apply(Create.of(ImmutableList.<Row>of()).withCoder(SchemaCoder.of(schemaB)));
> // Fails
> String query1 =
>     "WITH query AS "
>         + "( "
>         + " SELECT id, fA1, fA1 AS fA1_2 "
>         + " FROM tblA"
>         + ") "
>         + "SELECT fA1, fB1, fA1_2 "
>         + "FROM query "
>         + "JOIN tblB ON (TO_HEX(query.id) = tblB.id)";
> // Ok
> String query2 =
>     "WITH query AS "
>         + "( "
>         + " SELECT fA1, fB1, fA1 AS fA1_2 "
>         + " FROM tblA "
>         + " JOIN tblB "
>         + " ON (TO_HEX(tblA.id) = tblB.id) "
>         + ")"
>         + "SELECT fA1, fB1, fA1_2 "
>         + "FROM query ";
> // Ok
> String query3 =
>     "WITH query AS "
>     + "( "
>     + " SELECT TO_HEX(id) AS id, fA1, fA1 AS fA1_2 "
>     + " FROM tblA"
>     + ") "
>     + "SELECT fA1, fB1, fA1_2 "
>     + "FROM query "
>     + "JOIN tblB ON (query.id = tblB.id)";
> Schema transform3 =
>     PCollectionTuple.of("tblA", inputA)
>         .and("tblB", inputB)
>         .apply(SqlTransform.query(query3))
>         .getSchema();
> System.out.println(transform3);
> Schema transform2 =
>     PCollectionTuple.of("tblA", inputA)
>         .and("tblB", inputB)
>         .apply(SqlTransform.query(query2))
>         .getSchema();
> System.out.println(transform2);
> Schema transform1 =
>     PCollectionTuple.of("tblA", inputA)
>         .and("tblB", inputB)
>         .apply(SqlTransform.query(query1))
>         .getSchema();
> System.out.println(transform1);
> {code}
>  
> The error is:
> {noformat}
> Exception in thread "main" java.lang.AssertionError: Field ordinal 2 is invalid for  type 'RecordType(VARBINARY id, VARCHAR fA1)'Exception in thread "main" java.lang.AssertionError: Field ordinal 2 is invalid for  type 'RecordType(VARBINARY id, VARCHAR fA1)' at org.apache.beam.repackaged.sql.org.apache.calcite.rex.RexBuilder.makeFieldAccess(RexBuilder.java:197){noformat}
>  
> If I change `schemaB.id` to `BYTES` (while also avoid using `TO_HEX`), all queries work fine. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)