You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Sonam Ramchand <so...@venturedive.com> on 2021/02/09 18:22:37 UTC
COVAR_POP aggregate function test for the ZetaSql dialec
Hi Devs,
I am trying to test the COVAR_POP aggregate function for the ZetaSql
dialect. I see
https://github.com/apache/beam/blob/b74fcf7b30d956fb42830d652a57b265a1546973/sdks/[…]he/beam/sdk/extensions/sql/impl/transform/agg/CovarianceFn.java
<https://github.com/apache/beam/blob/b74fcf7b30d956fb42830d652a57b265a1546973/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/agg/CovarianceFn.java#L50>
is
implemented as CombineFn and it works correctly
https://github.com/apache/beam/blob/befcc3d780d561e81f23512742862a65c0ae3b69/sdks/[…]eam/sdk/extensions/sql/BeamSqlDslAggregationCovarianceTest.java
<https://github.com/apache/beam/blob/befcc3d780d561e81f23512742862a65c0ae3b69/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslAggregationCovarianceTest.java#L87>
.However, for ZetaSql dialect, it throws:
covar_pop has more than one argument.
java.lang.IllegalArgumentException: covar_pop has more than one argument.Unit
test:
public void testZetaSqlCovarPop() {
String sql = "SELECT COVAR_POP(row_id,int64_col) FROM
table_all_types GROUP BY bool_col";
ZetaSQLQueryPlanner zetaSQLQueryPlanner = new ZetaSQLQueryPlanner(config);
BeamRelNode beamRelNode = zetaSQLQueryPlanner.convertToBeamRel(sql);
PCollection<Row> stream = BeamSqlRelUtils.toPCollection(pipeline,
beamRelNode);
final Schema schema = Schema.builder().addDoubleField("field1").build();
PAssert.that(stream)
.containsInAnyOrder(
Row.withSchema(schema).addValue(-1.00000).build(),
Row.withSchema(schema).addValue(-1.55556).build());
pipeline.run().waitUntilFinish(Duration.standardMinutes(PIPELINE_EXECUTION_WAITTIME_MINUTES));
}
Can anybody help me in understanding the cause of this problem? I do not
understand how it works correctly in other places and not in ZetaSqlDialect.
For reference: https://github.com/apache/beam/pull/13915
I would really appreciate any sort of input on this.
--
Regards,
*Sonam*
Software Engineer
Mobile: +92 3088337296
<http://venturedive.com/>
Re: COVAR_POP aggregate function test for the ZetaSql dialec
Posted by Andrew Pilloud <ap...@google.com>.
Looks like the code that converts the parsed ZetaSQL to a Calcite logical
expression doesn't currently support aggregate functions with multiple
columns. See this TODO:
https://github.com/apache/beam/blob/3bb232fb098700de408f574585dfe74bbaff7230/sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/translation/AggregateScanConverter.java#L147
On Tue, Feb 9, 2021 at 10:22 AM Sonam Ramchand <
sonam.ramchand@venturedive.com> wrote:
> Hi Devs,
> I am trying to test the COVAR_POP aggregate function for the ZetaSql
> dialect. I see
> https://github.com/apache/beam/blob/b74fcf7b30d956fb42830d652a57b265a1546973/sdks/[…]he/beam/sdk/extensions/sql/impl/transform/agg/CovarianceFn.java
> <https://github.com/apache/beam/blob/b74fcf7b30d956fb42830d652a57b265a1546973/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/agg/CovarianceFn.java#L50> is
> implemented as CombineFn and it works correctly
> https://github.com/apache/beam/blob/befcc3d780d561e81f23512742862a65c0ae3b69/sdks/[…]eam/sdk/extensions/sql/BeamSqlDslAggregationCovarianceTest.java
> <https://github.com/apache/beam/blob/befcc3d780d561e81f23512742862a65c0ae3b69/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslAggregationCovarianceTest.java#L87>
> .However, for ZetaSql dialect, it throws:
> covar_pop has more than one argument.
> java.lang.IllegalArgumentException: covar_pop has more than one argument.Unit
> test:
>
> public void testZetaSqlCovarPop() {
> String sql = "SELECT COVAR_POP(row_id,int64_col) FROM table_all_types GROUP BY bool_col";
>
> ZetaSQLQueryPlanner zetaSQLQueryPlanner = new ZetaSQLQueryPlanner(config);
> BeamRelNode beamRelNode = zetaSQLQueryPlanner.convertToBeamRel(sql);
> PCollection<Row> stream = BeamSqlRelUtils.toPCollection(pipeline, beamRelNode);
>
> final Schema schema = Schema.builder().addDoubleField("field1").build();
> PAssert.that(stream)
> .containsInAnyOrder(
> Row.withSchema(schema).addValue(-1.00000).build(),
> Row.withSchema(schema).addValue(-1.55556).build());
>
> pipeline.run().waitUntilFinish(Duration.standardMinutes(PIPELINE_EXECUTION_WAITTIME_MINUTES));
> }
>
> Can anybody help me in understanding the cause of this problem? I do not
> understand how it works correctly in other places and not in ZetaSqlDialect.
> For reference: https://github.com/apache/beam/pull/13915
>
> I would really appreciate any sort of input on this.
> --
>
> Regards,
> *Sonam*
> Software Engineer
> Mobile: +92 3088337296 <+92%20308%208337296>
>
> <http://venturedive.com/>
>