You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Dan Hill <qu...@gmail.com> on 2020/09/18 07:47:41 UTC

Flink Table SQL and writing nested Avro files

Hi!

I want to join two tables and write the results to Avro where the left and
right rows are nested in the avro output.  Is it possible to do this with
the SQL interface?

Thanks!
- Dan

 CREATE TABLE `flat_avro` (
   `left` ROW<id BIGINT, name STRING>,
   `right` ROW<id BIGINT, name STRING>
) WITH (
   'connector' = 'filesystem',
   'path' = 's3p://blah/blah',
   'format' = 'avro'
);

INSERT INTO `flat_avro`
SELECT left.*, right.*
FROM `left`
LEFT JOIN `right`
ON `left`.`id` = `right`.`id`
);

Re: Flink Table SQL and writing nested Avro files

Posted by Dan Hill <qu...@gmail.com>.
Nice!  I'll try that.  Thanks, Dawid!

On Mon, Sep 21, 2020 at 2:37 AM Dawid Wysakowicz <dw...@apache.org>
wrote:

> Hi Dan,
>
> I think the best what I can suggest is this:
>
> SELECT
>
>     ROW(left.field0, left.field1, left.field2, ...),
>
>     ROW(right.field0, right.field1, right.field2, ...)
>
> FROM ...
>
> You will need to list all the fields manually, as SQL does not allow for
> asterisks in regular function calls.
>
> If you are willing to give the Table API a try you might workaround some
> of the manual work with the Column Function[1]
>
>         Table join = t1.join(t2).where($("id1").isEqual($("id2")));
>         join
>             .select(
>                 row(withColumns(range(1, t1.getSchema().getFieldCount()))),
>                 row(withColumns(range(
>                     t1.getSchema().getFieldCount() + 1,
>                     t1.getSchema().getFieldCount() +
> t2.getSchema().getFieldCount())))
>             )
>             .executeInsert("flat_avro")
>             .await();
>
>
> Best,
>
> Dawid
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/functions/systemFunctions.html#column-functions
> On 18/09/2020 09:47, Dan Hill wrote:
>
> Hi!
>
> I want to join two tables and write the results to Avro where the left and
> right rows are nested in the avro output.  Is it possible to do this with
> the SQL interface?
>
> Thanks!
> - Dan
>
>  CREATE TABLE `flat_avro` (
>    `left` ROW<id BIGINT, name STRING>,
>    `right` ROW<id BIGINT, name STRING>) WITH (
>    'connector' = 'filesystem',
>    'path' = 's3p://blah/blah',
>    'format' = 'avro');INSERT INTO `flat_avro` SELECT left.*, right.* FROM `left` LEFT JOIN `right`ON `left`.`id` = `right`.`id`);
>
>

Re: Flink Table SQL and writing nested Avro files

Posted by Dawid Wysakowicz <dw...@apache.org>.
Hi Dan,

I think the best what I can suggest is this:

|SELECT ||
|

|    ROW(left.field0, left.field1, left.field2, ...),|

|    ROW(right.field0, right.field1, right.field2, ...)|

|FROM ...|

You will need to list all the fields manually, as SQL does not allow for
asterisks in regular function calls.

If you are willing to give the Table API a try you might workaround some
of the manual work with the Column Function[1]

        Table join = t1.join(t2).where($("id1").isEqual($("id2")));
        join
            .select(
                row(withColumns(range(1, t1.getSchema().getFieldCount()))),
                row(withColumns(range(
                    t1.getSchema().getFieldCount() + 1,
                    t1.getSchema().getFieldCount() +
t2.getSchema().getFieldCount())))
            )
            .executeInsert("flat_avro")
            .await();


Best,

Dawid

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/functions/systemFunctions.html#column-functions

On 18/09/2020 09:47, Dan Hill wrote:
> Hi!
>
> I want to join two tables and write the results to Avro where the left
> and right rows are nested in the avro output.  Is it possible to do
> this with the SQL interface?  
>
> Thanks!
> - Dan
>  CREATE TABLE `flat_avro` (
>    `left` ROW<id BIGINT, name STRING>,
>    `right` ROW<id BIGINT, name STRING>
> ) WITH (
>    'connector' = 'filesystem',
>    'path' = 's3p://blah/blah',
>    'format' = 'avro'
> );INSERT INTO `flat_avro`
> SELECT left.*, right.* FROM `left`
> LEFT JOIN `right`
> ON `left`.`id` = `right`.`id` );