You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Wiśniowski Piotr <co...@gmail.com> on 2023/04/20 15:57:51 UTC

Beam shell sql with zeta

Hi,

I have a question regarding usage of Zeta with SQL extensions in SQL 
shell. I try to:

```

SET runner = DirectRunner;
SET tempLocation = `/tmp/test/`;
SET streaming=`True`;
SET plannerName = 
`org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner`;

CREATE EXTERNAL TABLE etl_raw(
     event_timestamp TIMESTAMP,
     event_type VARCHAR,
     message_id VARCHAR,
     tracking_source VARCHAR,
     tracking_version VARCHAR,
     `repo_state` STRUCT<`head` STRUCT<`commit` VARCHAR ,`name` VARCHAR>>
)
TYPE pubsub
LOCATION 'projects/xxx/topics/xxx'
TBLPROPERTIES '{"format":"json"}';

```

But get error `parse failed: Encountered "STRUCT" `.

If i change the `STRUCT` to `ROW` (as in Calcite) the DDL passes, but 
still I do fail to receive data on

`SELECT * FROM etl_raw LIMIT 1;` with exception of 
`java.lang.NoSuchFieldException: head (state=,code=0)` when I am sure 
that the field is there in json payload.

With commented out `repo_state` filed I am able to retrieve the data. 
Unfortunately I do not have control over the payload structure as its 
3rd party hook to make it flat.

In general I am unable to parse json msg from pubsub having structured 
field.

Is anyone familiar with this part of Beam functionalities?

Best regards

Wisniowski Piotr