You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Piyush Narang <p....@criteo.com> on 2019/03/28 19:31:55 UTC

Support for custom triggers in Table / SQL

Hi folks,

I’m trying to write a Flink job that computes a bunch of counters which requires custom triggers and I was trying to figure out the best way to express that.

The query looks something like this:
SELECT userId, customUDAF(...) AS counter1, customUDAF(...) AS counter2, ...
FROM (
SELECT * FROM my_kafka_stream
)
GROUP BY userId, HOP(`timestamp`, INTERVAL '6' HOUR, INTERVAL '7' DAY)

We sink this to a KV store (memcache / couchbase) for persistence.

Some of these counters end up spanning a pretty wide time window (longest is 7 days) and if we want to keep the state tractable we have to have a pretty large slide interval (6 hours or greater). A requirement that some of our users have is for counters to be updated fairly frequently (e.g. every min) so we were looking at how to achieve that with the Table / SQL api. I see that this is possible using the custom triggers<https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/windows.html#triggers> support if we were to use the Datastream api but I’m wondering if this is possible using the Table / SQL apis.

I did see another thread where Fabian brought up this design doc<https://docs.google.com/document/d/1wrla8mF_mmq-NW9sdJHYVgMyZsgCmHumJJ5f5WUzTiM/edit?ts=59816a8b#heading=h.1zg4jlmqwzlr> which has listed what support for emit triggers would look like (in various streaming platforms). Is this something that is being actively worked on? If not, any suggestions on how we could get the ball rolling on this? (google doc design + jira?)

Thanks,

-- Piyush


Re: Support for custom triggers in Table / SQL

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Piyush,

Custom triggers (or early firing) is currently not supported by SQL or the
Table API.
It is also not on the roadmap [1].
Currently, most efforts on the relational API are focused on restructuring
the code and working towards the integration of the Blink contribution [2].

AFAIK, there are no concrete plans to work on support for early results in
SQL.
You'd need to use the DataStream API for this use case.

Best, Fabian

[1] https://flink.apache.org/roadmap.html
[2]
https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html

Am Do., 28. März 2019 um 12:32 Uhr schrieb Piyush Narang <
p.narang@criteo.com>:

> Hi folks,
>
>
>
> I’m trying to write a Flink job that computes a bunch of counters which
> requires custom triggers and I was trying to figure out the best way to
> express that.
>
>
>
> The query looks something like this:
>
> SELECT userId, customUDAF(...) AS counter1, customUDAF(...) AS counter2,
> ...
>
> FROM (
>
> SELECT * FROM my_kafka_stream
>
> )
>
> GROUP BY userId, HOP(`timestamp`, INTERVAL '6' HOUR, INTERVAL '7' DAY)
>
>
>
> We sink this to a KV store (memcache / couchbase) for persistence.
>
>
>
> Some of these counters end up spanning a pretty wide time window (longest
> is 7 days) and if we want to keep the state tractable we have to have a
> pretty large slide interval (6 hours or greater). A requirement that some
> of our users have is for counters to be updated fairly frequently (e.g.
> every min) so we were looking at how to achieve that with the Table / SQL
> api. I see that this is possible using the custom triggers
> <https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/windows.html#triggers>
> support if we were to use the Datastream api but I’m wondering if this is
> possible using the Table / SQL apis.
>
>
>
> I did see another thread where Fabian brought up this design doc
> <https://docs.google.com/document/d/1wrla8mF_mmq-NW9sdJHYVgMyZsgCmHumJJ5f5WUzTiM/edit?ts=59816a8b#heading=h.1zg4jlmqwzlr>
> which has listed what support for emit triggers would look like (in various
> streaming platforms). Is this something that is being actively worked on?
> If not, any suggestions on how we could get the ball rolling on this?
> (google doc design + jira?)
>
>
>
> Thanks,
>
>
>
> -- Piyush
>
>
>