You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by "casel.chen" <ca...@126.com> on 2021/08/04 03:55:06 UTC

flink sql统计IP出现次数TopN问题

场景:实时统计用户访问日志数据,求一分钟内访问事件发生次数超过5次的用户,其不同source_ip出现次数最多前3个的事件


源表数据
user_name, source_ip, ts
张三, 100, 00:08
张三, 104, 00:12
张三, 100, 00:15
张三, 101, 00:35
张三, 100, 00:38
张三, 102, 00:40
张三, 102, 00:45
张三, 101, 00:47
张三, 100, 00:55


张三, 100, 01:15
李四, 200, 01:17
李四, 200, 01:19
李四, 200, 01:27
王五, 302, 01:35


目标表数据
user_name, source_ip, occur_times, window_start, window_end
张三, 100, 4, 00:00, 01:00
张三, 101, 2, 00:00, 01:00
张三, 102, 2, 00:00, 01:00


=====================================================
create TEMPORARY table event_table (
    user_name STRING, source_ip STRING, ts TIMESTAMP
  )
with ('connector' = 'datagen');


create TEMPORARY table alert_table (
    user_name STRING,
    source_ip STRING,
    occur_times BIGINT,
    ts TIMESTAMP
  )
with ('connector' = 'print');


请问
1. 用flink 1.12 sql要怎么实现? 
2. 用flink 1.13 window TopN要如何实现?


谢谢!