You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by "799590989@qq.com.INVALID" <79...@qq.com.INVALID> on 2022/12/23 07:06:18 UTC
为何雪花算法的udf 每次执行的结果都一样,如何能像UUID一样每次变化
环境信息
flink-1.13.6_scala_2.11
java 1.8
hive 1.1.0-cdh5.16.2
hbase 1.2.0-cdh5.16.2
使用的是yarn-session模式
UDF 类在附件中
应用场景是将hive表和kafka表join后输出到hbase,因为在每3秒倒序查询一次hbase的10条结果,故想将雪花算法生成的ID作为hbase的rowkey好做排序。
但是发现udf注册到flink之后,多次执行得到的结果是一样的,没有达到自动增长的目的,在sql-clinet下执行下面的SQL
CREATE FUNCTION IF NOT EXISTS SnowflakeId AS 'com.chinaoly.SnowflakeId' LANGUAGE JAVA;
SELECT SnowflakeId() AS a
UNION ALL
SELECT SnowflakeId() AS b
UNION ALL
SELECT SnowflakeId() AS c
UNION ALL
SELECT SnowflakeId() AS d;
得到的结果为
Table program finished. Page: Last of 1 Updated: 14:44:19.665
a
1606178806577209344
1606178806577209344
1606178806577209344
1606178806577209344
1606178806577209344
1606178806577209344
1606178806577209344
1606178806577209344
1606178806577209344
或者
SELECT SnowflakeId() AS a, SnowflakeId() AS b, SnowflakeId() AS c, SnowflakeId() AS d;
得到的结果为
a b c d
1606181622549028864 1606181622549028864 1606181622549028864 1606181622549028864
而系统自带的UUID确能每次生成不一样的结果,如
SELECT UUID() AS a, UUID() AS b, UUID() AS c, UUID() AS d, UUID() AS e, UUID() AS f, UUID() AS g, UUID() AS h;
得到的结果为:
a (CHAR(36) NOT NULL):
e16cd19c-b636-4a26-bca3-94424fee8313
b (CHAR(36) NOT NULL):
bb242ce0-6d73-428a-bc7e-6d35d44e7f1c
c (CHAR(36) NOT NULL):
357e56b6-a4ca-4666-9d09-0addf84db421
d (CHAR(36) NOT NULL):
deb70796-dc96-464c-a477-d19416fd0c0d
e (CHAR(36) NOT NULL):
233bf58e-9869-42d0-ad67-c21f653143d3
使用 UNION ALL的方式执行,结果也是一样的
SELECT UUID() AS a
UNION ALL
SELECT UUID() AS b
UNION ALL
SELECT UUID() AS c;
得到的结果为
a
e6955237-64f3-4b72-bf15-03098~
e6955237-64f3-4b72-bf15-03098~
e6955237-64f3-4b72-bf15-03098~
因为需求的原因,输出字段是不固定的,每办法用输出字段作为rowkey,请问sink到hbase时如何设计rowkey才能随时间倒序取最新的10条呢?
谌祥,杭州 - java后端开发 - 大数据方向
799590989@qq.com