You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by JasonLee <17...@163.com> on 2020/07/21 12:38:30 UTC
回复:flink-1.11 ddl kafka-to-hive问题
hi
hive表是一直没有数据还是过一段时间就有数据了?
| |
JasonLee
|
|
邮箱:17610775726@163.com
|
Signature is customized by Netease Mail Master
在2020年07月21日 19:09,kcz 写道:
hive-1.2.1
chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" +
" host STRING,\n" +
" url STRING," +
" public_date STRING" +
") partitioned by (public_date string) " +
"stored as PARQUET " +
"TBLPROPERTIES (\n" +
" 'sink.partition-commit.delay'='0 s',\n" +
" 'sink.partition-commit.trigger'='partition-time',\n" +
" 'sink.partition-commit.policy.kind'='metastore,success-file'" +
")";
tableEnv.executeSql(hiveSql);
tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");
Re: flink-1.11 ddl kafka-to-hive问题
Posted by Jark Wu <im...@gmail.com>.
rolling 策略配一下?
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/filesystem.html#sink-rolling-policy-rollover-interval
Best,
Jark
On Tue, 21 Jul 2020 at 20:38, JasonLee <17...@163.com> wrote:
> hi
> hive表是一直没有数据还是过一段时间就有数据了?
>
>
> | |
> JasonLee
> |
> |
> 邮箱:17610775726@163.com
> |
>
> Signature is customized by Netease Mail Master
>
> 在2020年07月21日 19:09,kcz 写道:
> hive-1.2.1
> chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
> String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" +
> " host STRING,\n" +
> " url STRING," +
> " public_date STRING" +
> ") partitioned by (public_date string) " +
> "stored as PARQUET " +
> "TBLPROPERTIES (\n" +
> " 'sink.partition-commit.delay'='0 s',\n" +
> " 'sink.partition-commit.trigger'='partition-time',\n" +
> " 'sink.partition-commit.policy.kind'='metastore,success-file'" +
> ")";
> tableEnv.executeSql(hiveSql);
>
>
> tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host, url,
> DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");
Re: flink-1.11 ddl kafka-to-hive问题
Posted by Jingsong Li <ji...@gmail.com>.
你的Source表是怎么定义的?确定有watermark前进吗?(可以看Flink UI)
'sink.partition-commit.trigger'='partition-time' 去掉试试?
Best,
Jingsong
On Wed, Jul 22, 2020 at 12:02 AM Leonard Xu <xb...@gmail.com> wrote:
> HI,
>
> Hive 表时在flink里建的吗? 如果是建表时使用了hive dialect吗?可以参考[1]设置下
>
> Best
> Leonard Xu
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect
> <
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect
> >
>
> > 在 2020年7月21日,22:57,kcz <57...@qq.com> 写道:
> >
> > 一直都木有数据 我也不知道哪里不太对 hive有这个表了已经。我测试写ddl hdfs 是OK的
> >
> >
> >
> >
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: JasonLee <17610775726@163.com <ma...@163.com>>
> > 发送时间: 2020年7月21日 20:39
> > 收件人: user-zh <user-zh@flink.apache.org <mailto:user-zh@flink.apache.org
> >>
> > 主题: 回复:flink-1.11 ddl kafka-to-hive问题
> >
> >
> >
> > hi
> > hive表是一直没有数据还是过一段时间就有数据了?
> >
> >
> > | |
> > JasonLee
> > |
> > |
> > 邮箱:17610775726@163.com
> > |
> >
> > Signature is customized by Netease Mail Master
> >
> > 在2020年07月21日 19:09,kcz 写道:
> > hive-1.2.1
> > chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
> > String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" +
> > " host STRING,\n" +
> > " url STRING," +
> > " public_date STRING" +
> > ") partitioned by (public_date
> string) " +
> > "stored as PARQUET " +
> > "TBLPROPERTIES (\n" +
> > "
> 'sink.partition-commit.delay'='0 s',\n" +
> > "
> 'sink.partition-commit.trigger'='partition-time',\n" +
> > "
> 'sink.partition-commit.policy.kind'='metastore,success-file'" +
> > ")";
> > tableEnv.executeSql(hiveSql);
> >
> >
> > tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host,
> url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");
>
>
--
Best, Jingsong Lee
Re: flink-1.11 ddl kafka-to-hive问题
Posted by Leonard Xu <xb...@gmail.com>.
HI,
Hive 表时在flink里建的吗? 如果是建表时使用了hive dialect吗?可以参考[1]设置下
Best
Leonard Xu
[1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect <https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect>
> 在 2020年7月21日,22:57,kcz <57...@qq.com> 写道:
>
> 一直都木有数据 我也不知道哪里不太对 hive有这个表了已经。我测试写ddl hdfs 是OK的
>
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: JasonLee <17610775726@163.com <ma...@163.com>>
> 发送时间: 2020年7月21日 20:39
> 收件人: user-zh <user-zh@flink.apache.org <ma...@flink.apache.org>>
> 主题: 回复:flink-1.11 ddl kafka-to-hive问题
>
>
>
> hi
> hive表是一直没有数据还是过一段时间就有数据了?
>
>
> | |
> JasonLee
> |
> |
> 邮箱:17610775726@163.com
> |
>
> Signature is customized by Netease Mail Master
>
> 在2020年07月21日 19:09,kcz 写道:
> hive-1.2.1
> chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
> String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" +
> " host STRING,\n" +
> " url STRING," +
> " public_date STRING" +
> ") partitioned by (public_date string) " +
> "stored as PARQUET " +
> "TBLPROPERTIES (\n" +
> " 'sink.partition-commit.delay'='0 s',\n" +
> " 'sink.partition-commit.trigger'='partition-time',\n" +
> " 'sink.partition-commit.policy.kind'='metastore,success-file'" +
> ")";
> tableEnv.executeSql(hiveSql);
>
>
> tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");
回复:flink-1.11 ddl kafka-to-hive问题
Posted by kcz <57...@qq.com>.
一直都木有数据 我也不知道哪里不太对 hive有这个表了已经。我测试写ddl hdfs 是OK的
------------------ 原始邮件 ------------------
发件人: JasonLee <17610775726@163.com>
发送时间: 2020年7月21日 20:39
收件人: user-zh <user-zh@flink.apache.org>
主题: 回复:flink-1.11 ddl kafka-to-hive问题
hi
hive表是一直没有数据还是过一段时间就有数据了?
| |
JasonLee
|
|
邮箱:17610775726@163.com
|
Signature is customized by Netease Mail Master
在2020年07月21日 19:09,kcz 写道:
hive-1.2.1
chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" +
" host STRING,\n" +
" url STRING," +
" public_date STRING" +
") partitioned by (public_date string) " +
"stored as PARQUET " +
"TBLPROPERTIES (\n" +
" 'sink.partition-commit.delay'='0 s',\n" +
" 'sink.partition-commit.trigger'='partition-time',\n" +
" 'sink.partition-commit.policy.kind'='metastore,success-file'" +
")";
tableEnv.executeSql(hiveSql);
tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");