You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by kcz <57...@qq.com> on 2020/07/21 11:09:19 UTC

flink-1.11 ddl kafka-to-hive问题

hive-1.2.1
chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
String hiveSql = "CREATE  TABLE  stream_tmp.fs_table (\n" +
        "  host STRING,\n" +
        "  url STRING," +
        "  public_date STRING" +
        ") partitioned by (public_date string) " +
        "stored as PARQUET " +
        "TBLPROPERTIES (\n" +
        "  'sink.partition-commit.delay'='0 s',\n" +
        "  'sink.partition-commit.trigger'='partition-time',\n" +
        "  'sink.partition-commit.policy.kind'='metastore,success-file'" +
        ")";
tableEnv.executeSql(hiveSql);


tableEnv.executeSql("INSERT INTO  stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");

Re: flink-1.11 ddl kafka-to-hive问题

Posted by Jark Wu <im...@gmail.com>.
rolling 策略配一下?
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/filesystem.html#sink-rolling-policy-rollover-interval

Best,
Jark

On Tue, 21 Jul 2020 at 20:38, JasonLee <17...@163.com> wrote:

> hi
> hive表是一直没有数据还是过一段时间就有数据了?
>
>
> | |
> JasonLee
> |
> |
> 邮箱:17610775726@163.com
> |
>
> Signature is customized by Netease Mail Master
>
> 在2020年07月21日 19:09,kcz 写道:
> hive-1.2.1
> chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
> String hiveSql = "CREATE  TABLE  stream_tmp.fs_table (\n" +
>        "  host STRING,\n" +
>        "  url STRING," +
>        "  public_date STRING" +
>        ") partitioned by (public_date string) " +
>        "stored as PARQUET " +
>        "TBLPROPERTIES (\n" +
>        "  'sink.partition-commit.delay'='0 s',\n" +
>        "  'sink.partition-commit.trigger'='partition-time',\n" +
>        "  'sink.partition-commit.policy.kind'='metastore,success-file'" +
>        ")";
> tableEnv.executeSql(hiveSql);
>
>
> tableEnv.executeSql("INSERT INTO  stream_tmp.fs_table SELECT host, url,
> DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");

Re: flink-1.11 ddl kafka-to-hive问题

Posted by Jingsong Li <ji...@gmail.com>.
你的Source表是怎么定义的?确定有watermark前进吗?(可以看Flink UI)

'sink.partition-commit.trigger'='partition-time' 去掉试试?

Best,
Jingsong

On Wed, Jul 22, 2020 at 12:02 AM Leonard Xu <xb...@gmail.com> wrote:

> HI,
>
> Hive 表时在flink里建的吗? 如果是建表时使用了hive dialect吗?可以参考[1]设置下
>
> Best
> Leonard Xu
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect
> <
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect
> >
>
> > 在 2020年7月21日,22:57,kcz <57...@qq.com> 写道:
> >
> > 一直都木有数据 我也不知道哪里不太对 hive有这个表了已经。我测试写ddl hdfs 是OK的
> >
> >
> >
> >
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: JasonLee <17610775726@163.com <ma...@163.com>&gt;
> > 发送时间: 2020年7月21日 20:39
> > 收件人: user-zh <user-zh@flink.apache.org <mailto:user-zh@flink.apache.org
> >&gt;
> > 主题: 回复:flink-1.11 ddl kafka-to-hive问题
> >
> >
> >
> > hi
> > hive表是一直没有数据还是过一段时间就有数据了?
> >
> >
> > | |
> > JasonLee
> > |
> > |
> > 邮箱:17610775726@163.com
> > |
> >
> > Signature is customized by Netease Mail Master
> >
> > 在2020年07月21日 19:09,kcz 写道:
> > hive-1.2.1
> > chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
> > String hiveSql = "CREATE&nbsp; TABLE&nbsp; stream_tmp.fs_table (\n" +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; host STRING,\n" +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; url STRING," +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; public_date STRING" +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ") partitioned by (public_date
> string) " +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "stored as PARQUET " +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "TBLPROPERTIES (\n" +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp;
> 'sink.partition-commit.delay'='0 s',\n" +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp;
> 'sink.partition-commit.trigger'='partition-time',\n" +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp;
> 'sink.partition-commit.policy.kind'='metastore,success-file'" +
> > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ")";
> > tableEnv.executeSql(hiveSql);
> >
> >
> > tableEnv.executeSql("INSERT INTO&nbsp; stream_tmp.fs_table SELECT host,
> url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");
>
>

-- 
Best, Jingsong Lee

Re: flink-1.11 ddl kafka-to-hive问题

Posted by Leonard Xu <xb...@gmail.com>.
HI,

Hive 表时在flink里建的吗? 如果是建表时使用了hive dialect吗?可以参考[1]设置下

Best
Leonard Xu
[1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect <https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect>

> 在 2020年7月21日,22:57,kcz <57...@qq.com> 写道:
> 
> 一直都木有数据 我也不知道哪里不太对 hive有这个表了已经。我测试写ddl hdfs 是OK的
> 
> 
> 
> 
> 
> ------------------ 原始邮件 ------------------
> 发件人: JasonLee <17610775726@163.com <ma...@163.com>&gt;
> 发送时间: 2020年7月21日 20:39
> 收件人: user-zh <user-zh@flink.apache.org <ma...@flink.apache.org>&gt;
> 主题: 回复:flink-1.11 ddl kafka-to-hive问题
> 
> 
> 
> hi
> hive表是一直没有数据还是过一段时间就有数据了?
> 
> 
> | |
> JasonLee
> |
> |
> 邮箱:17610775726@163.com
> |
> 
> Signature is customized by Netease Mail Master
> 
> 在2020年07月21日 19:09,kcz 写道:
> hive-1.2.1
> chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
> String hiveSql = "CREATE&nbsp; TABLE&nbsp; stream_tmp.fs_table (\n" +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; host STRING,\n" +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; url STRING," +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; public_date STRING" +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ") partitioned by (public_date string) " +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "stored as PARQUET " +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "TBLPROPERTIES (\n" +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; 'sink.partition-commit.delay'='0 s',\n" +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; 'sink.partition-commit.trigger'='partition-time',\n" +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; 'sink.partition-commit.policy.kind'='metastore,success-file'" +
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ")";
> tableEnv.executeSql(hiveSql);
> 
> 
> tableEnv.executeSql("INSERT INTO&nbsp; stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");


回复:flink-1.11 ddl kafka-to-hive问题

Posted by kcz <57...@qq.com>.
一直都木有数据 我也不知道哪里不太对 hive有这个表了已经。我测试写ddl hdfs 是OK的





------------------ 原始邮件 ------------------
发件人: JasonLee <17610775726@163.com&gt;
发送时间: 2020年7月21日 20:39
收件人: user-zh <user-zh@flink.apache.org&gt;
主题: 回复:flink-1.11 ddl kafka-to-hive问题



hi
hive表是一直没有数据还是过一段时间就有数据了?


| |
JasonLee
|
|
邮箱:17610775726@163.com
|

Signature is customized by Netease Mail Master

在2020年07月21日 19:09,kcz 写道:
hive-1.2.1
chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
String hiveSql = "CREATE&nbsp; TABLE&nbsp; stream_tmp.fs_table (\n" +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; host STRING,\n" +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; url STRING," +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; public_date STRING" +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ") partitioned by (public_date string) " +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "stored as PARQUET " +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "TBLPROPERTIES (\n" +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; 'sink.partition-commit.delay'='0 s',\n" +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; 'sink.partition-commit.trigger'='partition-time',\n" +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; 'sink.partition-commit.policy.kind'='metastore,success-file'" +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ")";
tableEnv.executeSql(hiveSql);


tableEnv.executeSql("INSERT INTO&nbsp; stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");

回复:flink-1.11 ddl kafka-to-hive问题

Posted by JasonLee <17...@163.com>.
hi
hive表是一直没有数据还是过一段时间就有数据了?


| |
JasonLee
|
|
邮箱:17610775726@163.com
|

Signature is customized by Netease Mail Master

在2020年07月21日 19:09,kcz 写道:
hive-1.2.1
chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗?
String hiveSql = "CREATE  TABLE  stream_tmp.fs_table (\n" +
       "  host STRING,\n" +
       "  url STRING," +
       "  public_date STRING" +
       ") partitioned by (public_date string) " +
       "stored as PARQUET " +
       "TBLPROPERTIES (\n" +
       "  'sink.partition-commit.delay'='0 s',\n" +
       "  'sink.partition-commit.trigger'='partition-time',\n" +
       "  'sink.partition-commit.policy.kind'='metastore,success-file'" +
       ")";
tableEnv.executeSql(hiveSql);


tableEnv.executeSql("INSERT INTO  stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table");