You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Becket Qin <be...@gmail.com> on 2019/12/04 04:12:29 UTC

Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

Hi all,

Sorry for the long belated update. I have updated FLIP-27 wiki page with
the latest proposals. Some noticeable changes include:
1. A new generic communication mechanism between SplitEnumerator and
SourceReader.
2. Some detail API method signature changes.

We left a few things out of this FLIP and will address them in separate
FLIPs. Including:
1. Per split event time.
2. Event time alignment.
3. Fine grained failover for SplitEnumerator failure.

Please let us know if you have any question.

Thanks,

Jiangjie (Becket) Qin

On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <he...@gmail.com>
wrote:

> Hi everyone,
>
> I've put the catalog part design in separate doc with more details for
> easier communication.
>
>
> https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing
>
> I would love to hear your thoughts on this.
>
> Best,
> Yijie
>
> On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <he...@gmail.com>
> wrote:
>
> > Hi everyone,
> >
> > Glad to receive your valuable feedbacks.
> >
> > I'd first separate the Pulsar catalog as another doc and show more design
> > and implementation details there.
> >
> > For the current FLIP-72, I would separate it into the sink part for
> > current work and keep the source part as future works until we reach
> > FLIP-27 finals.
> >
> > I also reply to some of the comments in the design doc. I will rewrite
> the
> > catalog part in regarding to Bowen's advice in both email and comments.
> >
> > Thanks for the help again.
> >
> > Best,
> > Yijie
> >
> > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <wa...@gmail.com> wrote:
> >
> >> Hi Yijie,
> >>
> >> I also agree with Jark on separating the Catalog part into another FLIP.
> >>
> >> With FLIP-27[1] also in the air, it is also probably great to split and
> >> unblock the sink implementation contribution.
> >> I would suggest either putting in a detail implementation plan section
> in
> >> the doc, or (maybe too much separation?) splitting them into different
> >> FLIPs. What do you guys think?
> >>
> >> --
> >> Rong
> >>
> >> [1]
> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> >>
> >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <im...@gmail.com> wrote:
> >>
> >> > Hi Yijie,
> >> >
> >> > Thanks for the design document. I agree with Bowen that the catalog
> part
> >> > needs more details.
> >> > And I would suggest to separate Pulsar Catalog as another FLIP. IMO,
> it
> >> has
> >> > little to do with source/sink.
> >> > Having a separate FLIP can unblock the contribution for sink (or
> source)
> >> > and keep the discussion more focus.
> >> > I also left some comments in the documentation.
> >> >
> >> > Thanks,
> >> > Jark
> >> >
> >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <he...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi Bowen,
> >> > >
> >> > > Thanks for your comments. I'll add catalog details as you suggested.
> >> > >
> >> > > One more question: since we decide to not implement source part of
> the
> >> > > connector at the moment.
> >> > > What can users do with a Pulsar catalog?
> >> > > Create a table backed by Pulsar and check existing pulsar tables to
> >> see
> >> > > their schemas? Drop tables maybe?
> >> > >
> >> > > Best,
> >> > > Yijie
> >> > >
> >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <bo...@gmail.com>
> wrote:
> >> > >
> >> > > > Hi Yijie,
> >> > > >
> >> > > > Per the discussion, maybe you can move pulsar source to 'future
> >> work'
> >> > > > section in the FLIP for now?
> >> > > >
> >> > > > Besides, the FLIP seems to be quite rough at the moment, and I'd
> >> > > recommend
> >> > > > to add more details .
> >> > > >
> >> > > > A few questions mainly regarding the proposed pulsar catalog.
> >> > > >
> >> > > >    - Can you provide some background of pulsar schema registry and
> >> how
> >> > it
> >> > > >    works?
> >> > > >    - The proposed design of pulsar catalog is very vague now, can
> >> you
> >> > > >    share some details of how a pulsar catalog would work
> internally?
> >> > E.g.
> >> > > >       - which APIs does it support exactly? E.g. I see from your
> >> > > >       prototype that table creation is supported but not
> alteration.
> >> > > >       - is it going to connect to a pulsar schema registry via a
> >> http
> >> > > >       client or a pulsar client, etc
> >> > > >       - will it be able to handle multiple versions of pulsar, or
> >> just
> >> > > >       one? How is compatibility handles between different
> >> Flink-Pulsar
> >> > > versions?
> >> > > >       - will it support only reading from pulsar schema registry ,
> >> or
> >> > > >       both read/write? Will it work end-to-end in Flink SQL for
> >> users
> >> > to
> >> > > create
> >> > > >       and manipulate a pulsar table such as "CREATE TABLE t WITH
> >> > > >       PROPERTIES(type=pulsar)" and "DROP TABLE t"?
> >> > > >       - Is a pulsar topic always gonna be a non-partitioned table?
> >> How
> >> > is
> >> > > >       a partitioned topic mapped to a Flink table?
> >> > > >    - How to map Flink's catalog/database namespace to pulsar's
> >> > > >    multi-tenant namespaces? I'm not very familiar with how multi
> >> > tenancy
> >> > > works
> >> > > >    in pulsar, and some background context/use cases may help here
> >> too.
> >> > > E.g.
> >> > > >       - can a pulsar client/consumer/producer be multiple-tenant
> at
> >> the
> >> > > >       same time?
> >> > > >       - how does authentication work in pulsar's multi-tenancy and
> >> the
> >> > > >       catalog? asking since I didn't see the proposed pulsar
> catalog
> >> > has
> >> > > >       username/password configs
> >> > > >       - the FLIP seems propose mapping a pulsar cluster and
> >> > > >       'tenant/namespace' respectively to Flink's 'catalog' and
> >> > > 'database'. I
> >> > > >       wonder whether it totally makes sense, or should we actually
> >> map
> >> > > "tenant"
> >> > > >       to "catalog", and "namespace" to "database"?
> >> > > >
> >> > > > Cheers,
> >> > > > Bowen
> >> > > >
> >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <
> >> henry.yijieshen@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > >> Hi everyone,
> >> > > >>
> >> > > >> Per discussion in the previous thread
> >> > > >> <
> >> > > >>
> >> > >
> >> >
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html
> >> > > >> >,
> >> > > >> I have created FLIP-72 to kick off a more detailed discussion on
> >> the
> >> > > Flink
> >> > > >> Pulsar connector:
> >> > > >>
> >> > > >>
> >> > > >>
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector
> >> > > >>
> >> > > >> In short, the connector has the following features:
> >> > > >>
> >> > > >>    -
> >> > > >>
> >> > > >>    Pulsar as a streaming source with exactly-once guarantee.
> >> > > >>    -
> >> > > >>
> >> > > >>    Sink streaming results to Pulsar with at-least-once semantics.
> >> > > >>    -
> >> > > >>
> >> > > >>    Build upon Flink new Table API Type system (FLIP-37
> >> > > >>    <
> >> > > >>
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
> >> > > >> >
> >> > > >>    ), and can automatically (de)serialize messages with the help
> of
> >> > > Pulsar
> >> > > >>    schema.
> >> > > >>    -
> >> > > >>
> >> > > >>    Integrate with Flink new Catalog API (FLIP-30
> >> > > >>    <
> >> > > >>
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs
> >> > > >> >),
> >> > > >>    which enables the use of Pulsar topics as tables in Table API
> as
> >> > well
> >> > > >> as
> >> > > >>    SQL client.
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >>
> >> > >
> >> >
> >>
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u
> >> > > >>
> >> > > >>
> >> > > >> Would love to here your thoughts on this.
> >> > > >>
> >> > > >> Best,
> >> > > >> Yijie
> >> > > >>
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

Posted by Yijie Shen <he...@gmail.com>.
Hi everyone,

I've updated the Catalog PR and make all settings small case. And tests are
added as well.
Hi Bowen, could you please take a look.
https://github.com/apache/flink/pull/10455

For the sink part of the connector, I've made a separate PR
https://github.com/apache/flink/pull/10875. Could someone help review this?


Best,
Yijie


On Thu, Jan 9, 2020 at 8:44 AM Bowen Li <bo...@gmail.com> wrote:

> Hi Yijie,
>
> There's just one more concern on the yaml configs. Otherwise, I think we
> should be good to go.
>
> Can you update your PR and ensure all tests pass? I can help review and
> merge in the next couple weeks.
>
> Thanks,
> Bowen
>
>
> On Mon, Dec 23, 2019 at 7:03 PM Yijie Shen <he...@gmail.com>
> wrote:
>
> > Hi Bowen,
> >
> > I've done updated the design doc, PTAL.
> > Btw the PR for catalog is https://github.com/apache/flink/pull/10455,
> > could
> > you please take a look?
> >
> > Best,
> > Yijie
> >
> > On Mon, Dec 9, 2019 at 8:44 AM Bowen Li <bo...@gmail.com> wrote:
> >
> > > Hi Yijie,
> > >
> > > I took a look at the design doc. LGTM overall, left a few questions.
> > >
> > > On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <be...@gmail.com>
> wrote:
> > >
> > > > Yes, you are absolutely right. Cannot believe I posted in the wrong
> > > > thread...
> > > >
> > > > On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <im...@gmail.com> wrote:
> > > >
> > > >> Thanks Becket the the updating,
> > > >>
> > > >> But shouldn't this message be posted in FLIP-27 discussion
> thread[1]?
> > > >>
> > > >>
> > > >> Best,
> > > >> Jark
> > > >>
> > > >> [1]:
> > > >>
> > > >>
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html
> > > >>
> > > >> On Wed, 4 Dec 2019 at 12:12, Becket Qin <be...@gmail.com>
> wrote:
> > > >>
> > > >> > Hi all,
> > > >> >
> > > >> > Sorry for the long belated update. I have updated FLIP-27 wiki
> page
> > > with
> > > >> > the latest proposals. Some noticeable changes include:
> > > >> > 1. A new generic communication mechanism between SplitEnumerator
> and
> > > >> > SourceReader.
> > > >> > 2. Some detail API method signature changes.
> > > >> >
> > > >> > We left a few things out of this FLIP and will address them in
> > > separate
> > > >> > FLIPs. Including:
> > > >> > 1. Per split event time.
> > > >> > 2. Event time alignment.
> > > >> > 3. Fine grained failover for SplitEnumerator failure.
> > > >> >
> > > >> > Please let us know if you have any question.
> > > >> >
> > > >> > Thanks,
> > > >> >
> > > >> > Jiangjie (Becket) Qin
> > > >> >
> > > >> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <
> > > henry.yijieshen@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Hi everyone,
> > > >> > >
> > > >> > > I've put the catalog part design in separate doc with more
> details
> > > for
> > > >> > > easier communication.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing
> > > >> > >
> > > >> > > I would love to hear your thoughts on this.
> > > >> > >
> > > >> > > Best,
> > > >> > > Yijie
> > > >> > >
> > > >> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <
> > > >> henry.yijieshen@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hi everyone,
> > > >> > > >
> > > >> > > > Glad to receive your valuable feedbacks.
> > > >> > > >
> > > >> > > > I'd first separate the Pulsar catalog as another doc and show
> > more
> > > >> > design
> > > >> > > > and implementation details there.
> > > >> > > >
> > > >> > > > For the current FLIP-72, I would separate it into the sink
> part
> > > for
> > > >> > > > current work and keep the source part as future works until we
> > > reach
> > > >> > > > FLIP-27 finals.
> > > >> > > >
> > > >> > > > I also reply to some of the comments in the design doc. I will
> > > >> rewrite
> > > >> > > the
> > > >> > > > catalog part in regarding to Bowen's advice in both email and
> > > >> comments.
> > > >> > > >
> > > >> > > > Thanks for the help again.
> > > >> > > >
> > > >> > > > Best,
> > > >> > > > Yijie
> > > >> > > >
> > > >> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <
> walterddr@gmail.com
> > >
> > > >> > wrote:
> > > >> > > >
> > > >> > > >> Hi Yijie,
> > > >> > > >>
> > > >> > > >> I also agree with Jark on separating the Catalog part into
> > > another
> > > >> > FLIP.
> > > >> > > >>
> > > >> > > >> With FLIP-27[1] also in the air, it is also probably great to
> > > split
> > > >> > and
> > > >> > > >> unblock the sink implementation contribution.
> > > >> > > >> I would suggest either putting in a detail implementation
> plan
> > > >> section
> > > >> > > in
> > > >> > > >> the doc, or (maybe too much separation?) splitting them into
> > > >> different
> > > >> > > >> FLIPs. What do you guys think?
> > > >> > > >>
> > > >> > > >> --
> > > >> > > >> Rong
> > > >> > > >>
> > > >> > > >> [1]
> > > >> > > >>
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> > > >> > > >>
> > > >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <im...@gmail.com>
> > > wrote:
> > > >> > > >>
> > > >> > > >> > Hi Yijie,
> > > >> > > >> >
> > > >> > > >> > Thanks for the design document. I agree with Bowen that the
> > > >> catalog
> > > >> > > part
> > > >> > > >> > needs more details.
> > > >> > > >> > And I would suggest to separate Pulsar Catalog as another
> > FLIP.
> > > >> IMO,
> > > >> > > it
> > > >> > > >> has
> > > >> > > >> > little to do with source/sink.
> > > >> > > >> > Having a separate FLIP can unblock the contribution for
> sink
> > > (or
> > > >> > > source)
> > > >> > > >> > and keep the discussion more focus.
> > > >> > > >> > I also left some comments in the documentation.
> > > >> > > >> >
> > > >> > > >> > Thanks,
> > > >> > > >> > Jark
> > > >> > > >> >
> > > >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <
> > > >> henry.yijieshen@gmail.com
> > > >> > >
> > > >> > > >> > wrote:
> > > >> > > >> >
> > > >> > > >> > > Hi Bowen,
> > > >> > > >> > >
> > > >> > > >> > > Thanks for your comments. I'll add catalog details as you
> > > >> > suggested.
> > > >> > > >> > >
> > > >> > > >> > > One more question: since we decide to not implement
> source
> > > >> part of
> > > >> > > the
> > > >> > > >> > > connector at the moment.
> > > >> > > >> > > What can users do with a Pulsar catalog?
> > > >> > > >> > > Create a table backed by Pulsar and check existing pulsar
> > > >> tables
> > > >> > to
> > > >> > > >> see
> > > >> > > >> > > their schemas? Drop tables maybe?
> > > >> > > >> > >
> > > >> > > >> > > Best,
> > > >> > > >> > > Yijie
> > > >> > > >> > >
> > > >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <
> > > bowenli86@gmail.com>
> > > >> > > wrote:
> > > >> > > >> > >
> > > >> > > >> > > > Hi Yijie,
> > > >> > > >> > > >
> > > >> > > >> > > > Per the discussion, maybe you can move pulsar source to
> > > >> 'future
> > > >> > > >> work'
> > > >> > > >> > > > section in the FLIP for now?
> > > >> > > >> > > >
> > > >> > > >> > > > Besides, the FLIP seems to be quite rough at the
> moment,
> > > and
> > > >> I'd
> > > >> > > >> > > recommend
> > > >> > > >> > > > to add more details .
> > > >> > > >> > > >
> > > >> > > >> > > > A few questions mainly regarding the proposed pulsar
> > > catalog.
> > > >> > > >> > > >
> > > >> > > >> > > >    - Can you provide some background of pulsar schema
> > > >> registry
> > > >> > and
> > > >> > > >> how
> > > >> > > >> > it
> > > >> > > >> > > >    works?
> > > >> > > >> > > >    - The proposed design of pulsar catalog is very
> vague
> > > now,
> > > >> > can
> > > >> > > >> you
> > > >> > > >> > > >    share some details of how a pulsar catalog would
> work
> > > >> > > internally?
> > > >> > > >> > E.g.
> > > >> > > >> > > >       - which APIs does it support exactly? E.g. I see
> > from
> > > >> your
> > > >> > > >> > > >       prototype that table creation is supported but
> not
> > > >> > > alteration.
> > > >> > > >> > > >       - is it going to connect to a pulsar schema
> > registry
> > > >> via a
> > > >> > > >> http
> > > >> > > >> > > >       client or a pulsar client, etc
> > > >> > > >> > > >       - will it be able to handle multiple versions of
> > > >> pulsar,
> > > >> > or
> > > >> > > >> just
> > > >> > > >> > > >       one? How is compatibility handles between
> different
> > > >> > > >> Flink-Pulsar
> > > >> > > >> > > versions?
> > > >> > > >> > > >       - will it support only reading from pulsar schema
> > > >> > registry ,
> > > >> > > >> or
> > > >> > > >> > > >       both read/write? Will it work end-to-end in Flink
> > SQL
> > > >> for
> > > >> > > >> users
> > > >> > > >> > to
> > > >> > > >> > > create
> > > >> > > >> > > >       and manipulate a pulsar table such as "CREATE
> > TABLE t
> > > >> WITH
> > > >> > > >> > > >       PROPERTIES(type=pulsar)" and "DROP TABLE t"?
> > > >> > > >> > > >       - Is a pulsar topic always gonna be a
> > non-partitioned
> > > >> > table?
> > > >> > > >> How
> > > >> > > >> > is
> > > >> > > >> > > >       a partitioned topic mapped to a Flink table?
> > > >> > > >> > > >    - How to map Flink's catalog/database namespace to
> > > >> pulsar's
> > > >> > > >> > > >    multi-tenant namespaces? I'm not very familiar with
> > how
> > > >> multi
> > > >> > > >> > tenancy
> > > >> > > >> > > works
> > > >> > > >> > > >    in pulsar, and some background context/use cases may
> > > help
> > > >> > here
> > > >> > > >> too.
> > > >> > > >> > > E.g.
> > > >> > > >> > > >       - can a pulsar client/consumer/producer be
> > > >> multiple-tenant
> > > >> > > at
> > > >> > > >> the
> > > >> > > >> > > >       same time?
> > > >> > > >> > > >       - how does authentication work in pulsar's
> > > >> multi-tenancy
> > > >> > and
> > > >> > > >> the
> > > >> > > >> > > >       catalog? asking since I didn't see the proposed
> > > pulsar
> > > >> > > catalog
> > > >> > > >> > has
> > > >> > > >> > > >       username/password configs
> > > >> > > >> > > >       - the FLIP seems propose mapping a pulsar cluster
> > and
> > > >> > > >> > > >       'tenant/namespace' respectively to Flink's
> > 'catalog'
> > > >> and
> > > >> > > >> > > 'database'. I
> > > >> > > >> > > >       wonder whether it totally makes sense, or should
> we
> > > >> > actually
> > > >> > > >> map
> > > >> > > >> > > "tenant"
> > > >> > > >> > > >       to "catalog", and "namespace" to "database"?
> > > >> > > >> > > >
> > > >> > > >> > > > Cheers,
> > > >> > > >> > > > Bowen
> > > >> > > >> > > >
> > > >> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <
> > > >> > > >> henry.yijieshen@gmail.com>
> > > >> > > >> > > > wrote:
> > > >> > > >> > > >
> > > >> > > >> > > >> Hi everyone,
> > > >> > > >> > > >>
> > > >> > > >> > > >> Per discussion in the previous thread
> > > >> > > >> > > >> <
> > > >> > > >> > > >>
> > > >> > > >> > >
> > > >> > > >> >
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html
> > > >> > > >> > > >> >,
> > > >> > > >> > > >> I have created FLIP-72 to kick off a more detailed
> > > >> discussion
> > > >> > on
> > > >> > > >> the
> > > >> > > >> > > Flink
> > > >> > > >> > > >> Pulsar connector:
> > > >> > > >> > > >>
> > > >> > > >> > > >>
> > > >> > > >> > > >>
> > > >> > > >> > >
> > > >> > > >> >
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector
> > > >> > > >> > > >>
> > > >> > > >> > > >> In short, the connector has the following features:
> > > >> > > >> > > >>
> > > >> > > >> > > >>    -
> > > >> > > >> > > >>
> > > >> > > >> > > >>    Pulsar as a streaming source with exactly-once
> > > guarantee.
> > > >> > > >> > > >>    -
> > > >> > > >> > > >>
> > > >> > > >> > > >>    Sink streaming results to Pulsar with at-least-once
> > > >> > semantics.
> > > >> > > >> > > >>    -
> > > >> > > >> > > >>
> > > >> > > >> > > >>    Build upon Flink new Table API Type system (FLIP-37
> > > >> > > >> > > >>    <
> > > >> > > >> > > >>
> > > >> > > >> > >
> > > >> > > >> >
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
> > > >> > > >> > > >> >
> > > >> > > >> > > >>    ), and can automatically (de)serialize messages
> with
> > > the
> > > >> > help
> > > >> > > of
> > > >> > > >> > > Pulsar
> > > >> > > >> > > >>    schema.
> > > >> > > >> > > >>    -
> > > >> > > >> > > >>
> > > >> > > >> > > >>    Integrate with Flink new Catalog API (FLIP-30
> > > >> > > >> > > >>    <
> > > >> > > >> > > >>
> > > >> > > >> > >
> > > >> > > >> >
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs
> > > >> > > >> > > >> >),
> > > >> > > >> > > >>    which enables the use of Pulsar topics as tables in
> > > Table
> > > >> > API
> > > >> > > as
> > > >> > > >> > well
> > > >> > > >> > > >> as
> > > >> > > >> > > >>    SQL client.
> > > >> > > >> > > >>
> > > >> > > >> > > >>
> > > >> > > >> > > >>
> > > >> > > >> > > >>
> > > >> > > >> > >
> > > >> > > >> >
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u
> > > >> > > >> > > >>
> > > >> > > >> > > >>
> > > >> > > >> > > >> Would love to here your thoughts on this.
> > > >> > > >> > > >>
> > > >> > > >> > > >> Best,
> > > >> > > >> > > >> Yijie
> > > >> > > >> > > >>
> > > >> > > >> > > >
> > > >> > > >> > >
> > > >> > > >> >
> > > >> > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

Posted by Bowen Li <bo...@gmail.com>.
Hi Yijie,

There's just one more concern on the yaml configs. Otherwise, I think we
should be good to go.

Can you update your PR and ensure all tests pass? I can help review and
merge in the next couple weeks.

Thanks,
Bowen


On Mon, Dec 23, 2019 at 7:03 PM Yijie Shen <he...@gmail.com>
wrote:

> Hi Bowen,
>
> I've done updated the design doc, PTAL.
> Btw the PR for catalog is https://github.com/apache/flink/pull/10455,
> could
> you please take a look?
>
> Best,
> Yijie
>
> On Mon, Dec 9, 2019 at 8:44 AM Bowen Li <bo...@gmail.com> wrote:
>
> > Hi Yijie,
> >
> > I took a look at the design doc. LGTM overall, left a few questions.
> >
> > On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <be...@gmail.com> wrote:
> >
> > > Yes, you are absolutely right. Cannot believe I posted in the wrong
> > > thread...
> > >
> > > On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <im...@gmail.com> wrote:
> > >
> > >> Thanks Becket the the updating,
> > >>
> > >> But shouldn't this message be posted in FLIP-27 discussion thread[1]?
> > >>
> > >>
> > >> Best,
> > >> Jark
> > >>
> > >> [1]:
> > >>
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html
> > >>
> > >> On Wed, 4 Dec 2019 at 12:12, Becket Qin <be...@gmail.com> wrote:
> > >>
> > >> > Hi all,
> > >> >
> > >> > Sorry for the long belated update. I have updated FLIP-27 wiki page
> > with
> > >> > the latest proposals. Some noticeable changes include:
> > >> > 1. A new generic communication mechanism between SplitEnumerator and
> > >> > SourceReader.
> > >> > 2. Some detail API method signature changes.
> > >> >
> > >> > We left a few things out of this FLIP and will address them in
> > separate
> > >> > FLIPs. Including:
> > >> > 1. Per split event time.
> > >> > 2. Event time alignment.
> > >> > 3. Fine grained failover for SplitEnumerator failure.
> > >> >
> > >> > Please let us know if you have any question.
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Jiangjie (Becket) Qin
> > >> >
> > >> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <
> > henry.yijieshen@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Hi everyone,
> > >> > >
> > >> > > I've put the catalog part design in separate doc with more details
> > for
> > >> > > easier communication.
> > >> > >
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing
> > >> > >
> > >> > > I would love to hear your thoughts on this.
> > >> > >
> > >> > > Best,
> > >> > > Yijie
> > >> > >
> > >> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <
> > >> henry.yijieshen@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Hi everyone,
> > >> > > >
> > >> > > > Glad to receive your valuable feedbacks.
> > >> > > >
> > >> > > > I'd first separate the Pulsar catalog as another doc and show
> more
> > >> > design
> > >> > > > and implementation details there.
> > >> > > >
> > >> > > > For the current FLIP-72, I would separate it into the sink part
> > for
> > >> > > > current work and keep the source part as future works until we
> > reach
> > >> > > > FLIP-27 finals.
> > >> > > >
> > >> > > > I also reply to some of the comments in the design doc. I will
> > >> rewrite
> > >> > > the
> > >> > > > catalog part in regarding to Bowen's advice in both email and
> > >> comments.
> > >> > > >
> > >> > > > Thanks for the help again.
> > >> > > >
> > >> > > > Best,
> > >> > > > Yijie
> > >> > > >
> > >> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <walterddr@gmail.com
> >
> > >> > wrote:
> > >> > > >
> > >> > > >> Hi Yijie,
> > >> > > >>
> > >> > > >> I also agree with Jark on separating the Catalog part into
> > another
> > >> > FLIP.
> > >> > > >>
> > >> > > >> With FLIP-27[1] also in the air, it is also probably great to
> > split
> > >> > and
> > >> > > >> unblock the sink implementation contribution.
> > >> > > >> I would suggest either putting in a detail implementation plan
> > >> section
> > >> > > in
> > >> > > >> the doc, or (maybe too much separation?) splitting them into
> > >> different
> > >> > > >> FLIPs. What do you guys think?
> > >> > > >>
> > >> > > >> --
> > >> > > >> Rong
> > >> > > >>
> > >> > > >> [1]
> > >> > > >>
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> > >> > > >>
> > >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <im...@gmail.com>
> > wrote:
> > >> > > >>
> > >> > > >> > Hi Yijie,
> > >> > > >> >
> > >> > > >> > Thanks for the design document. I agree with Bowen that the
> > >> catalog
> > >> > > part
> > >> > > >> > needs more details.
> > >> > > >> > And I would suggest to separate Pulsar Catalog as another
> FLIP.
> > >> IMO,
> > >> > > it
> > >> > > >> has
> > >> > > >> > little to do with source/sink.
> > >> > > >> > Having a separate FLIP can unblock the contribution for sink
> > (or
> > >> > > source)
> > >> > > >> > and keep the discussion more focus.
> > >> > > >> > I also left some comments in the documentation.
> > >> > > >> >
> > >> > > >> > Thanks,
> > >> > > >> > Jark
> > >> > > >> >
> > >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <
> > >> henry.yijieshen@gmail.com
> > >> > >
> > >> > > >> > wrote:
> > >> > > >> >
> > >> > > >> > > Hi Bowen,
> > >> > > >> > >
> > >> > > >> > > Thanks for your comments. I'll add catalog details as you
> > >> > suggested.
> > >> > > >> > >
> > >> > > >> > > One more question: since we decide to not implement source
> > >> part of
> > >> > > the
> > >> > > >> > > connector at the moment.
> > >> > > >> > > What can users do with a Pulsar catalog?
> > >> > > >> > > Create a table backed by Pulsar and check existing pulsar
> > >> tables
> > >> > to
> > >> > > >> see
> > >> > > >> > > their schemas? Drop tables maybe?
> > >> > > >> > >
> > >> > > >> > > Best,
> > >> > > >> > > Yijie
> > >> > > >> > >
> > >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <
> > bowenli86@gmail.com>
> > >> > > wrote:
> > >> > > >> > >
> > >> > > >> > > > Hi Yijie,
> > >> > > >> > > >
> > >> > > >> > > > Per the discussion, maybe you can move pulsar source to
> > >> 'future
> > >> > > >> work'
> > >> > > >> > > > section in the FLIP for now?
> > >> > > >> > > >
> > >> > > >> > > > Besides, the FLIP seems to be quite rough at the moment,
> > and
> > >> I'd
> > >> > > >> > > recommend
> > >> > > >> > > > to add more details .
> > >> > > >> > > >
> > >> > > >> > > > A few questions mainly regarding the proposed pulsar
> > catalog.
> > >> > > >> > > >
> > >> > > >> > > >    - Can you provide some background of pulsar schema
> > >> registry
> > >> > and
> > >> > > >> how
> > >> > > >> > it
> > >> > > >> > > >    works?
> > >> > > >> > > >    - The proposed design of pulsar catalog is very vague
> > now,
> > >> > can
> > >> > > >> you
> > >> > > >> > > >    share some details of how a pulsar catalog would work
> > >> > > internally?
> > >> > > >> > E.g.
> > >> > > >> > > >       - which APIs does it support exactly? E.g. I see
> from
> > >> your
> > >> > > >> > > >       prototype that table creation is supported but not
> > >> > > alteration.
> > >> > > >> > > >       - is it going to connect to a pulsar schema
> registry
> > >> via a
> > >> > > >> http
> > >> > > >> > > >       client or a pulsar client, etc
> > >> > > >> > > >       - will it be able to handle multiple versions of
> > >> pulsar,
> > >> > or
> > >> > > >> just
> > >> > > >> > > >       one? How is compatibility handles between different
> > >> > > >> Flink-Pulsar
> > >> > > >> > > versions?
> > >> > > >> > > >       - will it support only reading from pulsar schema
> > >> > registry ,
> > >> > > >> or
> > >> > > >> > > >       both read/write? Will it work end-to-end in Flink
> SQL
> > >> for
> > >> > > >> users
> > >> > > >> > to
> > >> > > >> > > create
> > >> > > >> > > >       and manipulate a pulsar table such as "CREATE
> TABLE t
> > >> WITH
> > >> > > >> > > >       PROPERTIES(type=pulsar)" and "DROP TABLE t"?
> > >> > > >> > > >       - Is a pulsar topic always gonna be a
> non-partitioned
> > >> > table?
> > >> > > >> How
> > >> > > >> > is
> > >> > > >> > > >       a partitioned topic mapped to a Flink table?
> > >> > > >> > > >    - How to map Flink's catalog/database namespace to
> > >> pulsar's
> > >> > > >> > > >    multi-tenant namespaces? I'm not very familiar with
> how
> > >> multi
> > >> > > >> > tenancy
> > >> > > >> > > works
> > >> > > >> > > >    in pulsar, and some background context/use cases may
> > help
> > >> > here
> > >> > > >> too.
> > >> > > >> > > E.g.
> > >> > > >> > > >       - can a pulsar client/consumer/producer be
> > >> multiple-tenant
> > >> > > at
> > >> > > >> the
> > >> > > >> > > >       same time?
> > >> > > >> > > >       - how does authentication work in pulsar's
> > >> multi-tenancy
> > >> > and
> > >> > > >> the
> > >> > > >> > > >       catalog? asking since I didn't see the proposed
> > pulsar
> > >> > > catalog
> > >> > > >> > has
> > >> > > >> > > >       username/password configs
> > >> > > >> > > >       - the FLIP seems propose mapping a pulsar cluster
> and
> > >> > > >> > > >       'tenant/namespace' respectively to Flink's
> 'catalog'
> > >> and
> > >> > > >> > > 'database'. I
> > >> > > >> > > >       wonder whether it totally makes sense, or should we
> > >> > actually
> > >> > > >> map
> > >> > > >> > > "tenant"
> > >> > > >> > > >       to "catalog", and "namespace" to "database"?
> > >> > > >> > > >
> > >> > > >> > > > Cheers,
> > >> > > >> > > > Bowen
> > >> > > >> > > >
> > >> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <
> > >> > > >> henry.yijieshen@gmail.com>
> > >> > > >> > > > wrote:
> > >> > > >> > > >
> > >> > > >> > > >> Hi everyone,
> > >> > > >> > > >>
> > >> > > >> > > >> Per discussion in the previous thread
> > >> > > >> > > >> <
> > >> > > >> > > >>
> > >> > > >> > >
> > >> > > >> >
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html
> > >> > > >> > > >> >,
> > >> > > >> > > >> I have created FLIP-72 to kick off a more detailed
> > >> discussion
> > >> > on
> > >> > > >> the
> > >> > > >> > > Flink
> > >> > > >> > > >> Pulsar connector:
> > >> > > >> > > >>
> > >> > > >> > > >>
> > >> > > >> > > >>
> > >> > > >> > >
> > >> > > >> >
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector
> > >> > > >> > > >>
> > >> > > >> > > >> In short, the connector has the following features:
> > >> > > >> > > >>
> > >> > > >> > > >>    -
> > >> > > >> > > >>
> > >> > > >> > > >>    Pulsar as a streaming source with exactly-once
> > guarantee.
> > >> > > >> > > >>    -
> > >> > > >> > > >>
> > >> > > >> > > >>    Sink streaming results to Pulsar with at-least-once
> > >> > semantics.
> > >> > > >> > > >>    -
> > >> > > >> > > >>
> > >> > > >> > > >>    Build upon Flink new Table API Type system (FLIP-37
> > >> > > >> > > >>    <
> > >> > > >> > > >>
> > >> > > >> > >
> > >> > > >> >
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
> > >> > > >> > > >> >
> > >> > > >> > > >>    ), and can automatically (de)serialize messages with
> > the
> > >> > help
> > >> > > of
> > >> > > >> > > Pulsar
> > >> > > >> > > >>    schema.
> > >> > > >> > > >>    -
> > >> > > >> > > >>
> > >> > > >> > > >>    Integrate with Flink new Catalog API (FLIP-30
> > >> > > >> > > >>    <
> > >> > > >> > > >>
> > >> > > >> > >
> > >> > > >> >
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs
> > >> > > >> > > >> >),
> > >> > > >> > > >>    which enables the use of Pulsar topics as tables in
> > Table
> > >> > API
> > >> > > as
> > >> > > >> > well
> > >> > > >> > > >> as
> > >> > > >> > > >>    SQL client.
> > >> > > >> > > >>
> > >> > > >> > > >>
> > >> > > >> > > >>
> > >> > > >> > > >>
> > >> > > >> > >
> > >> > > >> >
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u
> > >> > > >> > > >>
> > >> > > >> > > >>
> > >> > > >> > > >> Would love to here your thoughts on this.
> > >> > > >> > > >>
> > >> > > >> > > >> Best,
> > >> > > >> > > >> Yijie
> > >> > > >> > > >>
> > >> > > >> > > >
> > >> > > >> > >
> > >> > > >> >
> > >> > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

Posted by Yijie Shen <he...@gmail.com>.
Hi Bowen,

I've done updated the design doc, PTAL.
Btw the PR for catalog is https://github.com/apache/flink/pull/10455, could
you please take a look?

Best,
Yijie

On Mon, Dec 9, 2019 at 8:44 AM Bowen Li <bo...@gmail.com> wrote:

> Hi Yijie,
>
> I took a look at the design doc. LGTM overall, left a few questions.
>
> On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <be...@gmail.com> wrote:
>
> > Yes, you are absolutely right. Cannot believe I posted in the wrong
> > thread...
> >
> > On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <im...@gmail.com> wrote:
> >
> >> Thanks Becket the the updating,
> >>
> >> But shouldn't this message be posted in FLIP-27 discussion thread[1]?
> >>
> >>
> >> Best,
> >> Jark
> >>
> >> [1]:
> >>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html
> >>
> >> On Wed, 4 Dec 2019 at 12:12, Becket Qin <be...@gmail.com> wrote:
> >>
> >> > Hi all,
> >> >
> >> > Sorry for the long belated update. I have updated FLIP-27 wiki page
> with
> >> > the latest proposals. Some noticeable changes include:
> >> > 1. A new generic communication mechanism between SplitEnumerator and
> >> > SourceReader.
> >> > 2. Some detail API method signature changes.
> >> >
> >> > We left a few things out of this FLIP and will address them in
> separate
> >> > FLIPs. Including:
> >> > 1. Per split event time.
> >> > 2. Event time alignment.
> >> > 3. Fine grained failover for SplitEnumerator failure.
> >> >
> >> > Please let us know if you have any question.
> >> >
> >> > Thanks,
> >> >
> >> > Jiangjie (Becket) Qin
> >> >
> >> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <
> henry.yijieshen@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi everyone,
> >> > >
> >> > > I've put the catalog part design in separate doc with more details
> for
> >> > > easier communication.
> >> > >
> >> > >
> >> > >
> >> >
> >>
> https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing
> >> > >
> >> > > I would love to hear your thoughts on this.
> >> > >
> >> > > Best,
> >> > > Yijie
> >> > >
> >> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <
> >> henry.yijieshen@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi everyone,
> >> > > >
> >> > > > Glad to receive your valuable feedbacks.
> >> > > >
> >> > > > I'd first separate the Pulsar catalog as another doc and show more
> >> > design
> >> > > > and implementation details there.
> >> > > >
> >> > > > For the current FLIP-72, I would separate it into the sink part
> for
> >> > > > current work and keep the source part as future works until we
> reach
> >> > > > FLIP-27 finals.
> >> > > >
> >> > > > I also reply to some of the comments in the design doc. I will
> >> rewrite
> >> > > the
> >> > > > catalog part in regarding to Bowen's advice in both email and
> >> comments.
> >> > > >
> >> > > > Thanks for the help again.
> >> > > >
> >> > > > Best,
> >> > > > Yijie
> >> > > >
> >> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <wa...@gmail.com>
> >> > wrote:
> >> > > >
> >> > > >> Hi Yijie,
> >> > > >>
> >> > > >> I also agree with Jark on separating the Catalog part into
> another
> >> > FLIP.
> >> > > >>
> >> > > >> With FLIP-27[1] also in the air, it is also probably great to
> split
> >> > and
> >> > > >> unblock the sink implementation contribution.
> >> > > >> I would suggest either putting in a detail implementation plan
> >> section
> >> > > in
> >> > > >> the doc, or (maybe too much separation?) splitting them into
> >> different
> >> > > >> FLIPs. What do you guys think?
> >> > > >>
> >> > > >> --
> >> > > >> Rong
> >> > > >>
> >> > > >> [1]
> >> > > >>
> >> > > >>
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> >> > > >>
> >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <im...@gmail.com>
> wrote:
> >> > > >>
> >> > > >> > Hi Yijie,
> >> > > >> >
> >> > > >> > Thanks for the design document. I agree with Bowen that the
> >> catalog
> >> > > part
> >> > > >> > needs more details.
> >> > > >> > And I would suggest to separate Pulsar Catalog as another FLIP.
> >> IMO,
> >> > > it
> >> > > >> has
> >> > > >> > little to do with source/sink.
> >> > > >> > Having a separate FLIP can unblock the contribution for sink
> (or
> >> > > source)
> >> > > >> > and keep the discussion more focus.
> >> > > >> > I also left some comments in the documentation.
> >> > > >> >
> >> > > >> > Thanks,
> >> > > >> > Jark
> >> > > >> >
> >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <
> >> henry.yijieshen@gmail.com
> >> > >
> >> > > >> > wrote:
> >> > > >> >
> >> > > >> > > Hi Bowen,
> >> > > >> > >
> >> > > >> > > Thanks for your comments. I'll add catalog details as you
> >> > suggested.
> >> > > >> > >
> >> > > >> > > One more question: since we decide to not implement source
> >> part of
> >> > > the
> >> > > >> > > connector at the moment.
> >> > > >> > > What can users do with a Pulsar catalog?
> >> > > >> > > Create a table backed by Pulsar and check existing pulsar
> >> tables
> >> > to
> >> > > >> see
> >> > > >> > > their schemas? Drop tables maybe?
> >> > > >> > >
> >> > > >> > > Best,
> >> > > >> > > Yijie
> >> > > >> > >
> >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <
> bowenli86@gmail.com>
> >> > > wrote:
> >> > > >> > >
> >> > > >> > > > Hi Yijie,
> >> > > >> > > >
> >> > > >> > > > Per the discussion, maybe you can move pulsar source to
> >> 'future
> >> > > >> work'
> >> > > >> > > > section in the FLIP for now?
> >> > > >> > > >
> >> > > >> > > > Besides, the FLIP seems to be quite rough at the moment,
> and
> >> I'd
> >> > > >> > > recommend
> >> > > >> > > > to add more details .
> >> > > >> > > >
> >> > > >> > > > A few questions mainly regarding the proposed pulsar
> catalog.
> >> > > >> > > >
> >> > > >> > > >    - Can you provide some background of pulsar schema
> >> registry
> >> > and
> >> > > >> how
> >> > > >> > it
> >> > > >> > > >    works?
> >> > > >> > > >    - The proposed design of pulsar catalog is very vague
> now,
> >> > can
> >> > > >> you
> >> > > >> > > >    share some details of how a pulsar catalog would work
> >> > > internally?
> >> > > >> > E.g.
> >> > > >> > > >       - which APIs does it support exactly? E.g. I see from
> >> your
> >> > > >> > > >       prototype that table creation is supported but not
> >> > > alteration.
> >> > > >> > > >       - is it going to connect to a pulsar schema registry
> >> via a
> >> > > >> http
> >> > > >> > > >       client or a pulsar client, etc
> >> > > >> > > >       - will it be able to handle multiple versions of
> >> pulsar,
> >> > or
> >> > > >> just
> >> > > >> > > >       one? How is compatibility handles between different
> >> > > >> Flink-Pulsar
> >> > > >> > > versions?
> >> > > >> > > >       - will it support only reading from pulsar schema
> >> > registry ,
> >> > > >> or
> >> > > >> > > >       both read/write? Will it work end-to-end in Flink SQL
> >> for
> >> > > >> users
> >> > > >> > to
> >> > > >> > > create
> >> > > >> > > >       and manipulate a pulsar table such as "CREATE TABLE t
> >> WITH
> >> > > >> > > >       PROPERTIES(type=pulsar)" and "DROP TABLE t"?
> >> > > >> > > >       - Is a pulsar topic always gonna be a non-partitioned
> >> > table?
> >> > > >> How
> >> > > >> > is
> >> > > >> > > >       a partitioned topic mapped to a Flink table?
> >> > > >> > > >    - How to map Flink's catalog/database namespace to
> >> pulsar's
> >> > > >> > > >    multi-tenant namespaces? I'm not very familiar with how
> >> multi
> >> > > >> > tenancy
> >> > > >> > > works
> >> > > >> > > >    in pulsar, and some background context/use cases may
> help
> >> > here
> >> > > >> too.
> >> > > >> > > E.g.
> >> > > >> > > >       - can a pulsar client/consumer/producer be
> >> multiple-tenant
> >> > > at
> >> > > >> the
> >> > > >> > > >       same time?
> >> > > >> > > >       - how does authentication work in pulsar's
> >> multi-tenancy
> >> > and
> >> > > >> the
> >> > > >> > > >       catalog? asking since I didn't see the proposed
> pulsar
> >> > > catalog
> >> > > >> > has
> >> > > >> > > >       username/password configs
> >> > > >> > > >       - the FLIP seems propose mapping a pulsar cluster and
> >> > > >> > > >       'tenant/namespace' respectively to Flink's 'catalog'
> >> and
> >> > > >> > > 'database'. I
> >> > > >> > > >       wonder whether it totally makes sense, or should we
> >> > actually
> >> > > >> map
> >> > > >> > > "tenant"
> >> > > >> > > >       to "catalog", and "namespace" to "database"?
> >> > > >> > > >
> >> > > >> > > > Cheers,
> >> > > >> > > > Bowen
> >> > > >> > > >
> >> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <
> >> > > >> henry.yijieshen@gmail.com>
> >> > > >> > > > wrote:
> >> > > >> > > >
> >> > > >> > > >> Hi everyone,
> >> > > >> > > >>
> >> > > >> > > >> Per discussion in the previous thread
> >> > > >> > > >> <
> >> > > >> > > >>
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html
> >> > > >> > > >> >,
> >> > > >> > > >> I have created FLIP-72 to kick off a more detailed
> >> discussion
> >> > on
> >> > > >> the
> >> > > >> > > Flink
> >> > > >> > > >> Pulsar connector:
> >> > > >> > > >>
> >> > > >> > > >>
> >> > > >> > > >>
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector
> >> > > >> > > >>
> >> > > >> > > >> In short, the connector has the following features:
> >> > > >> > > >>
> >> > > >> > > >>    -
> >> > > >> > > >>
> >> > > >> > > >>    Pulsar as a streaming source with exactly-once
> guarantee.
> >> > > >> > > >>    -
> >> > > >> > > >>
> >> > > >> > > >>    Sink streaming results to Pulsar with at-least-once
> >> > semantics.
> >> > > >> > > >>    -
> >> > > >> > > >>
> >> > > >> > > >>    Build upon Flink new Table API Type system (FLIP-37
> >> > > >> > > >>    <
> >> > > >> > > >>
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
> >> > > >> > > >> >
> >> > > >> > > >>    ), and can automatically (de)serialize messages with
> the
> >> > help
> >> > > of
> >> > > >> > > Pulsar
> >> > > >> > > >>    schema.
> >> > > >> > > >>    -
> >> > > >> > > >>
> >> > > >> > > >>    Integrate with Flink new Catalog API (FLIP-30
> >> > > >> > > >>    <
> >> > > >> > > >>
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs
> >> > > >> > > >> >),
> >> > > >> > > >>    which enables the use of Pulsar topics as tables in
> Table
> >> > API
> >> > > as
> >> > > >> > well
> >> > > >> > > >> as
> >> > > >> > > >>    SQL client.
> >> > > >> > > >>
> >> > > >> > > >>
> >> > > >> > > >>
> >> > > >> > > >>
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u
> >> > > >> > > >>
> >> > > >> > > >>
> >> > > >> > > >> Would love to here your thoughts on this.
> >> > > >> > > >>
> >> > > >> > > >> Best,
> >> > > >> > > >> Yijie
> >> > > >> > > >>
> >> > > >> > > >
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

Posted by Bowen Li <bo...@gmail.com>.
Hi Yijie,

I took a look at the design doc. LGTM overall, left a few questions.

On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <be...@gmail.com> wrote:

> Yes, you are absolutely right. Cannot believe I posted in the wrong
> thread...
>
> On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <im...@gmail.com> wrote:
>
>> Thanks Becket the the updating,
>>
>> But shouldn't this message be posted in FLIP-27 discussion thread[1]?
>>
>>
>> Best,
>> Jark
>>
>> [1]:
>>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html
>>
>> On Wed, 4 Dec 2019 at 12:12, Becket Qin <be...@gmail.com> wrote:
>>
>> > Hi all,
>> >
>> > Sorry for the long belated update. I have updated FLIP-27 wiki page with
>> > the latest proposals. Some noticeable changes include:
>> > 1. A new generic communication mechanism between SplitEnumerator and
>> > SourceReader.
>> > 2. Some detail API method signature changes.
>> >
>> > We left a few things out of this FLIP and will address them in separate
>> > FLIPs. Including:
>> > 1. Per split event time.
>> > 2. Event time alignment.
>> > 3. Fine grained failover for SplitEnumerator failure.
>> >
>> > Please let us know if you have any question.
>> >
>> > Thanks,
>> >
>> > Jiangjie (Becket) Qin
>> >
>> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <he...@gmail.com>
>> > wrote:
>> >
>> > > Hi everyone,
>> > >
>> > > I've put the catalog part design in separate doc with more details for
>> > > easier communication.
>> > >
>> > >
>> > >
>> >
>> https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing
>> > >
>> > > I would love to hear your thoughts on this.
>> > >
>> > > Best,
>> > > Yijie
>> > >
>> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <
>> henry.yijieshen@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi everyone,
>> > > >
>> > > > Glad to receive your valuable feedbacks.
>> > > >
>> > > > I'd first separate the Pulsar catalog as another doc and show more
>> > design
>> > > > and implementation details there.
>> > > >
>> > > > For the current FLIP-72, I would separate it into the sink part for
>> > > > current work and keep the source part as future works until we reach
>> > > > FLIP-27 finals.
>> > > >
>> > > > I also reply to some of the comments in the design doc. I will
>> rewrite
>> > > the
>> > > > catalog part in regarding to Bowen's advice in both email and
>> comments.
>> > > >
>> > > > Thanks for the help again.
>> > > >
>> > > > Best,
>> > > > Yijie
>> > > >
>> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <wa...@gmail.com>
>> > wrote:
>> > > >
>> > > >> Hi Yijie,
>> > > >>
>> > > >> I also agree with Jark on separating the Catalog part into another
>> > FLIP.
>> > > >>
>> > > >> With FLIP-27[1] also in the air, it is also probably great to split
>> > and
>> > > >> unblock the sink implementation contribution.
>> > > >> I would suggest either putting in a detail implementation plan
>> section
>> > > in
>> > > >> the doc, or (maybe too much separation?) splitting them into
>> different
>> > > >> FLIPs. What do you guys think?
>> > > >>
>> > > >> --
>> > > >> Rong
>> > > >>
>> > > >> [1]
>> > > >>
>> > > >>
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
>> > > >>
>> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <im...@gmail.com> wrote:
>> > > >>
>> > > >> > Hi Yijie,
>> > > >> >
>> > > >> > Thanks for the design document. I agree with Bowen that the
>> catalog
>> > > part
>> > > >> > needs more details.
>> > > >> > And I would suggest to separate Pulsar Catalog as another FLIP.
>> IMO,
>> > > it
>> > > >> has
>> > > >> > little to do with source/sink.
>> > > >> > Having a separate FLIP can unblock the contribution for sink (or
>> > > source)
>> > > >> > and keep the discussion more focus.
>> > > >> > I also left some comments in the documentation.
>> > > >> >
>> > > >> > Thanks,
>> > > >> > Jark
>> > > >> >
>> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <
>> henry.yijieshen@gmail.com
>> > >
>> > > >> > wrote:
>> > > >> >
>> > > >> > > Hi Bowen,
>> > > >> > >
>> > > >> > > Thanks for your comments. I'll add catalog details as you
>> > suggested.
>> > > >> > >
>> > > >> > > One more question: since we decide to not implement source
>> part of
>> > > the
>> > > >> > > connector at the moment.
>> > > >> > > What can users do with a Pulsar catalog?
>> > > >> > > Create a table backed by Pulsar and check existing pulsar
>> tables
>> > to
>> > > >> see
>> > > >> > > their schemas? Drop tables maybe?
>> > > >> > >
>> > > >> > > Best,
>> > > >> > > Yijie
>> > > >> > >
>> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <bo...@gmail.com>
>> > > wrote:
>> > > >> > >
>> > > >> > > > Hi Yijie,
>> > > >> > > >
>> > > >> > > > Per the discussion, maybe you can move pulsar source to
>> 'future
>> > > >> work'
>> > > >> > > > section in the FLIP for now?
>> > > >> > > >
>> > > >> > > > Besides, the FLIP seems to be quite rough at the moment, and
>> I'd
>> > > >> > > recommend
>> > > >> > > > to add more details .
>> > > >> > > >
>> > > >> > > > A few questions mainly regarding the proposed pulsar catalog.
>> > > >> > > >
>> > > >> > > >    - Can you provide some background of pulsar schema
>> registry
>> > and
>> > > >> how
>> > > >> > it
>> > > >> > > >    works?
>> > > >> > > >    - The proposed design of pulsar catalog is very vague now,
>> > can
>> > > >> you
>> > > >> > > >    share some details of how a pulsar catalog would work
>> > > internally?
>> > > >> > E.g.
>> > > >> > > >       - which APIs does it support exactly? E.g. I see from
>> your
>> > > >> > > >       prototype that table creation is supported but not
>> > > alteration.
>> > > >> > > >       - is it going to connect to a pulsar schema registry
>> via a
>> > > >> http
>> > > >> > > >       client or a pulsar client, etc
>> > > >> > > >       - will it be able to handle multiple versions of
>> pulsar,
>> > or
>> > > >> just
>> > > >> > > >       one? How is compatibility handles between different
>> > > >> Flink-Pulsar
>> > > >> > > versions?
>> > > >> > > >       - will it support only reading from pulsar schema
>> > registry ,
>> > > >> or
>> > > >> > > >       both read/write? Will it work end-to-end in Flink SQL
>> for
>> > > >> users
>> > > >> > to
>> > > >> > > create
>> > > >> > > >       and manipulate a pulsar table such as "CREATE TABLE t
>> WITH
>> > > >> > > >       PROPERTIES(type=pulsar)" and "DROP TABLE t"?
>> > > >> > > >       - Is a pulsar topic always gonna be a non-partitioned
>> > table?
>> > > >> How
>> > > >> > is
>> > > >> > > >       a partitioned topic mapped to a Flink table?
>> > > >> > > >    - How to map Flink's catalog/database namespace to
>> pulsar's
>> > > >> > > >    multi-tenant namespaces? I'm not very familiar with how
>> multi
>> > > >> > tenancy
>> > > >> > > works
>> > > >> > > >    in pulsar, and some background context/use cases may help
>> > here
>> > > >> too.
>> > > >> > > E.g.
>> > > >> > > >       - can a pulsar client/consumer/producer be
>> multiple-tenant
>> > > at
>> > > >> the
>> > > >> > > >       same time?
>> > > >> > > >       - how does authentication work in pulsar's
>> multi-tenancy
>> > and
>> > > >> the
>> > > >> > > >       catalog? asking since I didn't see the proposed pulsar
>> > > catalog
>> > > >> > has
>> > > >> > > >       username/password configs
>> > > >> > > >       - the FLIP seems propose mapping a pulsar cluster and
>> > > >> > > >       'tenant/namespace' respectively to Flink's 'catalog'
>> and
>> > > >> > > 'database'. I
>> > > >> > > >       wonder whether it totally makes sense, or should we
>> > actually
>> > > >> map
>> > > >> > > "tenant"
>> > > >> > > >       to "catalog", and "namespace" to "database"?
>> > > >> > > >
>> > > >> > > > Cheers,
>> > > >> > > > Bowen
>> > > >> > > >
>> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <
>> > > >> henry.yijieshen@gmail.com>
>> > > >> > > > wrote:
>> > > >> > > >
>> > > >> > > >> Hi everyone,
>> > > >> > > >>
>> > > >> > > >> Per discussion in the previous thread
>> > > >> > > >> <
>> > > >> > > >>
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html
>> > > >> > > >> >,
>> > > >> > > >> I have created FLIP-72 to kick off a more detailed
>> discussion
>> > on
>> > > >> the
>> > > >> > > Flink
>> > > >> > > >> Pulsar connector:
>> > > >> > > >>
>> > > >> > > >>
>> > > >> > > >>
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector
>> > > >> > > >>
>> > > >> > > >> In short, the connector has the following features:
>> > > >> > > >>
>> > > >> > > >>    -
>> > > >> > > >>
>> > > >> > > >>    Pulsar as a streaming source with exactly-once guarantee.
>> > > >> > > >>    -
>> > > >> > > >>
>> > > >> > > >>    Sink streaming results to Pulsar with at-least-once
>> > semantics.
>> > > >> > > >>    -
>> > > >> > > >>
>> > > >> > > >>    Build upon Flink new Table API Type system (FLIP-37
>> > > >> > > >>    <
>> > > >> > > >>
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
>> > > >> > > >> >
>> > > >> > > >>    ), and can automatically (de)serialize messages with the
>> > help
>> > > of
>> > > >> > > Pulsar
>> > > >> > > >>    schema.
>> > > >> > > >>    -
>> > > >> > > >>
>> > > >> > > >>    Integrate with Flink new Catalog API (FLIP-30
>> > > >> > > >>    <
>> > > >> > > >>
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs
>> > > >> > > >> >),
>> > > >> > > >>    which enables the use of Pulsar topics as tables in Table
>> > API
>> > > as
>> > > >> > well
>> > > >> > > >> as
>> > > >> > > >>    SQL client.
>> > > >> > > >>
>> > > >> > > >>
>> > > >> > > >>
>> > > >> > > >>
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u
>> > > >> > > >>
>> > > >> > > >>
>> > > >> > > >> Would love to here your thoughts on this.
>> > > >> > > >>
>> > > >> > > >> Best,
>> > > >> > > >> Yijie
>> > > >> > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

Posted by Becket Qin <be...@gmail.com>.
Yes, you are absolutely right. Cannot believe I posted in the wrong
thread...

On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <im...@gmail.com> wrote:

> Thanks Becket the the updating,
>
> But shouldn't this message be posted in FLIP-27 discussion thread[1]?
>
>
> Best,
> Jark
>
> [1]:
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html
>
> On Wed, 4 Dec 2019 at 12:12, Becket Qin <be...@gmail.com> wrote:
>
> > Hi all,
> >
> > Sorry for the long belated update. I have updated FLIP-27 wiki page with
> > the latest proposals. Some noticeable changes include:
> > 1. A new generic communication mechanism between SplitEnumerator and
> > SourceReader.
> > 2. Some detail API method signature changes.
> >
> > We left a few things out of this FLIP and will address them in separate
> > FLIPs. Including:
> > 1. Per split event time.
> > 2. Event time alignment.
> > 3. Fine grained failover for SplitEnumerator failure.
> >
> > Please let us know if you have any question.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <he...@gmail.com>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > I've put the catalog part design in separate doc with more details for
> > > easier communication.
> > >
> > >
> > >
> >
> https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing
> > >
> > > I would love to hear your thoughts on this.
> > >
> > > Best,
> > > Yijie
> > >
> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <henry.yijieshen@gmail.com
> >
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Glad to receive your valuable feedbacks.
> > > >
> > > > I'd first separate the Pulsar catalog as another doc and show more
> > design
> > > > and implementation details there.
> > > >
> > > > For the current FLIP-72, I would separate it into the sink part for
> > > > current work and keep the source part as future works until we reach
> > > > FLIP-27 finals.
> > > >
> > > > I also reply to some of the comments in the design doc. I will
> rewrite
> > > the
> > > > catalog part in regarding to Bowen's advice in both email and
> comments.
> > > >
> > > > Thanks for the help again.
> > > >
> > > > Best,
> > > > Yijie
> > > >
> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <wa...@gmail.com>
> > wrote:
> > > >
> > > >> Hi Yijie,
> > > >>
> > > >> I also agree with Jark on separating the Catalog part into another
> > FLIP.
> > > >>
> > > >> With FLIP-27[1] also in the air, it is also probably great to split
> > and
> > > >> unblock the sink implementation contribution.
> > > >> I would suggest either putting in a detail implementation plan
> section
> > > in
> > > >> the doc, or (maybe too much separation?) splitting them into
> different
> > > >> FLIPs. What do you guys think?
> > > >>
> > > >> --
> > > >> Rong
> > > >>
> > > >> [1]
> > > >>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> > > >>
> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <im...@gmail.com> wrote:
> > > >>
> > > >> > Hi Yijie,
> > > >> >
> > > >> > Thanks for the design document. I agree with Bowen that the
> catalog
> > > part
> > > >> > needs more details.
> > > >> > And I would suggest to separate Pulsar Catalog as another FLIP.
> IMO,
> > > it
> > > >> has
> > > >> > little to do with source/sink.
> > > >> > Having a separate FLIP can unblock the contribution for sink (or
> > > source)
> > > >> > and keep the discussion more focus.
> > > >> > I also left some comments in the documentation.
> > > >> >
> > > >> > Thanks,
> > > >> > Jark
> > > >> >
> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <
> henry.yijieshen@gmail.com
> > >
> > > >> > wrote:
> > > >> >
> > > >> > > Hi Bowen,
> > > >> > >
> > > >> > > Thanks for your comments. I'll add catalog details as you
> > suggested.
> > > >> > >
> > > >> > > One more question: since we decide to not implement source part
> of
> > > the
> > > >> > > connector at the moment.
> > > >> > > What can users do with a Pulsar catalog?
> > > >> > > Create a table backed by Pulsar and check existing pulsar tables
> > to
> > > >> see
> > > >> > > their schemas? Drop tables maybe?
> > > >> > >
> > > >> > > Best,
> > > >> > > Yijie
> > > >> > >
> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <bo...@gmail.com>
> > > wrote:
> > > >> > >
> > > >> > > > Hi Yijie,
> > > >> > > >
> > > >> > > > Per the discussion, maybe you can move pulsar source to
> 'future
> > > >> work'
> > > >> > > > section in the FLIP for now?
> > > >> > > >
> > > >> > > > Besides, the FLIP seems to be quite rough at the moment, and
> I'd
> > > >> > > recommend
> > > >> > > > to add more details .
> > > >> > > >
> > > >> > > > A few questions mainly regarding the proposed pulsar catalog.
> > > >> > > >
> > > >> > > >    - Can you provide some background of pulsar schema registry
> > and
> > > >> how
> > > >> > it
> > > >> > > >    works?
> > > >> > > >    - The proposed design of pulsar catalog is very vague now,
> > can
> > > >> you
> > > >> > > >    share some details of how a pulsar catalog would work
> > > internally?
> > > >> > E.g.
> > > >> > > >       - which APIs does it support exactly? E.g. I see from
> your
> > > >> > > >       prototype that table creation is supported but not
> > > alteration.
> > > >> > > >       - is it going to connect to a pulsar schema registry
> via a
> > > >> http
> > > >> > > >       client or a pulsar client, etc
> > > >> > > >       - will it be able to handle multiple versions of pulsar,
> > or
> > > >> just
> > > >> > > >       one? How is compatibility handles between different
> > > >> Flink-Pulsar
> > > >> > > versions?
> > > >> > > >       - will it support only reading from pulsar schema
> > registry ,
> > > >> or
> > > >> > > >       both read/write? Will it work end-to-end in Flink SQL
> for
> > > >> users
> > > >> > to
> > > >> > > create
> > > >> > > >       and manipulate a pulsar table such as "CREATE TABLE t
> WITH
> > > >> > > >       PROPERTIES(type=pulsar)" and "DROP TABLE t"?
> > > >> > > >       - Is a pulsar topic always gonna be a non-partitioned
> > table?
> > > >> How
> > > >> > is
> > > >> > > >       a partitioned topic mapped to a Flink table?
> > > >> > > >    - How to map Flink's catalog/database namespace to pulsar's
> > > >> > > >    multi-tenant namespaces? I'm not very familiar with how
> multi
> > > >> > tenancy
> > > >> > > works
> > > >> > > >    in pulsar, and some background context/use cases may help
> > here
> > > >> too.
> > > >> > > E.g.
> > > >> > > >       - can a pulsar client/consumer/producer be
> multiple-tenant
> > > at
> > > >> the
> > > >> > > >       same time?
> > > >> > > >       - how does authentication work in pulsar's multi-tenancy
> > and
> > > >> the
> > > >> > > >       catalog? asking since I didn't see the proposed pulsar
> > > catalog
> > > >> > has
> > > >> > > >       username/password configs
> > > >> > > >       - the FLIP seems propose mapping a pulsar cluster and
> > > >> > > >       'tenant/namespace' respectively to Flink's 'catalog' and
> > > >> > > 'database'. I
> > > >> > > >       wonder whether it totally makes sense, or should we
> > actually
> > > >> map
> > > >> > > "tenant"
> > > >> > > >       to "catalog", and "namespace" to "database"?
> > > >> > > >
> > > >> > > > Cheers,
> > > >> > > > Bowen
> > > >> > > >
> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <
> > > >> henry.yijieshen@gmail.com>
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > >> Hi everyone,
> > > >> > > >>
> > > >> > > >> Per discussion in the previous thread
> > > >> > > >> <
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html
> > > >> > > >> >,
> > > >> > > >> I have created FLIP-72 to kick off a more detailed discussion
> > on
> > > >> the
> > > >> > > Flink
> > > >> > > >> Pulsar connector:
> > > >> > > >>
> > > >> > > >>
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector
> > > >> > > >>
> > > >> > > >> In short, the connector has the following features:
> > > >> > > >>
> > > >> > > >>    -
> > > >> > > >>
> > > >> > > >>    Pulsar as a streaming source with exactly-once guarantee.
> > > >> > > >>    -
> > > >> > > >>
> > > >> > > >>    Sink streaming results to Pulsar with at-least-once
> > semantics.
> > > >> > > >>    -
> > > >> > > >>
> > > >> > > >>    Build upon Flink new Table API Type system (FLIP-37
> > > >> > > >>    <
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
> > > >> > > >> >
> > > >> > > >>    ), and can automatically (de)serialize messages with the
> > help
> > > of
> > > >> > > Pulsar
> > > >> > > >>    schema.
> > > >> > > >>    -
> > > >> > > >>
> > > >> > > >>    Integrate with Flink new Catalog API (FLIP-30
> > > >> > > >>    <
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs
> > > >> > > >> >),
> > > >> > > >>    which enables the use of Pulsar topics as tables in Table
> > API
> > > as
> > > >> > well
> > > >> > > >> as
> > > >> > > >>    SQL client.
> > > >> > > >>
> > > >> > > >>
> > > >> > > >>
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u
> > > >> > > >>
> > > >> > > >>
> > > >> > > >> Would love to here your thoughts on this.
> > > >> > > >>
> > > >> > > >> Best,
> > > >> > > >> Yijie
> > > >> > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

Posted by Jark Wu <im...@gmail.com>.
Thanks Becket the the updating,

But shouldn't this message be posted in FLIP-27 discussion thread[1]?


Best,
Jark

[1]:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html

On Wed, 4 Dec 2019 at 12:12, Becket Qin <be...@gmail.com> wrote:

> Hi all,
>
> Sorry for the long belated update. I have updated FLIP-27 wiki page with
> the latest proposals. Some noticeable changes include:
> 1. A new generic communication mechanism between SplitEnumerator and
> SourceReader.
> 2. Some detail API method signature changes.
>
> We left a few things out of this FLIP and will address them in separate
> FLIPs. Including:
> 1. Per split event time.
> 2. Event time alignment.
> 3. Fine grained failover for SplitEnumerator failure.
>
> Please let us know if you have any question.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <he...@gmail.com>
> wrote:
>
> > Hi everyone,
> >
> > I've put the catalog part design in separate doc with more details for
> > easier communication.
> >
> >
> >
> https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing
> >
> > I would love to hear your thoughts on this.
> >
> > Best,
> > Yijie
> >
> > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <he...@gmail.com>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > Glad to receive your valuable feedbacks.
> > >
> > > I'd first separate the Pulsar catalog as another doc and show more
> design
> > > and implementation details there.
> > >
> > > For the current FLIP-72, I would separate it into the sink part for
> > > current work and keep the source part as future works until we reach
> > > FLIP-27 finals.
> > >
> > > I also reply to some of the comments in the design doc. I will rewrite
> > the
> > > catalog part in regarding to Bowen's advice in both email and comments.
> > >
> > > Thanks for the help again.
> > >
> > > Best,
> > > Yijie
> > >
> > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <wa...@gmail.com>
> wrote:
> > >
> > >> Hi Yijie,
> > >>
> > >> I also agree with Jark on separating the Catalog part into another
> FLIP.
> > >>
> > >> With FLIP-27[1] also in the air, it is also probably great to split
> and
> > >> unblock the sink implementation contribution.
> > >> I would suggest either putting in a detail implementation plan section
> > in
> > >> the doc, or (maybe too much separation?) splitting them into different
> > >> FLIPs. What do you guys think?
> > >>
> > >> --
> > >> Rong
> > >>
> > >> [1]
> > >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> > >>
> > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <im...@gmail.com> wrote:
> > >>
> > >> > Hi Yijie,
> > >> >
> > >> > Thanks for the design document. I agree with Bowen that the catalog
> > part
> > >> > needs more details.
> > >> > And I would suggest to separate Pulsar Catalog as another FLIP. IMO,
> > it
> > >> has
> > >> > little to do with source/sink.
> > >> > Having a separate FLIP can unblock the contribution for sink (or
> > source)
> > >> > and keep the discussion more focus.
> > >> > I also left some comments in the documentation.
> > >> >
> > >> > Thanks,
> > >> > Jark
> > >> >
> > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <henry.yijieshen@gmail.com
> >
> > >> > wrote:
> > >> >
> > >> > > Hi Bowen,
> > >> > >
> > >> > > Thanks for your comments. I'll add catalog details as you
> suggested.
> > >> > >
> > >> > > One more question: since we decide to not implement source part of
> > the
> > >> > > connector at the moment.
> > >> > > What can users do with a Pulsar catalog?
> > >> > > Create a table backed by Pulsar and check existing pulsar tables
> to
> > >> see
> > >> > > their schemas? Drop tables maybe?
> > >> > >
> > >> > > Best,
> > >> > > Yijie
> > >> > >
> > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <bo...@gmail.com>
> > wrote:
> > >> > >
> > >> > > > Hi Yijie,
> > >> > > >
> > >> > > > Per the discussion, maybe you can move pulsar source to 'future
> > >> work'
> > >> > > > section in the FLIP for now?
> > >> > > >
> > >> > > > Besides, the FLIP seems to be quite rough at the moment, and I'd
> > >> > > recommend
> > >> > > > to add more details .
> > >> > > >
> > >> > > > A few questions mainly regarding the proposed pulsar catalog.
> > >> > > >
> > >> > > >    - Can you provide some background of pulsar schema registry
> and
> > >> how
> > >> > it
> > >> > > >    works?
> > >> > > >    - The proposed design of pulsar catalog is very vague now,
> can
> > >> you
> > >> > > >    share some details of how a pulsar catalog would work
> > internally?
> > >> > E.g.
> > >> > > >       - which APIs does it support exactly? E.g. I see from your
> > >> > > >       prototype that table creation is supported but not
> > alteration.
> > >> > > >       - is it going to connect to a pulsar schema registry via a
> > >> http
> > >> > > >       client or a pulsar client, etc
> > >> > > >       - will it be able to handle multiple versions of pulsar,
> or
> > >> just
> > >> > > >       one? How is compatibility handles between different
> > >> Flink-Pulsar
> > >> > > versions?
> > >> > > >       - will it support only reading from pulsar schema
> registry ,
> > >> or
> > >> > > >       both read/write? Will it work end-to-end in Flink SQL for
> > >> users
> > >> > to
> > >> > > create
> > >> > > >       and manipulate a pulsar table such as "CREATE TABLE t WITH
> > >> > > >       PROPERTIES(type=pulsar)" and "DROP TABLE t"?
> > >> > > >       - Is a pulsar topic always gonna be a non-partitioned
> table?
> > >> How
> > >> > is
> > >> > > >       a partitioned topic mapped to a Flink table?
> > >> > > >    - How to map Flink's catalog/database namespace to pulsar's
> > >> > > >    multi-tenant namespaces? I'm not very familiar with how multi
> > >> > tenancy
> > >> > > works
> > >> > > >    in pulsar, and some background context/use cases may help
> here
> > >> too.
> > >> > > E.g.
> > >> > > >       - can a pulsar client/consumer/producer be multiple-tenant
> > at
> > >> the
> > >> > > >       same time?
> > >> > > >       - how does authentication work in pulsar's multi-tenancy
> and
> > >> the
> > >> > > >       catalog? asking since I didn't see the proposed pulsar
> > catalog
> > >> > has
> > >> > > >       username/password configs
> > >> > > >       - the FLIP seems propose mapping a pulsar cluster and
> > >> > > >       'tenant/namespace' respectively to Flink's 'catalog' and
> > >> > > 'database'. I
> > >> > > >       wonder whether it totally makes sense, or should we
> actually
> > >> map
> > >> > > "tenant"
> > >> > > >       to "catalog", and "namespace" to "database"?
> > >> > > >
> > >> > > > Cheers,
> > >> > > > Bowen
> > >> > > >
> > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <
> > >> henry.yijieshen@gmail.com>
> > >> > > > wrote:
> > >> > > >
> > >> > > >> Hi everyone,
> > >> > > >>
> > >> > > >> Per discussion in the previous thread
> > >> > > >> <
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html
> > >> > > >> >,
> > >> > > >> I have created FLIP-72 to kick off a more detailed discussion
> on
> > >> the
> > >> > > Flink
> > >> > > >> Pulsar connector:
> > >> > > >>
> > >> > > >>
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector
> > >> > > >>
> > >> > > >> In short, the connector has the following features:
> > >> > > >>
> > >> > > >>    -
> > >> > > >>
> > >> > > >>    Pulsar as a streaming source with exactly-once guarantee.
> > >> > > >>    -
> > >> > > >>
> > >> > > >>    Sink streaming results to Pulsar with at-least-once
> semantics.
> > >> > > >>    -
> > >> > > >>
> > >> > > >>    Build upon Flink new Table API Type system (FLIP-37
> > >> > > >>    <
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
> > >> > > >> >
> > >> > > >>    ), and can automatically (de)serialize messages with the
> help
> > of
> > >> > > Pulsar
> > >> > > >>    schema.
> > >> > > >>    -
> > >> > > >>
> > >> > > >>    Integrate with Flink new Catalog API (FLIP-30
> > >> > > >>    <
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs
> > >> > > >> >),
> > >> > > >>    which enables the use of Pulsar topics as tables in Table
> API
> > as
> > >> > well
> > >> > > >> as
> > >> > > >>    SQL client.
> > >> > > >>
> > >> > > >>
> > >> > > >>
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u
> > >> > > >>
> > >> > > >>
> > >> > > >> Would love to here your thoughts on this.
> > >> > > >>
> > >> > > >> Best,
> > >> > > >> Yijie
> > >> > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>