You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Christian Krudewig (Corporate Development)" <ch...@dpdhl.com> on 2021/09/27 19:27:30 UTC

How to add Flink a Flink connector to stateful functions

Hello everyone,

 

Currently I'm busy setting up a pipeline with Stateful Functions using a
deployment of the standard docker image "apache/flink-statefun" to
kubernetes. It has been going smoothly so far and I love the whole toolset.
But now I want to add Egress modules for both Opensearch (= ElasticSearch
protocol) and ScyllaDB (= Cassandra protocol). The documentation at
https://ci.apache.org/projects/flink/flink-statefun-docs-master/docs/io-modu
le/flink-connectors/ indicates that I can somehow simply plug in the
standard Flink datastream connectors for ElasticSearch and Cassandra. But I
didn't get how exactly.

It says "include the dependency in your pom". Does that mean that I need to
build the stateful functions java application and afterwards the docker
image? That would be a bit unfortunate in terms of long-term maintenance
effort, because I would need to keep my local checkout in sync with the
official repositories and rebuild every now and then. Maybe this can also be
added on top of the existing docker image by adding some jar file to some
magic plugin folder?

 

Sorry, I hope this doesn't sound captious. I just want to understand and do
it the right way. Maybe there is also some minimal example? I didn't find
any in the playground nor on stackoverflow or the mailing lists.

 

Thanks,

 

Christian Krudewig

 

 


Re: How to add Flink a Flink connector to stateful functions

Posted by Seth Wiesman <sj...@gmail.com>.
I just want to add that the StateFun documentation does cover using custom
Flink connectors[1].

[1]
https://nightlies.apache.org/flink/flink-statefun-docs-release-3.1/docs/modules/io/flink-connectors/#flink-connectors


On Tue, Sep 28, 2021 at 2:52 AM Christian Krudewig (Corporate Development) <
christian.krudewig@dpdhl.com> wrote:

> Hello Igal,
>
>
>
> Thanks for replying in detail and also so quickly.
>
>
>
> It’ll take me some time to try it out, thank you!
>
>
>
> Best,
>
>
>
> Christian
>
>
>
>
>
> *--*
>
>
> *Dr. Christian Krudewig*Corporate Development – Data Analytics
>
> *Deutsche Post DHL*
> Headquarters
> Charles-de-Gaulle-Str. 20
> 53113 Bonn
> Germany
>
> Phone: +49 (0) 228 – 189 63389
>
> christian.krudewig@dpdhl.com
>
> [image: cid:image002.png@01D65A98.90D07E00]
> <https://isharenew.dhl.com/sites/DataScience/SitePages/Home.aspx>
> *Learn more **here
> <https://isharenew.dhl.com/sites/DataScience/SitePages/Home.aspx>*
>
> Deutsche Post AG; Registered office Bonn; Register court Bonn; HRB 6792
>
> Board of Management: Dr. Frank Appel, Chairman; Ken Allen, Oskar de Bok,
> Melanie Kreis, Dr. Tobias Meyer, Dr. Thomas Ogilvie, John Pearson, Tim
> Scharwath
>
> Chairman of the Supervisory Board: Dr. Nikolaus von Bornhard
>
> This message is from Deutsche Post AG and may contain confidential
> business information. It is intended solely for the use of the individual
> to whom it is addressed. If you are not the intended recipient please
> contact the sender and delete this message and any attachment from your
> system. Unauthorized publication, use, dissemination, forwarding, printing
> or copying of this E-Mail and its attachments is strictly prohibited.
>
>
>
> *Von:* Igal Shilman <ig...@apache.org>
> *Gesendet:* Dienstag, 28. September 2021 08:14
> *An:* Christian Krudewig (Corporate Development) <
> christian.krudewig@dpdhl.com>
> *Cc:* igal@ververica.com; roman@apache.org; user@flink.apache.org
> *Betreff:* Re: How to add Flink a Flink connector to stateful functions
>
>
>
> Hello Christian,
>
>
>
> I'm happy to hear that you are trying out StateFun and like the toolset!
>
>
>
> Currently StateFun supports "out of the box" only Kafka/Kinesis egresses,
> simply because so far folks didn't requested anything else. I can create a
> JIRA issue for that and we'll see how the community responds.
>
>
>
> Meanwhile, exposing existing Flink connectors as Sinks, is also possible
> using the link you provided.
>
> You can see for example our e2e test does it [1]
>
>
>
> The way it works is:
>
> 1. You indeed need to create a Java application that depends on the
> specific Flink connector that you are using.
>
>
>
> 2. The application needs to contain a StatefulFunctionModule that binds
> this Egress.
>
>
>
> 3. Then you create a JAR and you can start statefun using the official
> Docker image: apache/flink-statefun by mounting your module into the
> modules/ path, for example:
>
> /opt/statefun/modules/my_module/
>
> Alternatively you can create your own Docker image that derives from
> StateFun but only adds that jar into the modules directory. [2]
>
>
>
> I hope that it helps,
>
> Igal
>
>
>
> [1]
>
>
> https://github.com/apache/flink-statefun/blob/master/statefun-e2e-tests/statefun-smoke-e2e-driver/src/main/java/org/apache/flink/statefun/e2e/smoke/driver/DriverModule.java#L40
>
>
>
> [2]
>
>
> https://github.com/apache/flink-statefun/blob/master/statefun-e2e-tests/statefun-smoke-e2e-embedded/src/test/resources/Dockerfile#L20
>
>
>
> On Tue 28. Sep 2021 at 07:40, Christian Krudewig (Corporate Development) <
> christian.krudewig@dpdhl.com> wrote:
>
> Hello Roman,
>
> Well, if that's the way to do it, I can manage to maintain a fork of the
> statefun repo with these tiny changes. But first my question is if that is
> the way it should be done? Or if there is another way to activate these
> connectors.
>
> Best,
>
> Christian
>
> -----Ursprüngliche Nachricht-----
> Von: Roman Khachatryan <ro...@apache.org>
> Gesendet: Dienstag, 28. September 2021 00:31
> An: Christian Krudewig (Corporate Development) <
> christian.krudewig@dpdhl.com>; Igal Shilman <ig...@ververica.com>
> Cc: user@flink.apache.org
> Betreff: Re: How to add Flink a Flink connector to stateful functions
>
> Hi,
>
> > Does that mean that I need to build the stateful functions java
> application and afterwards the docker image?
> Yes, you have to rebuild the application after updating the pom, as well
> as its docker image.
>
> Is your concern related to synchronizing local docker images with the
> official repo?
> If so, wouldn't using a specific statefun image version solve this issue?
>
> Regards,
> Roman
>
> On Mon, Sep 27, 2021 at 9:29 PM Christian Krudewig (Corporate
> Development) <ch...@dpdhl.com> wrote:
> >
> > Hello everyone,
> >
> >
> >
> > Currently I’m busy setting up a pipeline with Stateful Functions using a
> deployment of the standard docker image “apache/flink-statefun” to
> kubernetes. It has been going smoothly so far and I love the whole toolset.
> But now I want to add Egress modules for both Opensearch (= ElasticSearch
> protocol) and ScyllaDB (= Cassandra protocol). The documentation at
> https://ci.apache.org/projects/flink/flink-statefun-docs-master/docs/io-module/flink-connectors/
> indicates that I can somehow simply plug in the standard Flink datastream
> connectors for ElasticSearch and Cassandra. But I didn’t get how exactly.
> >
> > It says “include the dependency in your pom”. Does that mean that I need
> to build the stateful functions java application and afterwards the docker
> image? That would be a bit unfortunate in terms of long-term maintenance
> effort, because I would need to keep my local checkout in sync with the
> official repositories and rebuild every now and then. Maybe this can also
> be added on top of the existing docker image by adding some jar file to
> some magic plugin folder?
> >
> >
> >
> > Sorry, I hope this doesn’t sound captious. I just want to understand and
> do it the right way. Maybe there is also some minimal example? I didn’t
> find any in the playground nor on stackoverflow or the mailing lists.
> >
> >
> >
> > Thanks,
> >
> >
> >
> > Christian Krudewig
> >
> >
> >
> >
>
>

AW: How to add Flink a Flink connector to stateful functions

Posted by "Christian Krudewig (Corporate Development)" <ch...@dpdhl.com>.
Hello Igal,

 

Thanks for replying in detail and also so quickly.

 

It’ll take me some time to try it out, thank you!

 

Best,

 

Christian

 

 

--

Dr. Christian Krudewig
Corporate Development – Data Analytics

Deutsche Post DHL
Headquarters
Charles-de-Gaulle-Str. 20
53113 Bonn
Germany 

Phone: +49 (0) 228 – 189 63389 

christian.krudewig@dpdhl.com <ma...@dpdhl.com>  

 <https://isharenew.dhl.com/sites/DataScience/SitePages/Home.aspx> 
Learn more  <https://isharenew.dhl.com/sites/DataScience/SitePages/Home.aspx> here

Deutsche Post AG; Registered office Bonn; Register court Bonn; HRB 6792

Board of Management: Dr. Frank Appel, Chairman; Ken Allen, Oskar de Bok, Melanie Kreis, Dr. Tobias Meyer, Dr. Thomas Ogilvie, John Pearson, Tim Scharwath 

Chairman of the Supervisory Board: Dr. Nikolaus von Bornhard

This message is from Deutsche Post AG and may contain confidential business information. It is intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient please contact the sender and delete this message and any attachment from your system. Unauthorized publication, use, dissemination, forwarding, printing or copying of this E-Mail and its attachments is strictly prohibited.

 

Von: Igal Shilman <ig...@apache.org> 
Gesendet: Dienstag, 28. September 2021 08:14
An: Christian Krudewig (Corporate Development) <ch...@dpdhl.com>
Cc: igal@ververica.com; roman@apache.org; user@flink.apache.org
Betreff: Re: How to add Flink a Flink connector to stateful functions

 

Hello Christian,

 

I'm happy to hear that you are trying out StateFun and like the toolset!

 

Currently StateFun supports "out of the box" only Kafka/Kinesis egresses, simply because so far folks didn't requested anything else. I can create a JIRA issue for that and we'll see how the community responds.

 

Meanwhile, exposing existing Flink connectors as Sinks, is also possible using the link you provided.

You can see for example our e2e test does it [1]

 

The way it works is: 

1. You indeed need to create a Java application that depends on the specific Flink connector that you are using.

 

2. The application needs to contain a StatefulFunctionModule that binds this Egress.

 

3. Then you create a JAR and you can start statefun using the official Docker image: apache/flink-statefun by mounting your module into the modules/ path, for example: 

/opt/statefun/modules/my_module/ 

Alternatively you can create your own Docker image that derives from StateFun but only adds that jar into the modules directory. [2]

 

I hope that it helps,

Igal

 

[1] 

https://github.com/apache/flink-statefun/blob/master/statefun-e2e-tests/statefun-smoke-e2e-driver/src/main/java/org/apache/flink/statefun/e2e/smoke/driver/DriverModule.java#L40

 

[2] 

https://github.com/apache/flink-statefun/blob/master/statefun-e2e-tests/statefun-smoke-e2e-embedded/src/test/resources/Dockerfile#L20

 

On Tue 28. Sep 2021 at 07:40, Christian Krudewig (Corporate Development) <christian.krudewig@dpdhl.com <ma...@dpdhl.com> > wrote:

Hello Roman,

Well, if that's the way to do it, I can manage to maintain a fork of the statefun repo with these tiny changes. But first my question is if that is the way it should be done? Or if there is another way to activate these connectors.

Best,

Christian

-----Ursprüngliche Nachricht-----
Von: Roman Khachatryan <roman@apache.org <ma...@apache.org> > 
Gesendet: Dienstag, 28. September 2021 00:31
An: Christian Krudewig (Corporate Development) <christian.krudewig@dpdhl.com <ma...@dpdhl.com> >; Igal Shilman <igal@ververica.com <ma...@ververica.com> >
Cc: user@flink.apache.org <ma...@flink.apache.org> 
Betreff: Re: How to add Flink a Flink connector to stateful functions

Hi,

> Does that mean that I need to build the stateful functions java application and afterwards the docker image?
Yes, you have to rebuild the application after updating the pom, as well as its docker image.

Is your concern related to synchronizing local docker images with the official repo?
If so, wouldn't using a specific statefun image version solve this issue?

Regards,
Roman

On Mon, Sep 27, 2021 at 9:29 PM Christian Krudewig (Corporate
Development) <christian.krudewig@dpdhl.com <ma...@dpdhl.com> > wrote:
>
> Hello everyone,
>
>
>
> Currently I’m busy setting up a pipeline with Stateful Functions using a deployment of the standard docker image “apache/flink-statefun” to kubernetes. It has been going smoothly so far and I love the whole toolset. But now I want to add Egress modules for both Opensearch (= ElasticSearch protocol) and ScyllaDB (= Cassandra protocol). The documentation at https://ci.apache.org/projects/flink/flink-statefun-docs-master/docs/io-module/flink-connectors/ indicates that I can somehow simply plug in the standard Flink datastream connectors for ElasticSearch and Cassandra. But I didn’t get how exactly.
>
> It says “include the dependency in your pom”. Does that mean that I need to build the stateful functions java application and afterwards the docker image? That would be a bit unfortunate in terms of long-term maintenance effort, because I would need to keep my local checkout in sync with the official repositories and rebuild every now and then. Maybe this can also be added on top of the existing docker image by adding some jar file to some magic plugin folder?
>
>
>
> Sorry, I hope this doesn’t sound captious. I just want to understand and do it the right way. Maybe there is also some minimal example? I didn’t find any in the playground nor on stackoverflow or the mailing lists.
>
>
>
> Thanks,
>
>
>
> Christian Krudewig
>
>
>
>


Re: How to add Flink a Flink connector to stateful functions

Posted by Igal Shilman <ig...@apache.org>.
Hello Christian,

I'm happy to hear that you are trying out StateFun and like the toolset!

Currently StateFun supports "out of the box" only Kafka/Kinesis egresses,
simply because so far folks didn't requested anything else. I can create a
JIRA issue for that and we'll see how the community responds.

Meanwhile, exposing existing Flink connectors as Sinks, is also possible
using the link you provided.
You can see for example our e2e test does it [1]

The way it works is:
1. You indeed need to create a Java application that depends on the
specific Flink connector that you are using.

2. The application needs to contain a StatefulFunctionModule that binds
this Egress.

3. Then you create a JAR and you can start statefun using the official
Docker image: apache/flink-statefun by mounting your module into the
modules/ path, for example:
/opt/statefun/modules/my_module/
Alternatively you can create your own Docker image that derives from
StateFun but only adds that jar into the modules directory. [2]

I hope that it helps,
Igal

[1]
https://github.com/apache/flink-statefun/blob/master/statefun-e2e-tests/statefun-smoke-e2e-driver/src/main/java/org/apache/flink/statefun/e2e/smoke/driver/DriverModule.java#L40

[2]
https://github.com/apache/flink-statefun/blob/master/statefun-e2e-tests/statefun-smoke-e2e-embedded/src/test/resources/Dockerfile#L20

On Tue 28. Sep 2021 at 07:40, Christian Krudewig (Corporate Development) <
christian.krudewig@dpdhl.com> wrote:

> Hello Roman,
>
> Well, if that's the way to do it, I can manage to maintain a fork of the
> statefun repo with these tiny changes. But first my question is if that is
> the way it should be done? Or if there is another way to activate these
> connectors.
>
> Best,
>
> Christian
>
> -----Ursprüngliche Nachricht-----
> Von: Roman Khachatryan <ro...@apache.org>
> Gesendet: Dienstag, 28. September 2021 00:31
> An: Christian Krudewig (Corporate Development) <
> christian.krudewig@dpdhl.com>; Igal Shilman <ig...@ververica.com>
> Cc: user@flink.apache.org
> Betreff: Re: How to add Flink a Flink connector to stateful functions
>
> Hi,
>
> > Does that mean that I need to build the stateful functions java
> application and afterwards the docker image?
> Yes, you have to rebuild the application after updating the pom, as well
> as its docker image.
>
> Is your concern related to synchronizing local docker images with the
> official repo?
> If so, wouldn't using a specific statefun image version solve this issue?
>
> Regards,
> Roman
>
> On Mon, Sep 27, 2021 at 9:29 PM Christian Krudewig (Corporate
> Development) <ch...@dpdhl.com> wrote:
> >
> > Hello everyone,
> >
> >
> >
> > Currently I’m busy setting up a pipeline with Stateful Functions using a
> deployment of the standard docker image “apache/flink-statefun” to
> kubernetes. It has been going smoothly so far and I love the whole toolset.
> But now I want to add Egress modules for both Opensearch (= ElasticSearch
> protocol) and ScyllaDB (= Cassandra protocol). The documentation at
> https://ci.apache.org/projects/flink/flink-statefun-docs-master/docs/io-module/flink-connectors/
> indicates that I can somehow simply plug in the standard Flink datastream
> connectors for ElasticSearch and Cassandra. But I didn’t get how exactly.
> >
> > It says “include the dependency in your pom”. Does that mean that I need
> to build the stateful functions java application and afterwards the docker
> image? That would be a bit unfortunate in terms of long-term maintenance
> effort, because I would need to keep my local checkout in sync with the
> official repositories and rebuild every now and then. Maybe this can also
> be added on top of the existing docker image by adding some jar file to
> some magic plugin folder?
> >
> >
> >
> > Sorry, I hope this doesn’t sound captious. I just want to understand and
> do it the right way. Maybe there is also some minimal example? I didn’t
> find any in the playground nor on stackoverflow or the mailing lists.
> >
> >
> >
> > Thanks,
> >
> >
> >
> > Christian Krudewig
> >
> >
> >
> >
>

AW: How to add Flink a Flink connector to stateful functions

Posted by "Christian Krudewig (Corporate Development)" <ch...@dpdhl.com>.
Hello Roman,

Well, if that's the way to do it, I can manage to maintain a fork of the statefun repo with these tiny changes. But first my question is if that is the way it should be done? Or if there is another way to activate these connectors.

Best,

Christian

-----Ursprüngliche Nachricht-----
Von: Roman Khachatryan <ro...@apache.org> 
Gesendet: Dienstag, 28. September 2021 00:31
An: Christian Krudewig (Corporate Development) <ch...@dpdhl.com>; Igal Shilman <ig...@ververica.com>
Cc: user@flink.apache.org
Betreff: Re: How to add Flink a Flink connector to stateful functions

Hi,

> Does that mean that I need to build the stateful functions java application and afterwards the docker image?
Yes, you have to rebuild the application after updating the pom, as well as its docker image.

Is your concern related to synchronizing local docker images with the official repo?
If so, wouldn't using a specific statefun image version solve this issue?

Regards,
Roman

On Mon, Sep 27, 2021 at 9:29 PM Christian Krudewig (Corporate
Development) <ch...@dpdhl.com> wrote:
>
> Hello everyone,
>
>
>
> Currently I’m busy setting up a pipeline with Stateful Functions using a deployment of the standard docker image “apache/flink-statefun” to kubernetes. It has been going smoothly so far and I love the whole toolset. But now I want to add Egress modules for both Opensearch (= ElasticSearch protocol) and ScyllaDB (= Cassandra protocol). The documentation at https://ci.apache.org/projects/flink/flink-statefun-docs-master/docs/io-module/flink-connectors/ indicates that I can somehow simply plug in the standard Flink datastream connectors for ElasticSearch and Cassandra. But I didn’t get how exactly.
>
> It says “include the dependency in your pom”. Does that mean that I need to build the stateful functions java application and afterwards the docker image? That would be a bit unfortunate in terms of long-term maintenance effort, because I would need to keep my local checkout in sync with the official repositories and rebuild every now and then. Maybe this can also be added on top of the existing docker image by adding some jar file to some magic plugin folder?
>
>
>
> Sorry, I hope this doesn’t sound captious. I just want to understand and do it the right way. Maybe there is also some minimal example? I didn’t find any in the playground nor on stackoverflow or the mailing lists.
>
>
>
> Thanks,
>
>
>
> Christian Krudewig
>
>
>
>

Re: How to add Flink a Flink connector to stateful functions

Posted by Roman Khachatryan <ro...@apache.org>.
Hi,

> Does that mean that I need to build the stateful functions java application and afterwards the docker image?
Yes, you have to rebuild the application after updating the pom, as
well as its docker image.

Is your concern related to synchronizing local docker images with the
official repo?
If so, wouldn't using a specific statefun image version solve this issue?

Regards,
Roman

On Mon, Sep 27, 2021 at 9:29 PM Christian Krudewig (Corporate
Development) <ch...@dpdhl.com> wrote:
>
> Hello everyone,
>
>
>
> Currently I’m busy setting up a pipeline with Stateful Functions using a deployment of the standard docker image “apache/flink-statefun” to kubernetes. It has been going smoothly so far and I love the whole toolset. But now I want to add Egress modules for both Opensearch (= ElasticSearch protocol) and ScyllaDB (= Cassandra protocol). The documentation at https://ci.apache.org/projects/flink/flink-statefun-docs-master/docs/io-module/flink-connectors/ indicates that I can somehow simply plug in the standard Flink datastream connectors for ElasticSearch and Cassandra. But I didn’t get how exactly.
>
> It says “include the dependency in your pom”. Does that mean that I need to build the stateful functions java application and afterwards the docker image? That would be a bit unfortunate in terms of long-term maintenance effort, because I would need to keep my local checkout in sync with the official repositories and rebuild every now and then. Maybe this can also be added on top of the existing docker image by adding some jar file to some magic plugin folder?
>
>
>
> Sorry, I hope this doesn’t sound captious. I just want to understand and do it the right way. Maybe there is also some minimal example? I didn’t find any in the playground nor on stackoverflow or the mailing lists.
>
>
>
> Thanks,
>
>
>
> Christian Krudewig
>
>
>
>