You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Robert Metzger <rm...@apache.org> on 2016/04/04 16:35:57 UTC

Re: Apache Flink <=> Apache Ignite integration

Hi Raul,

thanks a lot for reaching out to the Flink community.
I'm really excited to see a Flink connector in Ignite. If you feel that the
connector would be more suitable for our "connector library" feel free to
open a JIRA and open a pull request.

Were there requests in the Ignite community to have an integration with
Flink?



On Thu, Mar 31, 2016 at 5:20 PM, Saikat Maitra <sa...@gmail.com>
wrote:

> Hi ,
>
> I agree with Roman and Raul.
> https://issues.apache.org/jira/browse/IGNITE-813 allows injecting data to
> into cache via Data Streamer. Integrating with Ignite FileSystem for source
> and sink will allow for bidirectional connector. It will also allow easier
> implementation for DataStream transformations over Ignite FileSystem.
>
> Regards
> Saikat
>
> On Thu, Mar 31, 2016 at 2:44 PM, Aljoscha Krettek <al...@apache.org>
> wrote:
>
> > Hi,
> > it should already be possible to use the Ignite FileSystem to store state
> > since we just use the HDFS FileSystem interface for that. Of course, one
> > would have to properly set up the jars and paths and everything for Flink
> > to pick up the IGFS classes.
> >
> > Cheers,
> > Aljoscha
> >
> > On Wed, 30 Mar 2016 at 16:50 Raul Kripalani <ra...@apache.org> wrote:
> >
> > > On Wed, Mar 30, 2016 at 2:20 PM, Roman <rs...@yahoo.com.invalid>
> > wrote:
> > >
> > > > Raul,
> > > >
> > > > Small comment from me.
> > > >
> > > > >* As a Flink sink => inject data directly into a cache via a
> > > DataStreamer.
> > > > After reviews, IGNITE-813 is exactly this functionality.
> > > >
> > > >
> > > That's cool, Roman! The idea would be to host these (richer) modules as
> > > Flink connectors, like they do with others:
> > >
> > > https://github.com/apache/flink/tree/master/flink-streaming-connectors
> > > https://github.com/apache/flink/tree/master/flink-batch-connectors
> > >
> >
>

Re: Apache Flink <=> Apache Ignite integration

Posted by Anton Vinogradov <av...@gridgain.com>.

Hi All,
I'll review it in the near future.

On Tue, Jul 19, 2016 at 11:53 AM, Denis Magda <dm...@gridgain.com> wrote:

> Hi Saikat,
>
> Thanks for this contribution.
>
> *Anton V.*, please review the following contribution.
>
> —
> Denis
>
> On Jul 16, 2016, at 11:09 PM, Saikat Maitra <sa...@gmail.com>
> wrote:
>
> Hi
>
> I have raised a PR for the following scope.
>
> As a Flink source => run a continuous query against one or multiple
> caches
>
> PR https://github.com/apache/ignite/pull/870
> Jira https://issues.apache.org/jira/browse/IGNITE-3303
>
> Please review and share feedback.
>
> Regards
> Saikat
>
>
> On Mon, Apr 4, 2016 at 8:24 PM, Stephan Ewen <se...@apache.org> wrote:
>
> Hi!
>
>  - Sounds like the having Ignite for snapshots should work pretty much out
> of the box (via the IGFS)
>  - The source and sink connector sounds like the next logical step. Does
> Ignite have a notion of stream partitions and offsets, to build a
> consistent replay around? This should probably have its dedicated issue and
> discussion thread.
>
>  - For Ignite as an execution backend - I am not sure how relevant and
> feasible that is. Many DataStream API features make use of the specific
> Flink runtime. For streaming, the runtime is not as decoupled as for batch.
>  - I think the parameter server integration would not be part of the Flink
> codebase - this is a pretty application specific thing that should be its
> own project and it is actually not tightly coupled to Flink.
>
> Greetings,
> Stephan
>
>
> On Mon, Apr 4, 2016 at 4:35 PM, Robert Metzger <rm...@apache.org>
> wrote:
>
> Hi Raul,
>
> thanks a lot for reaching out to the Flink community.
> I'm really excited to see a Flink connector in Ignite. If you feel that
>
> the
>
> connector would be more suitable for our "connector library" feel free to
> open a JIRA and open a pull request.
>
> Were there requests in the Ignite community to have an integration with
> Flink?
>
>
>
> On Thu, Mar 31, 2016 at 5:20 PM, Saikat Maitra <sa...@gmail.com>
> wrote:
>
> Hi ,
>
> I agree with Roman and Raul.
> https://issues.apache.org/jira/browse/IGNITE-813 allows injecting data
>
> to
>
> into cache via Data Streamer. Integrating with Ignite FileSystem for
>
> source
>
> and sink will allow for bidirectional connector. It will also allow
>
> easier
>
> implementation for DataStream transformations over Ignite FileSystem.
>
> Regards
> Saikat
>
> On Thu, Mar 31, 2016 at 2:44 PM, Aljoscha Krettek <aljoscha@apache.org
>
>
> wrote:
>
> Hi,
> it should already be possible to use the Ignite FileSystem to store
>
> state
>
> since we just use the HDFS FileSystem interface for that. Of course,
>
> one
>
> would have to properly set up the jars and paths and everything for
>
> Flink
>
> to pick up the IGFS classes.
>
> Cheers,
> Aljoscha
>
> On Wed, 30 Mar 2016 at 16:50 Raul Kripalani <ra...@apache.org>
>
> wrote:
>
>
> On Wed, Mar 30, 2016 at 2:20 PM, Roman <rs...@yahoo.com.invalid>
>
> wrote:
>
>
> Raul,
>
> Small comment from me.
>
> * As a Flink sink => inject data directly into a cache via a
>
> DataStreamer.
>
> After reviews, IGNITE-813 is exactly this functionality.
>
>
> That's cool, Roman! The idea would be to host these (richer)
>
> modules
>
> as
>
> Flink connectors, like they do with others:
>
>
> https://github.com/apache/flink/tree/master/flink-streaming-connectors
>
> https://github.com/apache/flink/tree/master/flink-batch-connectors
>
>
>
>
>
>
>

Re: Apache Flink <=> Apache Ignite integration

Posted by Anton Vinogradov <av...@gridgain.com>.

Hi All,
I'll review it in the near future.

On Tue, Jul 19, 2016 at 11:53 AM, Denis Magda <dm...@gridgain.com> wrote:

> Hi Saikat,
>
> Thanks for this contribution.
>
> *Anton V.*, please review the following contribution.
>
> —
> Denis
>
> On Jul 16, 2016, at 11:09 PM, Saikat Maitra <sa...@gmail.com>
> wrote:
>
> Hi
>
> I have raised a PR for the following scope.
>
> As a Flink source => run a continuous query against one or multiple
> caches
>
> PR https://github.com/apache/ignite/pull/870
> Jira https://issues.apache.org/jira/browse/IGNITE-3303
>
> Please review and share feedback.
>
> Regards
> Saikat
>
>
> On Mon, Apr 4, 2016 at 8:24 PM, Stephan Ewen <se...@apache.org> wrote:
>
> Hi!
>
>  - Sounds like the having Ignite for snapshots should work pretty much out
> of the box (via the IGFS)
>  - The source and sink connector sounds like the next logical step. Does
> Ignite have a notion of stream partitions and offsets, to build a
> consistent replay around? This should probably have its dedicated issue and
> discussion thread.
>
>  - For Ignite as an execution backend - I am not sure how relevant and
> feasible that is. Many DataStream API features make use of the specific
> Flink runtime. For streaming, the runtime is not as decoupled as for batch.
>  - I think the parameter server integration would not be part of the Flink
> codebase - this is a pretty application specific thing that should be its
> own project and it is actually not tightly coupled to Flink.
>
> Greetings,
> Stephan
>
>
> On Mon, Apr 4, 2016 at 4:35 PM, Robert Metzger <rm...@apache.org>
> wrote:
>
> Hi Raul,
>
> thanks a lot for reaching out to the Flink community.
> I'm really excited to see a Flink connector in Ignite. If you feel that
>
> the
>
> connector would be more suitable for our "connector library" feel free to
> open a JIRA and open a pull request.
>
> Were there requests in the Ignite community to have an integration with
> Flink?
>
>
>
> On Thu, Mar 31, 2016 at 5:20 PM, Saikat Maitra <sa...@gmail.com>
> wrote:
>
> Hi ,
>
> I agree with Roman and Raul.
> https://issues.apache.org/jira/browse/IGNITE-813 allows injecting data
>
> to
>
> into cache via Data Streamer. Integrating with Ignite FileSystem for
>
> source
>
> and sink will allow for bidirectional connector. It will also allow
>
> easier
>
> implementation for DataStream transformations over Ignite FileSystem.
>
> Regards
> Saikat
>
> On Thu, Mar 31, 2016 at 2:44 PM, Aljoscha Krettek <aljoscha@apache.org
>
>
> wrote:
>
> Hi,
> it should already be possible to use the Ignite FileSystem to store
>
> state
>
> since we just use the HDFS FileSystem interface for that. Of course,
>
> one
>
> would have to properly set up the jars and paths and everything for
>
> Flink
>
> to pick up the IGFS classes.
>
> Cheers,
> Aljoscha
>
> On Wed, 30 Mar 2016 at 16:50 Raul Kripalani <ra...@apache.org>
>
> wrote:
>
>
> On Wed, Mar 30, 2016 at 2:20 PM, Roman <rs...@yahoo.com.invalid>
>
> wrote:
>
>
> Raul,
>
> Small comment from me.
>
> * As a Flink sink => inject data directly into a cache via a
>
> DataStreamer.
>
> After reviews, IGNITE-813 is exactly this functionality.
>
>
> That's cool, Roman! The idea would be to host these (richer)
>
> modules
>
> as
>
> Flink connectors, like they do with others:
>
>
> https://github.com/apache/flink/tree/master/flink-streaming-connectors
>
> https://github.com/apache/flink/tree/master/flink-batch-connectors
>
>
>
>
>
>
>

Re: Apache Flink <=> Apache Ignite integration

Posted by Denis Magda <dm...@gridgain.com>.

Hi Saikat,

Thanks for this contribution.

Anton V., please review the following contribution.

—
Denis

> On Jul 16, 2016, at 11:09 PM, Saikat Maitra <sa...@gmail.com> wrote:
> 
> Hi
> 
> I have raised a PR for the following scope.
> 
> As a Flink source => run a continuous query against one or multiple
> caches
> 
> PR https://github.com/apache/ignite/pull/870
> Jira https://issues.apache.org/jira/browse/IGNITE-3303
> 
> Please review and share feedback.
> 
> Regards
> Saikat
> 
> 
> On Mon, Apr 4, 2016 at 8:24 PM, Stephan Ewen <se...@apache.org> wrote:
> 
>> Hi!
>> 
>>  - Sounds like the having Ignite for snapshots should work pretty much out
>> of the box (via the IGFS)
>>  - The source and sink connector sounds like the next logical step. Does
>> Ignite have a notion of stream partitions and offsets, to build a
>> consistent replay around? This should probably have its dedicated issue and
>> discussion thread.
>> 
>>  - For Ignite as an execution backend - I am not sure how relevant and
>> feasible that is. Many DataStream API features make use of the specific
>> Flink runtime. For streaming, the runtime is not as decoupled as for batch.
>>  - I think the parameter server integration would not be part of the Flink
>> codebase - this is a pretty application specific thing that should be its
>> own project and it is actually not tightly coupled to Flink.
>> 
>> Greetings,
>> Stephan
>> 
>> 
>> On Mon, Apr 4, 2016 at 4:35 PM, Robert Metzger <rm...@apache.org>
>> wrote:
>> 
>>> Hi Raul,
>>> 
>>> thanks a lot for reaching out to the Flink community.
>>> I'm really excited to see a Flink connector in Ignite. If you feel that
>> the
>>> connector would be more suitable for our "connector library" feel free to
>>> open a JIRA and open a pull request.
>>> 
>>> Were there requests in the Ignite community to have an integration with
>>> Flink?
>>> 
>>> 
>>> 
>>> On Thu, Mar 31, 2016 at 5:20 PM, Saikat Maitra <sa...@gmail.com>
>>> wrote:
>>> 
>>>> Hi ,
>>>> 
>>>> I agree with Roman and Raul.
>>>> https://issues.apache.org/jira/browse/IGNITE-813 allows injecting data
>>> to
>>>> into cache via Data Streamer. Integrating with Ignite FileSystem for
>>> source
>>>> and sink will allow for bidirectional connector. It will also allow
>>> easier
>>>> implementation for DataStream transformations over Ignite FileSystem.
>>>> 
>>>> Regards
>>>> Saikat
>>>> 
>>>> On Thu, Mar 31, 2016 at 2:44 PM, Aljoscha Krettek <aljoscha@apache.org
>>> 
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> it should already be possible to use the Ignite FileSystem to store
>>> state
>>>>> since we just use the HDFS FileSystem interface for that. Of course,
>>> one
>>>>> would have to properly set up the jars and paths and everything for
>>> Flink
>>>>> to pick up the IGFS classes.
>>>>> 
>>>>> Cheers,
>>>>> Aljoscha
>>>>> 
>>>>> On Wed, 30 Mar 2016 at 16:50 Raul Kripalani <ra...@apache.org>
>> wrote:
>>>>> 
>>>>>> On Wed, Mar 30, 2016 at 2:20 PM, Roman <rs...@yahoo.com.invalid>
>>>>> wrote:
>>>>>> 
>>>>>>> Raul,
>>>>>>> 
>>>>>>> Small comment from me.
>>>>>>> 
>>>>>>>> * As a Flink sink => inject data directly into a cache via a
>>>>>> DataStreamer.
>>>>>>> After reviews, IGNITE-813 is exactly this functionality.
>>>>>>> 
>>>>>>> 
>>>>>> That's cool, Roman! The idea would be to host these (richer)
>> modules
>>> as
>>>>>> Flink connectors, like they do with others:
>>>>>> 
>>>>>> 
>>> https://github.com/apache/flink/tree/master/flink-streaming-connectors
>>>>>> https://github.com/apache/flink/tree/master/flink-batch-connectors
>>>>>> 
>>>>> 
>>>> 
>>> 
>>

Re: Apache Flink <=> Apache Ignite integration

Posted by Denis Magda <dm...@gridgain.com>.

Hi Saikat,

Thanks for this contribution.

Anton V., please review the following contribution.

—
Denis

> On Jul 16, 2016, at 11:09 PM, Saikat Maitra <sa...@gmail.com> wrote:
> 
> Hi
> 
> I have raised a PR for the following scope.
> 
> As a Flink source => run a continuous query against one or multiple
> caches
> 
> PR https://github.com/apache/ignite/pull/870
> Jira https://issues.apache.org/jira/browse/IGNITE-3303
> 
> Please review and share feedback.
> 
> Regards
> Saikat
> 
> 
> On Mon, Apr 4, 2016 at 8:24 PM, Stephan Ewen <se...@apache.org> wrote:
> 
>> Hi!
>> 
>>  - Sounds like the having Ignite for snapshots should work pretty much out
>> of the box (via the IGFS)
>>  - The source and sink connector sounds like the next logical step. Does
>> Ignite have a notion of stream partitions and offsets, to build a
>> consistent replay around? This should probably have its dedicated issue and
>> discussion thread.
>> 
>>  - For Ignite as an execution backend - I am not sure how relevant and
>> feasible that is. Many DataStream API features make use of the specific
>> Flink runtime. For streaming, the runtime is not as decoupled as for batch.
>>  - I think the parameter server integration would not be part of the Flink
>> codebase - this is a pretty application specific thing that should be its
>> own project and it is actually not tightly coupled to Flink.
>> 
>> Greetings,
>> Stephan
>> 
>> 
>> On Mon, Apr 4, 2016 at 4:35 PM, Robert Metzger <rm...@apache.org>
>> wrote:
>> 
>>> Hi Raul,
>>> 
>>> thanks a lot for reaching out to the Flink community.
>>> I'm really excited to see a Flink connector in Ignite. If you feel that
>> the
>>> connector would be more suitable for our "connector library" feel free to
>>> open a JIRA and open a pull request.
>>> 
>>> Were there requests in the Ignite community to have an integration with
>>> Flink?
>>> 
>>> 
>>> 
>>> On Thu, Mar 31, 2016 at 5:20 PM, Saikat Maitra <sa...@gmail.com>
>>> wrote:
>>> 
>>>> Hi ,
>>>> 
>>>> I agree with Roman and Raul.
>>>> https://issues.apache.org/jira/browse/IGNITE-813 allows injecting data
>>> to
>>>> into cache via Data Streamer. Integrating with Ignite FileSystem for
>>> source
>>>> and sink will allow for bidirectional connector. It will also allow
>>> easier
>>>> implementation for DataStream transformations over Ignite FileSystem.
>>>> 
>>>> Regards
>>>> Saikat
>>>> 
>>>> On Thu, Mar 31, 2016 at 2:44 PM, Aljoscha Krettek <aljoscha@apache.org
>>> 
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> it should already be possible to use the Ignite FileSystem to store
>>> state
>>>>> since we just use the HDFS FileSystem interface for that. Of course,
>>> one
>>>>> would have to properly set up the jars and paths and everything for
>>> Flink
>>>>> to pick up the IGFS classes.
>>>>> 
>>>>> Cheers,
>>>>> Aljoscha
>>>>> 
>>>>> On Wed, 30 Mar 2016 at 16:50 Raul Kripalani <ra...@apache.org>
>> wrote:
>>>>> 
>>>>>> On Wed, Mar 30, 2016 at 2:20 PM, Roman <rs...@yahoo.com.invalid>
>>>>> wrote:
>>>>>> 
>>>>>>> Raul,
>>>>>>> 
>>>>>>> Small comment from me.
>>>>>>> 
>>>>>>>> * As a Flink sink => inject data directly into a cache via a
>>>>>> DataStreamer.
>>>>>>> After reviews, IGNITE-813 is exactly this functionality.
>>>>>>> 
>>>>>>> 
>>>>>> That's cool, Roman! The idea would be to host these (richer)
>> modules
>>> as
>>>>>> Flink connectors, like they do with others:
>>>>>> 
>>>>>> 
>>> https://github.com/apache/flink/tree/master/flink-streaming-connectors
>>>>>> https://github.com/apache/flink/tree/master/flink-batch-connectors
>>>>>> 
>>>>> 
>>>> 
>>> 
>>

Re: Apache Flink <=> Apache Ignite integration

Posted by Saikat Maitra <sa...@gmail.com>.

Hi

I have raised a PR for the following scope.

As a Flink source => run a continuous query against one or multiple
caches

PR https://github.com/apache/ignite/pull/870
Jira https://issues.apache.org/jira/browse/IGNITE-3303

Please review and share feedback.

Regards
Saikat


On Mon, Apr 4, 2016 at 8:24 PM, Stephan Ewen <se...@apache.org> wrote:

> Hi!
>
>   - Sounds like the having Ignite for snapshots should work pretty much out
> of the box (via the IGFS)
>   - The source and sink connector sounds like the next logical step. Does
> Ignite have a notion of stream partitions and offsets, to build a
> consistent replay around? This should probably have its dedicated issue and
> discussion thread.
>
>   - For Ignite as an execution backend - I am not sure how relevant and
> feasible that is. Many DataStream API features make use of the specific
> Flink runtime. For streaming, the runtime is not as decoupled as for batch.
>   - I think the parameter server integration would not be part of the Flink
> codebase - this is a pretty application specific thing that should be its
> own project and it is actually not tightly coupled to Flink.
>
> Greetings,
> Stephan
>
>
> On Mon, Apr 4, 2016 at 4:35 PM, Robert Metzger <rm...@apache.org>
> wrote:
>
> > Hi Raul,
> >
> > thanks a lot for reaching out to the Flink community.
> > I'm really excited to see a Flink connector in Ignite. If you feel that
> the
> > connector would be more suitable for our "connector library" feel free to
> > open a JIRA and open a pull request.
> >
> > Were there requests in the Ignite community to have an integration with
> > Flink?
> >
> >
> >
> > On Thu, Mar 31, 2016 at 5:20 PM, Saikat Maitra <sa...@gmail.com>
> > wrote:
> >
> > > Hi ,
> > >
> > > I agree with Roman and Raul.
> > > https://issues.apache.org/jira/browse/IGNITE-813 allows injecting data
> > to
> > > into cache via Data Streamer. Integrating with Ignite FileSystem for
> > source
> > > and sink will allow for bidirectional connector. It will also allow
> > easier
> > > implementation for DataStream transformations over Ignite FileSystem.
> > >
> > > Regards
> > > Saikat
> > >
> > > On Thu, Mar 31, 2016 at 2:44 PM, Aljoscha Krettek <aljoscha@apache.org
> >
> > > wrote:
> > >
> > > > Hi,
> > > > it should already be possible to use the Ignite FileSystem to store
> > state
> > > > since we just use the HDFS FileSystem interface for that. Of course,
> > one
> > > > would have to properly set up the jars and paths and everything for
> > Flink
> > > > to pick up the IGFS classes.
> > > >
> > > > Cheers,
> > > > Aljoscha
> > > >
> > > > On Wed, 30 Mar 2016 at 16:50 Raul Kripalani <ra...@apache.org>
> wrote:
> > > >
> > > > > On Wed, Mar 30, 2016 at 2:20 PM, Roman <rs...@yahoo.com.invalid>
> > > > wrote:
> > > > >
> > > > > > Raul,
> > > > > >
> > > > > > Small comment from me.
> > > > > >
> > > > > > >* As a Flink sink => inject data directly into a cache via a
> > > > > DataStreamer.
> > > > > > After reviews, IGNITE-813 is exactly this functionality.
> > > > > >
> > > > > >
> > > > > That's cool, Roman! The idea would be to host these (richer)
> modules
> > as
> > > > > Flink connectors, like they do with others:
> > > > >
> > > > >
> > https://github.com/apache/flink/tree/master/flink-streaming-connectors
> > > > > https://github.com/apache/flink/tree/master/flink-batch-connectors
> > > > >
> > > >
> > >
> >
>

Re: Apache Flink <=> Apache Ignite integration

Posted by Saikat Maitra <sa...@gmail.com>.

Hi

I have raised a PR for the following scope.

As a Flink source => run a continuous query against one or multiple
caches

PR https://github.com/apache/ignite/pull/870
Jira https://issues.apache.org/jira/browse/IGNITE-3303

Please review and share feedback.

Regards
Saikat


On Mon, Apr 4, 2016 at 8:24 PM, Stephan Ewen <se...@apache.org> wrote:

> Hi!
>
>   - Sounds like the having Ignite for snapshots should work pretty much out
> of the box (via the IGFS)
>   - The source and sink connector sounds like the next logical step. Does
> Ignite have a notion of stream partitions and offsets, to build a
> consistent replay around? This should probably have its dedicated issue and
> discussion thread.
>
>   - For Ignite as an execution backend - I am not sure how relevant and
> feasible that is. Many DataStream API features make use of the specific
> Flink runtime. For streaming, the runtime is not as decoupled as for batch.
>   - I think the parameter server integration would not be part of the Flink
> codebase - this is a pretty application specific thing that should be its
> own project and it is actually not tightly coupled to Flink.
>
> Greetings,
> Stephan
>
>
> On Mon, Apr 4, 2016 at 4:35 PM, Robert Metzger <rm...@apache.org>
> wrote:
>
> > Hi Raul,
> >
> > thanks a lot for reaching out to the Flink community.
> > I'm really excited to see a Flink connector in Ignite. If you feel that
> the
> > connector would be more suitable for our "connector library" feel free to
> > open a JIRA and open a pull request.
> >
> > Were there requests in the Ignite community to have an integration with
> > Flink?
> >
> >
> >
> > On Thu, Mar 31, 2016 at 5:20 PM, Saikat Maitra <sa...@gmail.com>
> > wrote:
> >
> > > Hi ,
> > >
> > > I agree with Roman and Raul.
> > > https://issues.apache.org/jira/browse/IGNITE-813 allows injecting data
> > to
> > > into cache via Data Streamer. Integrating with Ignite FileSystem for
> > source
> > > and sink will allow for bidirectional connector. It will also allow
> > easier
> > > implementation for DataStream transformations over Ignite FileSystem.
> > >
> > > Regards
> > > Saikat
> > >
> > > On Thu, Mar 31, 2016 at 2:44 PM, Aljoscha Krettek <aljoscha@apache.org
> >
> > > wrote:
> > >
> > > > Hi,
> > > > it should already be possible to use the Ignite FileSystem to store
> > state
> > > > since we just use the HDFS FileSystem interface for that. Of course,
> > one
> > > > would have to properly set up the jars and paths and everything for
> > Flink
> > > > to pick up the IGFS classes.
> > > >
> > > > Cheers,
> > > > Aljoscha
> > > >
> > > > On Wed, 30 Mar 2016 at 16:50 Raul Kripalani <ra...@apache.org>
> wrote:
> > > >
> > > > > On Wed, Mar 30, 2016 at 2:20 PM, Roman <rs...@yahoo.com.invalid>
> > > > wrote:
> > > > >
> > > > > > Raul,
> > > > > >
> > > > > > Small comment from me.
> > > > > >
> > > > > > >* As a Flink sink => inject data directly into a cache via a
> > > > > DataStreamer.
> > > > > > After reviews, IGNITE-813 is exactly this functionality.
> > > > > >
> > > > > >
> > > > > That's cool, Roman! The idea would be to host these (richer)
> modules
> > as
> > > > > Flink connectors, like they do with others:
> > > > >
> > > > >
> > https://github.com/apache/flink/tree/master/flink-streaming-connectors
> > > > > https://github.com/apache/flink/tree/master/flink-batch-connectors
> > > > >
> > > >
> > >
> >
>

Re: Apache Flink <=> Apache Ignite integration

Posted by Stephan Ewen <se...@apache.org>.

Hi Raul!

Concerning the source connector and position marker: Great idea!
The FlinkKafkaConsumer uses pretty much the same trick - the
offset-per-partition is used to filter during replays.

Greetings,
Stephan



On Tue, Apr 5, 2016 at 2:24 AM, Raul Kripalani <ra...@apache.org> wrote:

> On Mon, Apr 4, 2016 at 3:54 PM, Stephan Ewen <se...@apache.org> wrote:
>
> >
> >   - Sounds like the having Ignite for snapshots should work pretty much
> > out
> > of the box (via the IGFS)
> >   - The source and sink connector sounds like the next logical step. Does
> > Ignite have a notion of stream partitions and offsets, to build a
> > consistent replay around? This should probably have its dedicated issue
> and
> > discussion thread.
> >
> >   - For Ignite as an execution backend - I am not sure how relevant and
> > feasible that is. Many DataStream API features make use of the specific
> > Flink runtime. For streaming, the runtime is not as decoupled as for
> > batch.
> >   - I think the parameter server integration would not be part of the
> > Flink
> > codebase - this is a pretty application specific thing that should be its
> > own project and it is actually not tightly coupled to Flink.
>
>
> Danke, Stephan! I think I'll start with the sink/source connector – reusing
> what's already been committed to our codebase.
>
> With regards to source replayability, I plan to integrate Ignite Continuous
> Queries as a source. If the user's data objects contain an indexed
> ascending numeric or datetime field, we could use such a field as a
> "position marker" by launching the query with the appropriate WHERE filter
> when a replay is demanded.
>
> Do you have similar use cases with existing connectors?
>
> Cheers,
>
> *Raúl Kripalani*
> PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and
> Messaging Engineer
> http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
> Blog: raul.io
> <http://raul.io/?utm_source=email&utm_medium=email&utm_campaign=apache> |
> twitter: @raulvk <https://twitter.com/raulvk>
>

Re: Apache Flink <=> Apache Ignite integration

Posted by Raul Kripalani <ra...@apache.org>.

On Mon, Apr 4, 2016 at 3:54 PM, Stephan Ewen <se...@apache.org> wrote:

>
>   - Sounds like the having Ignite for snapshots should work pretty much
> out
> of the box (via the IGFS)
>   - The source and sink connector sounds like the next logical step. Does
> Ignite have a notion of stream partitions and offsets, to build a
> consistent replay around? This should probably have its dedicated issue and
> discussion thread.
>
>   - For Ignite as an execution backend - I am not sure how relevant and
> feasible that is. Many DataStream API features make use of the specific
> Flink runtime. For streaming, the runtime is not as decoupled as for
> batch.
>   - I think the parameter server integration would not be part of the
> Flink
> codebase - this is a pretty application specific thing that should be its
> own project and it is actually not tightly coupled to Flink.

Danke, Stephan! I think I'll start with the sink/source connector – reusing
what's already been committed to our codebase.

With regards to source replayability, I plan to integrate Ignite Continuous
Queries as a source. If the user's data objects contain an indexed
ascending numeric or datetime field, we could use such a field as a
"position marker" by launching the query with the appropriate WHERE filter
when a replay is demanded.

Do you have similar use cases with existing connectors?

Cheers,

*Raúl Kripalani*
PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and
Messaging Engineer
http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
Blog: raul.io
<http://raul.io/?utm_source=email&utm_medium=email&utm_campaign=apache> |
twitter: @raulvk <https://twitter.com/raulvk>

Re: Apache Flink <=> Apache Ignite integration

Posted by Stephan Ewen <se...@apache.org>.

Hi!

  - Sounds like the having Ignite for snapshots should work pretty much out
of the box (via the IGFS)
  - The source and sink connector sounds like the next logical step. Does
Ignite have a notion of stream partitions and offsets, to build a
consistent replay around? This should probably have its dedicated issue and
discussion thread.

  - For Ignite as an execution backend - I am not sure how relevant and
feasible that is. Many DataStream API features make use of the specific
Flink runtime. For streaming, the runtime is not as decoupled as for batch.
  - I think the parameter server integration would not be part of the Flink
codebase - this is a pretty application specific thing that should be its
own project and it is actually not tightly coupled to Flink.

Greetings,
Stephan


On Mon, Apr 4, 2016 at 4:35 PM, Robert Metzger <rm...@apache.org> wrote:

> Hi Raul,
>
> thanks a lot for reaching out to the Flink community.
> I'm really excited to see a Flink connector in Ignite. If you feel that the
> connector would be more suitable for our "connector library" feel free to
> open a JIRA and open a pull request.
>
> Were there requests in the Ignite community to have an integration with
> Flink?
>
>
>
> On Thu, Mar 31, 2016 at 5:20 PM, Saikat Maitra <sa...@gmail.com>
> wrote:
>
> > Hi ,
> >
> > I agree with Roman and Raul.
> > https://issues.apache.org/jira/browse/IGNITE-813 allows injecting data
> to
> > into cache via Data Streamer. Integrating with Ignite FileSystem for
> source
> > and sink will allow for bidirectional connector. It will also allow
> easier
> > implementation for DataStream transformations over Ignite FileSystem.
> >
> > Regards
> > Saikat
> >
> > On Thu, Mar 31, 2016 at 2:44 PM, Aljoscha Krettek <al...@apache.org>
> > wrote:
> >
> > > Hi,
> > > it should already be possible to use the Ignite FileSystem to store
> state
> > > since we just use the HDFS FileSystem interface for that. Of course,
> one
> > > would have to properly set up the jars and paths and everything for
> Flink
> > > to pick up the IGFS classes.
> > >
> > > Cheers,
> > > Aljoscha
> > >
> > > On Wed, 30 Mar 2016 at 16:50 Raul Kripalani <ra...@apache.org> wrote:
> > >
> > > > On Wed, Mar 30, 2016 at 2:20 PM, Roman <rs...@yahoo.com.invalid>
> > > wrote:
> > > >
> > > > > Raul,
> > > > >
> > > > > Small comment from me.
> > > > >
> > > > > >* As a Flink sink => inject data directly into a cache via a
> > > > DataStreamer.
> > > > > After reviews, IGNITE-813 is exactly this functionality.
> > > > >
> > > > >
> > > > That's cool, Roman! The idea would be to host these (richer) modules
> as
> > > > Flink connectors, like they do with others:
> > > >
> > > >
> https://github.com/apache/flink/tree/master/flink-streaming-connectors
> > > > https://github.com/apache/flink/tree/master/flink-batch-connectors
> > > >
> > >
> >
>

Re: Apache Flink <=> Apache Ignite integration

Posted by Stephan Ewen <se...@apache.org>.

Hi!

  - Sounds like the having Ignite for snapshots should work pretty much out
of the box (via the IGFS)
  - The source and sink connector sounds like the next logical step. Does
Ignite have a notion of stream partitions and offsets, to build a
consistent replay around? This should probably have its dedicated issue and
discussion thread.

  - For Ignite as an execution backend - I am not sure how relevant and
feasible that is. Many DataStream API features make use of the specific
Flink runtime. For streaming, the runtime is not as decoupled as for batch.
  - I think the parameter server integration would not be part of the Flink
codebase - this is a pretty application specific thing that should be its
own project and it is actually not tightly coupled to Flink.

Greetings,
Stephan


On Mon, Apr 4, 2016 at 4:35 PM, Robert Metzger <rm...@apache.org> wrote:

> Hi Raul,
>
> thanks a lot for reaching out to the Flink community.
> I'm really excited to see a Flink connector in Ignite. If you feel that the
> connector would be more suitable for our "connector library" feel free to
> open a JIRA and open a pull request.
>
> Were there requests in the Ignite community to have an integration with
> Flink?
>
>
>
> On Thu, Mar 31, 2016 at 5:20 PM, Saikat Maitra <sa...@gmail.com>
> wrote:
>
> > Hi ,
> >
> > I agree with Roman and Raul.
> > https://issues.apache.org/jira/browse/IGNITE-813 allows injecting data
> to
> > into cache via Data Streamer. Integrating with Ignite FileSystem for
> source
> > and sink will allow for bidirectional connector. It will also allow
> easier
> > implementation for DataStream transformations over Ignite FileSystem.
> >
> > Regards
> > Saikat
> >
> > On Thu, Mar 31, 2016 at 2:44 PM, Aljoscha Krettek <al...@apache.org>
> > wrote:
> >
> > > Hi,
> > > it should already be possible to use the Ignite FileSystem to store
> state
> > > since we just use the HDFS FileSystem interface for that. Of course,
> one
> > > would have to properly set up the jars and paths and everything for
> Flink
> > > to pick up the IGFS classes.
> > >
> > > Cheers,
> > > Aljoscha
> > >
> > > On Wed, 30 Mar 2016 at 16:50 Raul Kripalani <ra...@apache.org> wrote:
> > >
> > > > On Wed, Mar 30, 2016 at 2:20 PM, Roman <rs...@yahoo.com.invalid>
> > > wrote:
> > > >
> > > > > Raul,
> > > > >
> > > > > Small comment from me.
> > > > >
> > > > > >* As a Flink sink => inject data directly into a cache via a
> > > > DataStreamer.
> > > > > After reviews, IGNITE-813 is exactly this functionality.
> > > > >
> > > > >
> > > > That's cool, Roman! The idea would be to host these (richer) modules
> as
> > > > Flink connectors, like they do with others:
> > > >
> > > >
> https://github.com/apache/flink/tree/master/flink-streaming-connectors
> > > > https://github.com/apache/flink/tree/master/flink-batch-connectors
> > > >
> > >
> >
>

Re: Apache Flink <=> Apache Ignite integration

Posted by Raul Kripalani <ra...@apache.org>.

On Mon, Apr 4, 2016 at 3:35 PM, Robert Metzger <rm...@apache.org> wrote:

> thanks a lot for reaching out to the Flink community.
> I'm really excited to see a Flink connector in Ignite. If you feel that
> the
> connector would be more suitable for our "connector library" feel free to
> open a JIRA and open a pull request.
>

Will do.

> Were there requests in the Ignite community to have an integration with
> Flink?
>

Actually, there's a little story behind this. I'm personally interested in
reactive programming and I was keen on developing RxJava semantics for
Ignite, e.g. to consider DataStreamers as Observables and to apply
operators e.g. join, debounce, etc. In that exploration, Flink came up as a
synergistic project to integrate with and hence this thread.

Not as exciting as saying: "dude, we had 1000's of requests for this from
users" :) But once its there, I'm pretty sure people will use it.

Cheers,

*Raúl Kripalani*
PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and
Messaging Engineer
http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
Blog: raul.io
<http://raul.io/?utm_source=email&utm_medium=email&utm_campaign=apache> |
twitter: @raulvk <https://twitter.com/raulvk>