You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bahir.apache.org by "Christian Kadner (JIRA)" <ji...@apache.org> on 2017/09/20 21:06:00 UTC

[jira] [Comment Edited] (BAHIR-135) Add Spark Streaming Hazelcast Extension

    [ https://issues.apache.org/jira/browse/BAHIR-135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173822#comment-16173822 ] 

Christian Kadner edited comment on BAHIR-135 at 9/20/17 9:05 PM:
-----------------------------------------------------------------

[~erenavsarogullari] - First off, thank you for your proposal! I have a few questions for you.

1. How does your proposed connector differ from this [Spark Connector for Hazelcast|https://github.com/hazelcast/hazelcast-spark] which is already available as {{[hazelcast-spark|https://mvnrepository.com/artifact/com.hazelcast/hazelcast-spark]}} from Maven? {{DataSource}} vs {{DStream}}?

2. Follow up on (1) -- would it make sense to contribute your code to that existing connector, since that appears to be officially supported by [hazelcast.org|https://hazelcast.org/plugins/?type=big-data]?

3. If you think your Streaming extension should exist in a separate connector I suggest you create a PR in Bahir as WIP and solicit some feedback/review from some of the Hazelcast community members that may be interested in using Hazelcast with Spark (you may find people on Spark or Hazelcast mailing lists, forums, Stackoverflow threads, ...) i.e. viktor@hazelcast.com has this [slide deck|https://static.rainfocus.com/oracle/oraclecode17/sess/1485452090137001XkeN/PF/In-memory_Analytics_with_Apache_Spark_and_Hazelcast_-_OracleCode_-_03-01-2017.pdf] on Hazelcast and Spark


was (Author: ckadner):
[~erenavsarogullari] - First off, thank you for your proposal! I have a few questions for you.

1. How does your proposed connector differ from this [Spark Connector for Hazelcast|https://github.com/hazelcast/hazelcast-spark] which is already available as {{[hazelcast-spark|https://mvnrepository.com/artifact/com.hazelcast/hazelcast-spark]}} from Maven? {{DataSource}} vs {{DStream}}?

2. Follow up on (1) -- would it make sense to contribute your code to that existing connector, since that appears to be officially supported by [hazelcast.org|https://hazelcast.org/plugins/?type=big-data]?

3. If you think your Streaming extension should exist in a separate connector I suggest you create a PR in Bahir as WIP and solicit some feedback/review from some of the Hazelcast community members that may be interested in using Hazelcast with Spark (you may find people on Spark or Hazelcast mailing lists, forums, Stackoverflow threads, ...)

> Add Spark Streaming Hazelcast Extension
> ---------------------------------------
>
>                 Key: BAHIR-135
>                 URL: https://issues.apache.org/jira/browse/BAHIR-135
>             Project: Bahir
>          Issue Type: New Feature
>          Components: Spark Streaming Connectors
>            Reporter: Eren Avsarogullari
>
> I would like to propose Spark Streaming Hazelcast extension. 
> Hazelcast is an in-memory data grid(IMDG) solution under Apache 2 License and provides distributed data structures such as distributed map, list, set, queue (etc). When a new entry is _added_, _updated_, _removed_ or _evicted_, a new event is fired by Hazelcast. This flow is almost same for above all distributed data structures. This extension aims to subscribe these distributed events via Hazelcast Event Listeners and create a DStream in the light of distributed data structure changes. This extension supports Distributed Map, List, Set, Queue, Topic, MultiMap and Replicated Map.
> Please find the following documentation for further details.
> *Proposal:* [https://docs.google.com/document/d/1YN_9u72Wv699g8ivM3c8K_zZUbUl73JtquWy-g71Tm4/edit?usp=sharing]
> Also repo is ready for review. It covers implementation, full unit test coverage and examples as well.
> *Repo:* [https://github.com/erenavsarogullari/bahir/tree/Hazelcast_Streaming]
> This extension can be useful for both Spark and Hazelcast communities to listen these Hazelcast events & analyze them and transform the events payloads via Spark.
> Please let me know if you need further details and all feedbacks are welcome in advance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)