You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@gora.apache.org by Furkan KAMACI <fu...@gmail.com> on 2015/03/27 18:14:01 UTC

Spark Backend Support for Gora Proposal (GORA-386)

Hi All,

I've submitted a proposal for GORA-386, Spark Backend Support.

You already know that Apache Gora open source framework provides an
in-memory data model and persistence for big data. Gora supports persisting
to column stores, key value stores, document stores and RDBMSs, and
analyzing the data with extensive Apache Hadoop MapReduce support.

On the other hand, Spark is an Apache project advertised as “lightning fast
cluster computing”. It has a thriving open-source community and is the most
active Apache project at the moment.

There is already an existing Map/Reduce support for Apache Gora. However
there is not a generic abstraction layer which allows using some other
replacements instead of that.

At my proposal, I aim to create an abstraction layer and support Spark as a
backend. My goal includes Gora Input Format to RDD Transformation, Generic
Abstraction Layer Backend and Data Storage via newly developed
GoraInputmap. Due to Gora will have an architectural change; I planned to
test its functionality with new architecture.

I also have some other plans if I can finish my proposal earlier. I want to
try to test the ability of mapping Hadoop style Map/Reduce stuff into Spark
style. There are some interesting articles about it, i.e.: [1]

Kind Regards,
Furkan KAMACI

[1]
http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/

Re: Spark Backend Support for Gora Proposal (GORA-386)

Posted by Lewis John Mcgibbney <le...@gmail.com>.

W00T!!!

On Fri, Mar 27, 2015 at 10:14 AM, Furkan KAMACI <fu...@gmail.com>
wrote:

> Hi All,
>
> I've submitted a proposal for GORA-386, Spark Backend Support.
>
> You already know that Apache Gora open source framework provides an
> in-memory data model and persistence for big data. Gora supports persisting
> to column stores, key value stores, document stores and RDBMSs, and
> analyzing the data with extensive Apache Hadoop MapReduce support.
>
> On the other hand, Spark is an Apache project advertised as “lightning
> fast cluster computing”. It has a thriving open-source community and is the
> most active Apache project at the moment.
>
> There is already an existing Map/Reduce support for Apache Gora. However
> there is not a generic abstraction layer which allows using some other
> replacements instead of that.
>
> At my proposal, I aim to create an abstraction layer and support Spark as
> a backend. My goal includes Gora Input Format to RDD
> Transformation, Generic Abstraction Layer Backend and Data Storage via
> newly developed GoraInputmap. Due to Gora will have an architectural
> change; I planned to test its functionality with new architecture.
>
> I also have some other plans if I can finish my proposal earlier. I want
> to try to test the ability of mapping Hadoop style Map/Reduce stuff into
> Spark style. There are some interesting articles about it, i.e.: [1]
>
> Kind Regards,
> Furkan KAMACI
>
> [1]
> http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/
>



-- 
*Lewis*