You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Deenar Toraskar <de...@gmail.com> on 2015/10/20 11:42:57 UTC

Re: can I use Spark as alternative for gem fire cache ?

Kali

>> can I cache a RDD in memory for a whole day ? as of I know RDD will get
empty once the spark code finish executing (correct me if I am wrong).

Spark can definitely be used as a replacement for in memory databases for
certain use cases. Spark RDDs are not shared amongst contextss. You need a
long running Spark context and a REST API  (see JobServer) or some other
RPC mechanism to allow clients access information from the cached RDD in
the long running context.

Things to note are RDDs are immutable and do not support granular updates
and operations like key value lookups out of the box (though IndexedRDD
addresses some of these use cases). Spark will not be suitable for all IMDB
usecases. If you are using IMDBs for aggregation and reporting, Spark is a
much better fit. If you are using IMDBs for maintaining shared mutable
state then Spark is not designed for these use cases.

Hope that helps.

Deenar

Deenar

On 17 October 2015 at 19:05, Ndjido Ardo Bar <nd...@gmail.com> wrote:

> Hi Kali,
>
> If I do understand you well, Tachyon ( http://tachyon-project.org) can be
> good alternative. You can use Spark Api to load and persist data into
> Tachyon.
> Hope that will help.
>
> Ardo
>
> > On 17 Oct 2015, at 15:28, "Kali.tummala@gmail.com" <
> Kali.tummala@gmail.com> wrote:
> >
> > Hi All,
> >
> > Can spark be used as an alternative to gem fire cache ? we use gem fire
> > cache to save (cache) dimension data in memory which is later used by our
> > Java custom made ETL tool can I do something like below ?
> >
> > can I cache a RDD in memory for a whole day ? as of I know RDD will get
> > empty once the spark code finish executing (correct me if I am wrong).
> >
> > Spark:-
> > create a RDD
> > rdd.persistance
> >
> > Thanks
> >
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/can-I-use-Spark-as-alternative-for-gem-fire-cache-tp25106.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > For additional commands, e-mail: user-help@spark.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: can I use Spark as alternative for gem fire cache ?

Posted by Jags Ramnarayanan <jr...@pivotal.io>.

Kali,
   This is possible depending on the access pattern by your ETL logic. If
you only read (no point mutations) and you can pay the additional price of
having to scan your dimension data each time you have to lookup something
then spark could work out. Note that a KV RDD isn't really a Map
internally. Each partition will scan looking for your key.

btw, if you are planning to avoid using GemFire being a commercial product,
it is now also a Apache Incubation project  - Geode
<http://geode.incubator.apache.org/>.
And, Geode also has a Spark connector
<https://github.com/apache/incubator-geode/tree/develop/gemfire-spark-connector>
that makes working with GemFire region data (in parallel from each spark
partition) a breeze - any region is visible as a RDD.

-- Jags
(www.snappydata.io)

On Tue, Oct 20, 2015 at 2:42 AM, Deenar Toraskar <de...@gmail.com>
wrote:

> Kali
>
> >> can I cache a RDD in memory for a whole day ? as of I know RDD will get
> empty once the spark code finish executing (correct me if I am wrong).
>
> Spark can definitely be used as a replacement for in memory databases for
> certain use cases. Spark RDDs are not shared amongst contextss. You need a
> long running Spark context and a REST API  (see JobServer) or some other
> RPC mechanism to allow clients access information from the cached RDD in
> the long running context.
>
> Things to note are RDDs are immutable and do not support granular updates
> and operations like key value lookups out of the box (though IndexedRDD
> addresses some of these use cases). Spark will not be suitable for all IMDB
> usecases. If you are using IMDBs for aggregation and reporting, Spark is a
> much better fit. If you are using IMDBs for maintaining shared mutable
> state then Spark is not designed for these use cases.
>
> Hope that helps.
>
> Deenar
>
>
>
> Deenar
>
> On 17 October 2015 at 19:05, Ndjido Ardo Bar <nd...@gmail.com> wrote:
>
>> Hi Kali,
>>
>> If I do understand you well, Tachyon ( http://tachyon-project.org) can
>> be good alternative. You can use Spark Api to load and persist data into
>> Tachyon.
>> Hope that will help.
>>
>> Ardo
>>
>> > On 17 Oct 2015, at 15:28, "Kali.tummala@gmail.com" <
>> Kali.tummala@gmail.com> wrote:
>> >
>> > Hi All,
>> >
>> > Can spark be used as an alternative to gem fire cache ? we use gem fire
>> > cache to save (cache) dimension data in memory which is later used by
>> our
>> > Java custom made ETL tool can I do something like below ?
>> >
>> > can I cache a RDD in memory for a whole day ? as of I know RDD will get
>> > empty once the spark code finish executing (correct me if I am wrong).
>> >
>> > Spark:-
>> > create a RDD
>> > rdd.persistance
>> >
>> > Thanks
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/can-I-use-Spark-as-alternative-for-gem-fire-cache-tp25106.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> > For additional commands, e-mail: user-help@spark.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>