You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Rahul Jeevanandam <ra...@incture.com> on 2015/10/09 12:10:59 UTC

Datastore or DB for spark

Hi Guys,

 I wanted to know what is the databases that you associate with spark?

-- 
Regards,

*Rahul J*

Re: Datastore or DB for spark

Posted by Deenar Toraskar <de...@gmail.com>.
The choice of datastore is driven by your use case. In fact Spark can work
with multiple datastores too. Each datastore is optimised for certain kinds
of data.

e.g. HDFS is great for analytics and large data sets at rest. It is
scalable and very performant, but is immutable. No-SQL databases supports
key value and indexed lookups, provide granular update semantics and
provide eventual consistency. Relational databases provide more stronger
transactional guarantees.

So you can pick and choose and mix the storage layer appropriate to the
data in hand. e.g. logs might go to HDFS, product catalogues in Cassandra
and transactions in a relational database. Spark works transparently over
all these data sources.

Hope that helps.

On 9 October 2015 at 23:37, Xiao Li <ga...@gmail.com> wrote:

> FYI, in my local environment, Spark is connected to DB2 on z/OS but that
> requires a special JDBC driver.
>
> Xiao Li
>
>
> 2015-10-09 8:38 GMT-07:00 Rahul Jeevanandam <ra...@incture.com>:
>
>> Hi Jörn Franke
>>
>> I was sure that relational database wouldn't be a good option for Spark.
>> But what about distributed databases like Hbase, Cassandra, etc?
>>
>> On Fri, Oct 9, 2015 at 7:21 PM, Jörn Franke <jo...@gmail.com> wrote:
>>
>>> I am not aware of any empirical evidence, but I think hadoop (HDFS) as a
>>> datastore for Spark is quiet common. With relational databases you usually
>>> do not have so much data and you do not benefit from data locality.
>>>
>>> Le ven. 9 oct. 2015 à 15:16, Rahul Jeevanandam <ra...@incture.com> a
>>> écrit :
>>>
>>>> I wanna know what everyone are using. Which datastore is popular among
>>>> Spark community.
>>>>
>>>> On Fri, Oct 9, 2015 at 6:16 PM, Ted Yu <yu...@gmail.com> wrote:
>>>>
>>>>> There are connectors for hbase, Cassandra, etc.
>>>>>
>>>>> Which data store do you use now ?
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Oct 9, 2015, at 3:10 AM, Rahul Jeevanandam <ra...@incture.com>
>>>>> wrote:
>>>>>
>>>>> Hi Guys,
>>>>>
>>>>>  I wanted to know what is the databases that you associate with spark?
>>>>>
>>>>> --
>>>>> Regards,
>>>>>
>>>>> *Rahul J*
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>>
>>>> *Rahul J*
>>>>
>>>
>>
>>
>> --
>> Regards,
>> *Rahul J*
>> Associate Architect – Technology
>> Incture <http://www.incture.com/>
>>
>
>

Re: Datastore or DB for spark

Posted by Xiao Li <ga...@gmail.com>.
FYI, in my local environment, Spark is connected to DB2 on z/OS but that
requires a special JDBC driver.

Xiao Li


2015-10-09 8:38 GMT-07:00 Rahul Jeevanandam <ra...@incture.com>:

> Hi Jörn Franke
>
> I was sure that relational database wouldn't be a good option for Spark.
> But what about distributed databases like Hbase, Cassandra, etc?
>
> On Fri, Oct 9, 2015 at 7:21 PM, Jörn Franke <jo...@gmail.com> wrote:
>
>> I am not aware of any empirical evidence, but I think hadoop (HDFS) as a
>> datastore for Spark is quiet common. With relational databases you usually
>> do not have so much data and you do not benefit from data locality.
>>
>> Le ven. 9 oct. 2015 à 15:16, Rahul Jeevanandam <ra...@incture.com> a
>> écrit :
>>
>>> I wanna know what everyone are using. Which datastore is popular among
>>> Spark community.
>>>
>>> On Fri, Oct 9, 2015 at 6:16 PM, Ted Yu <yu...@gmail.com> wrote:
>>>
>>>> There are connectors for hbase, Cassandra, etc.
>>>>
>>>> Which data store do you use now ?
>>>>
>>>> Cheers
>>>>
>>>> On Oct 9, 2015, at 3:10 AM, Rahul Jeevanandam <ra...@incture.com>
>>>> wrote:
>>>>
>>>> Hi Guys,
>>>>
>>>>  I wanted to know what is the databases that you associate with spark?
>>>>
>>>> --
>>>> Regards,
>>>>
>>>> *Rahul J*
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>>
>>> *Rahul J*
>>>
>>
>
>
> --
> Regards,
> *Rahul J*
> Associate Architect – Technology
> Incture <http://www.incture.com/>
>

Re: Datastore or DB for spark

Posted by Rahul Jeevanandam <ra...@incture.com>.
Hi Jörn Franke

I was sure that relational database wouldn't be a good option for Spark.
But what about distributed databases like Hbase, Cassandra, etc?

On Fri, Oct 9, 2015 at 7:21 PM, Jörn Franke <jo...@gmail.com> wrote:

> I am not aware of any empirical evidence, but I think hadoop (HDFS) as a
> datastore for Spark is quiet common. With relational databases you usually
> do not have so much data and you do not benefit from data locality.
>
> Le ven. 9 oct. 2015 à 15:16, Rahul Jeevanandam <ra...@incture.com> a
> écrit :
>
>> I wanna know what everyone are using. Which datastore is popular among
>> Spark community.
>>
>> On Fri, Oct 9, 2015 at 6:16 PM, Ted Yu <yu...@gmail.com> wrote:
>>
>>> There are connectors for hbase, Cassandra, etc.
>>>
>>> Which data store do you use now ?
>>>
>>> Cheers
>>>
>>> On Oct 9, 2015, at 3:10 AM, Rahul Jeevanandam <ra...@incture.com>
>>> wrote:
>>>
>>> Hi Guys,
>>>
>>>  I wanted to know what is the databases that you associate with spark?
>>>
>>> --
>>> Regards,
>>>
>>> *Rahul J*
>>>
>>>
>>
>>
>> --
>> Regards,
>>
>> *Rahul J*
>>
>


-- 
Regards,
*Rahul J*
Associate Architect – Technology
Incture <http://www.incture.com/>

Re: Datastore or DB for spark

Posted by Jörn Franke <jo...@gmail.com>.
I am not aware of any empirical evidence, but I think hadoop (HDFS) as a
datastore for Spark is quiet common. With relational databases you usually
do not have so much data and you do not benefit from data locality.

Le ven. 9 oct. 2015 à 15:16, Rahul Jeevanandam <ra...@incture.com> a
écrit :

> I wanna know what everyone are using. Which datastore is popular among
> Spark community.
>
> On Fri, Oct 9, 2015 at 6:16 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> There are connectors for hbase, Cassandra, etc.
>>
>> Which data store do you use now ?
>>
>> Cheers
>>
>> On Oct 9, 2015, at 3:10 AM, Rahul Jeevanandam <ra...@incture.com>
>> wrote:
>>
>> Hi Guys,
>>
>>  I wanted to know what is the databases that you associate with spark?
>>
>> --
>> Regards,
>>
>> *Rahul J*
>>
>>
>
>
> --
> Regards,
>
> *Rahul J*
>

Re: Datastore or DB for spark

Posted by Rahul Jeevanandam <ra...@incture.com>.
I wanna know what everyone are using. Which datastore is popular among
Spark community.

On Fri, Oct 9, 2015 at 6:16 PM, Ted Yu <yu...@gmail.com> wrote:

> There are connectors for hbase, Cassandra, etc.
>
> Which data store do you use now ?
>
> Cheers
>
> On Oct 9, 2015, at 3:10 AM, Rahul Jeevanandam <ra...@incture.com> wrote:
>
> Hi Guys,
>
>  I wanted to know what is the databases that you associate with spark?
>
> --
> Regards,
>
> *Rahul J*
>
>


-- 
Regards,

*Rahul J*

Re: Datastore or DB for spark

Posted by Ted Yu <yu...@gmail.com>.
There are connectors for hbase, Cassandra, etc. 

Which data store do you use now ?

Cheers

> On Oct 9, 2015, at 3:10 AM, Rahul Jeevanandam <ra...@incture.com> wrote:
> 
> Hi Guys,
> 
>  I wanted to know what is the databases that you associate with spark? 
> 
> -- 
> Regards,
> Rahul J