You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Saravana Kumar <sa...@gmail.com> on 2011/08/09 11:47:35 UTC

Derby with Hadoop --Why?

Hi

What is the significance of Derby in Hadoop Project.
Why people are using Derby along with Hadoop

Regards
Saravana Kumar.J

Re: Derby with Hadoop --Why?

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
[CCed general@]

Mike,

What you are describing is MapReduce application scenario, where the DB is
handled from your MR code, nothing special from Hadoop side.

Thanks.

Alejandro

On Wed, Aug 10, 2011 at 5:34 AM, Segel, Mike <ms...@navteq.com> wrote:

> Arrgh!
> It's been far too many years since I was handed my diploma and kicked off
> campus. :-)
> IMHO it's a bit esoteric for a class room homework assignment.  Maybe an
> interview question used to stump most of the candidates?
>
> The funny thing is that on the walk to work, I started to think of if it
> made sense for a certain subset of m/r problems to use derby as an in memory
> db/local lightweight DB
> Ok and before you say WTF, I'm talking about a subset of problems where
> depending on the input to the Mapper.map() method, you may want to do a
> quick look up against a database which contains lookup data that is static
> and is indexed.
>
> Ok, so maybe I shouldn't read my e-mails before heading off to work... :-)
>
> -Mike
>
>
> -----Original Message-----
> From: Ted Dunning [mailto:tdunning@maprtech.com]
> Sent: Wednesday, August 10, 2011 12:55 AM
> To: general@hadoop.apache.org
> Subject: Re: Derby with Hadoop --Why?
>
> No.  He meant nothing of the kind.  The other explanations expanded on
> this.
>
> This sounds like homework.  If so, I would recommend a bit of reading
> before asking.
>
> On Tue, Aug 9, 2011 at 10:37 PM, Saravana Kumar
> <sa...@gmail.com>wrote:
> [SNIP]
>
>
> The information contained in this communication may be CONFIDENTIAL and is
> intended only for the use of the recipient(s) named above.  If you are not
> the intended recipient, you are hereby notified that any dissemination,
> distribution, or copying of this communication, or any of its contents, is
> strictly prohibited.  If you have received this communication in error,
> please notify the sender and delete/destroy the original message and any
> copy of it from your computer or paper files.
>

Re: Derby with Hadoop --Why?

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
[CCed general@]

Mike,

What you are describing is MapReduce application scenario, where the DB is
handled from your MR code, nothing special from Hadoop side.

Thanks.

Alejandro

On Wed, Aug 10, 2011 at 5:34 AM, Segel, Mike <ms...@navteq.com> wrote:

> Arrgh!
> It's been far too many years since I was handed my diploma and kicked off
> campus. :-)
> IMHO it's a bit esoteric for a class room homework assignment.  Maybe an
> interview question used to stump most of the candidates?
>
> The funny thing is that on the walk to work, I started to think of if it
> made sense for a certain subset of m/r problems to use derby as an in memory
> db/local lightweight DB
> Ok and before you say WTF, I'm talking about a subset of problems where
> depending on the input to the Mapper.map() method, you may want to do a
> quick look up against a database which contains lookup data that is static
> and is indexed.
>
> Ok, so maybe I shouldn't read my e-mails before heading off to work... :-)
>
> -Mike
>
>
> -----Original Message-----
> From: Ted Dunning [mailto:tdunning@maprtech.com]
> Sent: Wednesday, August 10, 2011 12:55 AM
> To: general@hadoop.apache.org
> Subject: Re: Derby with Hadoop --Why?
>
> No.  He meant nothing of the kind.  The other explanations expanded on
> this.
>
> This sounds like homework.  If so, I would recommend a bit of reading
> before asking.
>
> On Tue, Aug 9, 2011 at 10:37 PM, Saravana Kumar
> <sa...@gmail.com>wrote:
> [SNIP]
>
>
> The information contained in this communication may be CONFIDENTIAL and is
> intended only for the use of the recipient(s) named above.  If you are not
> the intended recipient, you are hereby notified that any dissemination,
> distribution, or copying of this communication, or any of its contents, is
> strictly prohibited.  If you have received this communication in error,
> please notify the sender and delete/destroy the original message and any
> copy of it from your computer or paper files.
>

RE: Derby with Hadoop --Why?

Posted by "Segel, Mike" <ms...@navteq.com>.
Arrgh!
It's been far too many years since I was handed my diploma and kicked off campus. :-)
IMHO it's a bit esoteric for a class room homework assignment.  Maybe an interview question used to stump most of the candidates?

The funny thing is that on the walk to work, I started to think of if it made sense for a certain subset of m/r problems to use derby as an in memory db/local lightweight DB
Ok and before you say WTF, I'm talking about a subset of problems where depending on the input to the Mapper.map() method, you may want to do a quick look up against a database which contains lookup data that is static and is indexed. 

Ok, so maybe I shouldn't read my e-mails before heading off to work... :-)

-Mike


-----Original Message-----
From: Ted Dunning [mailto:tdunning@maprtech.com] 
Sent: Wednesday, August 10, 2011 12:55 AM
To: general@hadoop.apache.org
Subject: Re: Derby with Hadoop --Why?

No.  He meant nothing of the kind.  The other explanations expanded on this.

This sounds like homework.  If so, I would recommend a bit of reading before asking.

On Tue, Aug 9, 2011 at 10:37 PM, Saravana Kumar
<sa...@gmail.com>wrote:
[SNIP]


The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above.  If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited.  If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.

Re: Derby with Hadoop --Why?

Posted by Ted Dunning <td...@maprtech.com>.
No.  He meant nothing of the kind.  The other explanations expanded on this.

This sounds like homework.  If so, I would recommend a bit of reading before
asking.

On Tue, Aug 9, 2011 at 10:37 PM, Saravana Kumar
<sa...@gmail.com>wrote:

> Thanks For the Explanation but needs some clarity as well
>
> Do you mean to say all the Information required to run a map/reduce job is
> effectively stored in derby. It means hadoop(not Ecosystem) uses Derby?
>
> On Tue, Aug 9, 2011 at 5:52 PM, Michael Segel <michael_segel@hotmail.com
> >wrote:
>
> >
> > Derby?
> >
> > First a little history...
> > Derby started out long ago as Cloudscape. Cloudscape was bought by
> > Informix. Informix was bought by IBM. IBM didn't understand Cloudscape
> and
> > decided to open source the project under APL. Hence Derby was born.
> >
> > Derby is an excellent lightweight 100% java database. So when you have a
> > Java framework, using Derby makes a lot of sense. Derby is used to
> persist
> > some environment information and I believe its used in part of some of
> the
> > unit testing.
> >
> > Where Derby has been replaced by MySQL is when someone wanted a
> multi-user
> > database and they were more comfortable with MySQL than they were with
> > Derby. (Hint: Derby can be started as an embedded single user database,
> or
> > as a multi-user database by changing its invocation at startup. ;-)
> >
> > So I would guess the initial reason to go with Derby was that its
> released
> > under APL and there were no licensing issues. ;-)
> >
> >
> > > Date: Tue, 9 Aug 2011 15:17:35 +0530
> > > Subject: Derby with Hadoop --Why?
> > > From: saravana.hadoop@gmail.com
> > > To: general@hadoop.apache.org
> >  >
> > > Hi
> > >
> > > What is the significance of Derby in Hadoop Project.
> > > Why people are using Derby along with Hadoop
> > >
> > > Regards
> > > Saravana Kumar.J
> >
> >
>

Re: Derby with Hadoop --Why?

Posted by Konstantin Boudnik <co...@apache.org>.
Well, for once Hive uses Derby by default as its metastore.
What make you think that Hadoop project is using derby? 

Also, this question seems to belong to common-dev@ (Cc'ed) raher then general@
(Bcc'ed)

Cos

On Wed, Aug 10, 2011 at 11:07AM, Saravana Kumar wrote:
> Thanks For the Explanation but needs some clarity as well
> 
> Do you mean to say all the Information required to run a map/reduce job is
> effectively stored in derby. It means hadoop(not Ecosystem) uses Derby?
> 
> On Tue, Aug 9, 2011 at 5:52 PM, Michael Segel <mi...@hotmail.com>wrote:
> 
> >
> > Derby?
> >
> > First a little history...
> > Derby started out long ago as Cloudscape. Cloudscape was bought by
> > Informix. Informix was bought by IBM. IBM didn't understand Cloudscape and
> > decided to open source the project under APL. Hence Derby was born.
> >
> > Derby is an excellent lightweight 100% java database. So when you have a
> > Java framework, using Derby makes a lot of sense. Derby is used to persist
> > some environment information and I believe its used in part of some of the
> > unit testing.
> >
> > Where Derby has been replaced by MySQL is when someone wanted a multi-user
> > database and they were more comfortable with MySQL than they were with
> > Derby. (Hint: Derby can be started as an embedded single user database, or
> > as a multi-user database by changing its invocation at startup. ;-)
> >
> > So I would guess the initial reason to go with Derby was that its released
> > under APL and there were no licensing issues. ;-)
> >
> >
> > > Date: Tue, 9 Aug 2011 15:17:35 +0530
> > > Subject: Derby with Hadoop --Why?
> > > From: saravana.hadoop@gmail.com
> > > To: general@hadoop.apache.org
> >  >
> > > Hi
> > >
> > > What is the significance of Derby in Hadoop Project.
> > > Why people are using Derby along with Hadoop
> > >
> > > Regards
> > > Saravana Kumar.J
> >
> >

Re: Derby with Hadoop --Why?

Posted by Konstantin Boudnik <co...@apache.org>.
Well, for once Hive uses Derby by default as its metastore.
What make you think that Hadoop project is using derby? 

Also, this question seems to belong to common-dev@ (Cc'ed) raher then general@
(Bcc'ed)

Cos

On Wed, Aug 10, 2011 at 11:07AM, Saravana Kumar wrote:
> Thanks For the Explanation but needs some clarity as well
> 
> Do you mean to say all the Information required to run a map/reduce job is
> effectively stored in derby. It means hadoop(not Ecosystem) uses Derby?
> 
> On Tue, Aug 9, 2011 at 5:52 PM, Michael Segel <mi...@hotmail.com>wrote:
> 
> >
> > Derby?
> >
> > First a little history...
> > Derby started out long ago as Cloudscape. Cloudscape was bought by
> > Informix. Informix was bought by IBM. IBM didn't understand Cloudscape and
> > decided to open source the project under APL. Hence Derby was born.
> >
> > Derby is an excellent lightweight 100% java database. So when you have a
> > Java framework, using Derby makes a lot of sense. Derby is used to persist
> > some environment information and I believe its used in part of some of the
> > unit testing.
> >
> > Where Derby has been replaced by MySQL is when someone wanted a multi-user
> > database and they were more comfortable with MySQL than they were with
> > Derby. (Hint: Derby can be started as an embedded single user database, or
> > as a multi-user database by changing its invocation at startup. ;-)
> >
> > So I would guess the initial reason to go with Derby was that its released
> > under APL and there were no licensing issues. ;-)
> >
> >
> > > Date: Tue, 9 Aug 2011 15:17:35 +0530
> > > Subject: Derby with Hadoop --Why?
> > > From: saravana.hadoop@gmail.com
> > > To: general@hadoop.apache.org
> >  >
> > > Hi
> > >
> > > What is the significance of Derby in Hadoop Project.
> > > Why people are using Derby along with Hadoop
> > >
> > > Regards
> > > Saravana Kumar.J
> >
> >

Re: Derby with Hadoop --Why?

Posted by Saravana Kumar <sa...@gmail.com>.
Thanks For the Explanation but needs some clarity as well

Do you mean to say all the Information required to run a map/reduce job is
effectively stored in derby. It means hadoop(not Ecosystem) uses Derby?

On Tue, Aug 9, 2011 at 5:52 PM, Michael Segel <mi...@hotmail.com>wrote:

>
> Derby?
>
> First a little history...
> Derby started out long ago as Cloudscape. Cloudscape was bought by
> Informix. Informix was bought by IBM. IBM didn't understand Cloudscape and
> decided to open source the project under APL. Hence Derby was born.
>
> Derby is an excellent lightweight 100% java database. So when you have a
> Java framework, using Derby makes a lot of sense. Derby is used to persist
> some environment information and I believe its used in part of some of the
> unit testing.
>
> Where Derby has been replaced by MySQL is when someone wanted a multi-user
> database and they were more comfortable with MySQL than they were with
> Derby. (Hint: Derby can be started as an embedded single user database, or
> as a multi-user database by changing its invocation at startup. ;-)
>
> So I would guess the initial reason to go with Derby was that its released
> under APL and there were no licensing issues. ;-)
>
>
> > Date: Tue, 9 Aug 2011 15:17:35 +0530
> > Subject: Derby with Hadoop --Why?
> > From: saravana.hadoop@gmail.com
> > To: general@hadoop.apache.org
>  >
> > Hi
> >
> > What is the significance of Derby in Hadoop Project.
> > Why people are using Derby along with Hadoop
> >
> > Regards
> > Saravana Kumar.J
>
>

RE: Derby with Hadoop --Why?

Posted by Michael Segel <mi...@hotmail.com>.
Derby?

First a little history...
Derby started out long ago as Cloudscape. Cloudscape was bought by Informix. Informix was bought by IBM. IBM didn't understand Cloudscape and decided to open source the project under APL. Hence Derby was born. 

Derby is an excellent lightweight 100% java database. So when you have a Java framework, using Derby makes a lot of sense. Derby is used to persist some environment information and I believe its used in part of some of the unit testing. 

Where Derby has been replaced by MySQL is when someone wanted a multi-user database and they were more comfortable with MySQL than they were with Derby. (Hint: Derby can be started as an embedded single user database, or as a multi-user database by changing its invocation at startup. ;-)

So I would guess the initial reason to go with Derby was that its released under APL and there were no licensing issues. ;-)


> Date: Tue, 9 Aug 2011 15:17:35 +0530
> Subject: Derby with Hadoop --Why?
> From: saravana.hadoop@gmail.com
> To: general@hadoop.apache.org
> 
> Hi
> 
> What is the significance of Derby in Hadoop Project.
> Why people are using Derby along with Hadoop
> 
> Regards
> Saravana Kumar.J