You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by shirish <sh...@gmail.com> on 2010/04/07 08:58:24 UTC

Hive support to cassandra

Hello,

I am shirish, a final year undergraduate student from Indian Institute of
Information Technology, Allahabad. I am participating in GSOC this year and
I am taking the
cassandra-913<https://issues.apache.org/jira/browse/CASSANDRA-913>(Add
Hive Support) as a gsoc project, which deals with adding hive support
with cassandra as its back end. I would like to know if any one could help
me on how to approach this. I have already started reading the wiki, and the
presentations/papers.

Thanking you,

Yours sincerely,
Shirish Reddy P
(Student)
Indian Institute Of Information Technology, Allahabad
Mob No. +919987398253

RE: Hive support to cassandra

Posted by John Sichi <js...@facebook.com>.
Just to clarify (since I mentioned HyperTable and Cassandra in that blog post), Facebook's own integration efforts are currently going into Hive+HBase alone, but for the Hive project as a whole, we'd be happy to see storage handlers beyond HBase.  Someone from HyperTable has been working on one and asking questions on hive-user.  At talks I have given, a number of people have expressed interest in Cassandra, but so far I haven't seen anyone take ownership on that after the GSoC project was a no-go.

Each technology has its own pros and cons, which I'll stay out of here, but I will say that I believe Hive can be useful as scaleout data integration/transformation technology even for stores which are unsuited for data warehousing.

JVS

________________________________________
From: Jeff Hammerbacher [hammer@cloudera.com]
Sent: Wednesday, June 16, 2010 5:44 PM
To: hive-dev@hadoop.apache.org
Subject: Re: Hive support to cassandra

Hey Tom,

I don't want to be rude, but if you're using Cassandra for your data
warehouse environment, you're doing it wrong. HBase is the primary focus for
integration with Hive (see
http://www.cloudera.com/blog/2010/06/integrating-hive-and-hbase/, for
example). Cassandra is a great choice for an OLTP application, but certainly
not for a data warehouse.

Later,
Jeff

On Wed, Jun 16, 2010 at 3:22 PM, tom kersnick <hi...@gmail.com> wrote:

> Quick question for all of you.  Its seems that there is more movement using
> Hive with Hbase rather than Cassandra.  Do you see this changing in the
> near
> future?  I have a client who is interested in using Cassandra due to the
> ease of maintenance.  They are planning on using Cassandra for both their
> data warehouse and OLTP environments.  Thoughts?
>
> I saw this ticket and I wanted to ask.
>
> Thanks in advance.
>
> /tom
>
>
> On Mon, May 3, 2010 at 12:42 PM, Edward Capriolo <edlinuxguru@gmail.com
> >wrote:
>
> > On Thu, Apr 8, 2010 at 1:17 PM, shirish <sh...@gmail.com>
> wrote:
> >
> > > > All,
> > > >
> > > > http://code.google.com/soc/.
> > > >
> > > > It is an interesting thing that Google offers stipends to get open
> > source
> > > > code written. However, last year I was was interested in a project
> that
> > > did
> > > > NOT get accepted into GSOC. It was quite deflating to be not
> > > > accepted/rejected.
> > > >
> > > > Money does make the world go around, and if we all had plenty of
> money
> > we
> > > > would all have more time to write open source code :) But on the
> chance
> > > > your
> > > > application does get rejected consider doing it anyway!
> > > >
> > > > Edward
> > > >
> > >
> > > Definitely Edward, Thanks for the suggestion :)
> > >
> > > shirish
> > >
> >
> > I did not see any cassandra or hive SOC projects....
> > http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010.
> :(
> > So
> > if no one is going to pick this cassandra interface up I will pick it up
> > after I close some pending things ....that is two strikes for me and
> GSOC.
> >
>

Re: Hive support to cassandra

Posted by Jeff Hammerbacher <ha...@cloudera.com>.
Hey Tom,

Well, I was being a bit short, and for that I apologize. To elaborate:
Cassandra was conceived of as a solution for a vastly different problem than
data warehousing, and certain design decisions in the early days were made
in light of the needs of OLTP data management. To the best of my knowledge,
its primary users and contributors have continued that focus. The
integration with Hadoop MapReduce is primarily useful for bulk import and
export, as well as for facilitating data hygiene by making bulk
transformations possible (e.g. recoding a column or enforcing a consistency
constraint in an asynchronous fashion).

More generally, OLTP ("application data management") and data warehousing
("analytical data management") are two very different beasts, and to expect
a single storage system to be optimal for both kinds of workloads is one
place where I feel things went a bit wrong in the RDBMS world. I'm hopeful
that we can avoid some of that confusion with these next generation storage
systems, though the temptation of making both workloads happen in a single
system is likely too large to be avoided. Something like
https://issues.apache.org/jira/browse/HBASE-2357 may be helpful here if you
insist on making both workloads happen in a single system.

In any case, using Hive against an RCFile in HDFS is probably the best way
to go in the short term for the data warehouse, as both the HBase and
Cassandra support in Hive are experimental.

Regards,
Jeff

On Wed, Jun 16, 2010 at 9:14 PM, tom kersnick <hi...@gmail.com> wrote:

> You are not being rude Jeff.  This is a request from the client due to ease
> of use of Cassandra compared to Hbase.  I'm with you on this.  They are
> looking for apples to apples consistency.  Easy migration of data from OLTP
> (Cassandra) to their Data Warehouse (Cassandra?).  Apparently not.  Is it
> possible to migrate from Cassandra to Hbase?  Any documentation on this
> type
> of push to Hbase from Cassandra would be helpful.
>
> Thanks in advance.
>
> /tom
>
>
>
>
>
> On Wed, Jun 16, 2010 at 5:44 PM, Jeff Hammerbacher <hammer@cloudera.com
> >wrote:
>
> > Hey Tom,
> >
> > I don't want to be rude, but if you're using Cassandra for your data
> > warehouse environment, you're doing it wrong. HBase is the primary focus
> > for
> > integration with Hive (see
> > http://www.cloudera.com/blog/2010/06/integrating-hive-and-hbase/,
> > example). Cassandra is a great choice for an OLTP application, but
> > certainly
> > not for a data warehouse.
> >
> > Later,
> > Jeff
> >
> > On Wed, Jun 16, 2010 at 3:22 PM, tom kersnick <hi...@gmail.com>
> wrote:
> >
> > > Quick question for all of you.  Its seems that there is more movement
> > using
> > > Hive with Hbase rather than Cassandra.  Do you see this changing in the
> > > near
> > > future?  I have a client who is interested in using Cassandra due to
> the
> > > ease of maintenance.  They are planning on using Cassandra for both
> their
> > > data warehouse and OLTP environments.  Thoughts?
> > >
> > > I saw this ticket and I wanted to ask.
> > >
> > > Thanks in advance.
> > >
> > > /tom
> > >
> > >
> > > On Mon, May 3, 2010 at 12:42 PM, Edward Capriolo <
> edlinuxguru@gmail.com
> > > >wrote:
> > >
> > > > On Thu, Apr 8, 2010 at 1:17 PM, shirish <sh...@gmail.com>
> > > wrote:
> > > >
> > > > > > All,
> > > > > >
> > > > > > http://code.google.com/soc/.
> > > > > >
> > > > > > It is an interesting thing that Google offers stipends to get
> open
> > > > source
> > > > > > code written. However, last year I was was interested in a
> project
> > > that
> > > > > did
> > > > > > NOT get accepted into GSOC. It was quite deflating to be not
> > > > > > accepted/rejected.
> > > > > >
> > > > > > Money does make the world go around, and if we all had plenty of
> > > money
> > > > we
> > > > > > would all have more time to write open source code :) But on the
> > > chance
> > > > > > your
> > > > > > application does get rejected consider doing it anyway!
> > > > > >
> > > > > > Edward
> > > > > >
> > > > >
> > > > > Definitely Edward, Thanks for the suggestion :)
> > > > >
> > > > > shirish
> > > > >
> > > >
> > > > I did not see any cassandra or hive SOC projects....
> > > >
> http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010.
> > > :(
> > > > So
> > > > if no one is going to pick this cassandra interface up I will pick it
> > up
> > > > after I close some pending things ....that is two strikes for me and
> > > GSOC.
> > > >
> > >
> >
>

Re: Hive support to cassandra

Posted by tom kersnick <hi...@gmail.com>.
I really appreciate the help Edward.

/tom


On Thu, Jun 17, 2010 at 8:04 AM, Edward Capriolo <ed...@gmail.com>wrote:

> On Thu, Jun 17, 2010 at 12:14 AM, tom kersnick <hi...@gmail.com> wrote:
>
> > You are not being rude Jeff.  This is a request from the client due to
> ease
> > of use of Cassandra compared to Hbase.  I'm with you on this.  They are
> > looking for apples to apples consistency.  Easy migration of data from
> OLTP
> > (Cassandra) to their Data Warehouse (Cassandra?).  Apparently not.  Is it
> > possible to migrate from Cassandra to Hbase?  Any documentation on this
> > type
> > of push to Hbase from Cassandra would be helpful.
> >
> > Thanks in advance.
> >
> > /tom
> >
> >
> >
> >
> >
> > On Wed, Jun 16, 2010 at 5:44 PM, Jeff Hammerbacher <hammer@cloudera.com
> > >wrote:
> >
> > > Hey Tom,
> > >
> > > I don't want to be rude, but if you're using Cassandra for your data
> > > warehouse environment, you're doing it wrong. HBase is the primary
> focus
> > > for
> > > integration with Hive (see
> > > http://www.cloudera.com/blog/2010/06/integrating-hive-and-hbase/,
> > > example). Cassandra is a great choice for an OLTP application, but
> > > certainly
> > > not for a data warehouse.
> > >
> > > Later,
> > > Jeff
> > >
> > > On Wed, Jun 16, 2010 at 3:22 PM, tom kersnick <hi...@gmail.com>
> > wrote:
> > >
> > > > Quick question for all of you.  Its seems that there is more movement
> > > using
> > > > Hive with Hbase rather than Cassandra.  Do you see this changing in
> the
> > > > near
> > > > future?  I have a client who is interested in using Cassandra due to
> > the
> > > > ease of maintenance.  They are planning on using Cassandra for both
> > their
> > > > data warehouse and OLTP environments.  Thoughts?
> > > >
> > > > I saw this ticket and I wanted to ask.
> > > >
> > > > Thanks in advance.
> > > >
> > > > /tom
> > > >
> > > >
> > > > On Mon, May 3, 2010 at 12:42 PM, Edward Capriolo <
> > edlinuxguru@gmail.com
> > > > >wrote:
> > > >
> > > > > On Thu, Apr 8, 2010 at 1:17 PM, shirish <sh...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > > All,
> > > > > > >
> > > > > > > http://code.google.com/soc/.
> > > > > > >
> > > > > > > It is an interesting thing that Google offers stipends to get
> > open
> > > > > source
> > > > > > > code written. However, last year I was was interested in a
> > project
> > > > that
> > > > > > did
> > > > > > > NOT get accepted into GSOC. It was quite deflating to be not
> > > > > > > accepted/rejected.
> > > > > > >
> > > > > > > Money does make the world go around, and if we all had plenty
> of
> > > > money
> > > > > we
> > > > > > > would all have more time to write open source code :) But on
> the
> > > > chance
> > > > > > > your
> > > > > > > application does get rejected consider doing it anyway!
> > > > > > >
> > > > > > > Edward
> > > > > > >
> > > > > >
> > > > > > Definitely Edward, Thanks for the suggestion :)
> > > > > >
> > > > > > shirish
> > > > > >
> > > > >
> > > > > I did not see any cassandra or hive SOC projects....
> > > > >
> > http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010.
> > > > :(
> > > > > So
> > > > > if no one is going to pick this cassandra interface up I will pick
> it
> > > up
> > > > > after I close some pending things ....that is two strikes for me
> and
> > > > GSOC.
> > > > >
> > > >
> > >
> >
>
> I am currently in the process of writing a cassandra storage handler to
> match the Hbase one. Ill open a ticket for it. I was looking to tackle it
> after the Hive Variables ticket I am working on.
>
> Edward
>

Re: Hive support to cassandra

Posted by Edward Capriolo <ed...@gmail.com>.
On Thu, Jun 17, 2010 at 12:14 AM, tom kersnick <hi...@gmail.com> wrote:

> You are not being rude Jeff.  This is a request from the client due to ease
> of use of Cassandra compared to Hbase.  I'm with you on this.  They are
> looking for apples to apples consistency.  Easy migration of data from OLTP
> (Cassandra) to their Data Warehouse (Cassandra?).  Apparently not.  Is it
> possible to migrate from Cassandra to Hbase?  Any documentation on this
> type
> of push to Hbase from Cassandra would be helpful.
>
> Thanks in advance.
>
> /tom
>
>
>
>
>
> On Wed, Jun 16, 2010 at 5:44 PM, Jeff Hammerbacher <hammer@cloudera.com
> >wrote:
>
> > Hey Tom,
> >
> > I don't want to be rude, but if you're using Cassandra for your data
> > warehouse environment, you're doing it wrong. HBase is the primary focus
> > for
> > integration with Hive (see
> > http://www.cloudera.com/blog/2010/06/integrating-hive-and-hbase/,
> > example). Cassandra is a great choice for an OLTP application, but
> > certainly
> > not for a data warehouse.
> >
> > Later,
> > Jeff
> >
> > On Wed, Jun 16, 2010 at 3:22 PM, tom kersnick <hi...@gmail.com>
> wrote:
> >
> > > Quick question for all of you.  Its seems that there is more movement
> > using
> > > Hive with Hbase rather than Cassandra.  Do you see this changing in the
> > > near
> > > future?  I have a client who is interested in using Cassandra due to
> the
> > > ease of maintenance.  They are planning on using Cassandra for both
> their
> > > data warehouse and OLTP environments.  Thoughts?
> > >
> > > I saw this ticket and I wanted to ask.
> > >
> > > Thanks in advance.
> > >
> > > /tom
> > >
> > >
> > > On Mon, May 3, 2010 at 12:42 PM, Edward Capriolo <
> edlinuxguru@gmail.com
> > > >wrote:
> > >
> > > > On Thu, Apr 8, 2010 at 1:17 PM, shirish <sh...@gmail.com>
> > > wrote:
> > > >
> > > > > > All,
> > > > > >
> > > > > > http://code.google.com/soc/.
> > > > > >
> > > > > > It is an interesting thing that Google offers stipends to get
> open
> > > > source
> > > > > > code written. However, last year I was was interested in a
> project
> > > that
> > > > > did
> > > > > > NOT get accepted into GSOC. It was quite deflating to be not
> > > > > > accepted/rejected.
> > > > > >
> > > > > > Money does make the world go around, and if we all had plenty of
> > > money
> > > > we
> > > > > > would all have more time to write open source code :) But on the
> > > chance
> > > > > > your
> > > > > > application does get rejected consider doing it anyway!
> > > > > >
> > > > > > Edward
> > > > > >
> > > > >
> > > > > Definitely Edward, Thanks for the suggestion :)
> > > > >
> > > > > shirish
> > > > >
> > > >
> > > > I did not see any cassandra or hive SOC projects....
> > > >
> http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010.
> > > :(
> > > > So
> > > > if no one is going to pick this cassandra interface up I will pick it
> > up
> > > > after I close some pending things ....that is two strikes for me and
> > > GSOC.
> > > >
> > >
> >
>

I am currently in the process of writing a cassandra storage handler to
match the Hbase one. Ill open a ticket for it. I was looking to tackle it
after the Hive Variables ticket I am working on.

Edward

Re: Hive support to cassandra

Posted by tom kersnick <hi...@gmail.com>.
You are not being rude Jeff.  This is a request from the client due to ease
of use of Cassandra compared to Hbase.  I'm with you on this.  They are
looking for apples to apples consistency.  Easy migration of data from OLTP
(Cassandra) to their Data Warehouse (Cassandra?).  Apparently not.  Is it
possible to migrate from Cassandra to Hbase?  Any documentation on this type
of push to Hbase from Cassandra would be helpful.

Thanks in advance.

/tom





On Wed, Jun 16, 2010 at 5:44 PM, Jeff Hammerbacher <ha...@cloudera.com>wrote:

> Hey Tom,
>
> I don't want to be rude, but if you're using Cassandra for your data
> warehouse environment, you're doing it wrong. HBase is the primary focus
> for
> integration with Hive (see
> http://www.cloudera.com/blog/2010/06/integrating-hive-and-hbase/,
> example). Cassandra is a great choice for an OLTP application, but
> certainly
> not for a data warehouse.
>
> Later,
> Jeff
>
> On Wed, Jun 16, 2010 at 3:22 PM, tom kersnick <hi...@gmail.com> wrote:
>
> > Quick question for all of you.  Its seems that there is more movement
> using
> > Hive with Hbase rather than Cassandra.  Do you see this changing in the
> > near
> > future?  I have a client who is interested in using Cassandra due to the
> > ease of maintenance.  They are planning on using Cassandra for both their
> > data warehouse and OLTP environments.  Thoughts?
> >
> > I saw this ticket and I wanted to ask.
> >
> > Thanks in advance.
> >
> > /tom
> >
> >
> > On Mon, May 3, 2010 at 12:42 PM, Edward Capriolo <edlinuxguru@gmail.com
> > >wrote:
> >
> > > On Thu, Apr 8, 2010 at 1:17 PM, shirish <sh...@gmail.com>
> > wrote:
> > >
> > > > > All,
> > > > >
> > > > > http://code.google.com/soc/.
> > > > >
> > > > > It is an interesting thing that Google offers stipends to get open
> > > source
> > > > > code written. However, last year I was was interested in a project
> > that
> > > > did
> > > > > NOT get accepted into GSOC. It was quite deflating to be not
> > > > > accepted/rejected.
> > > > >
> > > > > Money does make the world go around, and if we all had plenty of
> > money
> > > we
> > > > > would all have more time to write open source code :) But on the
> > chance
> > > > > your
> > > > > application does get rejected consider doing it anyway!
> > > > >
> > > > > Edward
> > > > >
> > > >
> > > > Definitely Edward, Thanks for the suggestion :)
> > > >
> > > > shirish
> > > >
> > >
> > > I did not see any cassandra or hive SOC projects....
> > > http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010.
> > :(
> > > So
> > > if no one is going to pick this cassandra interface up I will pick it
> up
> > > after I close some pending things ....that is two strikes for me and
> > GSOC.
> > >
> >
>

Re: Hive support to cassandra

Posted by Jeff Hammerbacher <ha...@cloudera.com>.
Hey Tom,

I don't want to be rude, but if you're using Cassandra for your data
warehouse environment, you're doing it wrong. HBase is the primary focus for
integration with Hive (see
http://www.cloudera.com/blog/2010/06/integrating-hive-and-hbase/, for
example). Cassandra is a great choice for an OLTP application, but certainly
not for a data warehouse.

Later,
Jeff

On Wed, Jun 16, 2010 at 3:22 PM, tom kersnick <hi...@gmail.com> wrote:

> Quick question for all of you.  Its seems that there is more movement using
> Hive with Hbase rather than Cassandra.  Do you see this changing in the
> near
> future?  I have a client who is interested in using Cassandra due to the
> ease of maintenance.  They are planning on using Cassandra for both their
> data warehouse and OLTP environments.  Thoughts?
>
> I saw this ticket and I wanted to ask.
>
> Thanks in advance.
>
> /tom
>
>
> On Mon, May 3, 2010 at 12:42 PM, Edward Capriolo <edlinuxguru@gmail.com
> >wrote:
>
> > On Thu, Apr 8, 2010 at 1:17 PM, shirish <sh...@gmail.com>
> wrote:
> >
> > > > All,
> > > >
> > > > http://code.google.com/soc/.
> > > >
> > > > It is an interesting thing that Google offers stipends to get open
> > source
> > > > code written. However, last year I was was interested in a project
> that
> > > did
> > > > NOT get accepted into GSOC. It was quite deflating to be not
> > > > accepted/rejected.
> > > >
> > > > Money does make the world go around, and if we all had plenty of
> money
> > we
> > > > would all have more time to write open source code :) But on the
> chance
> > > > your
> > > > application does get rejected consider doing it anyway!
> > > >
> > > > Edward
> > > >
> > >
> > > Definitely Edward, Thanks for the suggestion :)
> > >
> > > shirish
> > >
> >
> > I did not see any cassandra or hive SOC projects....
> > http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010.
> :(
> > So
> > if no one is going to pick this cassandra interface up I will pick it up
> > after I close some pending things ....that is two strikes for me and
> GSOC.
> >
>

Re: Hive support to cassandra

Posted by tom kersnick <hi...@gmail.com>.
Quick question for all of you.  Its seems that there is more movement using
Hive with Hbase rather than Cassandra.  Do you see this changing in the near
future?  I have a client who is interested in using Cassandra due to the
ease of maintenance.  They are planning on using Cassandra for both their
data warehouse and OLTP environments.  Thoughts?

I saw this ticket and I wanted to ask.

Thanks in advance.

/tom


On Mon, May 3, 2010 at 12:42 PM, Edward Capriolo <ed...@gmail.com>wrote:

> On Thu, Apr 8, 2010 at 1:17 PM, shirish <sh...@gmail.com> wrote:
>
> > > All,
> > >
> > > http://code.google.com/soc/.
> > >
> > > It is an interesting thing that Google offers stipends to get open
> source
> > > code written. However, last year I was was interested in a project that
> > did
> > > NOT get accepted into GSOC. It was quite deflating to be not
> > > accepted/rejected.
> > >
> > > Money does make the world go around, and if we all had plenty of money
> we
> > > would all have more time to write open source code :) But on the chance
> > > your
> > > application does get rejected consider doing it anyway!
> > >
> > > Edward
> > >
> >
> > Definitely Edward, Thanks for the suggestion :)
> >
> > shirish
> >
>
> I did not see any cassandra or hive SOC projects....
> http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010. :(
> So
> if no one is going to pick this cassandra interface up I will pick it up
> after I close some pending things ....that is two strikes for me and GSOC.
>

Re: Hive support to cassandra

Posted by Edward Capriolo <ed...@gmail.com>.
On Thu, Apr 8, 2010 at 1:17 PM, shirish <sh...@gmail.com> wrote:

> > All,
> >
> > http://code.google.com/soc/.
> >
> > It is an interesting thing that Google offers stipends to get open source
> > code written. However, last year I was was interested in a project that
> did
> > NOT get accepted into GSOC. It was quite deflating to be not
> > accepted/rejected.
> >
> > Money does make the world go around, and if we all had plenty of money we
> > would all have more time to write open source code :) But on the chance
> > your
> > application does get rejected consider doing it anyway!
> >
> > Edward
> >
>
> Definitely Edward, Thanks for the suggestion :)
>
> shirish
>

I did not see any cassandra or hive SOC projects....
http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010. :( So
if no one is going to pick this cassandra interface up I will pick it up
after I close some pending things ....that is two strikes for me and GSOC.

Re: Hive support to cassandra

Posted by shirish <sh...@gmail.com>.
> All,
>
> http://code.google.com/soc/.
>
> It is an interesting thing that Google offers stipends to get open source
> code written. However, last year I was was interested in a project that did
> NOT get accepted into GSOC. It was quite deflating to be not
> accepted/rejected.
>
> Money does make the world go around, and if we all had plenty of money we
> would all have more time to write open source code :) But on the chance
> your
> application does get rejected consider doing it anyway!
>
> Edward
>

Definitely Edward, Thanks for the suggestion :)

shirish

Re: Hive support to cassandra

Posted by Edward Capriolo <ed...@gmail.com>.
On Thu, Apr 8, 2010 at 1:19 AM, shirish <sh...@gmail.com> wrote:

> On Wed, Apr 7, 2010 at 7:42 PM, Edward Capriolo <edlinuxguru@gmail.com
> >wrote:
>
> > On Wed, Apr 7, 2010 at 2:58 AM, shirish <sh...@gmail.com>
> wrote:
> >
> > > Hello,
> > >
> > > I am shirish, a final year undergraduate student from Indian Institute
> of
> > > Information Technology, Allahabad. I am participating in GSOC this year
> > and
> > > I am taking the
> > > cassandra-913<https://issues.apache.org/jira/browse/CASSANDRA-913>(Add
> > > Hive Support) as a gsoc project, which deals with adding hive support
> > > with cassandra as its back end. I would like to know if any one could
> > help
> > > me on how to approach this. I have already started reading the wiki,
> and
> > > the
> > > presentations/papers.
> > >
> > > Thanking you,
> > >
> > > Yours sincerely,
> > > Shirish Reddy P
> > > (Student)
> > > Indian Institute Of Information Technology, Allahabad
> > > Mob No. +919987398253
> > >
> >
> > Shirish,
> >
> > I was looking to do this as well. Since cassandra 6.0 has an input
> format.
> > org.apache.cassandra.hadoop.ColumnFamilyInputFormat.class
> >
> > Without looking into the code much, We should be able to pretty much copy
> > the work in trunk that uses HBase an an input and port it to cassandra.
> >
> > Let me know if you want to tag team this one.
> > Edward
> >
>
> Hello Edward,
>
> Thanks for the reply. Since the project is under gsoc I don't know that the
> rules would allow in teaming up and working on a single issue. Thank you
> for
> your support Edward.
>
> shirish.
>


All,

http://code.google.com/soc/.

It is an interesting thing that Google offers stipends to get open source
code written. However, last year I was was interested in a project that did
NOT get accepted into GSOC. It was quite deflating to be not
accepted/rejected.

Money does make the world go around, and if we all had plenty of money we
would all have more time to write open source code :) But on the chance your
application does get rejected consider doing it anyway!

Edward

Re: Hive support to cassandra

Posted by shirish <sh...@gmail.com>.
On Wed, Apr 7, 2010 at 7:42 PM, Edward Capriolo <ed...@gmail.com>wrote:

> On Wed, Apr 7, 2010 at 2:58 AM, shirish <sh...@gmail.com> wrote:
>
> > Hello,
> >
> > I am shirish, a final year undergraduate student from Indian Institute of
> > Information Technology, Allahabad. I am participating in GSOC this year
> and
> > I am taking the
> > cassandra-913<https://issues.apache.org/jira/browse/CASSANDRA-913>(Add
> > Hive Support) as a gsoc project, which deals with adding hive support
> > with cassandra as its back end. I would like to know if any one could
> help
> > me on how to approach this. I have already started reading the wiki, and
> > the
> > presentations/papers.
> >
> > Thanking you,
> >
> > Yours sincerely,
> > Shirish Reddy P
> > (Student)
> > Indian Institute Of Information Technology, Allahabad
> > Mob No. +919987398253
> >
>
> Shirish,
>
> I was looking to do this as well. Since cassandra 6.0 has an input format.
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.class
>
> Without looking into the code much, We should be able to pretty much copy
> the work in trunk that uses HBase an an input and port it to cassandra.
>
> Let me know if you want to tag team this one.
> Edward
>

Hello Edward,

Thanks for the reply. Since the project is under gsoc I don't know that the
rules would allow in teaming up and working on a single issue. Thank you for
your support Edward.

shirish.

RE: Hive support to cassandra

Posted by John Sichi <js...@facebook.com>.
If you end up needing help understanding the HBase part, just ping me.

JVS
________________________________________
From: Edward Capriolo [edlinuxguru@gmail.com]
Sent: Wednesday, April 07, 2010 7:12 AM
To: hive-dev@hadoop.apache.org
Subject: Re: Hive support to cassandra

On Wed, Apr 7, 2010 at 2:58 AM, shirish <sh...@gmail.com> wrote:

> Hello,
>
> I am shirish, a final year undergraduate student from Indian Institute of
> Information Technology, Allahabad. I am participating in GSOC this year and
> I am taking the
> cassandra-913<https://issues.apache.org/jira/browse/CASSANDRA-913>(Add
> Hive Support) as a gsoc project, which deals with adding hive support
> with cassandra as its back end. I would like to know if any one could help
> me on how to approach this. I have already started reading the wiki, and
> the
> presentations/papers.
>
> Thanking you,
>
> Yours sincerely,
> Shirish Reddy P
> (Student)
> Indian Institute Of Information Technology, Allahabad
> Mob No. +919987398253
>

Shirish,

I was looking to do this as well. Since cassandra 6.0 has an input format.
org.apache.cassandra.hadoop.ColumnFamilyInputFormat.class

Without looking into the code much, We should be able to pretty much copy
the work in trunk that uses HBase an an input and port it to cassandra.

Let me know if you want to tag team this one.
Edward

Re: Hive support to cassandra

Posted by Edward Capriolo <ed...@gmail.com>.
On Wed, Apr 7, 2010 at 2:58 AM, shirish <sh...@gmail.com> wrote:

> Hello,
>
> I am shirish, a final year undergraduate student from Indian Institute of
> Information Technology, Allahabad. I am participating in GSOC this year and
> I am taking the
> cassandra-913<https://issues.apache.org/jira/browse/CASSANDRA-913>(Add
> Hive Support) as a gsoc project, which deals with adding hive support
> with cassandra as its back end. I would like to know if any one could help
> me on how to approach this. I have already started reading the wiki, and
> the
> presentations/papers.
>
> Thanking you,
>
> Yours sincerely,
> Shirish Reddy P
> (Student)
> Indian Institute Of Information Technology, Allahabad
> Mob No. +919987398253
>

Shirish,

I was looking to do this as well. Since cassandra 6.0 has an input format.
org.apache.cassandra.hadoop.ColumnFamilyInputFormat.class

Without looking into the code much, We should be able to pretty much copy
the work in trunk that uses HBase an an input and port it to cassandra.

Let me know if you want to tag team this one.
Edward