You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by Josh Wills <jo...@gmail.com> on 2012/11/01 02:04:53 UTC

Re: accumulo integration

+crunch-user, to see if we have any lurking accumulo users

Hey Anthony,

I don't think that we have much Accumulo experience yet among the
committers, so I'm hesitant to add a crunch-accumulo subproject w/o having
someone on the team who is dedicated to maintaining it. If you have stuff
you want to open source on github, we would be happy to link to it on the
Crunch homepage (something we should do for crunchR, come to think of it),
and we're all very happy to work together on bug fixes and new features to
support your use cases. Ideally, we would all work together for awhile and
get to like working with each other, and then you would join the committers
and own the submodule.

I'll let other folks weigh in, but that's my two cents.

J


On Wed, Oct 31, 2012 at 7:29 AM, Anthony Fox <ad...@ccri.com> wrote:

> Hi all,
>
> I've started exploring Apache Crunch for use in developing some analytics
> on top of the Apache Accumulo column family store.  So far, it looks very
> promising.  I've implemented the source and sink and exposed tables through
> the scrunch repl.  Being able to interactively define and submit map/reduce
> jobs from the repl will make developing new analytics much easier.  There
> are some enhancements that I'll need to put together to support some of my
> analytical workflows.  Much of this effort can be abstracted and applied to
> the HBase support as well.  If this is of interest to anyone, I'd be happy
> to contribute back to the crunch project.  Let me know.
>
> Thanks,
> Anthony
>

Re: accumulo integration

Posted by Matthias Friedrich <ma...@mafr.de>.
Hi,

I fully agree. Accumulo looks cool, but at least I don't have any
experience with it. Besides, the HBase dependency disaster is still
fresh in my mind. 50+ dependencies, with several of them being
incompatible with Hadoop and Crunch and no way to be sure it actually
works.

BTW: Do we have someone on the team who could help us solve the HBase
issues?

Regards,
  Matthias

On Wednesday, 2012-10-31, Josh Wills wrote:
> +crunch-user, to see if we have any lurking accumulo users
> 
> Hey Anthony,
> 
> I don't think that we have much Accumulo experience yet among the
> committers, so I'm hesitant to add a crunch-accumulo subproject w/o having
> someone on the team who is dedicated to maintaining it. If you have stuff
> you want to open source on github, we would be happy to link to it on the
> Crunch homepage (something we should do for crunchR, come to think of it),
> and we're all very happy to work together on bug fixes and new features to
> support your use cases. Ideally, we would all work together for awhile and
> get to like working with each other, and then you would join the committers
> and own the submodule.
> 
> I'll let other folks weigh in, but that's my two cents.
> 
> J
> 
> 
> On Wed, Oct 31, 2012 at 7:29 AM, Anthony Fox <ad...@ccri.com> wrote:
> 
> > Hi all,
> >
> > I've started exploring Apache Crunch for use in developing some analytics
> > on top of the Apache Accumulo column family store.  So far, it looks very
> > promising.  I've implemented the source and sink and exposed tables through
> > the scrunch repl.  Being able to interactively define and submit map/reduce
> > jobs from the repl will make developing new analytics much easier.  There
> > are some enhancements that I'll need to put together to support some of my
> > analytical workflows.  Much of this effort can be abstracted and applied to
> > the HBase support as well.  If this is of interest to anyone, I'd be happy
> > to contribute back to the crunch project.  Let me know.
> >
> > Thanks,
> > Anthony
> >

Re: accumulo integration

Posted by Matthias Friedrich <ma...@mafr.de>.
Hi,

I fully agree. Accumulo looks cool, but at least I don't have any
experience with it. Besides, the HBase dependency disaster is still
fresh in my mind. 50+ dependencies, with several of them being
incompatible with Hadoop and Crunch and no way to be sure it actually
works.

BTW: Do we have someone on the team who could help us solve the HBase
issues?

Regards,
  Matthias

On Wednesday, 2012-10-31, Josh Wills wrote:
> +crunch-user, to see if we have any lurking accumulo users
> 
> Hey Anthony,
> 
> I don't think that we have much Accumulo experience yet among the
> committers, so I'm hesitant to add a crunch-accumulo subproject w/o having
> someone on the team who is dedicated to maintaining it. If you have stuff
> you want to open source on github, we would be happy to link to it on the
> Crunch homepage (something we should do for crunchR, come to think of it),
> and we're all very happy to work together on bug fixes and new features to
> support your use cases. Ideally, we would all work together for awhile and
> get to like working with each other, and then you would join the committers
> and own the submodule.
> 
> I'll let other folks weigh in, but that's my two cents.
> 
> J
> 
> 
> On Wed, Oct 31, 2012 at 7:29 AM, Anthony Fox <ad...@ccri.com> wrote:
> 
> > Hi all,
> >
> > I've started exploring Apache Crunch for use in developing some analytics
> > on top of the Apache Accumulo column family store.  So far, it looks very
> > promising.  I've implemented the source and sink and exposed tables through
> > the scrunch repl.  Being able to interactively define and submit map/reduce
> > jobs from the repl will make developing new analytics much easier.  There
> > are some enhancements that I'll need to put together to support some of my
> > analytical workflows.  Much of this effort can be abstracted and applied to
> > the HBase support as well.  If this is of interest to anyone, I'd be happy
> > to contribute back to the crunch project.  Let me know.
> >
> > Thanks,
> > Anthony
> >