You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Sean Busbey <bu...@cloudera.com> on 2014/02/28 19:53:48 UTC

Re: Cascading Accumulo

-users
+dev

Would we be better off linking to them from the main page?

For one, we don't know that Anagha wants their code moved from github into
a repo controlled by our PMC.

As a practical matter for non-committers, they'd have to start going
through our review. Though I believe we could grant commit rights to just
the repo their project would be in[1], I don't know that this would add
much practical value over maintaining the code in its current location.


[1]: http://accumulo.apache.org/contrib.html


On Fri, Feb 28, 2014 at 1:12 PM, Arshak Navruzyan <ar...@gmail.com> wrote:

> Garry,
>
> Thanks for putting this up.
>
> Committers, perhaps we should start vetting these (another one was the
> node.js connector that was posted in the last few days) and posting them up
> under the contrib section.  I'm can spend some time to verify the
> functionality, make sure they are adequately documented, push it to the
> accumulo github and under the contrib section.
>
> Arshak
>
>
> On Fri, Feb 28, 2014 at 4:21 PM, Garry Steedman <gs...@talkthree.com>wrote:
>
>> Hello all,
>>
>> my colleague Anagha Khanolkar has developed some useful Cascading
>> (2.5.2) extensions for Accumulo (1.5.0).
>>
>> Source is available here: https://github.com/airawat/cascading.accumulo
>>
>> cheers,
>>
>> Garry
>>
>
>

Re: Cascading Accumulo

Posted by David Medinets <da...@gmail.com>.
+1 for linking.


On Fri, Feb 28, 2014 at 8:13 PM, Arshak Navruzyan <ar...@gmail.com> wrote:

> I like the single ecosystem page approach.
>
> If I am new to Accumulo and visit this page, I would expect to see a list
> like this under the heading "What are Accumulo related projects I should
> care about?"
>
> - Non-Java client libraries (C++, Python, Scala ...)
> - Event processing interfaces (Storm ...)
> - Querying (Hive, Pig ...)
> - Data pipelines (Cascading/Scalding, Flume)
> - Serialization (Avro, Protobuf, Thrift)
>
> I am sure that there are many more however as a new user something like
> this seems like "table stakes".
>
>
> On Fri, Feb 28, 2014 at 8:39 PM, Christopher <ct...@apache.org> wrote:
>
> > I agree with both points.
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
> >
> >
> > On Fri, Feb 28, 2014 at 2:34 PM, Sean Busbey <bu...@cloudera.com>
> > wrote:
> > > On Fri, Feb 28, 2014 at 2:16 PM, Arshak Navruzyan <ar...@gmail.com>
> > wrote:
> > >
> > >> Sean,
> > >>
> > >> I was just using the Cascading code as an example.  Not suggesting we
> > >> hijack Anagha's project :)
> > >>
> > >> I think there is a benefit to having a "standard" set of contribs that
> > have
> > >> been reviewed / tested.  Something similar to this:
> > >> http://docs.mongodb.org/ecosystem/
> > >>
> > >> I'm concerned that letting new users go hunting for contribs on their
> > own
> > >> gives (which may or may not work, may or may not be documented) gives
> > the
> > >> Accumulo project a little bit of an incomplete feeling.  Obviously
> it's
> > not
> > >> possible to take on the responsibility to moderate every possible
> > >> contribution but some basic/obvious ones would be helpful.
> > >>
> > >> Arshak
> > >>
> > >>
> > >>
> > > I definitely agree on the importance of a curated set of related
> outside
> > > projects. I'd much rather we not have links that answer the question
> "How
> > > do I search github for Accumulo projects?" since that doesn't really
> > answer
> > > the question of "What are Accumulo related projects I should care
> about?"
> > >
> > > Maybe we shouldn't continue to separate "contrib projects" in Accumulo
> > and
> > > "ecosystem projects" that are hosted and run elsewhere to the extent we
> > > currently do. If I'm looking for examples of things that leverage
> > Accumulo
> > > I probably don't want to look at two different places on our website.
> > Maybe
> > > we could list the Accumulo hosted contribs and the external projects on
> > an
> > > ecosystem page? "Powered by Accumulo" maybe?
> >
>

Re: Cascading Accumulo

Posted by Josh Elser <jo...@gmail.com>.
Until it becomes untenable to manage/maintain, I'd be in favor of 
keeping a page somewhere on the Accumulo site that we can allow people 
to request listing on.

Happy to keep these links up to date and process any requests.

On 2/28/14, 8:13 PM, Arshak Navruzyan wrote:
> I like the single ecosystem page approach.
>
> If I am new to Accumulo and visit this page, I would expect to see a list
> like this under the heading "What are Accumulo related projects I should
> care about?"
>
> - Non-Java client libraries (C++, Python, Scala ...)
> - Event processing interfaces (Storm ...)
> - Querying (Hive, Pig ...)
> - Data pipelines (Cascading/Scalding, Flume)
> - Serialization (Avro, Protobuf, Thrift)
>
> I am sure that there are many more however as a new user something like
> this seems like "table stakes".
>
>
> On Fri, Feb 28, 2014 at 8:39 PM, Christopher <ct...@apache.org> wrote:
>
>> I agree with both points.
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>>
>> On Fri, Feb 28, 2014 at 2:34 PM, Sean Busbey <bu...@cloudera.com>
>> wrote:
>>> On Fri, Feb 28, 2014 at 2:16 PM, Arshak Navruzyan <ar...@gmail.com>
>> wrote:
>>>
>>>> Sean,
>>>>
>>>> I was just using the Cascading code as an example.  Not suggesting we
>>>> hijack Anagha's project :)
>>>>
>>>> I think there is a benefit to having a "standard" set of contribs that
>> have
>>>> been reviewed / tested.  Something similar to this:
>>>> http://docs.mongodb.org/ecosystem/
>>>>
>>>> I'm concerned that letting new users go hunting for contribs on their
>> own
>>>> gives (which may or may not work, may or may not be documented) gives
>> the
>>>> Accumulo project a little bit of an incomplete feeling.  Obviously it's
>> not
>>>> possible to take on the responsibility to moderate every possible
>>>> contribution but some basic/obvious ones would be helpful.
>>>>
>>>> Arshak
>>>>
>>>>
>>>>
>>> I definitely agree on the importance of a curated set of related outside
>>> projects. I'd much rather we not have links that answer the question "How
>>> do I search github for Accumulo projects?" since that doesn't really
>> answer
>>> the question of "What are Accumulo related projects I should care about?"
>>>
>>> Maybe we shouldn't continue to separate "contrib projects" in Accumulo
>> and
>>> "ecosystem projects" that are hosted and run elsewhere to the extent we
>>> currently do. If I'm looking for examples of things that leverage
>> Accumulo
>>> I probably don't want to look at two different places on our website.
>> Maybe
>>> we could list the Accumulo hosted contribs and the external projects on
>> an
>>> ecosystem page? "Powered by Accumulo" maybe?
>>
>

Re: Cascading Accumulo

Posted by Arshak Navruzyan <ar...@gmail.com>.
I like the single ecosystem page approach.

If I am new to Accumulo and visit this page, I would expect to see a list
like this under the heading "What are Accumulo related projects I should
care about?"

- Non-Java client libraries (C++, Python, Scala ...)
- Event processing interfaces (Storm ...)
- Querying (Hive, Pig ...)
- Data pipelines (Cascading/Scalding, Flume)
- Serialization (Avro, Protobuf, Thrift)

I am sure that there are many more however as a new user something like
this seems like "table stakes".


On Fri, Feb 28, 2014 at 8:39 PM, Christopher <ct...@apache.org> wrote:

> I agree with both points.
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Fri, Feb 28, 2014 at 2:34 PM, Sean Busbey <bu...@cloudera.com>
> wrote:
> > On Fri, Feb 28, 2014 at 2:16 PM, Arshak Navruzyan <ar...@gmail.com>
> wrote:
> >
> >> Sean,
> >>
> >> I was just using the Cascading code as an example.  Not suggesting we
> >> hijack Anagha's project :)
> >>
> >> I think there is a benefit to having a "standard" set of contribs that
> have
> >> been reviewed / tested.  Something similar to this:
> >> http://docs.mongodb.org/ecosystem/
> >>
> >> I'm concerned that letting new users go hunting for contribs on their
> own
> >> gives (which may or may not work, may or may not be documented) gives
> the
> >> Accumulo project a little bit of an incomplete feeling.  Obviously it's
> not
> >> possible to take on the responsibility to moderate every possible
> >> contribution but some basic/obvious ones would be helpful.
> >>
> >> Arshak
> >>
> >>
> >>
> > I definitely agree on the importance of a curated set of related outside
> > projects. I'd much rather we not have links that answer the question "How
> > do I search github for Accumulo projects?" since that doesn't really
> answer
> > the question of "What are Accumulo related projects I should care about?"
> >
> > Maybe we shouldn't continue to separate "contrib projects" in Accumulo
> and
> > "ecosystem projects" that are hosted and run elsewhere to the extent we
> > currently do. If I'm looking for examples of things that leverage
> Accumulo
> > I probably don't want to look at two different places on our website.
> Maybe
> > we could list the Accumulo hosted contribs and the external projects on
> an
> > ecosystem page? "Powered by Accumulo" maybe?
>

Re: Cascading Accumulo

Posted by Christopher <ct...@apache.org>.
I agree with both points.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Fri, Feb 28, 2014 at 2:34 PM, Sean Busbey <bu...@cloudera.com> wrote:
> On Fri, Feb 28, 2014 at 2:16 PM, Arshak Navruzyan <ar...@gmail.com> wrote:
>
>> Sean,
>>
>> I was just using the Cascading code as an example.  Not suggesting we
>> hijack Anagha's project :)
>>
>> I think there is a benefit to having a "standard" set of contribs that have
>> been reviewed / tested.  Something similar to this:
>> http://docs.mongodb.org/ecosystem/
>>
>> I'm concerned that letting new users go hunting for contribs on their own
>> gives (which may or may not work, may or may not be documented) gives the
>> Accumulo project a little bit of an incomplete feeling.  Obviously it's not
>> possible to take on the responsibility to moderate every possible
>> contribution but some basic/obvious ones would be helpful.
>>
>> Arshak
>>
>>
>>
> I definitely agree on the importance of a curated set of related outside
> projects. I'd much rather we not have links that answer the question "How
> do I search github for Accumulo projects?" since that doesn't really answer
> the question of "What are Accumulo related projects I should care about?"
>
> Maybe we shouldn't continue to separate "contrib projects" in Accumulo and
> "ecosystem projects" that are hosted and run elsewhere to the extent we
> currently do. If I'm looking for examples of things that leverage Accumulo
> I probably don't want to look at two different places on our website. Maybe
> we could list the Accumulo hosted contribs and the external projects on an
> ecosystem page? "Powered by Accumulo" maybe?

Re: Cascading Accumulo

Posted by Sean Busbey <bu...@cloudera.com>.
On Fri, Feb 28, 2014 at 2:16 PM, Arshak Navruzyan <ar...@gmail.com> wrote:

> Sean,
>
> I was just using the Cascading code as an example.  Not suggesting we
> hijack Anagha's project :)
>
> I think there is a benefit to having a "standard" set of contribs that have
> been reviewed / tested.  Something similar to this:
> http://docs.mongodb.org/ecosystem/
>
> I'm concerned that letting new users go hunting for contribs on their own
> gives (which may or may not work, may or may not be documented) gives the
> Accumulo project a little bit of an incomplete feeling.  Obviously it's not
> possible to take on the responsibility to moderate every possible
> contribution but some basic/obvious ones would be helpful.
>
> Arshak
>
>
>
I definitely agree on the importance of a curated set of related outside
projects. I'd much rather we not have links that answer the question "How
do I search github for Accumulo projects?" since that doesn't really answer
the question of "What are Accumulo related projects I should care about?"

Maybe we shouldn't continue to separate "contrib projects" in Accumulo and
"ecosystem projects" that are hosted and run elsewhere to the extent we
currently do. If I'm looking for examples of things that leverage Accumulo
I probably don't want to look at two different places on our website. Maybe
we could list the Accumulo hosted contribs and the external projects on an
ecosystem page? "Powered by Accumulo" maybe?

Re: Cascading Accumulo

Posted by Arshak Navruzyan <ar...@gmail.com>.
Sean,

I was just using the Cascading code as an example.  Not suggesting we
hijack Anagha's project :)

I think there is a benefit to having a "standard" set of contribs that have
been reviewed / tested.  Something similar to this:
http://docs.mongodb.org/ecosystem/

I'm concerned that letting new users go hunting for contribs on their own
gives (which may or may not work, may or may not be documented) gives the
Accumulo project a little bit of an incomplete feeling.  Obviously it's not
possible to take on the responsibility to moderate every possible
contribution but some basic/obvious ones would be helpful.

Arshak


On Fri, Feb 28, 2014 at 7:53 PM, Sean Busbey <bu...@cloudera.com>wrote:

> -users
> +dev
>
> Would we be better off linking to them from the main page?
>
> For one, we don't know that Anagha wants their code moved from github into
> a repo controlled by our PMC.
>
> As a practical matter for non-committers, they'd have to start going
> through our review. Though I believe we could grant commit rights to just
> the repo their project would be in[1], I don't know that this would add
> much practical value over maintaining the code in its current location.
>
>
> [1]: http://accumulo.apache.org/contrib.html
>
>
>
> On Fri, Feb 28, 2014 at 1:12 PM, Arshak Navruzyan <ar...@gmail.com>wrote:
>
>> Garry,
>>
>> Thanks for putting this up.
>>
>> Committers, perhaps we should start vetting these (another one was the
>> node.js connector that was posted in the last few days) and posting them up
>> under the contrib section.  I'm can spend some time to verify the
>> functionality, make sure they are adequately documented, push it to the
>> accumulo github and under the contrib section.
>>
>> Arshak
>>
>>
>> On Fri, Feb 28, 2014 at 4:21 PM, Garry Steedman <gs...@talkthree.com>wrote:
>>
>>> Hello all,
>>>
>>> my colleague Anagha Khanolkar has developed some useful Cascading
>>> (2.5.2) extensions for Accumulo (1.5.0).
>>>
>>> Source is available here: https://github.com/airawat/cascading.accumulo
>>>
>>> cheers,
>>>
>>> Garry
>>>
>>
>>
>

Re: Cascading Accumulo

Posted by Billie Rinaldi <bi...@gmail.com>.
On Fri, Feb 28, 2014 at 11:13 AM, Billie Rinaldi
<bi...@gmail.com>wrote:

> On Fri, Feb 28, 2014 at 11:10 AM, Christopher <ct...@apache.org> wrote:
>
>> Agreed. Linking is better. Even just a search link would be useful:
>> "https://github.com/search?q=accumulo"
>>
>
> We have that on the Papers & Other Links page.
> (actually https://github.com/search?q=accumulo&type=Repositories)
>

Maybe we should add &s=updated to that url, too.


>
>
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>>
>> On Fri, Feb 28, 2014 at 1:59 PM, Keith Turner <ke...@deenlo.com> wrote:
>> > On Fri, Feb 28, 2014 at 1:53 PM, Sean Busbey <busbey+lists@cloudera.com
>> >wrote:
>> >
>> >> -users
>> >> +dev
>> >>
>> >> Would we be better off linking to them from the main page?
>> >>
>> >
>> > Thats what I was thinking.   The concept of contribs in Accumulo is not
>> > well defined and lacks ownership AFAICT.
>> >
>> >
>> >>
>> >> For one, we don't know that Anagha wants their code moved from github
>> into
>> >> a repo controlled by our PMC.
>> >>
>> >> As a practical matter for non-committers, they'd have to start going
>> >> through our review. Though I believe we could grant commit rights to
>> just
>> >> the repo their project would be in[1], I don't know that this would add
>> >> much practical value over maintaining the code in its current location.
>> >>
>> >>
>> >> [1]: http://accumulo.apache.org/contrib.html
>> >>
>> >>
>> >> On Fri, Feb 28, 2014 at 1:12 PM, Arshak Navruzyan <ar...@gmail.com>
>> >> wrote:
>> >>
>> >> > Garry,
>> >> >
>> >> > Thanks for putting this up.
>> >> >
>> >> > Committers, perhaps we should start vetting these (another one was
>> the
>> >> > node.js connector that was posted in the last few days) and posting
>> them
>> >> up
>> >> > under the contrib section.  I'm can spend some time to verify the
>> >> > functionality, make sure they are adequately documented, push it to
>> the
>> >> > accumulo github and under the contrib section.
>> >> >
>> >> > Arshak
>> >> >
>> >> >
>> >> > On Fri, Feb 28, 2014 at 4:21 PM, Garry Steedman <
>> gsteedman@talkthree.com
>> >> >wrote:
>> >> >
>> >> >> Hello all,
>> >> >>
>> >> >> my colleague Anagha Khanolkar has developed some useful Cascading
>> >> >> (2.5.2) extensions for Accumulo (1.5.0).
>> >> >>
>> >> >> Source is available here:
>> https://github.com/airawat/cascading.accumulo
>> >> >>
>> >> >> cheers,
>> >> >>
>> >> >> Garry
>> >> >>
>> >> >
>> >> >
>> >>
>>
>
>

Re: Cascading Accumulo

Posted by Billie Rinaldi <bi...@gmail.com>.
On Fri, Feb 28, 2014 at 11:10 AM, Christopher <ct...@apache.org> wrote:

> Agreed. Linking is better. Even just a search link would be useful:
> "https://github.com/search?q=accumulo"
>

We have that on the Papers & Other Links page.
(actually https://github.com/search?q=accumulo&type=Repositories)


>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Fri, Feb 28, 2014 at 1:59 PM, Keith Turner <ke...@deenlo.com> wrote:
> > On Fri, Feb 28, 2014 at 1:53 PM, Sean Busbey <busbey+lists@cloudera.com
> >wrote:
> >
> >> -users
> >> +dev
> >>
> >> Would we be better off linking to them from the main page?
> >>
> >
> > Thats what I was thinking.   The concept of contribs in Accumulo is not
> > well defined and lacks ownership AFAICT.
> >
> >
> >>
> >> For one, we don't know that Anagha wants their code moved from github
> into
> >> a repo controlled by our PMC.
> >>
> >> As a practical matter for non-committers, they'd have to start going
> >> through our review. Though I believe we could grant commit rights to
> just
> >> the repo their project would be in[1], I don't know that this would add
> >> much practical value over maintaining the code in its current location.
> >>
> >>
> >> [1]: http://accumulo.apache.org/contrib.html
> >>
> >>
> >> On Fri, Feb 28, 2014 at 1:12 PM, Arshak Navruzyan <ar...@gmail.com>
> >> wrote:
> >>
> >> > Garry,
> >> >
> >> > Thanks for putting this up.
> >> >
> >> > Committers, perhaps we should start vetting these (another one was the
> >> > node.js connector that was posted in the last few days) and posting
> them
> >> up
> >> > under the contrib section.  I'm can spend some time to verify the
> >> > functionality, make sure they are adequately documented, push it to
> the
> >> > accumulo github and under the contrib section.
> >> >
> >> > Arshak
> >> >
> >> >
> >> > On Fri, Feb 28, 2014 at 4:21 PM, Garry Steedman <
> gsteedman@talkthree.com
> >> >wrote:
> >> >
> >> >> Hello all,
> >> >>
> >> >> my colleague Anagha Khanolkar has developed some useful Cascading
> >> >> (2.5.2) extensions for Accumulo (1.5.0).
> >> >>
> >> >> Source is available here:
> https://github.com/airawat/cascading.accumulo
> >> >>
> >> >> cheers,
> >> >>
> >> >> Garry
> >> >>
> >> >
> >> >
> >>
>

Re: Cascading Accumulo

Posted by Christopher <ct...@apache.org>.
Agreed. Linking is better. Even just a search link would be useful:
"https://github.com/search?q=accumulo"

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Fri, Feb 28, 2014 at 1:59 PM, Keith Turner <ke...@deenlo.com> wrote:
> On Fri, Feb 28, 2014 at 1:53 PM, Sean Busbey <bu...@cloudera.com>wrote:
>
>> -users
>> +dev
>>
>> Would we be better off linking to them from the main page?
>>
>
> Thats what I was thinking.   The concept of contribs in Accumulo is not
> well defined and lacks ownership AFAICT.
>
>
>>
>> For one, we don't know that Anagha wants their code moved from github into
>> a repo controlled by our PMC.
>>
>> As a practical matter for non-committers, they'd have to start going
>> through our review. Though I believe we could grant commit rights to just
>> the repo their project would be in[1], I don't know that this would add
>> much practical value over maintaining the code in its current location.
>>
>>
>> [1]: http://accumulo.apache.org/contrib.html
>>
>>
>> On Fri, Feb 28, 2014 at 1:12 PM, Arshak Navruzyan <ar...@gmail.com>
>> wrote:
>>
>> > Garry,
>> >
>> > Thanks for putting this up.
>> >
>> > Committers, perhaps we should start vetting these (another one was the
>> > node.js connector that was posted in the last few days) and posting them
>> up
>> > under the contrib section.  I'm can spend some time to verify the
>> > functionality, make sure they are adequately documented, push it to the
>> > accumulo github and under the contrib section.
>> >
>> > Arshak
>> >
>> >
>> > On Fri, Feb 28, 2014 at 4:21 PM, Garry Steedman <gsteedman@talkthree.com
>> >wrote:
>> >
>> >> Hello all,
>> >>
>> >> my colleague Anagha Khanolkar has developed some useful Cascading
>> >> (2.5.2) extensions for Accumulo (1.5.0).
>> >>
>> >> Source is available here: https://github.com/airawat/cascading.accumulo
>> >>
>> >> cheers,
>> >>
>> >> Garry
>> >>
>> >
>> >
>>

Re: Cascading Accumulo

Posted by Keith Turner <ke...@deenlo.com>.
On Fri, Feb 28, 2014 at 1:53 PM, Sean Busbey <bu...@cloudera.com>wrote:

> -users
> +dev
>
> Would we be better off linking to them from the main page?
>

Thats what I was thinking.   The concept of contribs in Accumulo is not
well defined and lacks ownership AFAICT.


>
> For one, we don't know that Anagha wants their code moved from github into
> a repo controlled by our PMC.
>
> As a practical matter for non-committers, they'd have to start going
> through our review. Though I believe we could grant commit rights to just
> the repo their project would be in[1], I don't know that this would add
> much practical value over maintaining the code in its current location.
>
>
> [1]: http://accumulo.apache.org/contrib.html
>
>
> On Fri, Feb 28, 2014 at 1:12 PM, Arshak Navruzyan <ar...@gmail.com>
> wrote:
>
> > Garry,
> >
> > Thanks for putting this up.
> >
> > Committers, perhaps we should start vetting these (another one was the
> > node.js connector that was posted in the last few days) and posting them
> up
> > under the contrib section.  I'm can spend some time to verify the
> > functionality, make sure they are adequately documented, push it to the
> > accumulo github and under the contrib section.
> >
> > Arshak
> >
> >
> > On Fri, Feb 28, 2014 at 4:21 PM, Garry Steedman <gsteedman@talkthree.com
> >wrote:
> >
> >> Hello all,
> >>
> >> my colleague Anagha Khanolkar has developed some useful Cascading
> >> (2.5.2) extensions for Accumulo (1.5.0).
> >>
> >> Source is available here: https://github.com/airawat/cascading.accumulo
> >>
> >> cheers,
> >>
> >> Garry
> >>
> >
> >
>