You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Edward Capriolo <ed...@gmail.com> on 2011/03/30 17:08:39 UTC

Jira issues

The number of open tickets keeps growing. Searching out keywords in
open issues is at this point very hard to do and real productivity
killer. I believe we should be much more pro-active about catching
issues as they open and routing them properly.

My comments on:

HIVE-2085 Document GenericUD(A|T)F.

This is the second ticket I have seen calling for "more
documentation". All tickets like these should be closed instantly
unless
1) User wants to write an xdoc or java doc and get it committed.
Otherwise user can update the wiki. There are many things that could
be documented "better" and having a ticket to remind us about each one
is not practical.

HIVE-2079 The warehouse directory shouldn't be 777'ed

We need to intercept these issues better and get better descriptions
of exactly what the person wants to do. Such as:
1) How does this currently work?
2) What would be better ?
3) What needs to be changed ?
4) What is the timeline ?

This will help the person get on the right track. It will also help us
figure out how actionable the issue is. As many of these tickets sound
nice, but some never come to fruition, then they sit indefinitely in
open state.

We need to be more aggressive in intercepting issues and not be afraid
to close them as LATER, WONT FIX. Otherwise we are never going to be
able to turn this around and the open issues are just going to keep
growing. Suggest users to the wiki for ROAD map or the IRC. So they
can discuss features before creating vague tickets.

Edward

Re: Jira issues

Posted by Edward Capriolo <ed...@gmail.com>.
On Wed, Mar 30, 2011 at 11:59 AM, Lars Francke <la...@gmail.com> wrote:
>> HIVE-2085 Document GenericUD(A|T)F.
>>
>> This is the second ticket I have seen calling for "more
>> documentation". All tickets like these should be closed instantly
>> unless
>>
>> 1) User wants to write an xdoc or java doc and get it committed.
>> Otherwise user can update the wiki. There are many things that could
>> be documented "better" and having a ticket to remind us about each one
>> is not practical.
>
> This is not a case of "better" documentation because there is
> virtually no documentation at all for this feature. And I can't create
> a patch for this because I just don't know enough.
>
> It is my opinion that if you close these kinds of tickets then every
> new feature that gets committed should only be allowed with
> accompanying documentation because otherwise all these features will
> be useless to most people who learn about Hive from the documentation
> and are unaware of them. In this case those issues should probably be
> assigned to whoever implemented the feature.
>
>> We need to be more aggressive in intercepting issues and not be afraid
>> to close them as LATER, WONT FIX. Otherwise we are never going to be
>> able to turn this around and the open issues are just going to keep
>> growing. Suggest users to the wiki for ROAD map or the IRC. So they
>> can discuss features before creating vague tickets.
>
> I also don't agree here. Other projects encourage these kinds of
> issues and are doing fine with them. It is always easy to filter out
> Documentation issues once they are in the correct category. These
> kinds of issues are also a helpful resource for those searching for
> others having the same problem even if there's no fix available or no
> one working on it.
>
> I obviously say this as a non-committer and while I love the progress
> Hive is making and I'm very thankful for all the work going on I as a
> user would wish for more and better documentation over new features
> sometimes. I firmly believe this would also help attracting new
> developers to help out. The code base isn't the easiest to understand
> either when starting from scratch.
>
> Cheers,
> Lars
>



 It is my opinion that if you close these kinds of tickets then every
 new feature that gets committed should only be allowed with
 accompanying documentation because otherwise all these features will
 be useless to most people who learn about Hive from the documentation
 and are unaware of them. In this case those issues should probably be
 assigned to whoever implemented the feature.

Lars,

I have been pitching exactly that for months/years now. I really
wanted to do xdocs inline with code commits. I did a lot of work to
move the wiki to xdocs.
http://issues.apache.org/jira/browse/hive-1135. I repeatedly tried to
move us off the wiki for the exact reasons you sited. Mainly that no
one outside a tight knit group even understands half the features of
hive. I always try to impress on everyone how annoying this is.

But alas, Breaking the status quo is hard.
http://web.archiveorange.com/archive/v/gaVLEAiZ4td2nWmyBbNJ

Personally, I find that in the time it takes to run a single hive unit
test, generating a patch, and review and committing totally eclipses
the time it would take to include a little one paragraph blob in xdoc.
So to me it is a no-brainer. But I am off topic.

 I also don't agree here. Other projects encourage these kinds of
 issues and are doing fine with them.

The main point is opening a ticket that no one is ever going to look
at or do is not helping anything.
https://issues.apache.org/jira/browse/HIVE. Look at our 30 day
divergence. What other projects let into their issue tracker does not
concern me. Some projects try very hard to keep the open count low. It
makes it much easier to cut releases. It makes the project look
organized and managed. That is where I would like hive to be. A
sinister person not familiar with hive might say "2000 open critical
bugs this software is total buggy crap.". But that is not really what
is going on. Is it?

Hive-2 is still open. HIVE fricken-2! Right now I can not even tell if
it applies any more.

When I mentioned your ticket I was not trying to specifically call you
out. I was using it as an example that we should engage tickets early
and make assessments of them rather then letting open tickets linger.

Specifically on your issue, Having written a bunch of UDFS and Generic
UDF I understand they are rather confusing. There is a great need for
this documentation, but this is true for most API's in hive, and most
open source as a whole. IMHO Opening a jira for more docs is kinda
like preaching to the quire. If your not going to be the point man for
it it is probably not happening any time soon.

Re: Jira issues

Posted by Lars Francke <la...@gmail.com>.
> HIVE-2085 Document GenericUD(A|T)F.
>
> This is the second ticket I have seen calling for "more
> documentation". All tickets like these should be closed instantly
> unless
>
> 1) User wants to write an xdoc or java doc and get it committed.
> Otherwise user can update the wiki. There are many things that could
> be documented "better" and having a ticket to remind us about each one
> is not practical.

This is not a case of "better" documentation because there is
virtually no documentation at all for this feature. And I can't create
a patch for this because I just don't know enough.

It is my opinion that if you close these kinds of tickets then every
new feature that gets committed should only be allowed with
accompanying documentation because otherwise all these features will
be useless to most people who learn about Hive from the documentation
and are unaware of them. In this case those issues should probably be
assigned to whoever implemented the feature.

> We need to be more aggressive in intercepting issues and not be afraid
> to close them as LATER, WONT FIX. Otherwise we are never going to be
> able to turn this around and the open issues are just going to keep
> growing. Suggest users to the wiki for ROAD map or the IRC. So they
> can discuss features before creating vague tickets.

I also don't agree here. Other projects encourage these kinds of
issues and are doing fine with them. It is always easy to filter out
Documentation issues once they are in the correct category. These
kinds of issues are also a helpful resource for those searching for
others having the same problem even if there's no fix available or no
one working on it.

I obviously say this as a non-committer and while I love the progress
Hive is making and I'm very thankful for all the work going on I as a
user would wish for more and better documentation over new features
sometimes. I firmly believe this would also help attracting new
developers to help out. The code base isn't the easiest to understand
either when starting from scratch.

Cheers,
Lars