You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jonathan Gray <jg...@facebook.com> on 2010/07/12 19:47:48 UTC

HBase Hackathon IV Wrap-up

HBase Developers,

In conjunction with the recent HUG11 (http://hbaseblog.com/2010/07/04/hug11-hbase-0-90-preview-wrap-up/) we also held the fourth HBase Hackathon at Facebook.

At this hackathon, we spent a bit of time at the end in a discussion of ways to improve HBase outside of normal feature development.  Specifically the focus was on Usability, Documentation, and Public Relations, with an emphasis on getting people to take ownership over different things.  Notes from the meeting below.

The discussion was loosely guided by these slides:
http://hbaseblog.com/files/HBase_Hackathon_4_Slides.pdf

USABILITY

- Add the option for our scripts to start/stop an HDFS cluster like we do with ZK (no owner) (HBASE-2811)

- Configuration improvements (owned by jgray and larsgeorge)
  + Better documentation of what is important (currently too confusing) (HBASE-2328, also HBASE-2006 is an old but related jira about documenting config)
  + Document what has changed between versions in one place
  + Ship with sample configs (HBASE-2377)
  + Configuration automagic (ask questions, generates config) and verification (for example, to test cross-version) tools
  + Wasn't discussed at the hackathon but also related is HBASE-2056, better defaults for configs

- Automated verification (owned by andrei and cosmin)
  + ulimit, etc (HBASE-2750)
  + HDFS + JVM versions (RS->M info, JMX)

DOCUMENTATION

- Specific getting started guides
  + Three primary types of new users, should make a guide for each:
    1) Just want to play on one node, no previous hadoop experience
    2) Want a real cluster but have no hadoop experience or existing hdfs cluster
    3) Want a real cluster and already have an existing hdfs cluster
  + Don't bury everything in javadoc + wiki

- Instead of spreading documentation around javadoc, wiki, posts, etc... we will do a "Book" per release (owned by todd and also wade?)
  + No final decision on format, maybe docbook?
  + HBASE-2650

PUBLIC RELATIONS

- Stream and record meetups
  + Will be done for the next HUG (jgray)

- HBase Blog
  + Created hbaseblog.com, will also be blog.hbase.org (jgray)
  + This is a community blog, any committer/contributor is welcome to post!  Talk to jgray to get an account created.

- HBase.org and logo
  + Work in progress.  Currently owned by stack.


Thanks to everyone who participated.

JG

Re: HBase Hackathon IV Wrap-up

Posted by Steven Noels <st...@outerthought.org>.
On Wed, Jul 14, 2010 at 1:20 AM, Stack <st...@duboce.net> wrote:


> That fellas have to put up a webapp container to author to write docs
> is too high a barrier in my opinion (I have to have a mysql running
> too?).  Getting fellas writing doc. is like pulling teeth at the best
> of times so barriers should be at a minimum.
>
> How would versioning work?  We'd have to pull out of daisy and commit
> into the repo?
>


People would access a centrally-hosted instance of Daisy to author/maintain
docs. Offline editing is impossible.

Upon designated moments, we can then use the Books feature of Daisy to pull
a static copy from the webapp into the ASF repo - I'd say close to major
releases.



> > Consider it being a Confluence without the UI/Atlassian fanciness,
> slightly
> > more complex (but also more flexible), however pretty stable and with a
> > solid techdoc legacy. We're quite busy but I could look if we could free
> up
> > some time to support a trial if that's what the community wants.
>
>
> Thanks for the offer.  You suggesting you'd host it?  We'd make
> docs.hbase.org point to such a hosting?
>

There's a bit of work involved in setting up something which appeals both
visually and structurally to documentation authors - but nothing to be
afraid of. With the holiday season approaching however, doing so might take
until late August. If that's still helpful, we can look into helping out.

With regards to physical provisioning of such a box, we don't have any
leftover capacity at the moment. Either we ask Apache (one of the Sun zones
boxes perhaps), or we investigate spare capacity somewhere else. We don't
have any internal hosting capacity to speak of, I suspect other HBase users
have that readily available. If that's not possible, it's a matter of hiring
a VPS somewhere out there and pointing docs.hbase.org to it.


> Speaking of which, with my ASF Member hat on: we should have some formal
> > vetting on doc contributions as well - with Daisy for Cocoon we had a
> simple
> > tick box upon registration to declare a contributor had the legal rights
> to
> > actually contribute under the ASF license terms. If the idea is to open
> up
> > doc contributions, of course.
> >
> While this'd be sweet, there is such a checkbox in JIRA that
> contributors need to check for us to commit (doc) patches.
>


Ideally, doc contributors would get access to Daisy as well, but only be
allowed to work in draft mode. Doc editors would then publish their changes.

Steven.
-- 
Steven Noels                            http://outerthought.org/
Open Source Content Applications
Makers of Kauri, Daisy CMS and Lily

Re: HBase Hackathon IV Wrap-up

Posted by Stack <st...@duboce.net>.
On Mon, Jul 12, 2010 at 10:25 PM, Steven Noels <st...@outerthought.org> wrote:
> On Tue, Jul 13, 2010 at 7:03 AM, Stack <st...@duboce.net> wrote:
>
> Not on a formal basis - here's an old blog post talking about Daisy in a
> techdoc environment:
> http://blogs.sun.com/coolstuff/entry/daisy_wysiwyg_wiki_for_pdf
>

Going by this article, really, we should be running daisy instead of
the apache supplied wiki and we'd get a boat load of cool stuff.


> Yes, it's a webapp through which you author.

That fellas have to put up a webapp container to author to write docs
is too high a barrier in my opinion (I have to have a mysql running
too?).  Getting fellas writing doc. is like pulling teeth at the best
of times so barriers should be at a minimum.

How would versioning work?  We'd have to pull out of daisy and commit
into the repo?

> Consider it being a Confluence without the UI/Atlassian fanciness, slightly
> more complex (but also more flexible), however pretty stable and with a
> solid techdoc legacy. We're quite busy but I could look if we could free up
> some time to support a trial if that's what the community wants.


Thanks for the offer.  You suggesting you'd host it?  We'd make
docs.hbase.org point to such a hosting?


However, I
> think Todd/Cloudera are sold on Confluence already - and there's already
> some setup on the ASF side of things to export Confluence content to static
> HTML that can be SVN-versioned for the ASF website.
>

HBase doesn't use confluence.  No plans to either, not that I've heard of.

> Speaking of which, with my ASF Member hat on: we should have some formal
> vetting on doc contributions as well - with Daisy for Cocoon we had a simple
> tick box upon registration to declare a contributor had the legal rights to
> actually contribute under the ASF license terms. If the idea is to open up
> doc contributions, of course.
>
While this'd be sweet, there is such a checkbox in JIRA that
contributors need to check for us to commit (doc) patches.

Thanks Steven,
St.Ack

Re: HBase Hackathon IV Wrap-up

Posted by Steven Noels <st...@outerthought.org>.
On Tue, Jul 13, 2010 at 7:03 AM, Stack <st...@duboce.net> wrote:


> I took a look.  Its very docbooky looking (I like the html and html as
> one page rendorings).
>

Yes.


> There must be pointers comparing daisy to docbook and why daisy to docbook?
>

Not on a formal basis - here's an old blog post talking about Daisy in a
techdoc environment:
http://blogs.sun.com/coolstuff/entry/daisy_wysiwyg_wiki_for_pdf

If the underlying need is to have a Docbook export (like when going to a
paper publisher), I guess for a specific setup creating a custom Books
publishing pipeline that does this is very feasible. Daisy stores its
textual content in a semantically/structurally clean form of HTML.


>
> > Important in your consideration would be the requirement of offline
> > authoring.
> >
>
> Why?  Because can only author when daisy server running?
>

Yes, it's a webapp through which you author. A nice one, if I may say, it's
tested and production-quality stuff which has been going for a long time
now, and some of our customers have been creating 700+ pages publications
with it.

Consider it being a Confluence without the UI/Atlassian fanciness, slightly
more complex (but also more flexible), however pretty stable and with a
solid techdoc legacy. We're quite busy but I could look if we could free up
some time to support a trial if that's what the community wants. However, I
think Todd/Cloudera are sold on Confluence already - and there's already
some setup on the ASF side of things to export Confluence content to static
HTML that can be SVN-versioned for the ASF website.

Speaking of which, with my ASF Member hat on: we should have some formal
vetting on doc contributions as well - with Daisy for Cocoon we had a simple
tick box upon registration to declare a contributor had the legal rights to
actually contribute under the ASF license terms. If the idea is to open up
doc contributions, of course.

Steven.
-- 
Steven Noels                            http://outerthought.org/
Open Source Content Applications
Makers of Kauri, Daisy CMS and Lily

Re: HBase Hackathon IV Wrap-up

Posted by Stack <st...@duboce.net>.
On Mon, Jul 12, 2010 at 12:26 PM, Steven Noels <st...@outerthought.org> wrote:
> On Mon, Jul 12, 2010 at 7:47 PM, Jonathan Gray <jg...@facebook.com> wrote:
>
> - Instead of spreading documentation around javadoc, wiki, posts, etc... we
>> will do a "Book" per release (owned by todd and also wade?)
>>  + No final decision on format, maybe docbook?
>>
> We're in the process of revamping/unifying our own project web properties as
> we speak, and www.restlet.org is using Daisy as well (
> http://wiki.restlet.org/) - the Apache Cocoon project is also using Daisy (
> http://cocoon.zones.apache.org/daisy/) but that instance seems to be down at
> the moment. http://www.daisycms.org/books/ shows you how Daisy renders a
> document collection into PDF.

I took a look.  Its very docbooky looking (I like the html and html as
one page rendorings).

There must be pointers comparing daisy to docbook and why daisy to docbook?


> Important in your consideration would be the requirement of offline
> authoring.
>

Why?  Because can only author when daisy server running?

Thanks Steven.
St.Ack

Tool for HBase book was: HBase Hackathon IV Wrap-up

Posted by Thomas Koch <th...@koch.ro>.
Hi,

only a suggestion that you might consider for writing HBase documentation: 
sphinx[1].

* Uses reStructuredText as input format[2], therefor
  * the source text format is pleaseant to read too
  * the source can be checked in to SVN/GIT so code and documentation
    are in the same systen. Jira patches can come together with
    documentation changes in the same patch.
* Produces HTML and PDF as output
* Support to include javadoc documentation is under development[3] as
  a GSOC project
* Popular in the python world. Python seems to be popular in the
  Hadoop ecosystem too.

I've not yet used sphinx itself, but only reStructuredText. But I very much 
enjoy to have the documentation as plain text files alongside my code and 
still be able to compile a website or a PDF from it.

If you'd like to consider this option, I may have time after the next exams in 
september to digg deeper into sphinx.

[1] http://sphinx.pocoo.org/
[2] http://docutils.sf.net/rst.html
[3] 
http://leapon.net/files/Multiple%20language%20support%20for%20autodoc%20in%20Sphinx%20via%20ANTLR.html

Best regards,

Thomas Koch, http://www.koch.ro

Re: HBase Hackathon IV Wrap-up

Posted by Steven Noels <st...@outerthought.org>.
On Mon, Jul 12, 2010 at 7:47 PM, Jonathan Gray <jg...@facebook.com> wrote:

- Instead of spreading documentation around javadoc, wiki, posts, etc... we
> will do a "Book" per release (owned by todd and also wade?)
>  + No final decision on format, maybe docbook?
>

If you're looking for an environment to coordinate documentation development
*and* generate static versions (books) out of it, there's little value in
hiding the fact that we have been running an open source project delivering
such a tool (www.daisycms.org - of which Lily isn't necessarily a successor
but still has some conceptual ties) since six years or so.

We're in the process of revamping/unifying our own project web properties as
we speak, and www.restlet.org is using Daisy as well (
http://wiki.restlet.org/) - the Apache Cocoon project is also using Daisy (
http://cocoon.zones.apache.org/daisy/) but that instance seems to be down at
the moment. http://www.daisycms.org/books/ shows you how Daisy renders a
document collection into PDF.

Daisy being a Java server app, it requires (only slightly) more setup and
continuous care than a Docbook tool chain though.

Important in your consideration would be the requirement of offline
authoring.

Steven.
-- 
Steven Noels                            http://outerthought.org/
Open Source Content Applications
Makers of Kauri, Daisy CMS and Lily