You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Ivan Kelly <iv...@apache.org> on 2014/10/07 17:57:40 UTC

website frontpage

Hi folks,

I've just looked at the frontpage (zookeeper.apache.org/bookkeeper), and
it says pretty much nothing about what bookkeeper is and what it's good
for. There's even a header "What is the BookKeeper?". The? This isn't
even good english.

Anyhow, I propose we add some text to make what bookkeeper does more
understandable, especially for people who are unfamiliar with
distributed systems.

I've just knocked out some text, which I'll add to the frontpage
tomorrow if there are no strong objections. Suggestions are welcome of
course. 

-Ivan

<snip>
h2. What is Bookkeeper?

Bookkeeper is a log replication service which can be used to build
replicated state machines. A log contains a sequence of events which can
be applied to a state machine. Bookkeeper guarantees that each replica
state machine will see all the same entries, in the same order.

h2. Eh? What good is that to me?

Imagine for example that you have a database that you want to be able
access even if the database server goes down. You'll need to replicate
it to multiple servers. You need to ensure that if one database sees an
update, all databases see the update. But what happens if one database
server is cut off from the network for a time? Or if two clients try to
update the same field at exactly the same instance? This is where log
replication comes in.

A database can be seen as a state machine. It is the sum of all the
updates which is has applied since its initial state. Therefore, if you
consider your replicated database as a replicated statemachine, you can
do the replication using log replication service. If all updates are
written to the log replication service before being applied to the
database, then the database will continue to be available and consistent
even if some of the replicas fail.

This approach can be applied to many types of distributed systems, such
as messaging systems, coordination systems, filesystems, etc.

h2. What bookkeeper is not?

Bookkeeper has nothing to do with application/error/trace logging. There
are already many projects (link to log4j, slf4j, logback) dedicated to
that problem.

h2. How about Hedwig?

Hedwig is a distributed publish and subscribe system, which uses
bookkeeper to replicate its messages.
</snip>

Re: website frontpage

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.
Yeah, I don't want to make a big deal out this, but I think it is perfectly fine to use jira for this, and I don't a see a big deal in including such jiras in the release notes. In fact, it might be a good thing to track important changes to the documentation/website text like that. We obviously don't need to create a jira for every typo we see, though.

But, I'm flexible here, I like the idea of tracking changes to the website text, but can live without it.

-Flavio



On Wednesday, October 8, 2014 2:26 PM, Sijie Guo <gu...@gmail.com> wrote:
 

>
>
>On Wed, Oct 8, 2014 at 9:13 AM, Ivan Kelly <iv...@apache.org> wrote:
>
>>
>> Flavio Junqueira writes:
>>
>> > We don't use jira only for things that need to be checked in. Infra
>> > tickets and podling name search are example of queues that typically
>> > don't check in anything.
>> Those are both groups which don't have any code artifact releases. We
>> do, and putting non code stuff in jira, means when we have a release,
>> jira and svn/changes.txt will be out of sync, so someone has to go in
>> and fix it.
>>
>
>Tracking tasks in JIRA is pretty good practice.
>
>Why it is a bad idea to include them in release notes? Even you don't want
>to include those tickets, you could mark the ticket as specific category
>(e.g. documentation). Then it is easy to fill out those tickets you don't
>want to include in the release notes?
>
>
>>
>> >
>> > I was asking this just so that we could iterate, but you might as well
>> > just check it in and we can do it directly on the website.
>> Jira isn't the best tool for iterating over text. Suggestions get lost,
>> and quoting in comments is awkward at best.
>>
>
>why not use review board? we already have pretty standard tools for
>reviewing.
>
>
>
>> I've created a google doc if you want to give feedback:
>>
>> https://docs.google.com/document/d/1lit0QLJ58RG4-Fn0DAaTYF8naMZWzKinQL0g2VNmM3c/edit?usp=sharing
>>
>> -Ivan
>>
>
>
>

Re: website frontpage

Posted by Ivan Kelly <iv...@apache.org>.
> Why it is a bad idea to include them in release notes? Even you don't want
> to include those tickets, you could mark the ticket as specific category
> (e.g. documentation). Then it is easy to fill out those tickets you don't
> want to include in the release notes?
Because then the release notes and svn get out of sync, and it becomes
hard to tell if what we say is in a release actually is in a release.

I've created a jira, but I'm not going to put a fix version on it. 

> why not use review board? we already have pretty standard tools for
> reviewing.
I've put a review on reviewboard. I had stopped using it ages ago
because it was as slow as hell. It seems ok now though.


-Ivan

Re: website frontpage

Posted by Sijie Guo <gu...@gmail.com>.
On Wed, Oct 8, 2014 at 9:13 AM, Ivan Kelly <iv...@apache.org> wrote:

>
> Flavio Junqueira writes:
>
> > We don't use jira only for things that need to be checked in. Infra
> > tickets and podling name search are example of queues that typically
> > don't check in anything.
> Those are both groups which don't have any code artifact releases. We
> do, and putting non code stuff in jira, means when we have a release,
> jira and svn/changes.txt will be out of sync, so someone has to go in
> and fix it.
>

Tracking tasks in JIRA is pretty good practice.

Why it is a bad idea to include them in release notes? Even you don't want
to include those tickets, you could mark the ticket as specific category
(e.g. documentation). Then it is easy to fill out those tickets you don't
want to include in the release notes?


>
> >
> > I was asking this just so that we could iterate, but you might as well
> > just check it in and we can do it directly on the website.
> Jira isn't the best tool for iterating over text. Suggestions get lost,
> and quoting in comments is awkward at best.
>

why not use review board? we already have pretty standard tools for
reviewing.


> I've created a google doc if you want to give feedback:
>
> https://docs.google.com/document/d/1lit0QLJ58RG4-Fn0DAaTYF8naMZWzKinQL0g2VNmM3c/edit?usp=sharing
>
> -Ivan
>

Re: website frontpage

Posted by Ivan Kelly <iv...@apache.org>.
Flavio Junqueira writes:

> We don't use jira only for things that need to be checked in. Infra
> tickets and podling name search are example of queues that typically
> don't check in anything.
Those are both groups which don't have any code artifact releases. We
do, and putting non code stuff in jira, means when we have a release,
jira and svn/changes.txt will be out of sync, so someone has to go in
and fix it.

>
> I was asking this just so that we could iterate, but you might as well
> just check it in and we can do it directly on the website.
Jira isn't the best tool for iterating over text. Suggestions get lost,
and quoting in comments is awkward at best. 

I've created a google doc if you want to give feedback:
https://docs.google.com/document/d/1lit0QLJ58RG4-Fn0DAaTYF8naMZWzKinQL0g2VNmM3c/edit?usp=sharing

-Ivan

Re: website frontpage

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.
We don't use jira only for things that need to be checked in. Infra tickets and podling name search are example of queues that typically don't check in anything. 

I was asking this just so that we could iterate, but you might as well just check it in and we can do it directly on the website.

-Flavio

On 08 Oct 2014, at 03:16, Ivan Kelly <iv...@apache.org> wrote:

>> Do you want to put it on a jira? 
> This isn't a code change. Nothing will go into
> https://svn.apache.org/repos/asf/zookeeper/bookkeeper/trunk/ for this,
> so it has no version, so jira tracking makes little sense. Only the
> documentation under doc/ needs jira issues.
> 
>> We were having some problem with the site building pipeline, I need to
>> check if it is fixed.
> Seems to be working for me. I just removed the aforementioned "the".
> 
> -Ivan


Re: website frontpage

Posted by Ivan Kelly <iv...@apache.org>.
> Do you want to put it on a jira? 
This isn't a code change. Nothing will go into
https://svn.apache.org/repos/asf/zookeeper/bookkeeper/trunk/ for this,
so it has no version, so jira tracking makes little sense. Only the
documentation under doc/ needs jira issues.

> We were having some problem with the site building pipeline, I need to
> check if it is fixed.
Seems to be working for me. I just removed the aforementioned "the".

-Ivan

Re: website frontpage

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.
Do you want to put it on a jira? 


We were having some problem with the site building pipeline, I need to check if it is fixed.

-Flavio



On Tuesday, October 7, 2014 4:57 PM, Ivan Kelly <iv...@apache.org> wrote:
 

>
>
>Hi folks,
>
>I've just looked at the frontpage (zookeeper.apache.org/bookkeeper), and
>it says pretty much nothing about what bookkeeper is and what it's good
>for. There's even a header "What is the BookKeeper?". The? This isn't
>even good english.
>
>Anyhow, I propose we add some text to make what bookkeeper does more
>understandable, especially for people who are unfamiliar with
>distributed systems.
>
>I've just knocked out some text, which I'll add to the frontpage
>tomorrow if there are no strong objections. Suggestions are welcome of
>course. 
>
>-Ivan
>
><snip>
>h2. What is Bookkeeper?
>
>Bookkeeper is a log replication service which can be used to build
>replicated state machines. A log contains a sequence of events which can
>be applied to a state machine. Bookkeeper guarantees that each replica
>state machine will see all the same entries, in the same order.
>
>h2. Eh? What good is that to me?
>
>Imagine for example that you have a database that you want to be able
>access even if the database server goes down. You'll need to replicate
>it to multiple servers. You need to ensure that if one database sees an
>update, all databases see the update. But what happens if one database
>server is cut off from the network for a time? Or if two clients try to
>update the same field at exactly the same instance? This is where log
>replication comes in.
>
>A database can be seen as a state machine. It is the sum of all the
>updates which is has applied since its initial state. Therefore, if you
>consider your replicated database as a replicated statemachine, you can
>do the replication using log replication service. If all updates are
>written to the log replication service before being applied to the
>database, then the database will continue to be available and consistent
>even if some of the replicas fail.
>
>This approach can be applied to many types of distributed systems, such
>as messaging systems, coordination systems, filesystems, etc.
>
>h2. What bookkeeper is not?
>
>Bookkeeper has nothing to do with application/error/trace logging. There
>are already many projects (link to log4j, slf4j, logback) dedicated to
>that problem.
>
>h2. How about Hedwig?
>
>Hedwig is a distributed publish and subscribe system, which uses
>bookkeeper to replicate its messages.
></snip>
>
>
>