You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "zhaijia03@gmail.com" <zh...@gmail.com> on 2014/12/13 16:12:54 UTC

Would you please confirm the understanding of maxOutstandingRequests in GarbageCollector?

Hi,

I am preparing a patch for https://issues.apache.org/jira/browse/BOOKKEEPER-827 , in which try to turn maxOutstandingRequests from "by entries" to "by bytes".  
while reading the part of comments at the front of setCompactionMaxOutstandingRequests(), it makes me a little confusing:
"A higher value for this parameter means  more memory will be used for offsets"  
<=== here, I thought the memory should mainly be occupied by the entries that added into  "FileChannel",  but not flushed.  right?

bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java : line 1231
 ------------------------ 
/**
     * Set the maximum number of entries which can be compacted without flushing.
     *
     * When compacting, the entries are written to the entrylog and the new offsets
     * are cached in memory. Once the entrylog is flushed the index is updated with
     * the new offsets. This parameter controls the number of entries added to the
     * entrylog before a flush is forced. A higher value for this parameter means
     * more memory will be used for offsets. Each offset consists of 3 longs.
     *
     * This parameter should _not_ be modified unless you know what you're doing.
     * The default is 100,000.
     *
     * @param maxOutstandingRequests number of entries to compact before flushing
     *
     * @return ServerConfiguration
     */
    public ServerConfiguration setCompactionMaxOutstandingRequests(int maxOutstandingRequests) {
        setProperty(COMPACTION_MAX_OUTSTANDING_REQUESTS, maxOutstandingRequests);
        return this;
    }
 ------------------------ 

Thanks a lot.
-Jia

Re: Would you please confirm the understanding of maxOutstandingRequests in GarbageCollector?

Posted by Ivan Kelly <iv...@apache.org>.
I'll reply on the JIRA.

On Tue, Dec 16, 2014 at 6:51 AM, Jia Zhai <zh...@gmail.com> wrote:

> Hi Ivan,
>
> Thanks a ton for your explain.  It is very clear.
>
> If you have time, Would you please help take a look at bookkeeper-827?
> https://issues.apache.org/jira/browse/BOOKKEEPER-827
>
> After this enhancement,  user could choose to use "compactionRate"
> and "maxOutstandingRequests" either "by entries" or "by bytes". And new
> parameter "maxOutstandingRequestsBytes" stands for number of bytes that
> entries have occupied before flush.
>
>
> Thanks a lot.
>
> -Jia
>
> On Tue, Dec 16, 2014 at 12:14 AM, Ivan Kelly <iv...@apache.org> wrote:
> >
> > No, in this case it is talking about the offsets of the entries.
> >
> > When we do GC, we read from the old (partially empty) log and copy any
> > non-deleted entries we find into the new entry log. This means that the
> > index location for that entry needs to change to point to the new
> entrylog
> > and new offset. We cannot update the index though, until the new entry
> log
> > has flushed. Otherwise, we could update the index, then crash, the new
> > index entry would point to memory which had never been flushed. We cannot
> > flush the entry log for each entry, because that would slow everything
> > down. So we batch. We copy over 'maxOutstandingRequests' entries, keep a
> > record of the new offsets, and then wait for a flush of the entrylog, or
> > trigger one (I can't remember which). Once the flush has occurred, the
> > offsets can be added to the index.
> >
> > Let us know if this isn't clear, as it most likely isn't :)
> >
> > -Ivan
> >
> > On Sat, Dec 13, 2014 at 4:12 PM, zhaijia03@gmail.com <
> zhaijia03@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I am preparing a patch for
> > > https://issues.apache.org/jira/browse/BOOKKEEPER-827 , in which try to
> > > turn maxOutstandingRequests from "by entries" to "by bytes".
> > > while reading the part of comments at the front of
> > > setCompactionMaxOutstandingRequests(), it makes me a little confusing:
> > > "A higher value for this parameter means  more memory will be used for
> > > offsets"
> > > <=== here, I thought the memory should mainly be occupied by the
> entries
> > > that added into  "FileChannel",  but not flushed.  right?
> > >
> > >
> >
> bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java
> > > : line 1231
> > >  ------------------------
> > > /**
> > >      * Set the maximum number of entries which can be compacted without
> > > flushing.
> > >      *
> > >      * When compacting, the entries are written to the entrylog and the
> > > new offsets
> > >      * are cached in memory. Once the entrylog is flushed the index is
> > > updated with
> > >      * the new offsets. This parameter controls the number of entries
> > > added to the
> > >      * entrylog before a flush is forced. A higher value for this
> > > parameter means
> > >      * more memory will be used for offsets. Each offset consists of 3
> > > longs.
> > >      *
> > >      * This parameter should _not_ be modified unless you know what
> > you're
> > > doing.
> > >      * The default is 100,000.
> > >      *
> > >      * @param maxOutstandingRequests number of entries to compact
> before
> > > flushing
> > >      *
> > >      * @return ServerConfiguration
> > >      */
> > >     public ServerConfiguration setCompactionMaxOutstandingRequests(int
> > > maxOutstandingRequests) {
> > >         setProperty(COMPACTION_MAX_OUTSTANDING_REQUESTS,
> > > maxOutstandingRequests);
> > >         return this;
> > >     }
> > >  ------------------------
> > >
> > > Thanks a lot.
> > > -Jia
> > >
> >
>

Re: Would you please confirm the understanding of maxOutstandingRequests in GarbageCollector?

Posted by Jia Zhai <zh...@gmail.com>.
Hi Ivan,

Thanks a ton for your explain.  It is very clear.

If you have time, Would you please help take a look at bookkeeper-827?
https://issues.apache.org/jira/browse/BOOKKEEPER-827

After this enhancement,  user could choose to use "compactionRate"
and "maxOutstandingRequests" either "by entries" or "by bytes". And new
parameter "maxOutstandingRequestsBytes" stands for number of bytes that
entries have occupied before flush.


Thanks a lot.

-Jia

On Tue, Dec 16, 2014 at 12:14 AM, Ivan Kelly <iv...@apache.org> wrote:
>
> No, in this case it is talking about the offsets of the entries.
>
> When we do GC, we read from the old (partially empty) log and copy any
> non-deleted entries we find into the new entry log. This means that the
> index location for that entry needs to change to point to the new entrylog
> and new offset. We cannot update the index though, until the new entry log
> has flushed. Otherwise, we could update the index, then crash, the new
> index entry would point to memory which had never been flushed. We cannot
> flush the entry log for each entry, because that would slow everything
> down. So we batch. We copy over 'maxOutstandingRequests' entries, keep a
> record of the new offsets, and then wait for a flush of the entrylog, or
> trigger one (I can't remember which). Once the flush has occurred, the
> offsets can be added to the index.
>
> Let us know if this isn't clear, as it most likely isn't :)
>
> -Ivan
>
> On Sat, Dec 13, 2014 at 4:12 PM, zhaijia03@gmail.com <zh...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I am preparing a patch for
> > https://issues.apache.org/jira/browse/BOOKKEEPER-827 , in which try to
> > turn maxOutstandingRequests from "by entries" to "by bytes".
> > while reading the part of comments at the front of
> > setCompactionMaxOutstandingRequests(), it makes me a little confusing:
> > "A higher value for this parameter means  more memory will be used for
> > offsets"
> > <=== here, I thought the memory should mainly be occupied by the entries
> > that added into  "FileChannel",  but not flushed.  right?
> >
> >
> bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java
> > : line 1231
> >  ------------------------
> > /**
> >      * Set the maximum number of entries which can be compacted without
> > flushing.
> >      *
> >      * When compacting, the entries are written to the entrylog and the
> > new offsets
> >      * are cached in memory. Once the entrylog is flushed the index is
> > updated with
> >      * the new offsets. This parameter controls the number of entries
> > added to the
> >      * entrylog before a flush is forced. A higher value for this
> > parameter means
> >      * more memory will be used for offsets. Each offset consists of 3
> > longs.
> >      *
> >      * This parameter should _not_ be modified unless you know what
> you're
> > doing.
> >      * The default is 100,000.
> >      *
> >      * @param maxOutstandingRequests number of entries to compact before
> > flushing
> >      *
> >      * @return ServerConfiguration
> >      */
> >     public ServerConfiguration setCompactionMaxOutstandingRequests(int
> > maxOutstandingRequests) {
> >         setProperty(COMPACTION_MAX_OUTSTANDING_REQUESTS,
> > maxOutstandingRequests);
> >         return this;
> >     }
> >  ------------------------
> >
> > Thanks a lot.
> > -Jia
> >
>

Re: Would you please confirm the understanding of maxOutstandingRequests in GarbageCollector?

Posted by Ivan Kelly <iv...@apache.org>.
No, in this case it is talking about the offsets of the entries.

When we do GC, we read from the old (partially empty) log and copy any
non-deleted entries we find into the new entry log. This means that the
index location for that entry needs to change to point to the new entrylog
and new offset. We cannot update the index though, until the new entry log
has flushed. Otherwise, we could update the index, then crash, the new
index entry would point to memory which had never been flushed. We cannot
flush the entry log for each entry, because that would slow everything
down. So we batch. We copy over 'maxOutstandingRequests' entries, keep a
record of the new offsets, and then wait for a flush of the entrylog, or
trigger one (I can't remember which). Once the flush has occurred, the
offsets can be added to the index.

Let us know if this isn't clear, as it most likely isn't :)

-Ivan

On Sat, Dec 13, 2014 at 4:12 PM, zhaijia03@gmail.com <zh...@gmail.com>
wrote:

> Hi,
>
> I am preparing a patch for
> https://issues.apache.org/jira/browse/BOOKKEEPER-827 , in which try to
> turn maxOutstandingRequests from "by entries" to "by bytes".
> while reading the part of comments at the front of
> setCompactionMaxOutstandingRequests(), it makes me a little confusing:
> "A higher value for this parameter means  more memory will be used for
> offsets"
> <=== here, I thought the memory should mainly be occupied by the entries
> that added into  "FileChannel",  but not flushed.  right?
>
> bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java
> : line 1231
>  ------------------------
> /**
>      * Set the maximum number of entries which can be compacted without
> flushing.
>      *
>      * When compacting, the entries are written to the entrylog and the
> new offsets
>      * are cached in memory. Once the entrylog is flushed the index is
> updated with
>      * the new offsets. This parameter controls the number of entries
> added to the
>      * entrylog before a flush is forced. A higher value for this
> parameter means
>      * more memory will be used for offsets. Each offset consists of 3
> longs.
>      *
>      * This parameter should _not_ be modified unless you know what you're
> doing.
>      * The default is 100,000.
>      *
>      * @param maxOutstandingRequests number of entries to compact before
> flushing
>      *
>      * @return ServerConfiguration
>      */
>     public ServerConfiguration setCompactionMaxOutstandingRequests(int
> maxOutstandingRequests) {
>         setProperty(COMPACTION_MAX_OUTSTANDING_REQUESTS,
> maxOutstandingRequests);
>         return this;
>     }
>  ------------------------
>
> Thanks a lot.
> -Jia
>