You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Jia Zhai <zh...@gmail.com> on 2016/06/02 01:34:55 UTC

Improve Write performance with Relax durability.

Hello all,

I am wondering do you guys have any plans on supporting relax durability.
Is it a good feature to have in bookkeeper (also for DistributedLog)?

I am thinking adding a new flag to bookkeeper#addEntry(..., Boolean sync).
So the application can control whether to sync or not for individual
entries.

- On the write protocol, adding a flag to indicate whether this write
should sync to disk or not.
- On the bookie side, if the addEntry request is sync, going through
original pipeline. If the addEntry disables sync,    complete the add
callbacks after writing to the journal file and before flushing journal.
- Those add entries (disabled syncs) will be flushed to disks with
subsequent sync add entries.

To my use cases on DistributedLog, this feature can be used for supporting
streams that don't have strong durability requirements.

What do you guys think? Shall I create a jira to implement this?

Thanks a lot
-Jia

Re: Improve Write performance with Relax durability.

Posted by Jia Zhai <zh...@gmail.com>.
Thanks a lot for taking care and providing this use case.

On Wed, Aug 10, 2016 at 3:53 AM, Sijie Guo <si...@apache.org> wrote:

> On Wed, Aug 3, 2016 at 12:51 PM, Enrico Olivelli <eo...@gmail.com>
> wrote:
>
> > Hi Jia,
> > I have another similar use case for this feature.
> > Let it be a ledger a db transaction log.
> > The client issues a sequence of data manipulation instructions inside the
> > scope of the transaction, if everything goes well a commit is finally
> added
> > to the sequence. From the client perspective it is important to  wait for
> > sync only for the last entry, that is the 'commit'.
> > In my case all the entries will be added with sync=false and then the
> last
> > with sync=true. But it is important that the addentry with sync  returns
> > only if all the previous entries of the same sequence or of the same
> ledger
> > have been written to stable storage.
> >
> Yup, I think that's a common usage pattern.
>
>
>
> > In this case I see the real challenge is that entries span multiple
> > bookies and it will be very hard to coordinate such a sync
> >
>
> Does making ensemble size equal to ack quorum size work here?
>
>
> > At the moment for my projects is not very urgent but I think that it
> could
> > be an useful feature
> >
> > Enrico
> >
> > Il Gio 9 Giu 2016 16:07 Jia Zhai <zh...@gmail.com> ha scritto:
> >
> >> Thanks a lot for all of your suggestions,I would like to have a try, and
> >> will open a jira ticket, and make the proposal, discussion and testing
> >> there.
> >>
> >> On Wed, Jun 8, 2016 at 1:40 PM, Sijie Guo <gu...@gmail.com> wrote:
> >>
> >> > I think that's a fair consideration. However I am thinking if we allow
> >> > non-durable ledger, that means 1) application needs to handle the
> >> missing
> >> > entries; 2) the re-replication should handle non-durable ledger by
> >> ignoring
> >> > the non-existing entries if they are missing.
> >> >
> >> > But Let's see how Jia is proposing.
> >> >
> >> > - Sijie
> >> >
> >> > On Fri, Jun 3, 2016 at 8:57 AM, Venkateswara Rao Jujjuri <
> >> > jujjuri@gmail.com> wrote:
> >> >
> >> >> @sijie let me expand what I mean by " this changes something
> >> fundamental "
> >> >>
> >> >> Everything starts that we are not persisting. Also I share lot of the
> >> >> points raised by @Matteo.
> >> >>
> >> >> - In theory, we could loose all copies of EntryId X but persist
> EntryId
> >> >> X+Y.  How does reads,replication, consistency cope up with it?
> >> >> - We could advance LAC, but loose last last set of entries. What do
> we
> >> >> do? do we adjust LAC? at what boundaries?
> >> >> - One of the core principles of LOG is, if entry X is there , all the
> >> >> entries up until X are available too, with this we may need to deal
> >> with
> >> >>    sparse / missing entries.
> >> >>
> >> >> I believe this is more of a direction towards making BooKKeeper
> >> in-memory
> >> >> log, but I am afraid it is more of a core change.
> >> >>
> >> >> Thanks,
> >> >> JV
> >> >>
> >> >> On Fri, Jun 3, 2016 at 12:05 AM, Matteo Merli <mm...@apache.org>
> >> wrote:
> >> >>
> >> >>> I was interested in trying something in this area, but never
> actually
> >> got
> >> >>> to do it.
> >> >>>
> >> >>> A few random notes:
> >> >>>
> >> >>> 1. My suspicion, with no backing data at this point, is that simply
> >> >>> skipping the fsync
> >> >>>     for "non-durable" ledgers might not give a big improvement,
> just a
> >> >>> bit
> >> >>> less latency
> >> >>>     for non-fsynced writes but roughly the same throughput. Imagine
> a
> >> >>> bookie
> >> >>>     receiving writes for 2 ledgers, 1 durable and the other
> >> non-durable.
> >> >>>     Since the entries are appended to the journal as they come in,
> the
> >> >>> fsync() for the
> >> >>>     durable ledger write will also carry on the data for the
> previous
> >> >>> non-durable ledger
> >> >>>     write, causing more IOPS if that was spanning a different disk
> >> block.
> >> >>>     Given that the bookie throughput is typically limited by the
> IOPS
> >> >>> capacity of the
> >> >>>     journal device, having non-durable write might help that much.
> >> >>>
> >> >>> 2.  The other options I was thinking were :
> >> >>>       - Do not append the non-durable entries to journal (redundancy
> >> is
> >> >>> anyway given by
> >> >>>         writing to multiple bookies). In this case though, a single
> >> >>> bookie
> >> >>> could loose more
> >> >>>         entries depending on flushTime, and also could loose entries
> >> even
> >> >>> in case of
> >> >>>         process crash, not just kernel-panic or power-outage.
> >> >>>
> >> >>>     - Use a separate journal for non-durable writes which will not
> be
> >> >>> fsynced()
> >> >>>
> >> >>>     - Configure the durability at the bookie level and then use
> >> >>> placement/isolation policy to choose the
> >> >>>       appropriate set of bookies for a non-durable ledger.
> >> >>>
> >> >>> 3. How do bookie replication will operate when getting read-errors?
> >> >>>
> >> >>> Matteo
> >> >>>
> >> >>> On Thu, Jun 2, 2016 at 11:09 PM Sijie Guo <si...@apache.org> wrote:
> >> >>>
> >> >>> > I think if a ledger is configured to be non-durable, it is kind of
> >> >>> > application's responsibility to tolerant the data loss.
> >> >>> > So I don't think it actually will have to change any in the
> >> bookkeeper
> >> >>> > client side.
> >> >>> >
> >> >>> > - Sijie
> >> >>> >
> >> >>> > On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri <
> >> >>> > jujjuri@gmail.com>
> >> >>> > wrote:
> >> >>> >
> >> >>> > > I agree that we must make this ledger property not perEntry
> write
> >> >>> > property.
> >> >>> > >
> >> >>> > > But, biggest doubt in my mind is - this changes something
> >> >>> fundamental.
> >> >>> > LAC.
> >> >>> > > Are we allowing sparse ledger? in failure scenario? Handling
> read
> >> >>> side
> >> >>> > may
> >> >>> > > become more complex.
> >> >>> > >
> >> >>> > > On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <gu...@gmail.com>
> >> >>> wrote:
> >> >>> > >
> >> >>> > >> This seems interesting to me. However, it might be safe to
> start
> >> >>> with a
> >> >>> > >> flag configured per ledger, rather than per entry. Also, it
> >> would be
> >> >>> > good
> >> >>> > >> to hear the opinions from other people. JV, Matteo? (If I
> >> remembered
> >> >>> > >> correctly, Matteo mentioned that Yahoo might be working on
> >> similar
> >> >>> > thing)
> >> >>> > >>
> >> >>> > >> +1 for creating a BOOKKEEPER jira to track this.
> >> >>> > >>
> >> >>> > >> - Sijie
> >> >>> > >>
> >> >>> > >> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com>
> >> >>> wrote:
> >> >>> > >>
> >> >>> > >> > + distributedlog-user
> >> >>> > >> > For more input and comments. :)
> >> >>> > >> >
> >> >>> > >> > Thanks.
> >> >>> > >> >
> >> >>> > >> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <
> zhaijia03@gmail.com>
> >> >>> wrote:
> >> >>> > >> >
> >> >>> > >> >> Hello all,
> >> >>> > >> >>
> >> >>> > >> >> I am wondering do you guys have any plans on supporting
> relax
> >> >>> > >> durability.
> >> >>> > >> >> Is it a good feature to have in bookkeeper (also for
> >> >>> DistributedLog)?
> >> >>> > >> >>
> >> >>> > >> >> I am thinking adding a new flag to bookkeeper#addEntry(...,
> >> >>> Boolean
> >> >>> > >> >> sync). So the application can control whether to sync or not
> >> for
> >> >>> > >> individual
> >> >>> > >> >> entries.
> >> >>> > >> >>
> >> >>> > >> >> - On the write protocol, adding a flag to indicate whether
> >> this
> >> >>> write
> >> >>> > >> >> should sync to disk or not.
> >> >>> > >> >> - On the bookie side, if the addEntry request is sync, going
> >> >>> through
> >> >>> > >> >> original pipeline. If the addEntry disables sync,
> complete
> >> >>> the add
> >> >>> > >> >> callbacks after writing to the journal file and before
> >> flushing
> >> >>> > >> journal.
> >> >>> > >> >> - Those add entries (disabled syncs) will be flushed to
> disks
> >> >>> with
> >> >>> > >> >> subsequent sync add entries.
> >> >>> > >> >>
> >> >>> > >> >> To my use cases on DistributedLog, this feature can be used
> >> for
> >> >>> > >> >> supporting streams that don't have strong durability
> >> >>> requirements.
> >> >>> > >> >>
> >> >>> > >> >> What do you guys think? Shall I create a jira to implement
> >> this?
> >> >>> > >> >>
> >> >>> > >> >> Thanks a lot
> >> >>> > >> >> -Jia
> >> >>> > >> >>
> >> >>> > >> >
> >> >>> > >> > --
> >> >>> > >> > You received this message because you are subscribed to the
> >> Google
> >> >>> > >> Groups
> >> >>> > >> > "distributedlog-user" group.
> >> >>> > >> > To unsubscribe from this group and stop receiving emails from
> >> it,
> >> >>> send
> >> >>> > >> an
> >> >>> > >> > email to distributedlog-user+unsubscribe@googlegroups.com.
> >> >>> > >> > To post to this group, send email to
> >> >>> > >> distributedlog-user@googlegroups.com.
> >> >>> > >> > To view this discussion on the web visit
> >> >>> > >> >
> >> >>> > >>
> >> >>> >
> >> >>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%
> >> 2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
> >> >>> > >> > <
> >> >>> > >>
> >> >>> >
> >> >>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%
> >> 2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.
> >> gmail.com?utm_medium=email&utm_source=footer
> >> >>> > >> >
> >> >>> > >> > .
> >> >>> > >> > For more options, visit https://groups.google.com/d/optout.
> >> >>> > >> >
> >> >>> > >>
> >> >>> > >
> >> >>> > >
> >> >>> > >
> >> >>> > > --
> >> >>> > > Jvrao
> >> >>> > > ---
> >> >>> > > First they ignore you, then they laugh at you, then they fight
> >> you,
> >> >>> then
> >> >>> > > you win. - Mahatma Gandhi
> >> >>> > >
> >> >>> > >
> >> >>> > > --
> >> >>> > > You received this message because you are subscribed to the
> Google
> >> >>> Groups
> >> >>> > > "distributedlog-user" group.
> >> >>> > > To unsubscribe from this group and stop receiving emails from
> it,
> >> >>> send an
> >> >>> > > email to distributedlog-user+unsubscribe@googlegroups.com.
> >> >>> > > To post to this group, send email to
> >> >>> > distributedlog-user@googlegroups.com.
> >> >>> > > To view this discussion on the web visit
> >> >>> > >
> >> >>> >
> >> >>> https://groups.google.com/d/msgid/distributedlog-user/
> >> CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%
> >> 3DaCHUFomQ%40mail.gmail.com
> >> >>> > > <
> >> >>> >
> >> >>> https://groups.google.com/d/msgid/distributedlog-user/
> >> CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%
> >> 3DaCHUFomQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> >> >>> > >
> >> >>> > > .
> >> >>> > >
> >> >>> > > For more options, visit https://groups.google.com/d/optout.
> >> >>> > >
> >> >>> >
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Jvrao
> >> >> ---
> >> >> First they ignore you, then they laugh at you, then they fight you,
> >> then
> >> >> you win. - Mahatma Gandhi
> >> >>
> >> >>
> >> >> --
> >> >> You received this message because you are subscribed to the Google
> >> Groups
> >> >> "distributedlog-user" group.
> >> >> To unsubscribe from this group and stop receiving emails from it,
> send
> >> an
> >> >> email to distributedlog-user+unsubscribe@googlegroups.com.
> >> >> To post to this group, send email to distributedlog-user@
> >> googlegroups.com
> >> >> .
> >> >> To view this discussion on the web visit
> >> >> https://groups.google.com/d/msgid/distributedlog-user/
> CAKKTCLXs42QqZY-
> >> pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com
> >> >> <https://groups.google.com/d/msgid/distributedlog-user/
> >> CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.
> >> gmail.com?utm_medium=email&utm_source=footer>
> >> >> .
> >> >>
> >> >> For more options, visit https://groups.google.com/d/optout.
> >> >>
> >> >
> >> >
> >>
> > --
> >
> >
> > -- Enrico Olivelli
> >
>

Re: Improve Write performance with Relax durability.

Posted by Sijie Guo <si...@apache.org>.
On Wed, Aug 3, 2016 at 12:51 PM, Enrico Olivelli <eo...@gmail.com>
wrote:

> Hi Jia,
> I have another similar use case for this feature.
> Let it be a ledger a db transaction log.
> The client issues a sequence of data manipulation instructions inside the
> scope of the transaction, if everything goes well a commit is finally added
> to the sequence. From the client perspective it is important to  wait for
> sync only for the last entry, that is the 'commit'.
> In my case all the entries will be added with sync=false and then the last
> with sync=true. But it is important that the addentry with sync  returns
> only if all the previous entries of the same sequence or of the same ledger
> have been written to stable storage.
>
Yup, I think that's a common usage pattern.



> In this case I see the real challenge is that entries span multiple
> bookies and it will be very hard to coordinate such a sync
>

Does making ensemble size equal to ack quorum size work here?


> At the moment for my projects is not very urgent but I think that it could
> be an useful feature
>
> Enrico
>
> Il Gio 9 Giu 2016 16:07 Jia Zhai <zh...@gmail.com> ha scritto:
>
>> Thanks a lot for all of your suggestions,I would like to have a try, and
>> will open a jira ticket, and make the proposal, discussion and testing
>> there.
>>
>> On Wed, Jun 8, 2016 at 1:40 PM, Sijie Guo <gu...@gmail.com> wrote:
>>
>> > I think that's a fair consideration. However I am thinking if we allow
>> > non-durable ledger, that means 1) application needs to handle the
>> missing
>> > entries; 2) the re-replication should handle non-durable ledger by
>> ignoring
>> > the non-existing entries if they are missing.
>> >
>> > But Let's see how Jia is proposing.
>> >
>> > - Sijie
>> >
>> > On Fri, Jun 3, 2016 at 8:57 AM, Venkateswara Rao Jujjuri <
>> > jujjuri@gmail.com> wrote:
>> >
>> >> @sijie let me expand what I mean by " this changes something
>> fundamental "
>> >>
>> >> Everything starts that we are not persisting. Also I share lot of the
>> >> points raised by @Matteo.
>> >>
>> >> - In theory, we could loose all copies of EntryId X but persist EntryId
>> >> X+Y.  How does reads,replication, consistency cope up with it?
>> >> - We could advance LAC, but loose last last set of entries. What do we
>> >> do? do we adjust LAC? at what boundaries?
>> >> - One of the core principles of LOG is, if entry X is there , all the
>> >> entries up until X are available too, with this we may need to deal
>> with
>> >>    sparse / missing entries.
>> >>
>> >> I believe this is more of a direction towards making BooKKeeper
>> in-memory
>> >> log, but I am afraid it is more of a core change.
>> >>
>> >> Thanks,
>> >> JV
>> >>
>> >> On Fri, Jun 3, 2016 at 12:05 AM, Matteo Merli <mm...@apache.org>
>> wrote:
>> >>
>> >>> I was interested in trying something in this area, but never actually
>> got
>> >>> to do it.
>> >>>
>> >>> A few random notes:
>> >>>
>> >>> 1. My suspicion, with no backing data at this point, is that simply
>> >>> skipping the fsync
>> >>>     for "non-durable" ledgers might not give a big improvement, just a
>> >>> bit
>> >>> less latency
>> >>>     for non-fsynced writes but roughly the same throughput. Imagine a
>> >>> bookie
>> >>>     receiving writes for 2 ledgers, 1 durable and the other
>> non-durable.
>> >>>     Since the entries are appended to the journal as they come in, the
>> >>> fsync() for the
>> >>>     durable ledger write will also carry on the data for the previous
>> >>> non-durable ledger
>> >>>     write, causing more IOPS if that was spanning a different disk
>> block.
>> >>>     Given that the bookie throughput is typically limited by the IOPS
>> >>> capacity of the
>> >>>     journal device, having non-durable write might help that much.
>> >>>
>> >>> 2.  The other options I was thinking were :
>> >>>       - Do not append the non-durable entries to journal (redundancy
>> is
>> >>> anyway given by
>> >>>         writing to multiple bookies). In this case though, a single
>> >>> bookie
>> >>> could loose more
>> >>>         entries depending on flushTime, and also could loose entries
>> even
>> >>> in case of
>> >>>         process crash, not just kernel-panic or power-outage.
>> >>>
>> >>>     - Use a separate journal for non-durable writes which will not be
>> >>> fsynced()
>> >>>
>> >>>     - Configure the durability at the bookie level and then use
>> >>> placement/isolation policy to choose the
>> >>>       appropriate set of bookies for a non-durable ledger.
>> >>>
>> >>> 3. How do bookie replication will operate when getting read-errors?
>> >>>
>> >>> Matteo
>> >>>
>> >>> On Thu, Jun 2, 2016 at 11:09 PM Sijie Guo <si...@apache.org> wrote:
>> >>>
>> >>> > I think if a ledger is configured to be non-durable, it is kind of
>> >>> > application's responsibility to tolerant the data loss.
>> >>> > So I don't think it actually will have to change any in the
>> bookkeeper
>> >>> > client side.
>> >>> >
>> >>> > - Sijie
>> >>> >
>> >>> > On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri <
>> >>> > jujjuri@gmail.com>
>> >>> > wrote:
>> >>> >
>> >>> > > I agree that we must make this ledger property not perEntry write
>> >>> > property.
>> >>> > >
>> >>> > > But, biggest doubt in my mind is - this changes something
>> >>> fundamental.
>> >>> > LAC.
>> >>> > > Are we allowing sparse ledger? in failure scenario? Handling read
>> >>> side
>> >>> > may
>> >>> > > become more complex.
>> >>> > >
>> >>> > > On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <gu...@gmail.com>
>> >>> wrote:
>> >>> > >
>> >>> > >> This seems interesting to me. However, it might be safe to start
>> >>> with a
>> >>> > >> flag configured per ledger, rather than per entry. Also, it
>> would be
>> >>> > good
>> >>> > >> to hear the opinions from other people. JV, Matteo? (If I
>> remembered
>> >>> > >> correctly, Matteo mentioned that Yahoo might be working on
>> similar
>> >>> > thing)
>> >>> > >>
>> >>> > >> +1 for creating a BOOKKEEPER jira to track this.
>> >>> > >>
>> >>> > >> - Sijie
>> >>> > >>
>> >>> > >> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com>
>> >>> wrote:
>> >>> > >>
>> >>> > >> > + distributedlog-user
>> >>> > >> > For more input and comments. :)
>> >>> > >> >
>> >>> > >> > Thanks.
>> >>> > >> >
>> >>> > >> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com>
>> >>> wrote:
>> >>> > >> >
>> >>> > >> >> Hello all,
>> >>> > >> >>
>> >>> > >> >> I am wondering do you guys have any plans on supporting relax
>> >>> > >> durability.
>> >>> > >> >> Is it a good feature to have in bookkeeper (also for
>> >>> DistributedLog)?
>> >>> > >> >>
>> >>> > >> >> I am thinking adding a new flag to bookkeeper#addEntry(...,
>> >>> Boolean
>> >>> > >> >> sync). So the application can control whether to sync or not
>> for
>> >>> > >> individual
>> >>> > >> >> entries.
>> >>> > >> >>
>> >>> > >> >> - On the write protocol, adding a flag to indicate whether
>> this
>> >>> write
>> >>> > >> >> should sync to disk or not.
>> >>> > >> >> - On the bookie side, if the addEntry request is sync, going
>> >>> through
>> >>> > >> >> original pipeline. If the addEntry disables sync,    complete
>> >>> the add
>> >>> > >> >> callbacks after writing to the journal file and before
>> flushing
>> >>> > >> journal.
>> >>> > >> >> - Those add entries (disabled syncs) will be flushed to disks
>> >>> with
>> >>> > >> >> subsequent sync add entries.
>> >>> > >> >>
>> >>> > >> >> To my use cases on DistributedLog, this feature can be used
>> for
>> >>> > >> >> supporting streams that don't have strong durability
>> >>> requirements.
>> >>> > >> >>
>> >>> > >> >> What do you guys think? Shall I create a jira to implement
>> this?
>> >>> > >> >>
>> >>> > >> >> Thanks a lot
>> >>> > >> >> -Jia
>> >>> > >> >>
>> >>> > >> >
>> >>> > >> > --
>> >>> > >> > You received this message because you are subscribed to the
>> Google
>> >>> > >> Groups
>> >>> > >> > "distributedlog-user" group.
>> >>> > >> > To unsubscribe from this group and stop receiving emails from
>> it,
>> >>> send
>> >>> > >> an
>> >>> > >> > email to distributedlog-user+unsubscribe@googlegroups.com.
>> >>> > >> > To post to this group, send email to
>> >>> > >> distributedlog-user@googlegroups.com.
>> >>> > >> > To view this discussion on the web visit
>> >>> > >> >
>> >>> > >>
>> >>> >
>> >>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%
>> 2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
>> >>> > >> > <
>> >>> > >>
>> >>> >
>> >>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%
>> 2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.
>> gmail.com?utm_medium=email&utm_source=footer
>> >>> > >> >
>> >>> > >> > .
>> >>> > >> > For more options, visit https://groups.google.com/d/optout.
>> >>> > >> >
>> >>> > >>
>> >>> > >
>> >>> > >
>> >>> > >
>> >>> > > --
>> >>> > > Jvrao
>> >>> > > ---
>> >>> > > First they ignore you, then they laugh at you, then they fight
>> you,
>> >>> then
>> >>> > > you win. - Mahatma Gandhi
>> >>> > >
>> >>> > >
>> >>> > > --
>> >>> > > You received this message because you are subscribed to the Google
>> >>> Groups
>> >>> > > "distributedlog-user" group.
>> >>> > > To unsubscribe from this group and stop receiving emails from it,
>> >>> send an
>> >>> > > email to distributedlog-user+unsubscribe@googlegroups.com.
>> >>> > > To post to this group, send email to
>> >>> > distributedlog-user@googlegroups.com.
>> >>> > > To view this discussion on the web visit
>> >>> > >
>> >>> >
>> >>> https://groups.google.com/d/msgid/distributedlog-user/
>> CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%
>> 3DaCHUFomQ%40mail.gmail.com
>> >>> > > <
>> >>> >
>> >>> https://groups.google.com/d/msgid/distributedlog-user/
>> CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%
>> 3DaCHUFomQ%40mail.gmail.com?utm_medium=email&utm_source=footer
>> >>> > >
>> >>> > > .
>> >>> > >
>> >>> > > For more options, visit https://groups.google.com/d/optout.
>> >>> > >
>> >>> >
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Jvrao
>> >> ---
>> >> First they ignore you, then they laugh at you, then they fight you,
>> then
>> >> you win. - Mahatma Gandhi
>> >>
>> >>
>> >> --
>> >> You received this message because you are subscribed to the Google
>> Groups
>> >> "distributedlog-user" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send
>> an
>> >> email to distributedlog-user+unsubscribe@googlegroups.com.
>> >> To post to this group, send email to distributedlog-user@
>> googlegroups.com
>> >> .
>> >> To view this discussion on the web visit
>> >> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXs42QqZY-
>> pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com
>> >> <https://groups.google.com/d/msgid/distributedlog-user/
>> CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.
>> gmail.com?utm_medium=email&utm_source=footer>
>> >> .
>> >>
>> >> For more options, visit https://groups.google.com/d/optout.
>> >>
>> >
>> >
>>
> --
>
>
> -- Enrico Olivelli
>

Re: Improve Write performance with Relax durability.

Posted by Enrico Olivelli <eo...@gmail.com>.
Hi Jia,
I have another similar use case for this feature.
Let it be a ledger a db transaction log.
The client issues a sequence of data manipulation instructions inside the
scope of the transaction, if everything goes well a commit is finally added
to the sequence. From the client perspective it is important to  wait for
sync only for the last entry, that is the 'commit'.
In my case all the entries will be added with sync=false and then the last
with sync=true. But it is important that the addentry with sync  returns
only if all the previous entries of the same sequence or of the same ledger
have been written to stable storage.
In this case I see the real challenge is that entries span multiple bookies
and it will be very hard to coordinate such a sync

At the moment for my projects is not very urgent but I think that it could
be an useful feature

Enrico

Il Gio 9 Giu 2016 16:07 Jia Zhai <zh...@gmail.com> ha scritto:

> Thanks a lot for all of your suggestions,I would like to have a try, and
> will open a jira ticket, and make the proposal, discussion and testing
> there.
>
> On Wed, Jun 8, 2016 at 1:40 PM, Sijie Guo <gu...@gmail.com> wrote:
>
> > I think that's a fair consideration. However I am thinking if we allow
> > non-durable ledger, that means 1) application needs to handle the missing
> > entries; 2) the re-replication should handle non-durable ledger by
> ignoring
> > the non-existing entries if they are missing.
> >
> > But Let's see how Jia is proposing.
> >
> > - Sijie
> >
> > On Fri, Jun 3, 2016 at 8:57 AM, Venkateswara Rao Jujjuri <
> > jujjuri@gmail.com> wrote:
> >
> >> @sijie let me expand what I mean by " this changes something
> fundamental "
> >>
> >> Everything starts that we are not persisting. Also I share lot of the
> >> points raised by @Matteo.
> >>
> >> - In theory, we could loose all copies of EntryId X but persist EntryId
> >> X+Y.  How does reads,replication, consistency cope up with it?
> >> - We could advance LAC, but loose last last set of entries. What do we
> >> do? do we adjust LAC? at what boundaries?
> >> - One of the core principles of LOG is, if entry X is there , all the
> >> entries up until X are available too, with this we may need to deal with
> >>    sparse / missing entries.
> >>
> >> I believe this is more of a direction towards making BooKKeeper
> in-memory
> >> log, but I am afraid it is more of a core change.
> >>
> >> Thanks,
> >> JV
> >>
> >> On Fri, Jun 3, 2016 at 12:05 AM, Matteo Merli <mm...@apache.org>
> wrote:
> >>
> >>> I was interested in trying something in this area, but never actually
> got
> >>> to do it.
> >>>
> >>> A few random notes:
> >>>
> >>> 1. My suspicion, with no backing data at this point, is that simply
> >>> skipping the fsync
> >>>     for "non-durable" ledgers might not give a big improvement, just a
> >>> bit
> >>> less latency
> >>>     for non-fsynced writes but roughly the same throughput. Imagine a
> >>> bookie
> >>>     receiving writes for 2 ledgers, 1 durable and the other
> non-durable.
> >>>     Since the entries are appended to the journal as they come in, the
> >>> fsync() for the
> >>>     durable ledger write will also carry on the data for the previous
> >>> non-durable ledger
> >>>     write, causing more IOPS if that was spanning a different disk
> block.
> >>>     Given that the bookie throughput is typically limited by the IOPS
> >>> capacity of the
> >>>     journal device, having non-durable write might help that much.
> >>>
> >>> 2.  The other options I was thinking were :
> >>>       - Do not append the non-durable entries to journal (redundancy is
> >>> anyway given by
> >>>         writing to multiple bookies). In this case though, a single
> >>> bookie
> >>> could loose more
> >>>         entries depending on flushTime, and also could loose entries
> even
> >>> in case of
> >>>         process crash, not just kernel-panic or power-outage.
> >>>
> >>>     - Use a separate journal for non-durable writes which will not be
> >>> fsynced()
> >>>
> >>>     - Configure the durability at the bookie level and then use
> >>> placement/isolation policy to choose the
> >>>       appropriate set of bookies for a non-durable ledger.
> >>>
> >>> 3. How do bookie replication will operate when getting read-errors?
> >>>
> >>> Matteo
> >>>
> >>> On Thu, Jun 2, 2016 at 11:09 PM Sijie Guo <si...@apache.org> wrote:
> >>>
> >>> > I think if a ledger is configured to be non-durable, it is kind of
> >>> > application's responsibility to tolerant the data loss.
> >>> > So I don't think it actually will have to change any in the
> bookkeeper
> >>> > client side.
> >>> >
> >>> > - Sijie
> >>> >
> >>> > On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri <
> >>> > jujjuri@gmail.com>
> >>> > wrote:
> >>> >
> >>> > > I agree that we must make this ledger property not perEntry write
> >>> > property.
> >>> > >
> >>> > > But, biggest doubt in my mind is - this changes something
> >>> fundamental.
> >>> > LAC.
> >>> > > Are we allowing sparse ledger? in failure scenario? Handling read
> >>> side
> >>> > may
> >>> > > become more complex.
> >>> > >
> >>> > > On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <gu...@gmail.com>
> >>> wrote:
> >>> > >
> >>> > >> This seems interesting to me. However, it might be safe to start
> >>> with a
> >>> > >> flag configured per ledger, rather than per entry. Also, it would
> be
> >>> > good
> >>> > >> to hear the opinions from other people. JV, Matteo? (If I
> remembered
> >>> > >> correctly, Matteo mentioned that Yahoo might be working on similar
> >>> > thing)
> >>> > >>
> >>> > >> +1 for creating a BOOKKEEPER jira to track this.
> >>> > >>
> >>> > >> - Sijie
> >>> > >>
> >>> > >> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com>
> >>> wrote:
> >>> > >>
> >>> > >> > + distributedlog-user
> >>> > >> > For more input and comments. :)
> >>> > >> >
> >>> > >> > Thanks.
> >>> > >> >
> >>> > >> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com>
> >>> wrote:
> >>> > >> >
> >>> > >> >> Hello all,
> >>> > >> >>
> >>> > >> >> I am wondering do you guys have any plans on supporting relax
> >>> > >> durability.
> >>> > >> >> Is it a good feature to have in bookkeeper (also for
> >>> DistributedLog)?
> >>> > >> >>
> >>> > >> >> I am thinking adding a new flag to bookkeeper#addEntry(...,
> >>> Boolean
> >>> > >> >> sync). So the application can control whether to sync or not
> for
> >>> > >> individual
> >>> > >> >> entries.
> >>> > >> >>
> >>> > >> >> - On the write protocol, adding a flag to indicate whether this
> >>> write
> >>> > >> >> should sync to disk or not.
> >>> > >> >> - On the bookie side, if the addEntry request is sync, going
> >>> through
> >>> > >> >> original pipeline. If the addEntry disables sync,    complete
> >>> the add
> >>> > >> >> callbacks after writing to the journal file and before flushing
> >>> > >> journal.
> >>> > >> >> - Those add entries (disabled syncs) will be flushed to disks
> >>> with
> >>> > >> >> subsequent sync add entries.
> >>> > >> >>
> >>> > >> >> To my use cases on DistributedLog, this feature can be used for
> >>> > >> >> supporting streams that don't have strong durability
> >>> requirements.
> >>> > >> >>
> >>> > >> >> What do you guys think? Shall I create a jira to implement
> this?
> >>> > >> >>
> >>> > >> >> Thanks a lot
> >>> > >> >> -Jia
> >>> > >> >>
> >>> > >> >
> >>> > >> > --
> >>> > >> > You received this message because you are subscribed to the
> Google
> >>> > >> Groups
> >>> > >> > "distributedlog-user" group.
> >>> > >> > To unsubscribe from this group and stop receiving emails from
> it,
> >>> send
> >>> > >> an
> >>> > >> > email to distributedlog-user+unsubscribe@googlegroups.com.
> >>> > >> > To post to this group, send email to
> >>> > >> distributedlog-user@googlegroups.com.
> >>> > >> > To view this discussion on the web visit
> >>> > >> >
> >>> > >>
> >>> >
> >>>
> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
> >>> > >> > <
> >>> > >>
> >>> >
> >>>
> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> >>> > >> >
> >>> > >> > .
> >>> > >> > For more options, visit https://groups.google.com/d/optout.
> >>> > >> >
> >>> > >>
> >>> > >
> >>> > >
> >>> > >
> >>> > > --
> >>> > > Jvrao
> >>> > > ---
> >>> > > First they ignore you, then they laugh at you, then they fight you,
> >>> then
> >>> > > you win. - Mahatma Gandhi
> >>> > >
> >>> > >
> >>> > > --
> >>> > > You received this message because you are subscribed to the Google
> >>> Groups
> >>> > > "distributedlog-user" group.
> >>> > > To unsubscribe from this group and stop receiving emails from it,
> >>> send an
> >>> > > email to distributedlog-user+unsubscribe@googlegroups.com.
> >>> > > To post to this group, send email to
> >>> > distributedlog-user@googlegroups.com.
> >>> > > To view this discussion on the web visit
> >>> > >
> >>> >
> >>>
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com
> >>> > > <
> >>> >
> >>>
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> >>> > >
> >>> > > .
> >>> > >
> >>> > > For more options, visit https://groups.google.com/d/optout.
> >>> > >
> >>> >
> >>>
> >>
> >>
> >>
> >> --
> >> Jvrao
> >> ---
> >> First they ignore you, then they laugh at you, then they fight you, then
> >> you win. - Mahatma Gandhi
> >>
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "distributedlog-user" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to distributedlog-user+unsubscribe@googlegroups.com.
> >> To post to this group, send email to
> distributedlog-user@googlegroups.com
> >> .
> >> To view this discussion on the web visit
> >>
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com
> >> <
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com?utm_medium=email&utm_source=footer
> >
> >> .
> >>
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> >
>
-- 


-- Enrico Olivelli

Re: Improve Write performance with Relax durability.

Posted by Jia Zhai <zh...@gmail.com>.
Thanks a lot for all of your suggestions,I would like to have a try, and
will open a jira ticket, and make the proposal, discussion and testing
there.

On Wed, Jun 8, 2016 at 1:40 PM, Sijie Guo <gu...@gmail.com> wrote:

> I think that's a fair consideration. However I am thinking if we allow
> non-durable ledger, that means 1) application needs to handle the missing
> entries; 2) the re-replication should handle non-durable ledger by ignoring
> the non-existing entries if they are missing.
>
> But Let's see how Jia is proposing.
>
> - Sijie
>
> On Fri, Jun 3, 2016 at 8:57 AM, Venkateswara Rao Jujjuri <
> jujjuri@gmail.com> wrote:
>
>> @sijie let me expand what I mean by " this changes something fundamental "
>>
>> Everything starts that we are not persisting. Also I share lot of the
>> points raised by @Matteo.
>>
>> - In theory, we could loose all copies of EntryId X but persist EntryId
>> X+Y.  How does reads,replication, consistency cope up with it?
>> - We could advance LAC, but loose last last set of entries. What do we
>> do? do we adjust LAC? at what boundaries?
>> - One of the core principles of LOG is, if entry X is there , all the
>> entries up until X are available too, with this we may need to deal with
>>    sparse / missing entries.
>>
>> I believe this is more of a direction towards making BooKKeeper in-memory
>> log, but I am afraid it is more of a core change.
>>
>> Thanks,
>> JV
>>
>> On Fri, Jun 3, 2016 at 12:05 AM, Matteo Merli <mm...@apache.org> wrote:
>>
>>> I was interested in trying something in this area, but never actually got
>>> to do it.
>>>
>>> A few random notes:
>>>
>>> 1. My suspicion, with no backing data at this point, is that simply
>>> skipping the fsync
>>>     for "non-durable" ledgers might not give a big improvement, just a
>>> bit
>>> less latency
>>>     for non-fsynced writes but roughly the same throughput. Imagine a
>>> bookie
>>>     receiving writes for 2 ledgers, 1 durable and the other non-durable.
>>>     Since the entries are appended to the journal as they come in, the
>>> fsync() for the
>>>     durable ledger write will also carry on the data for the previous
>>> non-durable ledger
>>>     write, causing more IOPS if that was spanning a different disk block.
>>>     Given that the bookie throughput is typically limited by the IOPS
>>> capacity of the
>>>     journal device, having non-durable write might help that much.
>>>
>>> 2.  The other options I was thinking were :
>>>       - Do not append the non-durable entries to journal (redundancy is
>>> anyway given by
>>>         writing to multiple bookies). In this case though, a single
>>> bookie
>>> could loose more
>>>         entries depending on flushTime, and also could loose entries even
>>> in case of
>>>         process crash, not just kernel-panic or power-outage.
>>>
>>>     - Use a separate journal for non-durable writes which will not be
>>> fsynced()
>>>
>>>     - Configure the durability at the bookie level and then use
>>> placement/isolation policy to choose the
>>>       appropriate set of bookies for a non-durable ledger.
>>>
>>> 3. How do bookie replication will operate when getting read-errors?
>>>
>>> Matteo
>>>
>>> On Thu, Jun 2, 2016 at 11:09 PM Sijie Guo <si...@apache.org> wrote:
>>>
>>> > I think if a ledger is configured to be non-durable, it is kind of
>>> > application's responsibility to tolerant the data loss.
>>> > So I don't think it actually will have to change any in the bookkeeper
>>> > client side.
>>> >
>>> > - Sijie
>>> >
>>> > On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri <
>>> > jujjuri@gmail.com>
>>> > wrote:
>>> >
>>> > > I agree that we must make this ledger property not perEntry write
>>> > property.
>>> > >
>>> > > But, biggest doubt in my mind is - this changes something
>>> fundamental.
>>> > LAC.
>>> > > Are we allowing sparse ledger? in failure scenario? Handling read
>>> side
>>> > may
>>> > > become more complex.
>>> > >
>>> > > On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <gu...@gmail.com>
>>> wrote:
>>> > >
>>> > >> This seems interesting to me. However, it might be safe to start
>>> with a
>>> > >> flag configured per ledger, rather than per entry. Also, it would be
>>> > good
>>> > >> to hear the opinions from other people. JV, Matteo? (If I remembered
>>> > >> correctly, Matteo mentioned that Yahoo might be working on similar
>>> > thing)
>>> > >>
>>> > >> +1 for creating a BOOKKEEPER jira to track this.
>>> > >>
>>> > >> - Sijie
>>> > >>
>>> > >> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com>
>>> wrote:
>>> > >>
>>> > >> > + distributedlog-user
>>> > >> > For more input and comments. :)
>>> > >> >
>>> > >> > Thanks.
>>> > >> >
>>> > >> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com>
>>> wrote:
>>> > >> >
>>> > >> >> Hello all,
>>> > >> >>
>>> > >> >> I am wondering do you guys have any plans on supporting relax
>>> > >> durability.
>>> > >> >> Is it a good feature to have in bookkeeper (also for
>>> DistributedLog)?
>>> > >> >>
>>> > >> >> I am thinking adding a new flag to bookkeeper#addEntry(...,
>>> Boolean
>>> > >> >> sync). So the application can control whether to sync or not for
>>> > >> individual
>>> > >> >> entries.
>>> > >> >>
>>> > >> >> - On the write protocol, adding a flag to indicate whether this
>>> write
>>> > >> >> should sync to disk or not.
>>> > >> >> - On the bookie side, if the addEntry request is sync, going
>>> through
>>> > >> >> original pipeline. If the addEntry disables sync,    complete
>>> the add
>>> > >> >> callbacks after writing to the journal file and before flushing
>>> > >> journal.
>>> > >> >> - Those add entries (disabled syncs) will be flushed to disks
>>> with
>>> > >> >> subsequent sync add entries.
>>> > >> >>
>>> > >> >> To my use cases on DistributedLog, this feature can be used for
>>> > >> >> supporting streams that don't have strong durability
>>> requirements.
>>> > >> >>
>>> > >> >> What do you guys think? Shall I create a jira to implement this?
>>> > >> >>
>>> > >> >> Thanks a lot
>>> > >> >> -Jia
>>> > >> >>
>>> > >> >
>>> > >> > --
>>> > >> > You received this message because you are subscribed to the Google
>>> > >> Groups
>>> > >> > "distributedlog-user" group.
>>> > >> > To unsubscribe from this group and stop receiving emails from it,
>>> send
>>> > >> an
>>> > >> > email to distributedlog-user+unsubscribe@googlegroups.com.
>>> > >> > To post to this group, send email to
>>> > >> distributedlog-user@googlegroups.com.
>>> > >> > To view this discussion on the web visit
>>> > >> >
>>> > >>
>>> >
>>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
>>> > >> > <
>>> > >>
>>> >
>>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com?utm_medium=email&utm_source=footer
>>> > >> >
>>> > >> > .
>>> > >> > For more options, visit https://groups.google.com/d/optout.
>>> > >> >
>>> > >>
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > Jvrao
>>> > > ---
>>> > > First they ignore you, then they laugh at you, then they fight you,
>>> then
>>> > > you win. - Mahatma Gandhi
>>> > >
>>> > >
>>> > > --
>>> > > You received this message because you are subscribed to the Google
>>> Groups
>>> > > "distributedlog-user" group.
>>> > > To unsubscribe from this group and stop receiving emails from it,
>>> send an
>>> > > email to distributedlog-user+unsubscribe@googlegroups.com.
>>> > > To post to this group, send email to
>>> > distributedlog-user@googlegroups.com.
>>> > > To view this discussion on the web visit
>>> > >
>>> >
>>> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com
>>> > > <
>>> >
>>> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com?utm_medium=email&utm_source=footer
>>> > >
>>> > > .
>>> > >
>>> > > For more options, visit https://groups.google.com/d/optout.
>>> > >
>>> >
>>>
>>
>>
>>
>> --
>> Jvrao
>> ---
>> First they ignore you, then they laugh at you, then they fight you, then
>> you win. - Mahatma Gandhi
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "distributedlog-user" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to distributedlog-user+unsubscribe@googlegroups.com.
>> To post to this group, send email to distributedlog-user@googlegroups.com
>> .
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

Re: Improve Write performance with Relax durability.

Posted by Sijie Guo <gu...@gmail.com>.
I think that's a fair consideration. However I am thinking if we allow
non-durable ledger, that means 1) application needs to handle the missing
entries; 2) the re-replication should handle non-durable ledger by ignoring
the non-existing entries if they are missing.

But Let's see how Jia is proposing.

- Sijie

On Fri, Jun 3, 2016 at 8:57 AM, Venkateswara Rao Jujjuri <ju...@gmail.com>
wrote:

> @sijie let me expand what I mean by " this changes something fundamental "
>
> Everything starts that we are not persisting. Also I share lot of the
> points raised by @Matteo.
>
> - In theory, we could loose all copies of EntryId X but persist EntryId
> X+Y.  How does reads,replication, consistency cope up with it?
> - We could advance LAC, but loose last last set of entries. What do we do?
> do we adjust LAC? at what boundaries?
> - One of the core principles of LOG is, if entry X is there , all the
> entries up until X are available too, with this we may need to deal with
>    sparse / missing entries.
>
> I believe this is more of a direction towards making BooKKeeper in-memory
> log, but I am afraid it is more of a core change.
>
> Thanks,
> JV
>
> On Fri, Jun 3, 2016 at 12:05 AM, Matteo Merli <mm...@apache.org> wrote:
>
>> I was interested in trying something in this area, but never actually got
>> to do it.
>>
>> A few random notes:
>>
>> 1. My suspicion, with no backing data at this point, is that simply
>> skipping the fsync
>>     for "non-durable" ledgers might not give a big improvement, just a bit
>> less latency
>>     for non-fsynced writes but roughly the same throughput. Imagine a
>> bookie
>>     receiving writes for 2 ledgers, 1 durable and the other non-durable.
>>     Since the entries are appended to the journal as they come in, the
>> fsync() for the
>>     durable ledger write will also carry on the data for the previous
>> non-durable ledger
>>     write, causing more IOPS if that was spanning a different disk block.
>>     Given that the bookie throughput is typically limited by the IOPS
>> capacity of the
>>     journal device, having non-durable write might help that much.
>>
>> 2.  The other options I was thinking were :
>>       - Do not append the non-durable entries to journal (redundancy is
>> anyway given by
>>         writing to multiple bookies). In this case though, a single bookie
>> could loose more
>>         entries depending on flushTime, and also could loose entries even
>> in case of
>>         process crash, not just kernel-panic or power-outage.
>>
>>     - Use a separate journal for non-durable writes which will not be
>> fsynced()
>>
>>     - Configure the durability at the bookie level and then use
>> placement/isolation policy to choose the
>>       appropriate set of bookies for a non-durable ledger.
>>
>> 3. How do bookie replication will operate when getting read-errors?
>>
>> Matteo
>>
>> On Thu, Jun 2, 2016 at 11:09 PM Sijie Guo <si...@apache.org> wrote:
>>
>> > I think if a ledger is configured to be non-durable, it is kind of
>> > application's responsibility to tolerant the data loss.
>> > So I don't think it actually will have to change any in the bookkeeper
>> > client side.
>> >
>> > - Sijie
>> >
>> > On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri <
>> > jujjuri@gmail.com>
>> > wrote:
>> >
>> > > I agree that we must make this ledger property not perEntry write
>> > property.
>> > >
>> > > But, biggest doubt in my mind is - this changes something fundamental.
>> > LAC.
>> > > Are we allowing sparse ledger? in failure scenario? Handling read side
>> > may
>> > > become more complex.
>> > >
>> > > On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <gu...@gmail.com>
>> wrote:
>> > >
>> > >> This seems interesting to me. However, it might be safe to start
>> with a
>> > >> flag configured per ledger, rather than per entry. Also, it would be
>> > good
>> > >> to hear the opinions from other people. JV, Matteo? (If I remembered
>> > >> correctly, Matteo mentioned that Yahoo might be working on similar
>> > thing)
>> > >>
>> > >> +1 for creating a BOOKKEEPER jira to track this.
>> > >>
>> > >> - Sijie
>> > >>
>> > >> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com>
>> wrote:
>> > >>
>> > >> > + distributedlog-user
>> > >> > For more input and comments. :)
>> > >> >
>> > >> > Thanks.
>> > >> >
>> > >> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com>
>> wrote:
>> > >> >
>> > >> >> Hello all,
>> > >> >>
>> > >> >> I am wondering do you guys have any plans on supporting relax
>> > >> durability.
>> > >> >> Is it a good feature to have in bookkeeper (also for
>> DistributedLog)?
>> > >> >>
>> > >> >> I am thinking adding a new flag to bookkeeper#addEntry(...,
>> Boolean
>> > >> >> sync). So the application can control whether to sync or not for
>> > >> individual
>> > >> >> entries.
>> > >> >>
>> > >> >> - On the write protocol, adding a flag to indicate whether this
>> write
>> > >> >> should sync to disk or not.
>> > >> >> - On the bookie side, if the addEntry request is sync, going
>> through
>> > >> >> original pipeline. If the addEntry disables sync,    complete the
>> add
>> > >> >> callbacks after writing to the journal file and before flushing
>> > >> journal.
>> > >> >> - Those add entries (disabled syncs) will be flushed to disks with
>> > >> >> subsequent sync add entries.
>> > >> >>
>> > >> >> To my use cases on DistributedLog, this feature can be used for
>> > >> >> supporting streams that don't have strong durability requirements.
>> > >> >>
>> > >> >> What do you guys think? Shall I create a jira to implement this?
>> > >> >>
>> > >> >> Thanks a lot
>> > >> >> -Jia
>> > >> >>
>> > >> >
>> > >> > --
>> > >> > You received this message because you are subscribed to the Google
>> > >> Groups
>> > >> > "distributedlog-user" group.
>> > >> > To unsubscribe from this group and stop receiving emails from it,
>> send
>> > >> an
>> > >> > email to distributedlog-user+unsubscribe@googlegroups.com.
>> > >> > To post to this group, send email to
>> > >> distributedlog-user@googlegroups.com.
>> > >> > To view this discussion on the web visit
>> > >> >
>> > >>
>> >
>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
>> > >> > <
>> > >>
>> >
>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com?utm_medium=email&utm_source=footer
>> > >> >
>> > >> > .
>> > >> > For more options, visit https://groups.google.com/d/optout.
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Jvrao
>> > > ---
>> > > First they ignore you, then they laugh at you, then they fight you,
>> then
>> > > you win. - Mahatma Gandhi
>> > >
>> > >
>> > > --
>> > > You received this message because you are subscribed to the Google
>> Groups
>> > > "distributedlog-user" group.
>> > > To unsubscribe from this group and stop receiving emails from it,
>> send an
>> > > email to distributedlog-user+unsubscribe@googlegroups.com.
>> > > To post to this group, send email to
>> > distributedlog-user@googlegroups.com.
>> > > To view this discussion on the web visit
>> > >
>> >
>> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com
>> > > <
>> >
>> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com?utm_medium=email&utm_source=footer
>> > >
>> > > .
>> > >
>> > > For more options, visit https://groups.google.com/d/optout.
>> > >
>> >
>>
>
>
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "distributedlog-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to distributedlog-user+unsubscribe@googlegroups.com.
> To post to this group, send email to distributedlog-user@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com
> <https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

Re: Improve Write performance with Relax durability.

Posted by Venkateswara Rao Jujjuri <ju...@gmail.com>.
@sijie let me expand what I mean by " this changes something fundamental "

Everything starts that we are not persisting. Also I share lot of the
points raised by @Matteo.

- In theory, we could loose all copies of EntryId X but persist EntryId
X+Y.  How does reads,replication, consistency cope up with it?
- We could advance LAC, but loose last last set of entries. What do we do?
do we adjust LAC? at what boundaries?
- One of the core principles of LOG is, if entry X is there , all the
entries up until X are available too, with this we may need to deal with
   sparse / missing entries.

I believe this is more of a direction towards making BooKKeeper in-memory
log, but I am afraid it is more of a core change.

Thanks,
JV

On Fri, Jun 3, 2016 at 12:05 AM, Matteo Merli <mm...@apache.org> wrote:

> I was interested in trying something in this area, but never actually got
> to do it.
>
> A few random notes:
>
> 1. My suspicion, with no backing data at this point, is that simply
> skipping the fsync
>     for "non-durable" ledgers might not give a big improvement, just a bit
> less latency
>     for non-fsynced writes but roughly the same throughput. Imagine a
> bookie
>     receiving writes for 2 ledgers, 1 durable and the other non-durable.
>     Since the entries are appended to the journal as they come in, the
> fsync() for the
>     durable ledger write will also carry on the data for the previous
> non-durable ledger
>     write, causing more IOPS if that was spanning a different disk block.
>     Given that the bookie throughput is typically limited by the IOPS
> capacity of the
>     journal device, having non-durable write might help that much.
>
> 2.  The other options I was thinking were :
>       - Do not append the non-durable entries to journal (redundancy is
> anyway given by
>         writing to multiple bookies). In this case though, a single bookie
> could loose more
>         entries depending on flushTime, and also could loose entries even
> in case of
>         process crash, not just kernel-panic or power-outage.
>
>     - Use a separate journal for non-durable writes which will not be
> fsynced()
>
>     - Configure the durability at the bookie level and then use
> placement/isolation policy to choose the
>       appropriate set of bookies for a non-durable ledger.
>
> 3. How do bookie replication will operate when getting read-errors?
>
> Matteo
>
> On Thu, Jun 2, 2016 at 11:09 PM Sijie Guo <si...@apache.org> wrote:
>
> > I think if a ledger is configured to be non-durable, it is kind of
> > application's responsibility to tolerant the data loss.
> > So I don't think it actually will have to change any in the bookkeeper
> > client side.
> >
> > - Sijie
> >
> > On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri <
> > jujjuri@gmail.com>
> > wrote:
> >
> > > I agree that we must make this ledger property not perEntry write
> > property.
> > >
> > > But, biggest doubt in my mind is - this changes something fundamental.
> > LAC.
> > > Are we allowing sparse ledger? in failure scenario? Handling read side
> > may
> > > become more complex.
> > >
> > > On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <gu...@gmail.com> wrote:
> > >
> > >> This seems interesting to me. However, it might be safe to start with
> a
> > >> flag configured per ledger, rather than per entry. Also, it would be
> > good
> > >> to hear the opinions from other people. JV, Matteo? (If I remembered
> > >> correctly, Matteo mentioned that Yahoo might be working on similar
> > thing)
> > >>
> > >> +1 for creating a BOOKKEEPER jira to track this.
> > >>
> > >> - Sijie
> > >>
> > >> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com> wrote:
> > >>
> > >> > + distributedlog-user
> > >> > For more input and comments. :)
> > >> >
> > >> > Thanks.
> > >> >
> > >> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com>
> wrote:
> > >> >
> > >> >> Hello all,
> > >> >>
> > >> >> I am wondering do you guys have any plans on supporting relax
> > >> durability.
> > >> >> Is it a good feature to have in bookkeeper (also for
> DistributedLog)?
> > >> >>
> > >> >> I am thinking adding a new flag to bookkeeper#addEntry(..., Boolean
> > >> >> sync). So the application can control whether to sync or not for
> > >> individual
> > >> >> entries.
> > >> >>
> > >> >> - On the write protocol, adding a flag to indicate whether this
> write
> > >> >> should sync to disk or not.
> > >> >> - On the bookie side, if the addEntry request is sync, going
> through
> > >> >> original pipeline. If the addEntry disables sync,    complete the
> add
> > >> >> callbacks after writing to the journal file and before flushing
> > >> journal.
> > >> >> - Those add entries (disabled syncs) will be flushed to disks with
> > >> >> subsequent sync add entries.
> > >> >>
> > >> >> To my use cases on DistributedLog, this feature can be used for
> > >> >> supporting streams that don't have strong durability requirements.
> > >> >>
> > >> >> What do you guys think? Shall I create a jira to implement this?
> > >> >>
> > >> >> Thanks a lot
> > >> >> -Jia
> > >> >>
> > >> >
> > >> > --
> > >> > You received this message because you are subscribed to the Google
> > >> Groups
> > >> > "distributedlog-user" group.
> > >> > To unsubscribe from this group and stop receiving emails from it,
> send
> > >> an
> > >> > email to distributedlog-user+unsubscribe@googlegroups.com.
> > >> > To post to this group, send email to
> > >> distributedlog-user@googlegroups.com.
> > >> > To view this discussion on the web visit
> > >> >
> > >>
> >
> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
> > >> > <
> > >>
> >
> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> > >> >
> > >> > .
> > >> > For more options, visit https://groups.google.com/d/optout.
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Jvrao
> > > ---
> > > First they ignore you, then they laugh at you, then they fight you,
> then
> > > you win. - Mahatma Gandhi
> > >
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "distributedlog-user" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> an
> > > email to distributedlog-user+unsubscribe@googlegroups.com.
> > > To post to this group, send email to
> > distributedlog-user@googlegroups.com.
> > > To view this discussion on the web visit
> > >
> >
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com
> > > <
> >
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> > >
> > > .
> > >
> > > For more options, visit https://groups.google.com/d/optout.
> > >
> >
>



-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: Improve Write performance with Relax durability.

Posted by Matteo Merli <mm...@apache.org>.
I was interested in trying something in this area, but never actually got
to do it.

A few random notes:

1. My suspicion, with no backing data at this point, is that simply
skipping the fsync
    for "non-durable" ledgers might not give a big improvement, just a bit
less latency
    for non-fsynced writes but roughly the same throughput. Imagine a
bookie
    receiving writes for 2 ledgers, 1 durable and the other non-durable.
    Since the entries are appended to the journal as they come in, the
fsync() for the
    durable ledger write will also carry on the data for the previous
non-durable ledger
    write, causing more IOPS if that was spanning a different disk block.
    Given that the bookie throughput is typically limited by the IOPS
capacity of the
    journal device, having non-durable write might help that much.

2.  The other options I was thinking were :
      - Do not append the non-durable entries to journal (redundancy is
anyway given by
        writing to multiple bookies). In this case though, a single bookie
could loose more
        entries depending on flushTime, and also could loose entries even
in case of
        process crash, not just kernel-panic or power-outage.

    - Use a separate journal for non-durable writes which will not be
fsynced()

    - Configure the durability at the bookie level and then use
placement/isolation policy to choose the
      appropriate set of bookies for a non-durable ledger.

3. How do bookie replication will operate when getting read-errors?

Matteo

On Thu, Jun 2, 2016 at 11:09 PM Sijie Guo <si...@apache.org> wrote:

> I think if a ledger is configured to be non-durable, it is kind of
> application's responsibility to tolerant the data loss.
> So I don't think it actually will have to change any in the bookkeeper
> client side.
>
> - Sijie
>
> On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri <
> jujjuri@gmail.com>
> wrote:
>
> > I agree that we must make this ledger property not perEntry write
> property.
> >
> > But, biggest doubt in my mind is - this changes something fundamental.
> LAC.
> > Are we allowing sparse ledger? in failure scenario? Handling read side
> may
> > become more complex.
> >
> > On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <gu...@gmail.com> wrote:
> >
> >> This seems interesting to me. However, it might be safe to start with a
> >> flag configured per ledger, rather than per entry. Also, it would be
> good
> >> to hear the opinions from other people. JV, Matteo? (If I remembered
> >> correctly, Matteo mentioned that Yahoo might be working on similar
> thing)
> >>
> >> +1 for creating a BOOKKEEPER jira to track this.
> >>
> >> - Sijie
> >>
> >> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com> wrote:
> >>
> >> > + distributedlog-user
> >> > For more input and comments. :)
> >> >
> >> > Thanks.
> >> >
> >> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com> wrote:
> >> >
> >> >> Hello all,
> >> >>
> >> >> I am wondering do you guys have any plans on supporting relax
> >> durability.
> >> >> Is it a good feature to have in bookkeeper (also for DistributedLog)?
> >> >>
> >> >> I am thinking adding a new flag to bookkeeper#addEntry(..., Boolean
> >> >> sync). So the application can control whether to sync or not for
> >> individual
> >> >> entries.
> >> >>
> >> >> - On the write protocol, adding a flag to indicate whether this write
> >> >> should sync to disk or not.
> >> >> - On the bookie side, if the addEntry request is sync, going through
> >> >> original pipeline. If the addEntry disables sync,    complete the add
> >> >> callbacks after writing to the journal file and before flushing
> >> journal.
> >> >> - Those add entries (disabled syncs) will be flushed to disks with
> >> >> subsequent sync add entries.
> >> >>
> >> >> To my use cases on DistributedLog, this feature can be used for
> >> >> supporting streams that don't have strong durability requirements.
> >> >>
> >> >> What do you guys think? Shall I create a jira to implement this?
> >> >>
> >> >> Thanks a lot
> >> >> -Jia
> >> >>
> >> >
> >> > --
> >> > You received this message because you are subscribed to the Google
> >> Groups
> >> > "distributedlog-user" group.
> >> > To unsubscribe from this group and stop receiving emails from it, send
> >> an
> >> > email to distributedlog-user+unsubscribe@googlegroups.com.
> >> > To post to this group, send email to
> >> distributedlog-user@googlegroups.com.
> >> > To view this discussion on the web visit
> >> >
> >>
> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
> >> > <
> >>
> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> >> >
> >> > .
> >> > For more options, visit https://groups.google.com/d/optout.
> >> >
> >>
> >
> >
> >
> > --
> > Jvrao
> > ---
> > First they ignore you, then they laugh at you, then they fight you, then
> > you win. - Mahatma Gandhi
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "distributedlog-user" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to distributedlog-user+unsubscribe@googlegroups.com.
> > To post to this group, send email to
> distributedlog-user@googlegroups.com.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com
> > <
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> >
> > .
> >
> > For more options, visit https://groups.google.com/d/optout.
> >
>

Re: Improve Write performance with Relax durability.

Posted by Sijie Guo <si...@apache.org>.
I think if a ledger is configured to be non-durable, it is kind of
application's responsibility to tolerant the data loss.
So I don't think it actually will have to change any in the bookkeeper
client side.

- Sijie

On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri <ju...@gmail.com>
wrote:

> I agree that we must make this ledger property not perEntry write property.
>
> But, biggest doubt in my mind is - this changes something fundamental. LAC.
> Are we allowing sparse ledger? in failure scenario? Handling read side may
> become more complex.
>
> On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <gu...@gmail.com> wrote:
>
>> This seems interesting to me. However, it might be safe to start with a
>> flag configured per ledger, rather than per entry. Also, it would be good
>> to hear the opinions from other people. JV, Matteo? (If I remembered
>> correctly, Matteo mentioned that Yahoo might be working on similar thing)
>>
>> +1 for creating a BOOKKEEPER jira to track this.
>>
>> - Sijie
>>
>> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com> wrote:
>>
>> > + distributedlog-user
>> > For more input and comments. :)
>> >
>> > Thanks.
>> >
>> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com> wrote:
>> >
>> >> Hello all,
>> >>
>> >> I am wondering do you guys have any plans on supporting relax
>> durability.
>> >> Is it a good feature to have in bookkeeper (also for DistributedLog)?
>> >>
>> >> I am thinking adding a new flag to bookkeeper#addEntry(..., Boolean
>> >> sync). So the application can control whether to sync or not for
>> individual
>> >> entries.
>> >>
>> >> - On the write protocol, adding a flag to indicate whether this write
>> >> should sync to disk or not.
>> >> - On the bookie side, if the addEntry request is sync, going through
>> >> original pipeline. If the addEntry disables sync,    complete the add
>> >> callbacks after writing to the journal file and before flushing
>> journal.
>> >> - Those add entries (disabled syncs) will be flushed to disks with
>> >> subsequent sync add entries.
>> >>
>> >> To my use cases on DistributedLog, this feature can be used for
>> >> supporting streams that don't have strong durability requirements.
>> >>
>> >> What do you guys think? Shall I create a jira to implement this?
>> >>
>> >> Thanks a lot
>> >> -Jia
>> >>
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups
>> > "distributedlog-user" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an
>> > email to distributedlog-user+unsubscribe@googlegroups.com.
>> > To post to this group, send email to
>> distributedlog-user@googlegroups.com.
>> > To view this discussion on the web visit
>> >
>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
>> > <
>> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com?utm_medium=email&utm_source=footer
>> >
>> > .
>> > For more options, visit https://groups.google.com/d/optout.
>> >
>>
>
>
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "distributedlog-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to distributedlog-user+unsubscribe@googlegroups.com.
> To post to this group, send email to distributedlog-user@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

Re: Improve Write performance with Relax durability.

Posted by Venkateswara Rao Jujjuri <ju...@gmail.com>.
I agree that we must make this ledger property not perEntry write property.

But, biggest doubt in my mind is - this changes something fundamental. LAC.
Are we allowing sparse ledger? in failure scenario? Handling read side may
become more complex.

On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <gu...@gmail.com> wrote:

> This seems interesting to me. However, it might be safe to start with a
> flag configured per ledger, rather than per entry. Also, it would be good
> to hear the opinions from other people. JV, Matteo? (If I remembered
> correctly, Matteo mentioned that Yahoo might be working on similar thing)
>
> +1 for creating a BOOKKEEPER jira to track this.
>
> - Sijie
>
> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com> wrote:
>
> > + distributedlog-user
> > For more input and comments. :)
> >
> > Thanks.
> >
> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com> wrote:
> >
> >> Hello all,
> >>
> >> I am wondering do you guys have any plans on supporting relax
> durability.
> >> Is it a good feature to have in bookkeeper (also for DistributedLog)?
> >>
> >> I am thinking adding a new flag to bookkeeper#addEntry(..., Boolean
> >> sync). So the application can control whether to sync or not for
> individual
> >> entries.
> >>
> >> - On the write protocol, adding a flag to indicate whether this write
> >> should sync to disk or not.
> >> - On the bookie side, if the addEntry request is sync, going through
> >> original pipeline. If the addEntry disables sync,    complete the add
> >> callbacks after writing to the journal file and before flushing journal.
> >> - Those add entries (disabled syncs) will be flushed to disks with
> >> subsequent sync add entries.
> >>
> >> To my use cases on DistributedLog, this feature can be used for
> >> supporting streams that don't have strong durability requirements.
> >>
> >> What do you guys think? Shall I create a jira to implement this?
> >>
> >> Thanks a lot
> >> -Jia
> >>
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "distributedlog-user" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to distributedlog-user+unsubscribe@googlegroups.com.
> > To post to this group, send email to
> distributedlog-user@googlegroups.com.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
> > <
> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> >
> > .
> > For more options, visit https://groups.google.com/d/optout.
> >
>



-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: Improve Write performance with Relax durability.

Posted by Sijie Guo <gu...@gmail.com>.
This seems interesting to me. However, it might be safe to start with a
flag configured per ledger, rather than per entry. Also, it would be good
to hear the opinions from other people. JV, Matteo? (If I remembered
correctly, Matteo mentioned that Yahoo might be working on similar thing)

+1 for creating a BOOKKEEPER jira to track this.

- Sijie

On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zh...@gmail.com> wrote:

> + distributedlog-user
> For more input and comments. :)
>
> Thanks.
>
> On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com> wrote:
>
>> Hello all,
>>
>> I am wondering do you guys have any plans on supporting relax durability.
>> Is it a good feature to have in bookkeeper (also for DistributedLog)?
>>
>> I am thinking adding a new flag to bookkeeper#addEntry(..., Boolean
>> sync). So the application can control whether to sync or not for individual
>> entries.
>>
>> - On the write protocol, adding a flag to indicate whether this write
>> should sync to disk or not.
>> - On the bookie side, if the addEntry request is sync, going through
>> original pipeline. If the addEntry disables sync,    complete the add
>> callbacks after writing to the journal file and before flushing journal.
>> - Those add entries (disabled syncs) will be flushed to disks with
>> subsequent sync add entries.
>>
>> To my use cases on DistributedLog, this feature can be used for
>> supporting streams that don't have strong durability requirements.
>>
>> What do you guys think? Shall I create a jira to implement this?
>>
>> Thanks a lot
>> -Jia
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "distributedlog-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to distributedlog-user+unsubscribe@googlegroups.com.
> To post to this group, send email to distributedlog-user@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

Re: Improve Write performance with Relax durability.

Posted by Jia Zhai <zh...@gmail.com>.
+ distributedlog-user
For more input and comments. :)

Thanks.

On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zh...@gmail.com> wrote:

> Hello all,
>
> I am wondering do you guys have any plans on supporting relax durability.
> Is it a good feature to have in bookkeeper (also for DistributedLog)?
>
> I am thinking adding a new flag to bookkeeper#addEntry(..., Boolean sync).
> So the application can control whether to sync or not for individual
> entries.
>
> - On the write protocol, adding a flag to indicate whether this write
> should sync to disk or not.
> - On the bookie side, if the addEntry request is sync, going through
> original pipeline. If the addEntry disables sync,    complete the add
> callbacks after writing to the journal file and before flushing journal.
> - Those add entries (disabled syncs) will be flushed to disks with
> subsequent sync add entries.
>
> To my use cases on DistributedLog, this feature can be used for supporting
> streams that don't have strong durability requirements.
>
> What do you guys think? Shall I create a jira to implement this?
>
> Thanks a lot
> -Jia
>