You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Sijie Guo <si...@apache.org> on 2016/10/25 23:11:18 UTC

[Discuss] 64 bits ledger id

*Problem: *

Currently the ledger id is long, which it should be 64-bits. However
currently bookkeeper only can generate 32-bits ledger id as zookeeper's
sequence znode only produce 32-bits.

This problem was basically raised before at BOOKKEEPER-421. Jiannan has
already done fair amount of work on this and there were several patches for
it.

This email thread is to start the discussion for 64-bits ledger id support
in bookkeeper.

*Discuss*:

Based on bookkeeper-421, the changes will relatively happen in following
places. Assume the metadata store is ZooKeeper.


   1. How to generate 64-bits ledger id? (64 Bits Ledger ID Generation
   <https://issues.apache.org/jira/browse/BOOKKEEPER-552>)
   2. How to store the 64-bits ledger id in zookeeper? (New LedgerManager
   for 64 Bits Ledger ID Management in ZooKeeper
   <https://issues.apache.org/jira/browse/BOOKKEEPER-553>)
   3. How can the garbage collect handle correctly with 64-bits ledger id? (
   https://issues.apache.org/jira/browse/BOOKKEEPER-553?focusedCommentId=13558192&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13558192
   )
   4. How can we upgrade current HierarchicalLedgerManager to support
   64-bits. [??]

Feel free to take a look at those tickets and make any proposals.

- Sijie

Re: [Discuss] 64 bits ledger id

Posted by Sijie Guo <si...@apache.org>.
On Tue, Oct 25, 2016 at 4:39 PM, Venkateswara Rao Jujjuri <jujjuri@gmail.com
> wrote:

> We are using 64 bit ledger ids internally right now, but the ledger id is
> supported by the application/caller.
> We have extended Hierarchical ledger manger to Long hierarchical ledger
> manger for this.
>

Can anyone from your team describe how did you guys extend the ledger
manager? I am interested in how did you guys handle backward compatibility
for Hierarchical Ledger Manager.



>
> Ultimately we would like to move to 128 bit UUID as the ledger id. That
> makes ledgers unique without
> the need of centralized ZK/metadata server.
>
> On Tue, Oct 25, 2016 at 4:11 PM, Sijie Guo <si...@apache.org> wrote:
>
> > *Problem: *
> >
> > Currently the ledger id is long, which it should be 64-bits. However
> > currently bookkeeper only can generate 32-bits ledger id as zookeeper's
> > sequence znode only produce 32-bits.
> >
> > This problem was basically raised before at BOOKKEEPER-421. Jiannan has
> > already done fair amount of work on this and there were several patches
> for
> > it.
> >
> > This email thread is to start the discussion for 64-bits ledger id
> support
> > in bookkeeper.
> >
> > *Discuss*:
> >
> > Based on bookkeeper-421, the changes will relatively happen in following
> > places. Assume the metadata store is ZooKeeper.
> >
> >
> >    1. How to generate 64-bits ledger id? (64 Bits Ledger ID Generation
> >    <https://issues.apache.org/jira/browse/BOOKKEEPER-552>)
> >    2. How to store the 64-bits ledger id in zookeeper? (New LedgerManager
> >    for 64 Bits Ledger ID Management in ZooKeeper
> >    <https://issues.apache.org/jira/browse/BOOKKEEPER-553>)
> >    3. How can the garbage collect handle correctly with 64-bits ledger
> id?
> > (
> >    https://issues.apache.org/jira/browse/BOOKKEEPER-553?
> > focusedCommentId=13558192&page=com.atlassian.jira.
> > plugin.system.issuetabpanels:comment-tabpanel#comment-13558192
> >    )
> >    4. How can we upgrade current HierarchicalLedgerManager to support
> >    64-bits. [??]
> >
> > Feel free to take a look at those tickets and make any proposals.
> >
> > - Sijie
> >
>
>
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>

Re: [Discuss] 64 bits ledger id

Posted by Venkateswara Rao Jujjuri <ju...@gmail.com>.
That limits us to 19 digit ledgerId and Charan's patch should work as-is as
it has different name space.
and 2^63 should be pretty good for Yahoo's usecase too.

On Tue, Nov 1, 2016 at 2:05 AM, Sijie Guo <si...@apache.org> wrote:

> On Fri, Oct 28, 2016 at 5:01 PM, Charan Reddy G <re...@gmail.com>
> wrote:
>
> > In our last meeting, we briefly discussed about options of handling
> > negative long ledgerids in LongeHierarchicalLedgerManager. Mainly by
> using
> > the unsigned long value (20 digits) for determining the ledger zNode
> path.
> > I see couple of problems with that
> >
> > (signed long)         ->     (unsigned long)
> > -9223372036854775808  ->  "09223372036854775808"
> >  9223372036854775807  ->  "09223372036854775807"
> >
> > now both these ledger znodes would be under the same parent zNode
> > 0922/3372/0368/5477
> >
> > gc method in ScanAndCompareGarbageCollector, for LedgerRange
> > (0922/3372/0368/5477) the following logic would break. Because we would
> get
> > mix of positive and negative ledgerids in this ledgerrange and
> > scanandcompare logic with bkActiveLedgers (ledgerStorage.
> getActiveLedgers)
> > will not work.
> >
> >
> >             while(ledgerRangeIterator.hasNext()) {
> >                 LedgerRange lRange = ledgerRangeIterator.next();
> >
> >                 Long start = lastEnd + 1;
> >                 Long end = lRange.end();
> >                 if (!ledgerRangeIterator.hasNext()) {
> >                     end = Long.MAX_VALUE;
> >                 }
> >
> >                 Iterable<Long> subBkActiveLedgers =
> > bkActiveLedgers.subSet(start, true, end, true);
> >
> > and also with ledgerrange with all negative ledgerids we need to do some
> > tweaking for the 'start' and the 'end'.
> >
> > So in summary, I feel its risky to use unsigned long for determining the
> > ledger zNode path, since we used signed long for ledgerids all across the
> > codebase and this inconsistency can cause issue in some other areas as
> > well.
> >
> > So I'm more convinced to use the signed long for determining the ledger
> > zNode path and let the LongHierarchicalLedgerManager/Iterator deal with
> > negative ledgerids accordingly (for eg. deal with znode with '-' while
> > doing comparison and iteration). So it makes scope of the change limited
> to
> > LongHierarchicalLedgerManager rather than applying patches to other
> areas.
> >
> > (ledger id)      ->  (znode path)
> > -9223372036854775808  ->  "-922/3372/0368/5477/5808"
> >  9223372036854775807  ->  "0922/3372/0368/5477/5807"
> >
>
> Or shall we just simply not use negative numbers?
>
>
> >
> > Thanks,
> > Charan
> >
> >
> > On Wed, Oct 26, 2016 at 10:55 PM, Sijie Guo <si...@apache.org> wrote:
> >
> > > On Wed, Oct 26, 2016 at 3:49 PM, Matteo Merli <mm...@apache.org>
> wrote:
> > >
> > > > On Wed, Oct 26, 2016 at 11:45 AM Venkateswara Rao Jujjuri <
> > > > jujjuri@gmail.com>
> > > > wrote:
> > > >
> > > > > - Ledgers are unique across multiple clusters. Useful if storage
> > tiers
> > > > with
> > > > > different stores are employed.
> > > > >
> > > >
> > > > For this you could combine the ledgerId with another 64bit id, that
> > could
> > > > encode the rest of the required informations ( storage tier, cluster,
> > ..
> > > )
> > > >
> > >
> > > +1 on this idea
> > >
> > >
> > > >
> > > >
> > > > > - No centralized id creation - Allows client to give the name
> instead
> > > of
> > > > > server generating name on create.
> > > > >
> > > >
> > > > This should be already possible with your changes in 4.4, right?
> > > >
> > > > Wouldn't be enough to combine the 64bit ledgerId with an additional
> id
> > > that
> > > > doesn't need to flow through BK ?
> > > >
> > >
> >
>



-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: [Discuss] 64 bits ledger id

Posted by Sijie Guo <si...@apache.org>.
On Fri, Oct 28, 2016 at 5:01 PM, Charan Reddy G <re...@gmail.com>
wrote:

> In our last meeting, we briefly discussed about options of handling
> negative long ledgerids in LongeHierarchicalLedgerManager. Mainly by using
> the unsigned long value (20 digits) for determining the ledger zNode path.
> I see couple of problems with that
>
> (signed long)         ->     (unsigned long)
> -9223372036854775808  ->  "09223372036854775808"
>  9223372036854775807  ->  "09223372036854775807"
>
> now both these ledger znodes would be under the same parent zNode
> 0922/3372/0368/5477
>
> gc method in ScanAndCompareGarbageCollector, for LedgerRange
> (0922/3372/0368/5477) the following logic would break. Because we would get
> mix of positive and negative ledgerids in this ledgerrange and
> scanandcompare logic with bkActiveLedgers (ledgerStorage.getActiveLedgers)
> will not work.
>
>
>             while(ledgerRangeIterator.hasNext()) {
>                 LedgerRange lRange = ledgerRangeIterator.next();
>
>                 Long start = lastEnd + 1;
>                 Long end = lRange.end();
>                 if (!ledgerRangeIterator.hasNext()) {
>                     end = Long.MAX_VALUE;
>                 }
>
>                 Iterable<Long> subBkActiveLedgers =
> bkActiveLedgers.subSet(start, true, end, true);
>
> and also with ledgerrange with all negative ledgerids we need to do some
> tweaking for the 'start' and the 'end'.
>
> So in summary, I feel its risky to use unsigned long for determining the
> ledger zNode path, since we used signed long for ledgerids all across the
> codebase and this inconsistency can cause issue in some other areas as
> well.
>
> So I'm more convinced to use the signed long for determining the ledger
> zNode path and let the LongHierarchicalLedgerManager/Iterator deal with
> negative ledgerids accordingly (for eg. deal with znode with '-' while
> doing comparison and iteration). So it makes scope of the change limited to
> LongHierarchicalLedgerManager rather than applying patches to other areas.
>
> (ledger id)      ->  (znode path)
> -9223372036854775808  ->  "-922/3372/0368/5477/5808"
>  9223372036854775807  ->  "0922/3372/0368/5477/5807"
>

Or shall we just simply not use negative numbers?


>
> Thanks,
> Charan
>
>
> On Wed, Oct 26, 2016 at 10:55 PM, Sijie Guo <si...@apache.org> wrote:
>
> > On Wed, Oct 26, 2016 at 3:49 PM, Matteo Merli <mm...@apache.org> wrote:
> >
> > > On Wed, Oct 26, 2016 at 11:45 AM Venkateswara Rao Jujjuri <
> > > jujjuri@gmail.com>
> > > wrote:
> > >
> > > > - Ledgers are unique across multiple clusters. Useful if storage
> tiers
> > > with
> > > > different stores are employed.
> > > >
> > >
> > > For this you could combine the ledgerId with another 64bit id, that
> could
> > > encode the rest of the required informations ( storage tier, cluster,
> ..
> > )
> > >
> >
> > +1 on this idea
> >
> >
> > >
> > >
> > > > - No centralized id creation - Allows client to give the name instead
> > of
> > > > server generating name on create.
> > > >
> > >
> > > This should be already possible with your changes in 4.4, right?
> > >
> > > Wouldn't be enough to combine the 64bit ledgerId with an additional id
> > that
> > > doesn't need to flow through BK ?
> > >
> >
>

Re: [Discuss] 64 bits ledger id

Posted by Charan Reddy G <re...@gmail.com>.
In our last meeting, we briefly discussed about options of handling
negative long ledgerids in LongeHierarchicalLedgerManager. Mainly by using
the unsigned long value (20 digits) for determining the ledger zNode path.
I see couple of problems with that

(signed long)         ->     (unsigned long)
-9223372036854775808  ->  "09223372036854775808"
 9223372036854775807  ->  "09223372036854775807"

now both these ledger znodes would be under the same parent zNode
0922/3372/0368/5477

gc method in ScanAndCompareGarbageCollector, for LedgerRange
(0922/3372/0368/5477) the following logic would break. Because we would get
mix of positive and negative ledgerids in this ledgerrange and
scanandcompare logic with bkActiveLedgers (ledgerStorage.getActiveLedgers)
will not work.


            while(ledgerRangeIterator.hasNext()) {
                LedgerRange lRange = ledgerRangeIterator.next();

                Long start = lastEnd + 1;
                Long end = lRange.end();
                if (!ledgerRangeIterator.hasNext()) {
                    end = Long.MAX_VALUE;
                }

                Iterable<Long> subBkActiveLedgers =
bkActiveLedgers.subSet(start, true, end, true);

and also with ledgerrange with all negative ledgerids we need to do some
tweaking for the 'start' and the 'end'.

So in summary, I feel its risky to use unsigned long for determining the
ledger zNode path, since we used signed long for ledgerids all across the
codebase and this inconsistency can cause issue in some other areas as well.

So I'm more convinced to use the signed long for determining the ledger
zNode path and let the LongHierarchicalLedgerManager/Iterator deal with
negative ledgerids accordingly (for eg. deal with znode with '-' while
doing comparison and iteration). So it makes scope of the change limited to
LongHierarchicalLedgerManager rather than applying patches to other areas.

(ledger id)      ->  (znode path)
-9223372036854775808  ->  "-922/3372/0368/5477/5808"
 9223372036854775807  ->  "0922/3372/0368/5477/5807"

Thanks,
Charan


On Wed, Oct 26, 2016 at 10:55 PM, Sijie Guo <si...@apache.org> wrote:

> On Wed, Oct 26, 2016 at 3:49 PM, Matteo Merli <mm...@apache.org> wrote:
>
> > On Wed, Oct 26, 2016 at 11:45 AM Venkateswara Rao Jujjuri <
> > jujjuri@gmail.com>
> > wrote:
> >
> > > - Ledgers are unique across multiple clusters. Useful if storage tiers
> > with
> > > different stores are employed.
> > >
> >
> > For this you could combine the ledgerId with another 64bit id, that could
> > encode the rest of the required informations ( storage tier, cluster, ..
> )
> >
>
> +1 on this idea
>
>
> >
> >
> > > - No centralized id creation - Allows client to give the name instead
> of
> > > server generating name on create.
> > >
> >
> > This should be already possible with your changes in 4.4, right?
> >
> > Wouldn't be enough to combine the 64bit ledgerId with an additional id
> that
> > doesn't need to flow through BK ?
> >
>

Re: [Discuss] 64 bits ledger id

Posted by Sijie Guo <si...@apache.org>.
On Wed, Oct 26, 2016 at 3:49 PM, Matteo Merli <mm...@apache.org> wrote:

> On Wed, Oct 26, 2016 at 11:45 AM Venkateswara Rao Jujjuri <
> jujjuri@gmail.com>
> wrote:
>
> > - Ledgers are unique across multiple clusters. Useful if storage tiers
> with
> > different stores are employed.
> >
>
> For this you could combine the ledgerId with another 64bit id, that could
> encode the rest of the required informations ( storage tier, cluster, .. )
>

+1 on this idea


>
>
> > - No centralized id creation - Allows client to give the name instead of
> > server generating name on create.
> >
>
> This should be already possible with your changes in 4.4, right?
>
> Wouldn't be enough to combine the 64bit ledgerId with an additional id that
> doesn't need to flow through BK ?
>

Re: [Discuss] 64 bits ledger id

Posted by Sijie Guo <si...@apache.org>.
Thanks for clarification on the implementation.

When you guys contribute the change back, it might be worth considering
using 2-4-4 split for ledger ids whose higher 32 bits is 0. So it can
backward compatible with existing HirarchicalLedgerManager.

- Sijie

On Wed, Oct 26, 2016 at 4:36 PM, Charan Reddy G <re...@gmail.com>
wrote:

> *Can anyone from your team describe how did you guys extend the
> ledgermanager? I am interested in how did you guys handle backward
> compatibilityfor Hierarchical Ledger Manager.*
>
> We created LongHierarchicalLedgerManager, which would work for *positive
> long* ledgerids (so technically only 63 bits for ledgerid).
> This LongHierarchicalLedgerManager extends HierarchicalLedgerManager and
> its logic is similar to HierarchicalLedgerManager. But instead of using
> 2-level hierarchical znodes (2-4-4 split), we use 4-level hierarchical
> znode with 3-4-4-4-4 split. We didn't plan for backward compatibility for
> HierarchicalLedgerManager, since we started the cluster with
> LongHierarchicalLedgerManager, we were ok with it.
>
> Thanks,
> Charan
>
> On Wed, Oct 26, 2016 at 3:49 PM, Matteo Merli <mm...@apache.org> wrote:
>
> > On Wed, Oct 26, 2016 at 11:45 AM Venkateswara Rao Jujjuri <
> > jujjuri@gmail.com>
> > wrote:
> >
> > > - Ledgers are unique across multiple clusters. Useful if storage tiers
> > with
> > > different stores are employed.
> > >
> >
> > For this you could combine the ledgerId with another 64bit id, that could
> > encode the rest of the required informations ( storage tier, cluster, ..
> )
> >
> >
> > > - No centralized id creation - Allows client to give the name instead
> of
> > > server generating name on create.
> > >
> >
> > This should be already possible with your changes in 4.4, right?
> >
> > Wouldn't be enough to combine the 64bit ledgerId with an additional id
> that
> > doesn't need to flow through BK ?
> >
>

Re: [Discuss] 64 bits ledger id

Posted by Charan Reddy G <re...@gmail.com>.
*Can anyone from your team describe how did you guys extend the
ledgermanager? I am interested in how did you guys handle backward
compatibilityfor Hierarchical Ledger Manager.*

We created LongHierarchicalLedgerManager, which would work for *positive
long* ledgerids (so technically only 63 bits for ledgerid).
This LongHierarchicalLedgerManager extends HierarchicalLedgerManager and
its logic is similar to HierarchicalLedgerManager. But instead of using
2-level hierarchical znodes (2-4-4 split), we use 4-level hierarchical
znode with 3-4-4-4-4 split. We didn't plan for backward compatibility for
HierarchicalLedgerManager, since we started the cluster with
LongHierarchicalLedgerManager, we were ok with it.

Thanks,
Charan

On Wed, Oct 26, 2016 at 3:49 PM, Matteo Merli <mm...@apache.org> wrote:

> On Wed, Oct 26, 2016 at 11:45 AM Venkateswara Rao Jujjuri <
> jujjuri@gmail.com>
> wrote:
>
> > - Ledgers are unique across multiple clusters. Useful if storage tiers
> with
> > different stores are employed.
> >
>
> For this you could combine the ledgerId with another 64bit id, that could
> encode the rest of the required informations ( storage tier, cluster, .. )
>
>
> > - No centralized id creation - Allows client to give the name instead of
> > server generating name on create.
> >
>
> This should be already possible with your changes in 4.4, right?
>
> Wouldn't be enough to combine the 64bit ledgerId with an additional id that
> doesn't need to flow through BK ?
>

Re: [Discuss] 64 bits ledger id

Posted by Matteo Merli <mm...@apache.org>.
On Wed, Oct 26, 2016 at 11:45 AM Venkateswara Rao Jujjuri <ju...@gmail.com>
wrote:

> - Ledgers are unique across multiple clusters. Useful if storage tiers with
> different stores are employed.
>

For this you could combine the ledgerId with another 64bit id, that could
encode the rest of the required informations ( storage tier, cluster, .. )


> - No centralized id creation - Allows client to give the name instead of
> server generating name on create.
>

This should be already possible with your changes in 4.4, right?

Wouldn't be enough to combine the 64bit ledgerId with an additional id that
doesn't need to flow through BK ?

Re: [Discuss] 64 bits ledger id

Posted by Venkateswara Rao Jujjuri <ju...@gmail.com>.
Right, we can still detect and handle collisions in either way, and in both
methods (seq-long or uuid) it is extremely unlikely to see collision.

But general assumption and practice is, lot of software built on the
assumption that UUID will never have a collision.

- Ledgers are unique across multiple clusters. Useful if storage tiers with
different stores are employed.
- No centralized id creation - Allows client to give the name instead of
server generating name on create.



On Wed, Oct 26, 2016 at 9:40 AM, Flavio Junqueira <fp...@apache.org> wrote:

> Agreed that the probability of collision is low, but it exists and there
> is no way to check if it kicks in if we don't have that information
> centralized. For wraparounds, we at least have a way of detecting that we
> reached the maximum value, but granted that if we reach the maximum value,
> then we may have duplicates as we reset the counter.
>
> Assuming that we still keep ledger metadata in the metadata service and it
> is keyed by ledger id, we can check whether the ledger id is taken upon
> writing the ledger metadata, can't we?
>
> -Flavio
>
> > On 26 Oct 2016, at 16:41, Venkateswara Rao Jujjuri <ju...@gmail.com>
> wrote:
> >
> > Flavio,
> >
> > In theory you are right. But in practice it is virtually unique.
> > RFC 4122 (https://www.ietf.org/rfc/rfc4122.txt) talks about multiple
> ways
> > to generate virtually unique ID.
> > Even in the sequence-id we have a theoretical chance of wrapping. So if
> we
> > are worried about duplicates,
> > we need a way to handle it. But in both cases (64 bit long or 128 bit
> uuid)
> > collisions are impractical.
> >
> > Also I am not advocating to get rid of metadata server completely, Just
> > saying for this purpose we don't need to.
> > Thanks,
> > JV
> >
> >
> > On Wed, Oct 26, 2016 at 3:43 AM, Flavio Junqueira <fp...@apache.org>
> wrote:
> >
> >> Hi JV,
> >>
> >> Do I understand correctly that you're proposing to generate 128-bit
> UUIDs
> >> locally to use as ledger ids? If so, I'm concerned that even if the
> >> probability of collision is low, there is still a chance of having
> >> collisions with no way to verify in the case you aren't relying on a
> >> metadata server.
> >>
> >> I feel that I'm missing something here because you can't completely get
> >> rid of the metadata server without getting in trouble or at least
> without
> >> deeper changes. I'd appreciate if you could clarify, please.
> >>
> >> -Flavio
> >>
> >>> On 26 Oct 2016, at 00:39, Venkateswara Rao Jujjuri <ju...@gmail.com>
> >> wrote:
> >>>
> >>> We are using 64 bit ledger ids internally right now, but the ledger id
> is
> >>> supported by the application/caller.
> >>> We have extended Hierarchical ledger manger to Long hierarchical ledger
> >>> manger for this.
> >>>
> >>> Ultimately we would like to move to 128 bit UUID as the ledger id. That
> >>> makes ledgers unique without
> >>> the need of centralized ZK/metadata server.
> >>>
> >>> On Tue, Oct 25, 2016 at 4:11 PM, Sijie Guo <si...@apache.org> wrote:
> >>>
> >>>> *Problem: *
> >>>>
> >>>> Currently the ledger id is long, which it should be 64-bits. However
> >>>> currently bookkeeper only can generate 32-bits ledger id as
> zookeeper's
> >>>> sequence znode only produce 32-bits.
> >>>>
> >>>> This problem was basically raised before at BOOKKEEPER-421. Jiannan
> has
> >>>> already done fair amount of work on this and there were several
> patches
> >> for
> >>>> it.
> >>>>
> >>>> This email thread is to start the discussion for 64-bits ledger id
> >> support
> >>>> in bookkeeper.
> >>>>
> >>>> *Discuss*:
> >>>>
> >>>> Based on bookkeeper-421, the changes will relatively happen in
> following
> >>>> places. Assume the metadata store is ZooKeeper.
> >>>>
> >>>>
> >>>>  1. How to generate 64-bits ledger id? (64 Bits Ledger ID Generation
> >>>>  <https://issues.apache.org/jira/browse/BOOKKEEPER-552>)
> >>>>  2. How to store the 64-bits ledger id in zookeeper? (New
> LedgerManager
> >>>>  for 64 Bits Ledger ID Management in ZooKeeper
> >>>>  <https://issues.apache.org/jira/browse/BOOKKEEPER-553>)
> >>>>  3. How can the garbage collect handle correctly with 64-bits ledger
> >> id?
> >>>> (
> >>>>  https://issues.apache.org/jira/browse/BOOKKEEPER-553?
> >>>> focusedCommentId=13558192&page=com.atlassian.jira.
> >>>> plugin.system.issuetabpanels:comment-tabpanel#comment-13558192
> >>>>  )
> >>>>  4. How can we upgrade current HierarchicalLedgerManager to support
> >>>>  64-bits. [??]
> >>>>
> >>>> Feel free to take a look at those tickets and make any proposals.
> >>>>
> >>>> - Sijie
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Jvrao
> >>> ---
> >>> First they ignore you, then they laugh at you, then they fight you,
> then
> >>> you win. - Mahatma Gandhi
> >>
> >>
> >
> >
> > --
> > Jvrao
> > ---
> > First they ignore you, then they laugh at you, then they fight you, then
> > you win. - Mahatma Gandhi
>
>


-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: [Discuss] 64 bits ledger id

Posted by Flavio Junqueira <fp...@apache.org>.
Agreed that the probability of collision is low, but it exists and there is no way to check if it kicks in if we don't have that information centralized. For wraparounds, we at least have a way of detecting that we reached the maximum value, but granted that if we reach the maximum value, then we may have duplicates as we reset the counter.

Assuming that we still keep ledger metadata in the metadata service and it is keyed by ledger id, we can check whether the ledger id is taken upon writing the ledger metadata, can't we?

-Flavio 

> On 26 Oct 2016, at 16:41, Venkateswara Rao Jujjuri <ju...@gmail.com> wrote:
> 
> Flavio,
> 
> In theory you are right. But in practice it is virtually unique.
> RFC 4122 (https://www.ietf.org/rfc/rfc4122.txt) talks about multiple ways
> to generate virtually unique ID.
> Even in the sequence-id we have a theoretical chance of wrapping. So if we
> are worried about duplicates,
> we need a way to handle it. But in both cases (64 bit long or 128 bit uuid)
> collisions are impractical.
> 
> Also I am not advocating to get rid of metadata server completely, Just
> saying for this purpose we don't need to.
> Thanks,
> JV
> 
> 
> On Wed, Oct 26, 2016 at 3:43 AM, Flavio Junqueira <fp...@apache.org> wrote:
> 
>> Hi JV,
>> 
>> Do I understand correctly that you're proposing to generate 128-bit UUIDs
>> locally to use as ledger ids? If so, I'm concerned that even if the
>> probability of collision is low, there is still a chance of having
>> collisions with no way to verify in the case you aren't relying on a
>> metadata server.
>> 
>> I feel that I'm missing something here because you can't completely get
>> rid of the metadata server without getting in trouble or at least without
>> deeper changes. I'd appreciate if you could clarify, please.
>> 
>> -Flavio
>> 
>>> On 26 Oct 2016, at 00:39, Venkateswara Rao Jujjuri <ju...@gmail.com>
>> wrote:
>>> 
>>> We are using 64 bit ledger ids internally right now, but the ledger id is
>>> supported by the application/caller.
>>> We have extended Hierarchical ledger manger to Long hierarchical ledger
>>> manger for this.
>>> 
>>> Ultimately we would like to move to 128 bit UUID as the ledger id. That
>>> makes ledgers unique without
>>> the need of centralized ZK/metadata server.
>>> 
>>> On Tue, Oct 25, 2016 at 4:11 PM, Sijie Guo <si...@apache.org> wrote:
>>> 
>>>> *Problem: *
>>>> 
>>>> Currently the ledger id is long, which it should be 64-bits. However
>>>> currently bookkeeper only can generate 32-bits ledger id as zookeeper's
>>>> sequence znode only produce 32-bits.
>>>> 
>>>> This problem was basically raised before at BOOKKEEPER-421. Jiannan has
>>>> already done fair amount of work on this and there were several patches
>> for
>>>> it.
>>>> 
>>>> This email thread is to start the discussion for 64-bits ledger id
>> support
>>>> in bookkeeper.
>>>> 
>>>> *Discuss*:
>>>> 
>>>> Based on bookkeeper-421, the changes will relatively happen in following
>>>> places. Assume the metadata store is ZooKeeper.
>>>> 
>>>> 
>>>>  1. How to generate 64-bits ledger id? (64 Bits Ledger ID Generation
>>>>  <https://issues.apache.org/jira/browse/BOOKKEEPER-552>)
>>>>  2. How to store the 64-bits ledger id in zookeeper? (New LedgerManager
>>>>  for 64 Bits Ledger ID Management in ZooKeeper
>>>>  <https://issues.apache.org/jira/browse/BOOKKEEPER-553>)
>>>>  3. How can the garbage collect handle correctly with 64-bits ledger
>> id?
>>>> (
>>>>  https://issues.apache.org/jira/browse/BOOKKEEPER-553?
>>>> focusedCommentId=13558192&page=com.atlassian.jira.
>>>> plugin.system.issuetabpanels:comment-tabpanel#comment-13558192
>>>>  )
>>>>  4. How can we upgrade current HierarchicalLedgerManager to support
>>>>  64-bits. [??]
>>>> 
>>>> Feel free to take a look at those tickets and make any proposals.
>>>> 
>>>> - Sijie
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Jvrao
>>> ---
>>> First they ignore you, then they laugh at you, then they fight you, then
>>> you win. - Mahatma Gandhi
>> 
>> 
> 
> 
> -- 
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi


Re: [Discuss] 64 bits ledger id

Posted by Venkateswara Rao Jujjuri <ju...@gmail.com>.
Flavio,

In theory you are right. But in practice it is virtually unique.
RFC 4122 (https://www.ietf.org/rfc/rfc4122.txt) talks about multiple ways
to generate virtually unique ID.
Even in the sequence-id we have a theoretical chance of wrapping. So if we
are worried about duplicates,
we need a way to handle it. But in both cases (64 bit long or 128 bit uuid)
collisions are impractical.

Also I am not advocating to get rid of metadata server completely, Just
saying for this purpose we don't need to.
Thanks,
JV


On Wed, Oct 26, 2016 at 3:43 AM, Flavio Junqueira <fp...@apache.org> wrote:

> Hi JV,
>
> Do I understand correctly that you're proposing to generate 128-bit UUIDs
> locally to use as ledger ids? If so, I'm concerned that even if the
> probability of collision is low, there is still a chance of having
> collisions with no way to verify in the case you aren't relying on a
> metadata server.
>
> I feel that I'm missing something here because you can't completely get
> rid of the metadata server without getting in trouble or at least without
> deeper changes. I'd appreciate if you could clarify, please.
>
> -Flavio
>
> > On 26 Oct 2016, at 00:39, Venkateswara Rao Jujjuri <ju...@gmail.com>
> wrote:
> >
> > We are using 64 bit ledger ids internally right now, but the ledger id is
> > supported by the application/caller.
> > We have extended Hierarchical ledger manger to Long hierarchical ledger
> > manger for this.
> >
> > Ultimately we would like to move to 128 bit UUID as the ledger id. That
> > makes ledgers unique without
> > the need of centralized ZK/metadata server.
> >
> > On Tue, Oct 25, 2016 at 4:11 PM, Sijie Guo <si...@apache.org> wrote:
> >
> >> *Problem: *
> >>
> >> Currently the ledger id is long, which it should be 64-bits. However
> >> currently bookkeeper only can generate 32-bits ledger id as zookeeper's
> >> sequence znode only produce 32-bits.
> >>
> >> This problem was basically raised before at BOOKKEEPER-421. Jiannan has
> >> already done fair amount of work on this and there were several patches
> for
> >> it.
> >>
> >> This email thread is to start the discussion for 64-bits ledger id
> support
> >> in bookkeeper.
> >>
> >> *Discuss*:
> >>
> >> Based on bookkeeper-421, the changes will relatively happen in following
> >> places. Assume the metadata store is ZooKeeper.
> >>
> >>
> >>   1. How to generate 64-bits ledger id? (64 Bits Ledger ID Generation
> >>   <https://issues.apache.org/jira/browse/BOOKKEEPER-552>)
> >>   2. How to store the 64-bits ledger id in zookeeper? (New LedgerManager
> >>   for 64 Bits Ledger ID Management in ZooKeeper
> >>   <https://issues.apache.org/jira/browse/BOOKKEEPER-553>)
> >>   3. How can the garbage collect handle correctly with 64-bits ledger
> id?
> >> (
> >>   https://issues.apache.org/jira/browse/BOOKKEEPER-553?
> >> focusedCommentId=13558192&page=com.atlassian.jira.
> >> plugin.system.issuetabpanels:comment-tabpanel#comment-13558192
> >>   )
> >>   4. How can we upgrade current HierarchicalLedgerManager to support
> >>   64-bits. [??]
> >>
> >> Feel free to take a look at those tickets and make any proposals.
> >>
> >> - Sijie
> >>
> >
> >
> >
> > --
> > Jvrao
> > ---
> > First they ignore you, then they laugh at you, then they fight you, then
> > you win. - Mahatma Gandhi
>
>


-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: [Discuss] 64 bits ledger id

Posted by Flavio Junqueira <fp...@apache.org>.
Hi JV,

Do I understand correctly that you're proposing to generate 128-bit UUIDs locally to use as ledger ids? If so, I'm concerned that even if the probability of collision is low, there is still a chance of having collisions with no way to verify in the case you aren't relying on a metadata server.

I feel that I'm missing something here because you can't completely get rid of the metadata server without getting in trouble or at least without deeper changes. I'd appreciate if you could clarify, please.

-Flavio
 
> On 26 Oct 2016, at 00:39, Venkateswara Rao Jujjuri <ju...@gmail.com> wrote:
> 
> We are using 64 bit ledger ids internally right now, but the ledger id is
> supported by the application/caller.
> We have extended Hierarchical ledger manger to Long hierarchical ledger
> manger for this.
> 
> Ultimately we would like to move to 128 bit UUID as the ledger id. That
> makes ledgers unique without
> the need of centralized ZK/metadata server.
> 
> On Tue, Oct 25, 2016 at 4:11 PM, Sijie Guo <si...@apache.org> wrote:
> 
>> *Problem: *
>> 
>> Currently the ledger id is long, which it should be 64-bits. However
>> currently bookkeeper only can generate 32-bits ledger id as zookeeper's
>> sequence znode only produce 32-bits.
>> 
>> This problem was basically raised before at BOOKKEEPER-421. Jiannan has
>> already done fair amount of work on this and there were several patches for
>> it.
>> 
>> This email thread is to start the discussion for 64-bits ledger id support
>> in bookkeeper.
>> 
>> *Discuss*:
>> 
>> Based on bookkeeper-421, the changes will relatively happen in following
>> places. Assume the metadata store is ZooKeeper.
>> 
>> 
>>   1. How to generate 64-bits ledger id? (64 Bits Ledger ID Generation
>>   <https://issues.apache.org/jira/browse/BOOKKEEPER-552>)
>>   2. How to store the 64-bits ledger id in zookeeper? (New LedgerManager
>>   for 64 Bits Ledger ID Management in ZooKeeper
>>   <https://issues.apache.org/jira/browse/BOOKKEEPER-553>)
>>   3. How can the garbage collect handle correctly with 64-bits ledger id?
>> (
>>   https://issues.apache.org/jira/browse/BOOKKEEPER-553?
>> focusedCommentId=13558192&page=com.atlassian.jira.
>> plugin.system.issuetabpanels:comment-tabpanel#comment-13558192
>>   )
>>   4. How can we upgrade current HierarchicalLedgerManager to support
>>   64-bits. [??]
>> 
>> Feel free to take a look at those tickets and make any proposals.
>> 
>> - Sijie
>> 
> 
> 
> 
> -- 
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi


Re: [Discuss] 64 bits ledger id

Posted by Venkateswara Rao Jujjuri <ju...@gmail.com>.
We are using 64 bit ledger ids internally right now, but the ledger id is
supported by the application/caller.
We have extended Hierarchical ledger manger to Long hierarchical ledger
manger for this.

Ultimately we would like to move to 128 bit UUID as the ledger id. That
makes ledgers unique without
the need of centralized ZK/metadata server.

On Tue, Oct 25, 2016 at 4:11 PM, Sijie Guo <si...@apache.org> wrote:

> *Problem: *
>
> Currently the ledger id is long, which it should be 64-bits. However
> currently bookkeeper only can generate 32-bits ledger id as zookeeper's
> sequence znode only produce 32-bits.
>
> This problem was basically raised before at BOOKKEEPER-421. Jiannan has
> already done fair amount of work on this and there were several patches for
> it.
>
> This email thread is to start the discussion for 64-bits ledger id support
> in bookkeeper.
>
> *Discuss*:
>
> Based on bookkeeper-421, the changes will relatively happen in following
> places. Assume the metadata store is ZooKeeper.
>
>
>    1. How to generate 64-bits ledger id? (64 Bits Ledger ID Generation
>    <https://issues.apache.org/jira/browse/BOOKKEEPER-552>)
>    2. How to store the 64-bits ledger id in zookeeper? (New LedgerManager
>    for 64 Bits Ledger ID Management in ZooKeeper
>    <https://issues.apache.org/jira/browse/BOOKKEEPER-553>)
>    3. How can the garbage collect handle correctly with 64-bits ledger id?
> (
>    https://issues.apache.org/jira/browse/BOOKKEEPER-553?
> focusedCommentId=13558192&page=com.atlassian.jira.
> plugin.system.issuetabpanels:comment-tabpanel#comment-13558192
>    )
>    4. How can we upgrade current HierarchicalLedgerManager to support
>    64-bits. [??]
>
> Feel free to take a look at those tickets and make any proposals.
>
> - Sijie
>



-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi