You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Ivan Kelly <iv...@yahoo-inc.com> on 2011/07/13 18:30:55 UTC
Issue with addEntry api
I'm having an issue with the LedgerHandle#addEntry api.
[1] best illustrates it. I'm buffering namenode transactions in the stream and only transmitting when either flush is called or I have enough data to pass my threshold. This means I have a byte buffer in my class which I fill up as new transactions come in. When I transmit, I set this buffer as an entry to bookkeeper. I.e. N whole namenode transactions will be contained in 1 single bk entry.
The problem is this byte buffer (DataOutputBuffer in this case). I reuse the same buffer over and over. But this buffer has a fixed size. If I transmit before it is full, the whole buffer size will be transmitted anyhow. If the buffer is being reused, this will retransmit old transactions out of order. For example, in the first use, the buffer fills with, [a,b,c,d,e] and adds this as an entry and resets the byte buffer. Then transaction f is added and flushed, in this case [f,b,c,d,e] is not transmitted.
What I need is the ability to set offset and length in the byte[] passed to addEntry. Is there a reason this wasn't added in the initial implementation? If not, and if you agree this is a valid usecase, ill open a JIRA and add this functionality. Im getting around this now by doing an extra Array.copyOf which is less than ideal.
-Ivan
[1] https://github.com/ivankelly/hadoop-common/blob/HDFS-1580-BK/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogOutputStream.java#L149
Re: Issue with addEntry api
Posted by Flavio Junqueira <fp...@yahoo-inc.com>.
Sounds good to me as well, so +1 for opening a jira and discussing how
to do it.
I also wanted to point out that BOOKKEEPER-11 will be changing the
client API (adding new calls, not changing existing ones). Anyone
interested or with an opinion, please have a look.
-Flavio
On Jul 13, 2011, at 11:34 PM, Utkarsh Srivastava wrote:
> Seems reasonable to me.
>
> Is it worth using ByteBuffer instead of passing around offset and
> limit
> separately?
>
> Utkarsh
>
> On Wed, Jul 13, 2011 at 9:30 AM, Ivan Kelly <iv...@yahoo-inc.com>
> wrote:
>
>> I'm having an issue with the LedgerHandle#addEntry api.
>>
>> [1] best illustrates it. I'm buffering namenode transactions in the
>> stream
>> and only transmitting when either flush is called or I have enough
>> data to
>> pass my threshold. This means I have a byte buffer in my class
>> which I fill
>> up as new transactions come in. When I transmit, I set this buffer
>> as an
>> entry to bookkeeper. I.e. N whole namenode transactions will be
>> contained in
>> 1 single bk entry.
>>
>> The problem is this byte buffer (DataOutputBuffer in this case). I
>> reuse
>> the same buffer over and over. But this buffer has a fixed size. If I
>> transmit before it is full, the whole buffer size will be transmitted
>> anyhow. If the buffer is being reused, this will retransmit old
>> transactions
>> out of order. For example, in the first use, the buffer fills with,
>> [a,b,c,d,e] and adds this as an entry and resets the byte buffer.
>> Then
>> transaction f is added and flushed, in this case [f,b,c,d,e] is not
>> transmitted.
>>
>> What I need is the ability to set offset and length in the byte[]
>> passed to
>> addEntry. Is there a reason this wasn't added in the initial
>> implementation?
>> If not, and if you agree this is a valid usecase, ill open a JIRA
>> and add
>> this functionality. Im getting around this now by doing an extra
>> Array.copyOf which is less than ideal.
>>
>> -Ivan
>>
>>
>>
>> [1]
>> https://github.com/ivankelly/hadoop-common/blob/HDFS-1580-BK/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogOutputStream.java
>> #L149
flavio
junqueira
research scientist
fpj@yahoo-inc.com
direct +34 93-183-8828
avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300 fax (408) 349 3301
Re: Issue with addEntry api
Posted by Ivan Kelly <iv...@yahoo-inc.com>.
I created Bookkeeper-33 to discuss this.
-Ivan
On 13 Jul 2011, at 23:34, Utkarsh Srivastava wrote:
> Seems reasonable to me.
>
> Is it worth using ByteBuffer instead of passing around offset and limit
> separately?
>
> Utkarsh
>
> On Wed, Jul 13, 2011 at 9:30 AM, Ivan Kelly <iv...@yahoo-inc.com> wrote:
>
>> I'm having an issue with the LedgerHandle#addEntry api.
>>
>> [1] best illustrates it. I'm buffering namenode transactions in the stream
>> and only transmitting when either flush is called or I have enough data to
>> pass my threshold. This means I have a byte buffer in my class which I fill
>> up as new transactions come in. When I transmit, I set this buffer as an
>> entry to bookkeeper. I.e. N whole namenode transactions will be contained in
>> 1 single bk entry.
>>
>> The problem is this byte buffer (DataOutputBuffer in this case). I reuse
>> the same buffer over and over. But this buffer has a fixed size. If I
>> transmit before it is full, the whole buffer size will be transmitted
>> anyhow. If the buffer is being reused, this will retransmit old transactions
>> out of order. For example, in the first use, the buffer fills with,
>> [a,b,c,d,e] and adds this as an entry and resets the byte buffer. Then
>> transaction f is added and flushed, in this case [f,b,c,d,e] is not
>> transmitted.
>>
>> What I need is the ability to set offset and length in the byte[] passed to
>> addEntry. Is there a reason this wasn't added in the initial implementation?
>> If not, and if you agree this is a valid usecase, ill open a JIRA and add
>> this functionality. Im getting around this now by doing an extra
>> Array.copyOf which is less than ideal.
>>
>> -Ivan
>>
>>
>>
>> [1]
>> https://github.com/ivankelly/hadoop-common/blob/HDFS-1580-BK/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogOutputStream.java#L149
Re: Issue with addEntry api
Posted by Utkarsh Srivastava <ut...@twitter.com>.
Seems reasonable to me.
Is it worth using ByteBuffer instead of passing around offset and limit
separately?
Utkarsh
On Wed, Jul 13, 2011 at 9:30 AM, Ivan Kelly <iv...@yahoo-inc.com> wrote:
> I'm having an issue with the LedgerHandle#addEntry api.
>
> [1] best illustrates it. I'm buffering namenode transactions in the stream
> and only transmitting when either flush is called or I have enough data to
> pass my threshold. This means I have a byte buffer in my class which I fill
> up as new transactions come in. When I transmit, I set this buffer as an
> entry to bookkeeper. I.e. N whole namenode transactions will be contained in
> 1 single bk entry.
>
> The problem is this byte buffer (DataOutputBuffer in this case). I reuse
> the same buffer over and over. But this buffer has a fixed size. If I
> transmit before it is full, the whole buffer size will be transmitted
> anyhow. If the buffer is being reused, this will retransmit old transactions
> out of order. For example, in the first use, the buffer fills with,
> [a,b,c,d,e] and adds this as an entry and resets the byte buffer. Then
> transaction f is added and flushed, in this case [f,b,c,d,e] is not
> transmitted.
>
> What I need is the ability to set offset and length in the byte[] passed to
> addEntry. Is there a reason this wasn't added in the initial implementation?
> If not, and if you agree this is a valid usecase, ill open a JIRA and add
> this functionality. Im getting around this now by doing an extra
> Array.copyOf which is less than ideal.
>
> -Ivan
>
>
>
> [1]
> https://github.com/ivankelly/hadoop-common/blob/HDFS-1580-BK/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogOutputStream.java#L149