You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Ivan Kelly <iv...@yahoo-inc.com> on 2011/07/13 18:30:55 UTC

Issue with addEntry api

I'm having an issue with the LedgerHandle#addEntry api.

[1] best illustrates it. I'm buffering namenode transactions in the stream and only transmitting when either flush is called or I have enough data to pass my threshold. This means I have a byte buffer in my class which I fill up as new transactions come in. When I transmit, I set this buffer as an entry to bookkeeper. I.e. N whole namenode transactions will be contained in 1 single bk entry. 

The problem is this byte buffer (DataOutputBuffer in this case). I reuse the same buffer over and over. But this buffer has a fixed size. If I transmit before it is full, the whole buffer size will be transmitted anyhow. If the buffer is being reused, this will retransmit old transactions out of order. For example, in the first use, the buffer fills with, [a,b,c,d,e] and adds this as an entry and resets the byte buffer. Then transaction f is  added and flushed, in this case [f,b,c,d,e] is not transmitted. 

What I need is the ability to set offset and length in the byte[] passed to addEntry. Is there a reason this wasn't added in the initial implementation? If not, and if you agree this is a valid usecase, ill open a JIRA and add this functionality. Im getting around this now by doing an extra Array.copyOf which is less than ideal.

-Ivan



[1] https://github.com/ivankelly/hadoop-common/blob/HDFS-1580-BK/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogOutputStream.java#L149

Re: Issue with addEntry api

Posted by Flavio Junqueira <fp...@yahoo-inc.com>.
Sounds good to me as well, so +1 for opening a jira and discussing how  
to do it.

I also wanted to point out that BOOKKEEPER-11 will be changing the  
client API (adding new calls, not changing existing ones). Anyone  
interested or with an opinion, please have a look.

-Flavio

On Jul 13, 2011, at 11:34 PM, Utkarsh Srivastava wrote:

> Seems reasonable to me.
>
> Is it worth using ByteBuffer instead of passing around offset and  
> limit
> separately?
>
> Utkarsh
>
> On Wed, Jul 13, 2011 at 9:30 AM, Ivan Kelly <iv...@yahoo-inc.com>  
> wrote:
>
>> I'm having an issue with the LedgerHandle#addEntry api.
>>
>> [1] best illustrates it. I'm buffering namenode transactions in the  
>> stream
>> and only transmitting when either flush is called or I have enough  
>> data to
>> pass my threshold. This means I have a byte buffer in my class  
>> which I fill
>> up as new transactions come in. When I transmit, I set this buffer  
>> as an
>> entry to bookkeeper. I.e. N whole namenode transactions will be  
>> contained in
>> 1 single bk entry.
>>
>> The problem is this byte buffer (DataOutputBuffer in this case). I  
>> reuse
>> the same buffer over and over. But this buffer has a fixed size. If I
>> transmit before it is full, the whole buffer size will be transmitted
>> anyhow. If the buffer is being reused, this will retransmit old  
>> transactions
>> out of order. For example, in the first use, the buffer fills with,
>> [a,b,c,d,e] and adds this as an entry and resets the byte buffer.  
>> Then
>> transaction f is  added and flushed, in this case [f,b,c,d,e] is not
>> transmitted.
>>
>> What I need is the ability to set offset and length in the byte[]  
>> passed to
>> addEntry. Is there a reason this wasn't added in the initial  
>> implementation?
>> If not, and if you agree this is a valid usecase, ill open a JIRA  
>> and add
>> this functionality. Im getting around this now by doing an extra
>> Array.copyOf which is less than ideal.
>>
>> -Ivan
>>
>>
>>
>> [1]
>> https://github.com/ivankelly/hadoop-common/blob/HDFS-1580-BK/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogOutputStream.java 
>> #L149

flavio
junqueira

research scientist

fpj@yahoo-inc.com
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301




Re: Issue with addEntry api

Posted by Ivan Kelly <iv...@yahoo-inc.com>.
I created Bookkeeper-33 to discuss this.

-Ivan

On 13 Jul 2011, at 23:34, Utkarsh Srivastava wrote:

> Seems reasonable to me.
> 
> Is it worth using ByteBuffer instead of passing around offset and limit
> separately?
> 
> Utkarsh
> 
> On Wed, Jul 13, 2011 at 9:30 AM, Ivan Kelly <iv...@yahoo-inc.com> wrote:
> 
>> I'm having an issue with the LedgerHandle#addEntry api.
>> 
>> [1] best illustrates it. I'm buffering namenode transactions in the stream
>> and only transmitting when either flush is called or I have enough data to
>> pass my threshold. This means I have a byte buffer in my class which I fill
>> up as new transactions come in. When I transmit, I set this buffer as an
>> entry to bookkeeper. I.e. N whole namenode transactions will be contained in
>> 1 single bk entry.
>> 
>> The problem is this byte buffer (DataOutputBuffer in this case). I reuse
>> the same buffer over and over. But this buffer has a fixed size. If I
>> transmit before it is full, the whole buffer size will be transmitted
>> anyhow. If the buffer is being reused, this will retransmit old transactions
>> out of order. For example, in the first use, the buffer fills with,
>> [a,b,c,d,e] and adds this as an entry and resets the byte buffer. Then
>> transaction f is  added and flushed, in this case [f,b,c,d,e] is not
>> transmitted.
>> 
>> What I need is the ability to set offset and length in the byte[] passed to
>> addEntry. Is there a reason this wasn't added in the initial implementation?
>> If not, and if you agree this is a valid usecase, ill open a JIRA and add
>> this functionality. Im getting around this now by doing an extra
>> Array.copyOf which is less than ideal.
>> 
>> -Ivan
>> 
>> 
>> 
>> [1]
>> https://github.com/ivankelly/hadoop-common/blob/HDFS-1580-BK/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogOutputStream.java#L149


Re: Issue with addEntry api

Posted by Utkarsh Srivastava <ut...@twitter.com>.
Seems reasonable to me.

Is it worth using ByteBuffer instead of passing around offset and limit
separately?

Utkarsh

On Wed, Jul 13, 2011 at 9:30 AM, Ivan Kelly <iv...@yahoo-inc.com> wrote:

> I'm having an issue with the LedgerHandle#addEntry api.
>
> [1] best illustrates it. I'm buffering namenode transactions in the stream
> and only transmitting when either flush is called or I have enough data to
> pass my threshold. This means I have a byte buffer in my class which I fill
> up as new transactions come in. When I transmit, I set this buffer as an
> entry to bookkeeper. I.e. N whole namenode transactions will be contained in
> 1 single bk entry.
>
> The problem is this byte buffer (DataOutputBuffer in this case). I reuse
> the same buffer over and over. But this buffer has a fixed size. If I
> transmit before it is full, the whole buffer size will be transmitted
> anyhow. If the buffer is being reused, this will retransmit old transactions
> out of order. For example, in the first use, the buffer fills with,
> [a,b,c,d,e] and adds this as an entry and resets the byte buffer. Then
> transaction f is  added and flushed, in this case [f,b,c,d,e] is not
> transmitted.
>
> What I need is the ability to set offset and length in the byte[] passed to
> addEntry. Is there a reason this wasn't added in the initial implementation?
> If not, and if you agree this is a valid usecase, ill open a JIRA and add
> this functionality. Im getting around this now by doing an extra
> Array.copyOf which is less than ideal.
>
> -Ivan
>
>
>
> [1]
> https://github.com/ivankelly/hadoop-common/blob/HDFS-1580-BK/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogOutputStream.java#L149