You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Kay Kay <ka...@gmail.com> on 2009/12/30 09:36:18 UTC
clearing o.a.h.io.Text
In o.a.h.io.Text - the clear method currently just resets length to 0,
while not doing anything about the bytes internally.
Curious to know the thoughts behind the decision (to let the internal
bytes to be reused for future appends vs. memory leaks due to not
clearing them ) ? Thanks.
$ svn diff
Index: src/java/org/apache/hadoop/io/Text.java
===================================================================
--- src/java/org/apache/hadoop/io/Text.java (revision 894545)
+++ src/java/org/apache/hadoop/io/Text.java (working copy)
@@ -224,6 +224,7 @@
*/
public void clear() {
length = 0;
+ bytes = EMPTY_BYTES;
}
Re: clearing o.a.h.io.Text
Posted by Owen O'Malley <ow...@gmail.com>.
On Jan 1, 2010, at 2:39 AM, Kay Kay <ka...@gmail.com> wrote:
>
> I believe that behavior would be surprising to the user if they were
> expecting the object resources to be released entirely, by calling the
> clear() method.
I disagree. Clear only promises to reset to the empty string. It
doesn't imply freeing resources.
> May be - clear() can reset the internal byte buffer and another method
> provided - called reset() / rewind() that can reuse the existing
> internal
> buffer while resetting the length variable only.
Changing semantics of Text methods is very difficult. Clear is exactly
the right verb for what it does. A patch that makes the Javadoc clear
would be appriciated.
Once we have setCapacity, a lot of these issues go away.
txt.setCapacity(0)
Is very clear what your intent is.
-- Owen
Re: clearing o.a.h.io.Text
Posted by Kay Kay <ka...@gmail.com>.
On Thu, Dec 31, 2009 at 11:03 PM, Owen O'Malley <om...@apache.org> wrote:
>
> On Dec 30, 2009, at 12:36 AM, Kay Kay wrote:
>
> In o.a.h.io.Text - the clear method currently just resets length to 0,
>> while not doing anything about the bytes internally.
>>
>> Curious to know the thoughts behind the decision (to let the internal
>> bytes to be reused for future appends vs. memory leaks due to not clearing
>> them ) ?
>>
>
> The byte array that backs up the Text object is always reused.
I believe that behavior would be surprising to the user if they were
expecting the object resources to be released entirely, by calling the
clear() method.
May be - clear() can reset the internal byte buffer and another method
provided - called reset() / rewind() that can reuse the existing internal
buffer while resetting the length variable only.
> It might make sense to have a setCapacity method on Text that is similar to
> BytesWritable's. With such a method, it would be possible to shrink the size
> of the backing array.
>
>
HADOOP-6476 in place for this.
> -- Owen
>
Re: clearing o.a.h.io.Text
Posted by Owen O'Malley <om...@apache.org>.
On Dec 30, 2009, at 12:36 AM, Kay Kay wrote:
> In o.a.h.io.Text - the clear method currently just resets length to
> 0, while not doing anything about the bytes internally.
>
> Curious to know the thoughts behind the decision (to let the
> internal bytes to be reused for future appends vs. memory leaks due
> to not clearing them ) ?
The byte array that backs up the Text object is always reused. It
might make sense to have a setCapacity method on Text that is similar
to BytesWritable's. With such a method, it would be possible to shrink
the size of the backing array.
-- Owen