You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Christopher Laux <ct...@googlemail.com> on 2010/03/08 19:18:42 UTC

memory management style

Hi all,

I'm not sure if this is the right list, as it's sort of a development
question too, but I don't want to bother them over there. Anyway, I'm
curious as to the reason for using "manual memory management" a la
ByteBlockPool and consorts in Java. Is it for performance reasons
alone, to avoid the allocation and garbage collection of many small
objects or is there some residue of C-style thinking in the early
years?

Even then, shouldn't there be a more Java-ish solution using the
existing streams classes? Would that be the way to go if one started
over? I realize this is not very realistic, I'm asking out of
curiosity.

Thanks,
Chris

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: memory management style

Posted by Christopher Laux <ct...@googlemail.com>.

On Mon, Mar 8, 2010 at 7:52 PM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> This was done for performance (to remove alloc/init/GC load).
>
> There are two parts to it -- first, consolidating what used to be lots
> of little objects into shared byte[]/int[] blocks.  Second, reusing
> those blocks.

Thanks, just one more question: does anyone know why these are
two-dimensional arrays? It seems more trouble than a one-dimensional
array and I don't see the benefit

-Chris

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: memory management style

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Mon, Mar 8, 2010 at 1:18 PM, Christopher Laux <ct...@googlemail.com> wrote:

> I'm not sure if this is the right list, as it's sort of a development
> question too, but I don't want to bother them over there. Anyway, I'm
> curious as to the reason for using "manual memory management" a la
> ByteBlockPool and consorts in Java. Is it for performance reasons
> alone, to avoid the allocation and garbage collection of many small
> objects or is there some residue of C-style thinking in the early
> years?

This was done for performance (to remove alloc/init/GC load).

There are two parts to it -- first, consolidating what used to be lots
of little objects into shared byte[]/int[] blocks.  Second, reusing
those blocks.

I think the biggest perf gains were from the first (consolidating tiny
objs together), but we probably still have some gains from the second.

A simple test would be to change the pools to not re-use and then
measure indexing throughput.

> Even then, shouldn't there be a more Java-ish solution using the
> existing streams classes? Would that be the way to go if one started
> over? I realize this is not very realistic, I'm asking out of
> curiosity.

Actually that's how Lucene used to work, and then (in 2.3 I think) we
cutover to the current reused blocks ram writing.  If we were to start
over I don't think I'd change much over where we are now, at least on
this aspect of Lucene.  There are plenty of other things I'd change ;)

But... one can always make a custom indexing chain (it's a package
private API now, but possible) to do something totally different.  EG
I think a chain dedicated to inverting tiny docs could show sizable
gains over the default chain Lucene uses> today.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org