You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Arun Kumar K <ar...@gmail.com> on 2013/04/01 13:56:47 UTC

"4.1 consuming more memory than 3.0.2 while Indexing"

Hi Guys,

I have been finding out the heap space requirement for indexing and
searching with 3.0.2 vs 4.1 (with BlockPostings Format).

I have a 2GB index with 1 million docs with around 42 fields with 40 fields
being random strings.

I have seen that memory for search has reduced by 5X with 4.1 (with
BlockPostings Format) but the memory usage during indexing with 4.1 is
around 800MB~1.7 GB whereas for 3.0.2 it is 300~600MB.
But indexing time is almost same with both versions.


So, 2~3X more memory required for 4.1 version for indexing.

What could be the reasons for this ?
Did i go wrong somewhere ?  Can somebody explain ?

Arun

Re: "4.1 consuming more memory than 3.0.2 while Indexing"

Posted by Arun Kumar K <ar...@gmail.com>.
Hi Simon,

I have used default ram settings for index writer config for both versions.
I have used only 1 thread for indexing.
My Systems MaxHeapSize is 2GB and MaxPermSize is 175MB.


Arun


On Mon, Apr 1, 2013 at 7:09 PM, Simon Willnauer
<si...@gmail.com>wrote:

> can you provide some information how much ram you are setting on the
> index writer config?
>
> also how many threads are you using for indexing?
>
> simon
>
> On Mon, Apr 1, 2013 at 2:21 PM, Arun Kumar K <ar...@gmail.com> wrote:
> > Hi Adrien,
> >
> > I have seen memory usage using linux command top for RES memory & i have
> > used SimpleFSDirectory for indexing (NOT FSDirectory which i guess will
> > take it's default as MMapDir)  in both versions.
> >
> > Does 4.1 still take MMapDir as default ?
> >
> >
> > Arun
> >
> >
> > On Mon, Apr 1, 2013 at 5:38 PM, Adrien Grand <jp...@gmail.com> wrote:
> >
> >> On Mon, Apr 1, 2013 at 1:56 PM, Arun Kumar K <ar...@gmail.com>
> wrote:
> >> > Hi Guys,
> >>
> >> Hi,
> >>
> >> > I have been finding out the heap space requirement for indexing and
> >> > searching with 3.0.2 vs 4.1 (with BlockPostings Format).
> >> >
> >> > I have a 2GB index with 1 million docs with around 42 fields with 40
> >> fields
> >> > being random strings.
> >> >
> >> > I have seen that memory for search has reduced by 5X with 4.1 (with
> >> > BlockPostings Format) but the memory usage during indexing with 4.1 is
> >> > around 800MB~1.7 GB whereas for 3.0.2 it is 300~600MB.
> >> > But indexing time is almost same with both versions.
> >>
> >> How did you measure memory usage ? Operating systems may report high
> >> memory usage because MMapDirectory became the new default directory
> >> implementation and memory-mapped files are taken into account in the
> >> virtual memory of your processes. See
> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> >> for more information.
> >>
> >> --
> >> Adrien
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: "4.1 consuming more memory than 3.0.2 while Indexing"

Posted by Simon Willnauer <si...@gmail.com>.
can you provide some information how much ram you are setting on the
index writer config?

also how many threads are you using for indexing?

simon

On Mon, Apr 1, 2013 at 2:21 PM, Arun Kumar K <ar...@gmail.com> wrote:
> Hi Adrien,
>
> I have seen memory usage using linux command top for RES memory & i have
> used SimpleFSDirectory for indexing (NOT FSDirectory which i guess will
> take it's default as MMapDir)  in both versions.
>
> Does 4.1 still take MMapDir as default ?
>
>
> Arun
>
>
> On Mon, Apr 1, 2013 at 5:38 PM, Adrien Grand <jp...@gmail.com> wrote:
>
>> On Mon, Apr 1, 2013 at 1:56 PM, Arun Kumar K <ar...@gmail.com> wrote:
>> > Hi Guys,
>>
>> Hi,
>>
>> > I have been finding out the heap space requirement for indexing and
>> > searching with 3.0.2 vs 4.1 (with BlockPostings Format).
>> >
>> > I have a 2GB index with 1 million docs with around 42 fields with 40
>> fields
>> > being random strings.
>> >
>> > I have seen that memory for search has reduced by 5X with 4.1 (with
>> > BlockPostings Format) but the memory usage during indexing with 4.1 is
>> > around 800MB~1.7 GB whereas for 3.0.2 it is 300~600MB.
>> > But indexing time is almost same with both versions.
>>
>> How did you measure memory usage ? Operating systems may report high
>> memory usage because MMapDirectory became the new default directory
>> implementation and memory-mapped files are taken into account in the
>> virtual memory of your processes. See
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>> for more information.
>>
>> --
>> Adrien
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: "4.1 consuming more memory than 3.0.2 while Indexing"

Posted by Arun Kumar K <ar...@gmail.com>.
Hi Adrien,

I have seen memory usage using linux command top for RES memory & i have
used SimpleFSDirectory for indexing (NOT FSDirectory which i guess will
take it's default as MMapDir)  in both versions.

Does 4.1 still take MMapDir as default ?


Arun


On Mon, Apr 1, 2013 at 5:38 PM, Adrien Grand <jp...@gmail.com> wrote:

> On Mon, Apr 1, 2013 at 1:56 PM, Arun Kumar K <ar...@gmail.com> wrote:
> > Hi Guys,
>
> Hi,
>
> > I have been finding out the heap space requirement for indexing and
> > searching with 3.0.2 vs 4.1 (with BlockPostings Format).
> >
> > I have a 2GB index with 1 million docs with around 42 fields with 40
> fields
> > being random strings.
> >
> > I have seen that memory for search has reduced by 5X with 4.1 (with
> > BlockPostings Format) but the memory usage during indexing with 4.1 is
> > around 800MB~1.7 GB whereas for 3.0.2 it is 300~600MB.
> > But indexing time is almost same with both versions.
>
> How did you measure memory usage ? Operating systems may report high
> memory usage because MMapDirectory became the new default directory
> implementation and memory-mapped files are taken into account in the
> virtual memory of your processes. See
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> for more information.
>
> --
> Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: "4.1 consuming more memory than 3.0.2 while Indexing"

Posted by Adrien Grand <jp...@gmail.com>.
On Mon, Apr 1, 2013 at 1:56 PM, Arun Kumar K <ar...@gmail.com> wrote:
> Hi Guys,

Hi,

> I have been finding out the heap space requirement for indexing and
> searching with 3.0.2 vs 4.1 (with BlockPostings Format).
>
> I have a 2GB index with 1 million docs with around 42 fields with 40 fields
> being random strings.
>
> I have seen that memory for search has reduced by 5X with 4.1 (with
> BlockPostings Format) but the memory usage during indexing with 4.1 is
> around 800MB~1.7 GB whereas for 3.0.2 it is 300~600MB.
> But indexing time is almost same with both versions.

How did you measure memory usage ? Operating systems may report high
memory usage because MMapDirectory became the new default directory
implementation and memory-mapped files are taken into account in the
virtual memory of your processes. See
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
for more information.

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org