You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kissue Kissue <ki...@gmail.com> on 2011/09/28 13:56:07 UTC

Still too many files after running solr optimization

Hi,

I am using solr 3.3. I noticed  that after indexing about 700, 000 records
and running optimization at the end, i still have about 91 files in my index
directory. I thought that optimization was supposed to reduce the number of
files.

My settings are the default that came with Solr (mergefactor, etc)

Any ideas what i could be doing wrong?

Re: Still too many files after running solr optimization

Posted by Chris Hostetter <ho...@fucit.org>.
: I was worried because when i used to use only Lucene for the same indexing,
: before optimization there are many files but after optimization i always end
: up with just 3 files in my index filder. Just want to find out if this was
: ok.

It sounds like you were most likely using the "Compound File Format" 
(which causes multiple per-field files to be encapsultated into a single 
file per segment) when you were using Lucene directly (i believe it is the 
default) but in Solr you are not.

check the "<useCompoundFile>" setting(s) in your solrconfig.xml

https://lucene.apache.org/java/3_4_0/fileformats.html#Compound%20Files

For most Solr users, the compound file format is a bad idea because it 
can decreases performance -- the only reason to use it is if you are in a 
heavily constraind setup where you need to be very restrictive about the 
number of open file handles.


-Hoss

Re: Still too many files after running solr optimization

Posted by Vadim Kisselmann <v....@googlemail.com>.
we had an understanding problem:)

docs are the docs in index.
files are the files in the index directory (index parts).

during the optimization you don't delete docs if they are don't flagged as
deleted.
but you merge your index und delete the files in your index directory, thats
right.

after an second optimize the files are deleted which were opened for
reading.

Regards



2011/9/28 Manish Bafna <ma...@gmail.com>

> We tested it so many times.
> 1st time we optimize, the new index file is created (merged one), but
> the existing index files are not deleted (because they might be still
> open for reading)
> 2nd time optimize, other than the new index file, all else gets deleted.
>
> This is happening specifically on Windows.
>
> On Wed, Sep 28, 2011 at 8:23 PM, Vadim Kisselmann
> <v....@googlemail.com> wrote:
> > 2011/9/28 Manish Bafna <ma...@gmail.com>
> >
> >> >>Will it not merge the index?
> >>
> >
> > yes
> >
> >
> >> >>While merging on windows, the old index files dont get deleted.
> >> >>(Windows has an issue where the file opened for reading cannot be
> >> >>deleted)
> >> >>
> >> >>So, if you call optimize again, it will delete the older index files.
> >>
> >> no.
> > during optimize you only delete docs, which are flagged as deleted. no
> > matter how old they are.
> > if your numDocs and maxDocs have the same number of Docs, you only
> rebuild
> > and merge your index, but you delete nothing.
> >
> > Regards
> >
> >
> >
> >
> >> On Wed, Sep 28, 2011 at 6:43 PM, Vadim Kisselmann
> >> <v....@googlemail.com> wrote:
> >> > if numDocs und maxDocs have the same mumber of docs nothing will be
> >> deleted
> >> > on optimize.
> >> > You only rebuild your index.
> >> >
> >> > Regards
> >> > Vadim
> >> >
> >> >
> >> >
> >> >
> >> > 2011/9/28 Kissue Kissue <ki...@gmail.com>
> >> >
> >> >> numDocs and maxDocs are same size.
> >> >>
> >> >> I was worried because when i used to use only Lucene for the same
> >> indexing,
> >> >> before optimization there are many files but after optimization i
> always
> >> >> end
> >> >> up with just 3 files in my index filder. Just want to find out if
> this
> >> was
> >> >> ok.
> >> >>
> >> >> Thanks
> >> >>
> >> >> On Wed, Sep 28, 2011 at 1:23 PM, Vadim Kisselmann <
> >> >> v.kisselmann@googlemail.com> wrote:
> >> >>
> >> >> > why should the optimization reduce the number of files?
> >> >> > It happens only when you indexing docs with same unique key.
> >> >> >
> >> >> > Have you differences in numDocs und maxDocs after optimize?
> >> >> > If yes:
> >> >> > how is your optimize command ?
> >> >> >
> >> >> > Regards
> >> >> > Vadim
> >> >> >
> >> >> >
> >> >> >
> >> >> > 2011/9/28 Manish Bafna <ma...@gmail.com>
> >> >> >
> >> >> > > Try to do optimize twice.
> >> >> > > The 2nd one will be quick and will delete lot of files.
> >> >> > >
> >> >> > > On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue <
> kissuenow@gmail.com
> >> >
> >> >> > > wrote:
> >> >> > > > Hi,
> >> >> > > >
> >> >> > > > I am using solr 3.3. I noticed  that after indexing about 700,
> 000
> >> >> > > records
> >> >> > > > and running optimization at the end, i still have about 91
> files
> >> in
> >> >> my
> >> >> > > index
> >> >> > > > directory. I thought that optimization was supposed to reduce
> the
> >> >> > number
> >> >> > > of
> >> >> > > > files.
> >> >> > > >
> >> >> > > > My settings are the default that came with Solr (mergefactor,
> etc)
> >> >> > > >
> >> >> > > > Any ideas what i could be doing wrong?
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> >
> >>
> >
>

Re: Still too many files after running solr optimization

Posted by Manish Bafna <ma...@gmail.com>.
We tested it so many times.
1st time we optimize, the new index file is created (merged one), but
the existing index files are not deleted (because they might be still
open for reading)
2nd time optimize, other than the new index file, all else gets deleted.

This is happening specifically on Windows.

On Wed, Sep 28, 2011 at 8:23 PM, Vadim Kisselmann
<v....@googlemail.com> wrote:
> 2011/9/28 Manish Bafna <ma...@gmail.com>
>
>> >>Will it not merge the index?
>>
>
> yes
>
>
>> >>While merging on windows, the old index files dont get deleted.
>> >>(Windows has an issue where the file opened for reading cannot be
>> >>deleted)
>> >>
>> >>So, if you call optimize again, it will delete the older index files.
>>
>> no.
> during optimize you only delete docs, which are flagged as deleted. no
> matter how old they are.
> if your numDocs and maxDocs have the same number of Docs, you only rebuild
> and merge your index, but you delete nothing.
>
> Regards
>
>
>
>
>> On Wed, Sep 28, 2011 at 6:43 PM, Vadim Kisselmann
>> <v....@googlemail.com> wrote:
>> > if numDocs und maxDocs have the same mumber of docs nothing will be
>> deleted
>> > on optimize.
>> > You only rebuild your index.
>> >
>> > Regards
>> > Vadim
>> >
>> >
>> >
>> >
>> > 2011/9/28 Kissue Kissue <ki...@gmail.com>
>> >
>> >> numDocs and maxDocs are same size.
>> >>
>> >> I was worried because when i used to use only Lucene for the same
>> indexing,
>> >> before optimization there are many files but after optimization i always
>> >> end
>> >> up with just 3 files in my index filder. Just want to find out if this
>> was
>> >> ok.
>> >>
>> >> Thanks
>> >>
>> >> On Wed, Sep 28, 2011 at 1:23 PM, Vadim Kisselmann <
>> >> v.kisselmann@googlemail.com> wrote:
>> >>
>> >> > why should the optimization reduce the number of files?
>> >> > It happens only when you indexing docs with same unique key.
>> >> >
>> >> > Have you differences in numDocs und maxDocs after optimize?
>> >> > If yes:
>> >> > how is your optimize command ?
>> >> >
>> >> > Regards
>> >> > Vadim
>> >> >
>> >> >
>> >> >
>> >> > 2011/9/28 Manish Bafna <ma...@gmail.com>
>> >> >
>> >> > > Try to do optimize twice.
>> >> > > The 2nd one will be quick and will delete lot of files.
>> >> > >
>> >> > > On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue <kissuenow@gmail.com
>> >
>> >> > > wrote:
>> >> > > > Hi,
>> >> > > >
>> >> > > > I am using solr 3.3. I noticed  that after indexing about 700, 000
>> >> > > records
>> >> > > > and running optimization at the end, i still have about 91 files
>> in
>> >> my
>> >> > > index
>> >> > > > directory. I thought that optimization was supposed to reduce the
>> >> > number
>> >> > > of
>> >> > > > files.
>> >> > > >
>> >> > > > My settings are the default that came with Solr (mergefactor, etc)
>> >> > > >
>> >> > > > Any ideas what i could be doing wrong?
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >
>>
>

Re: Still too many files after running solr optimization

Posted by Vadim Kisselmann <v....@googlemail.com>.
2011/9/28 Manish Bafna <ma...@gmail.com>

> >>Will it not merge the index?
>

yes


> >>While merging on windows, the old index files dont get deleted.
> >>(Windows has an issue where the file opened for reading cannot be
> >>deleted)
> >>
> >>So, if you call optimize again, it will delete the older index files.
>
> no.
during optimize you only delete docs, which are flagged as deleted. no
matter how old they are.
if your numDocs and maxDocs have the same number of Docs, you only rebuild
and merge your index, but you delete nothing.

Regards




> On Wed, Sep 28, 2011 at 6:43 PM, Vadim Kisselmann
> <v....@googlemail.com> wrote:
> > if numDocs und maxDocs have the same mumber of docs nothing will be
> deleted
> > on optimize.
> > You only rebuild your index.
> >
> > Regards
> > Vadim
> >
> >
> >
> >
> > 2011/9/28 Kissue Kissue <ki...@gmail.com>
> >
> >> numDocs and maxDocs are same size.
> >>
> >> I was worried because when i used to use only Lucene for the same
> indexing,
> >> before optimization there are many files but after optimization i always
> >> end
> >> up with just 3 files in my index filder. Just want to find out if this
> was
> >> ok.
> >>
> >> Thanks
> >>
> >> On Wed, Sep 28, 2011 at 1:23 PM, Vadim Kisselmann <
> >> v.kisselmann@googlemail.com> wrote:
> >>
> >> > why should the optimization reduce the number of files?
> >> > It happens only when you indexing docs with same unique key.
> >> >
> >> > Have you differences in numDocs und maxDocs after optimize?
> >> > If yes:
> >> > how is your optimize command ?
> >> >
> >> > Regards
> >> > Vadim
> >> >
> >> >
> >> >
> >> > 2011/9/28 Manish Bafna <ma...@gmail.com>
> >> >
> >> > > Try to do optimize twice.
> >> > > The 2nd one will be quick and will delete lot of files.
> >> > >
> >> > > On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue <kissuenow@gmail.com
> >
> >> > > wrote:
> >> > > > Hi,
> >> > > >
> >> > > > I am using solr 3.3. I noticed  that after indexing about 700, 000
> >> > > records
> >> > > > and running optimization at the end, i still have about 91 files
> in
> >> my
> >> > > index
> >> > > > directory. I thought that optimization was supposed to reduce the
> >> > number
> >> > > of
> >> > > > files.
> >> > > >
> >> > > > My settings are the default that came with Solr (mergefactor, etc)
> >> > > >
> >> > > > Any ideas what i could be doing wrong?
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: Still too many files after running solr optimization

Posted by Manish Bafna <ma...@gmail.com>.
Will it not merge the index?
While merging on windows, the old index files dont get deleted.
(Windows has an issue where the file opened for reading cannot be
deleted)

So, if you call optimize again, it will delete the older index files.

On Wed, Sep 28, 2011 at 6:43 PM, Vadim Kisselmann
<v....@googlemail.com> wrote:
> if numDocs und maxDocs have the same mumber of docs nothing will be deleted
> on optimize.
> You only rebuild your index.
>
> Regards
> Vadim
>
>
>
>
> 2011/9/28 Kissue Kissue <ki...@gmail.com>
>
>> numDocs and maxDocs are same size.
>>
>> I was worried because when i used to use only Lucene for the same indexing,
>> before optimization there are many files but after optimization i always
>> end
>> up with just 3 files in my index filder. Just want to find out if this was
>> ok.
>>
>> Thanks
>>
>> On Wed, Sep 28, 2011 at 1:23 PM, Vadim Kisselmann <
>> v.kisselmann@googlemail.com> wrote:
>>
>> > why should the optimization reduce the number of files?
>> > It happens only when you indexing docs with same unique key.
>> >
>> > Have you differences in numDocs und maxDocs after optimize?
>> > If yes:
>> > how is your optimize command ?
>> >
>> > Regards
>> > Vadim
>> >
>> >
>> >
>> > 2011/9/28 Manish Bafna <ma...@gmail.com>
>> >
>> > > Try to do optimize twice.
>> > > The 2nd one will be quick and will delete lot of files.
>> > >
>> > > On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue <ki...@gmail.com>
>> > > wrote:
>> > > > Hi,
>> > > >
>> > > > I am using solr 3.3. I noticed  that after indexing about 700, 000
>> > > records
>> > > > and running optimization at the end, i still have about 91 files in
>> my
>> > > index
>> > > > directory. I thought that optimization was supposed to reduce the
>> > number
>> > > of
>> > > > files.
>> > > >
>> > > > My settings are the default that came with Solr (mergefactor, etc)
>> > > >
>> > > > Any ideas what i could be doing wrong?
>> > > >
>> > >
>> >
>>
>

Re: Still too many files after running solr optimization

Posted by Vadim Kisselmann <v....@googlemail.com>.
if numDocs und maxDocs have the same mumber of docs nothing will be deleted
on optimize.
You only rebuild your index.

Regards
Vadim




2011/9/28 Kissue Kissue <ki...@gmail.com>

> numDocs and maxDocs are same size.
>
> I was worried because when i used to use only Lucene for the same indexing,
> before optimization there are many files but after optimization i always
> end
> up with just 3 files in my index filder. Just want to find out if this was
> ok.
>
> Thanks
>
> On Wed, Sep 28, 2011 at 1:23 PM, Vadim Kisselmann <
> v.kisselmann@googlemail.com> wrote:
>
> > why should the optimization reduce the number of files?
> > It happens only when you indexing docs with same unique key.
> >
> > Have you differences in numDocs und maxDocs after optimize?
> > If yes:
> > how is your optimize command ?
> >
> > Regards
> > Vadim
> >
> >
> >
> > 2011/9/28 Manish Bafna <ma...@gmail.com>
> >
> > > Try to do optimize twice.
> > > The 2nd one will be quick and will delete lot of files.
> > >
> > > On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue <ki...@gmail.com>
> > > wrote:
> > > > Hi,
> > > >
> > > > I am using solr 3.3. I noticed  that after indexing about 700, 000
> > > records
> > > > and running optimization at the end, i still have about 91 files in
> my
> > > index
> > > > directory. I thought that optimization was supposed to reduce the
> > number
> > > of
> > > > files.
> > > >
> > > > My settings are the default that came with Solr (mergefactor, etc)
> > > >
> > > > Any ideas what i could be doing wrong?
> > > >
> > >
> >
>

Re: Still too many files after running solr optimization

Posted by Kissue Kissue <ki...@gmail.com>.
numDocs and maxDocs are same size.

I was worried because when i used to use only Lucene for the same indexing,
before optimization there are many files but after optimization i always end
up with just 3 files in my index filder. Just want to find out if this was
ok.

Thanks

On Wed, Sep 28, 2011 at 1:23 PM, Vadim Kisselmann <
v.kisselmann@googlemail.com> wrote:

> why should the optimization reduce the number of files?
> It happens only when you indexing docs with same unique key.
>
> Have you differences in numDocs und maxDocs after optimize?
> If yes:
> how is your optimize command ?
>
> Regards
> Vadim
>
>
>
> 2011/9/28 Manish Bafna <ma...@gmail.com>
>
> > Try to do optimize twice.
> > The 2nd one will be quick and will delete lot of files.
> >
> > On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue <ki...@gmail.com>
> > wrote:
> > > Hi,
> > >
> > > I am using solr 3.3. I noticed  that after indexing about 700, 000
> > records
> > > and running optimization at the end, i still have about 91 files in my
> > index
> > > directory. I thought that optimization was supposed to reduce the
> number
> > of
> > > files.
> > >
> > > My settings are the default that came with Solr (mergefactor, etc)
> > >
> > > Any ideas what i could be doing wrong?
> > >
> >
>

Re: Still too many files after running solr optimization

Posted by Vadim Kisselmann <v....@googlemail.com>.
why should the optimization reduce the number of files?
It happens only when you indexing docs with same unique key.

Have you differences in numDocs und maxDocs after optimize?
If yes:
how is your optimize command ?

Regards
Vadim



2011/9/28 Manish Bafna <ma...@gmail.com>

> Try to do optimize twice.
> The 2nd one will be quick and will delete lot of files.
>
> On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue <ki...@gmail.com>
> wrote:
> > Hi,
> >
> > I am using solr 3.3. I noticed  that after indexing about 700, 000
> records
> > and running optimization at the end, i still have about 91 files in my
> index
> > directory. I thought that optimization was supposed to reduce the number
> of
> > files.
> >
> > My settings are the default that came with Solr (mergefactor, etc)
> >
> > Any ideas what i could be doing wrong?
> >
>

Re: Still too many files after running solr optimization

Posted by Manish Bafna <ma...@gmail.com>.
Try to do optimize twice.
The 2nd one will be quick and will delete lot of files.

On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue <ki...@gmail.com> wrote:
> Hi,
>
> I am using solr 3.3. I noticed  that after indexing about 700, 000 records
> and running optimization at the end, i still have about 91 files in my index
> directory. I thought that optimization was supposed to reduce the number of
> files.
>
> My settings are the default that came with Solr (mergefactor, etc)
>
> Any ideas what i could be doing wrong?
>