You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by mechravi25 <me...@yahoo.co.in> on 2012/01/02 13:10:32 UTC

Query regarding segment files in SOLR

Hi,

I have a few doubts regarding the segment files. I have a optimized data in
my solr core and the following are the files there
{_2ni.fdt,_2ni.fdx,_2ni.fnm,_2ni..frq,_2ni..nrm,_2ni..prx,_2ni..tii,_2ni.tis}
and two other files {segments.gen,segments_2hr}.

My understanding is that those 8 files mentioned represent one segment file.
and the other two files have the information about the segment files.

My solrconfig.xml file has the following details

   <ramBufferSizeMB>320</ramBufferSizeMB>
    <mergeFactor>10</mergeFactor>
    <maxBufferedDocs>100000</maxBufferedDocs>


I am now adding only one document to the index without optimizing and i
notice that another set of segment files is getting created. In all, I
create index 11 documents individually and once the 11th document is added,
all the newly added 11 documents are combined in a new segment file.

My question is this,

In the solr wiki this is the explanation given for MergeFactor  For example,
if you set mergeFactor to 10, a new segment will be created on the disk for
every 1000 (or maxBufferedDocs) documents added to the index. When the 10th
segment of size 1000 is added, all 10 will be merged into a single segment
of size 10,000.

But for me for every document a new set of segment files is created. DOes
the explanation given in the solr wiki suitable only when the documents are
indexed continously? 

What is the purpose for maxBufferedDocs here?

Thanks

--
View this message in context: http://lucene.472066.n3.nabble.com/Query-regarding-segment-files-in-SOLR-tp3626428p3626428.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Query regarding segment files in SOLR

Posted by Shawn Heisey <so...@elyograg.org>.
On 1/2/2012 5:10 AM, mechravi25 wrote:
> My solrconfig.xml file has the following details
>
>     <ramBufferSizeMB>320</ramBufferSizeMB>
>      <mergeFactor>10</mergeFactor>
>      <maxBufferedDocs>100000</maxBufferedDocs>
>
>
> I am now adding only one document to the index without optimizing and i
> notice that another set of segment files is getting created. In all, I
> create index 11 documents individually and once the 11th document is added,
> all the newly added 11 documents are combined in a new segment file.
>
> My question is this,
>
> In the solr wiki this is the explanation given for MergeFactor  For example,
> if you set mergeFactor to 10, a new segment will be created on the disk for
> every 1000 (or maxBufferedDocs) documents added to the index. When the 10th
> segment of size 1000 is added, all 10 will be merged into a single segment
> of size 10,000.
>
> But for me for every document a new set of segment files is created. DOes
> the explanation given in the solr wiki suitable only when the documents are
> indexed continously?
>
> What is the purpose for maxBufferedDocs here?

These settings are enforced for segment creation until there is a commit 
(which makes the new documents searchable), then it writes what it has 
and starts over.  If you are committing every time you add a document, 
then you will get a segment for every document.  If you add many 
documents before committing, then the limits you have specified will be 
enforced ... but none of the new documents will be searchable until they 
are committed.

As for maxBufferedDocs, it is likely that you would hit your configured 
RAM limit before you hit your configured maxBufferedDocs limit.

Thanks,
Shawn