You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Bernd Fehling <be...@uni-bielefeld.de> on 2011/01/03 08:55:23 UTC

names of index files

Dear list,

some questions about the names of the index files.
With an older Solr 4.x version from trunk my index looks like:
_2t1.fdt
_2t1.fdx
_2t1.fnm
_2t1.frq
_2t1.nrm
_2t1.prx
_2t1.tii
_2t1.tis
segments_2
segments.gen

With a most recent version from trunk it looks like:
_3a9.fdt
_3a9.fdx
_3a9.fnm
_3a9_0.frq
_3a9.nrm
_3a9_0.prx
_3a9_0.tii
_3a9_0.tis
segments_4
segments.gen

Why is there an "_0" at some files?
Is it from Lucene or from Solr or a fault in my system?

Both indexes are optimized, any idea?

Regards, Bernd

Re: names of index files

Posted by Bernd Fehling <be...@uni-bielefeld.de>.
Hi Grant,

Simon Willnauer gave me an excellent explanation:

> Why is there an "_0" at some files?
> Is it from Lucene or from Solr or a fault in my system?
lucene 4.0 as you might know has the ability to plug in a Codec which
has full control over how postings are stored, which format is used
and what files are written. Each Field within a segment can have its
own codec ie. field "foo" can have "Standard" and Field "bar" uses
"Pulsing" for instance. In such a case, since Pulsing is just a
wrapper around Standard - Codec, both codecs try to write the same set
of files per segment. For that reason we introduced a codec ID valid
per segment. Its is really an ordinal build from the set of codecs
used per segment. this ordinal is used to build the filenames, in your
case you only have one codec (I suppose its Standard - Codec) with the
ord "0". This ordinal is used for all files that codec writes. The
files without that ordinal (nrm, fdt, fdx and fnrm) are written by the
IndexWriter directly but the functionality behind it might be exposed
via codec sooner or later.
So, afterall this is a lucene functionality and your system is just
fine doing the right thing!

> I also didn't find any information at
> http://lucene.apache.org/java/3_0_3/fileformats.html
Its not a 3.3 feature - codecs are introduced in 4.0 aka. trunk.


Thanks again to Simon.



Am 03.01.2011 16:19, schrieb Grant Ingersoll:
> http://lucene.apache.org/java/3_0_2/fileformats.html (and other versions) contains the explanation of what the file formats are and the naming conventions.  Since you are on trunk, you will need to get the docs for that particular version of Lucene and take a look in them.
> 
> -Grant
> 
> On Jan 3, 2011, at 2:55 AM, Bernd Fehling wrote:
> 
>> Dear list,
>>
>> some questions about the names of the index files.
>> With an older Solr 4.x version from trunk my index looks like:
>> _2t1.fdt
>> _2t1.fdx
>> _2t1.fnm
>> _2t1.frq
>> _2t1.nrm
>> _2t1.prx
>> _2t1.tii
>> _2t1.tis
>> segments_2
>> segments.gen
>>
>> With a most recent version from trunk it looks like:
>> _3a9.fdt
>> _3a9.fdx
>> _3a9.fnm
>> _3a9_0.frq
>> _3a9.nrm
>> _3a9_0.prx
>> _3a9_0.tii
>> _3a9_0.tis
>> segments_4
>> segments.gen
>>
>> Why is there an "_0" at some files?
>> Is it from Lucene or from Solr or a fault in my system?
>>
>> Both indexes are optimized, any idea?
>>
>> Regards, Bernd
> 
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
> 

Re: names of index files

Posted by Grant Ingersoll <gs...@apache.org>.
http://lucene.apache.org/java/3_0_2/fileformats.html (and other versions) contains the explanation of what the file formats are and the naming conventions.  Since you are on trunk, you will need to get the docs for that particular version of Lucene and take a look in them.

-Grant

On Jan 3, 2011, at 2:55 AM, Bernd Fehling wrote:

> Dear list,
> 
> some questions about the names of the index files.
> With an older Solr 4.x version from trunk my index looks like:
> _2t1.fdt
> _2t1.fdx
> _2t1.fnm
> _2t1.frq
> _2t1.nrm
> _2t1.prx
> _2t1.tii
> _2t1.tis
> segments_2
> segments.gen
> 
> With a most recent version from trunk it looks like:
> _3a9.fdt
> _3a9.fdx
> _3a9.fnm
> _3a9_0.frq
> _3a9.nrm
> _3a9_0.prx
> _3a9_0.tii
> _3a9_0.tis
> segments_4
> segments.gen
> 
> Why is there an "_0" at some files?
> Is it from Lucene or from Solr or a fault in my system?
> 
> Both indexes are optimized, any idea?
> 
> Regards, Bernd

--------------------------
Grant Ingersoll
http://www.lucidimagination.com