You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by wgggfiy <wu...@qq.com> on 2012/11/16 09:57:36 UTC

what is the format of .tim and .tiq in lucene 4.0 ?

Hi, guys.I'm now studying lucene 4.0, and come into difficulties.Compared
previous version, the term dictionary is not like this version.what is block
? and what is the FST ?help me, thx.



--
View this message in context: http://lucene.472066.n3.nabble.com/what-is-the-format-of-tim-and-tiq-in-lucene-4-0-tp4020677.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Re: what is the format of .tim and .tiq in lucene 4.0 ?

Posted by Michael McCandless <lu...@mikemccandless.com>.
The format is unfortunately rather intricate ...

FST = finite state transducer (see eg
http://blog.mikemccandless.com/2010/12/using-finite-state-transducers-in.html
).  We use that to hold the terms index (*.tip), which is loaded into
RAM.

The blocks are because we encode a block of between 25 - 48 terms
together.  Blocks are picked according to how terms share prefixes so
that we get better compression and faster loookup.  It's a variant of
a burst trie (see eg
http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499 ).

The index points to the start of blocks, so in looking up a term we
figure out from the index which block may have the term (if any), seek
there, and scan for it.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Nov 16, 2012 at 3:57 AM, wgggfiy <wu...@qq.com> wrote:
> Hi, guys.I'm now studying lucene 4.0, and come into difficulties.Compared
> previous version, the term dictionary is not like this version.what is block
> ? and what is the FST ?help me, thx.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/what-is-the-format-of-tim-and-tiq-in-lucene-4-0-tp4020677.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org