You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by wj <pp...@gmail.com> on 2012/07/06 04:16:58 UTC
A doubt of the description about the tii file format in document
IN http://lucene.apache.org/core/3_6_0/fileformats.html#tii
1 tii structure
The structure of this file is very similar to the .tis file, with the
addition of one item per record, the IndexDelta.
TermInfoIndex (.tii)--> TIVersion, IndexTermCount, IndexInterval,
SkipInterval, MaxSkipLevels, TermIndices
TIVersion --> UInt32
IndexTermCount --> UInt64
IndexInterval --> UInt32
SkipInterval --> UInt32
TermIndices --> <TermInfo, IndexDelta> IndexTermCount
IndexDelta --> VLong
2 tis structure
TermInfoFile (.tis)--> TIVersion, TermCount, IndexInterval,
SkipInterval, MaxSkipLevels, TermInfos
TIVersion --> UInt32
TermCount --> UInt64
IndexInterval --> UInt32
SkipInterval --> UInt32
MaxSkipLevels --> UInt32
TermInfos --> <TermInfo> TermCount
TermInfo --> <Term, DocFreq, FreqDelta, ProxDelta, SkipDelta>
Term --> <PrefixLength, Suffix, FieldNum>
Suffix --> String
PrefixLength, DocFreq, FreqDelta, ProxDelta, SkipDelta
--> VInt
My doubt is:the TermInfo structure in TII file is as same as TermInfo in TIS ?
---------------------------------------------THE TIS HEX
FF FF FF FC 00 00 00 00 00 00 00 12 00 00 00 80
00 00 00 10 00 00 00 0A
00 02 6D 79 00 01 00 00
00 08 73 74 6F 72 65 79 65 73 00 01 02 04 00 04
74 65 73 74 00 01 01 01 00 02 6D 79 01 01 01 01
00 07 73 74 6F 72 65 6E 6F 01 01 01 01 00 04 74
65 73 74 01 01 01 01 00 04 64 6F 63 31 02 01 01
01 00 02 6D 79 02 01 01 01 00 08 73 74 6F 72 65
79 65 73 02 01 01 01 00 04 74 65 73 74 02 01 01
01 00 04 64 6F 63 32 03 01 01 01 00 02 6D 79 03
01 01 01 00 08 73 74 6F 72 65 79 65 73 03 01 01
01 00 04 74 65 73 74 03 01 01 01 00 04 64 6F 63
32 04 01 01 01 00 02 6D 79 04 01 01 01 00 07 73
74 6F 72 65 6E 6F 04 01 01 01 00 04 74 65 73 74
04 01 01 01
One of the TermInfo in TIS file
00 :PrefixLength
02 :string length
6D 79 :Term“my” unicode code
00 :filed num
01 :term in only one doc
00 :FreqDelta,determines the position of this term's TermFreqs within
the .frq file.
00:ProxDelta,determines the position of this term's TermPositions
within the .prx file.
--------------------------------------------------------THE TII HEX
FF FF FF FC 00 00 00 00 00 00 00 01 00 00 00 80
00 00 00 10 00 00 00 0A 00 00 FF FF FF FF 0F 00
00 00 18
FF FF FF FC :TIVersion
00 00 00 00 00 00 00 01 :IndexTermCount
00 00 00 80:IndexInterval
00 00 00 10 :SkipInterval
00 00 00 0A 00 00 FF FF FF FF 0F 00 00 00 18
BUT what is the TermInfo in TII ? It confused me,please give me a help,thx.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org